Skip to the main content

Review article

https://doi.org/10.5562/cca4130

The State of the Art on Chemical Databases and Libraries

Višnja Stepanić ; Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, 10000 Zagreb, Croatia *
Dalibor Hršak ; Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, 10000 Zagreb, Croatia
Renata Kobetić ; Division of Organic Chemistry and Biochemistry, Ruđer Bošković Institute, Bijenička cesta 54, 10000 Zagreb, Croatia

* Corresponding author.


Full text: english pdf 4.816 Kb

versions

page P1-P12

downloads: 96

cite


Abstract

Molecules that act on the biological target at micromolar level at least are called hits. The usual method for identifying hits is high-throughput screening (HTS) of chemical libraries in relevant in vitro assays. An even more efficient, cost-effective and faster method for identifying hits is to perform virtual pre-screening, where the top scoring hits are validated in appropriate in vitro assays. Both wet HTS and virtual screening using structure- or ligand-based approaches utilise large libraries containing millions to billions of drug-like compounds. In this paper, we provide an insight into the state of the art in large collections of small molecular weight molecules, i) public databases for synthetic compounds (PubChem, ChEMBL) and natural products (COCONUT, LOTUS) and commercial ultra-large chemical libraries, ii) make-on-demand virtual libraries (Enamine, Galaxi®, ZINC-22) and iii) wet DNA-encoded libraries (DELs). Machine learning methods for characterising and visualising molecular diversity in screening collections are also described.

Keywords

database; library; ultra-large library; DNA-encoded library; virtual screening; hits; machine learning; visualization; PCA; t-SNE; UMAP

Hrčak ID:

322894

URI

https://hrcak.srce.hr/322894

Publication date:

30.11.2024.

Visits: 239 *