Croatica Chemica Acta, Vol. 97 No. 4, 2024.
Review article
https://doi.org/10.5562/cca4130
The State of the Art on Chemical Databases and Libraries
Višnja Stepanić
; Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, 10000 Zagreb, Croatia
*
Dalibor Hršak
; Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, 10000 Zagreb, Croatia
Renata Kobetić
; Division of Organic Chemistry and Biochemistry, Ruđer Bošković Institute, Bijenička cesta 54, 10000 Zagreb, Croatia
* Corresponding author.
Abstract
Molecules that act on the biological target at micromolar level at least are called hits. The usual method for identifying hits is high-throughput screening (HTS) of chemical libraries in relevant in vitro assays. An even more efficient, cost-effective and faster method for identifying hits is to perform virtual pre-screening, where the top scoring hits are validated in appropriate in vitro assays. Both wet HTS and virtual screening using structure- or ligand-based approaches utilise large libraries containing millions to billions of drug-like compounds. In this paper, we provide an insight into the state of the art in large collections of small molecular weight molecules, i) public databases for synthetic compounds (PubChem, ChEMBL) and natural products (COCONUT, LOTUS) and commercial ultra-large chemical libraries, ii) make-on-demand virtual libraries (Enamine, Galaxi®, ZINC-22) and iii) wet DNA-encoded libraries (DELs). Machine learning methods for characterising and visualising molecular diversity in screening collections are also described.
Keywords
database; library; ultra-large library; DNA-encoded library; virtual screening; hits; machine learning; visualization; PCA; t-SNE; UMAP
Hrčak ID:
322894
URI
Publication date:
30.11.2024.
Visits: 0 *