Skip to the main content

Original scientific paper

https://doi.org/10.5562/cca3776

QSPR Models for Prediction of Aqueous Solubility: Exploring the Potency of Randić-type Indices

Janja Sluga ; National Institute of Chemistry, Theory Department, Laboratory for Cheminformatics, Hajdrihova ulica 19, Ljubljana, Slovenia
Katja Venko ; National Institute of Chemistry, Theory Department, Laboratory for Cheminformatics, Hajdrihova ulica 19, Ljubljana, Slovenia
Viktor Drgan ; National Institute of Chemistry, Theory Department, Laboratory for Cheminformatics, Hajdrihova ulica 19, Ljubljana, Slovenia
Marjana Novič orcid id orcid.org/0000-0002-4243-2181 ; National Institute of Chemistry, Theory Department, Laboratory for Cheminformatics, Hajdrihova ulica 19, Ljubljana, Slovenia


Full text: english pdf 13.023 Kb

page 311-319

downloads: 546

cite

Supplements: cca3776_Supplement.pdf


Abstract

The development of QSPR models to predict aqueous solubility (logS) is presented. A structurally diverse set of over 1600 compounds with experimentally determined solubility values (AqSolDB database) is used for building the data-driven models based on multiple linear regression (MLR) and artificial neural network (ANN) methods to predict aqueous solubility. Molecular structures are encoded by numerous structural descriptors, including the connectivity index developed by Randić in 1975, and many later derived variations. To evaluate the potency of Randić-like descriptors in the structure-property relationship, we developed models based on two sets of descriptors, first using only Randić-like descriptors calculated with Dragon, and second using 17 commonly applied descriptors available in the AqSolDB database. All models were validated with external prediction sets, with the RMSE ranging from 0.8 to 1.1. Interestingly, the RMSE of predicted LogS values of models based only on the Randić-like descriptors were in average just 0.1 larger than the models with 17 descriptors preselected as suitable for modelling logS.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Keywords

aqueous solubility; QSPR model; MLR; ANN; connectivity index; Randić-like indices

Hrčak ID:

261775

URI

https://hrcak.srce.hr/261775

Publication date:

25.4.2021.

Visits: 1.114 *