Three machine learning models for the 2019 Solubility Challenge

Mitchell, John

doi:10.5599/admet.835

ADMET and DMPK, Vol. 8 No. 3, 2020.

Original scientific paper

Three machine learning models for the 2019 Solubility Challenge

John Mitchell orcid.org/0000-0002-0379-6097 ; EaStCHEM School of Chemistry and Biomedical Sciences Research Complex, University of St Andrews, North Haugh, St Andrews, Scotland, KY16 9ST, UK

Full text: english pdf 2.382 Kb

page 215-250

downloads: 549

cite

APA 6th Edition

Mitchell, J. (2020). Three machine learning models for the 2019 Solubility Challenge. ADMET and DMPK, 8 (3), 215-250. https://doi.org/10.5599/admet.835

MLA 8th Edition

Mitchell, John. "Three machine learning models for the 2019 Solubility Challenge." ADMET and DMPK, vol. 8, no. 3, 2020, pp. 215-250. https://doi.org/10.5599/admet.835. Accessed 15 Apr. 2025.

Chicago 17th Edition

Mitchell, John. "Three machine learning models for the 2019 Solubility Challenge." ADMET and DMPK 8, no. 3 (2020): 215-250. https://doi.org/10.5599/admet.835

Harvard

Mitchell, J. (2020). 'Three machine learning models for the 2019 Solubility Challenge', ADMET and DMPK, 8(3), pp. 215-250. https://doi.org/10.5599/admet.835

Vancouver

Mitchell J. Three machine learning models for the 2019 Solubility Challenge. ADMET and DMPK [Internet]. 2020 [cited 2025 April 15];8(3):215-250. https://doi.org/10.5599/admet.835

IEEE

J. Mitchell, "Three machine learning models for the 2019 Solubility Challenge", ADMET and DMPK, vol.8, no. 3, pp. 215-250, 2020. [Online]. https://doi.org/10.5599/admet.835

Abstract

We describe three machine learning models submitted to the 2019 Solubility Challenge. All are founded on tree-like classifiers, with one model being based on Random Forest and another on the related Extra Trees algorithm. The third model is a consensus predictor combining the former two with a Bagging classifier. We call this consensus classifier Vox Machinarum, and here discuss how it benefits from the Wisdom of Crowds. On the first 2019 Solubility Challenge test set of 100 low-variance intrinsic aqueous solubilities, Extra Trees is our best classifier. One the other, a high-variance set of 32 molecules, we find that Vox Machinarum and Random Forest both perform a little better than Extra Trees, and almost equally to one another. We also compare the gold standard solubilities from the 2019 Solubility Challenge with a set of literature-based solubilities for most of the same compounds.

Keywords

Hrčak ID:

244116

URI

https://hrcak.srce.hr/244116

Publication date:

27.9.2020.

Visits: 1.536 *

Login and registration

ADMET and DMPK, Vol. 8 No. 3, 2020.

Abstract

Keywords

Hrčak ID:

URI

Publication date:

closePristupačnostrefresh

Pristupačnost