AUTHOR=Bühlmann Sven , Reymond Jean-Louis TITLE=ChEMBL-Likeness Score and Database GDBChEMBL JOURNAL=Frontiers in Chemistry VOLUME=8 YEAR=2020 URL=https://www.frontiersin.org/journals/chemistry/articles/10.3389/fchem.2020.00046 DOI=10.3389/fchem.2020.00046 ISSN=2296-2646 ABSTRACT=

The generated database GDB17 enumerates 166.4 billion molecules up to 17 atoms of C, N, O, S and halogens following simple rules of chemical stability and synthetic feasibility. However, most molecules in GDB17 are too complex to be considered for chemical synthesis. To address this limitation, we report GDBChEMBL as a subset of GDB17 featuring 10 million molecules selected according to a ChEMBL-likeness score (CLscore) calculated from the frequency of occurrence of circular substructures in ChEMBL, followed by uniform sampling across molecular size, stereocenters and heteroatoms. Compared to the previously reported subsets FDB17 and GDBMedChem selected from GDB17 by fragment-likeness, respectively, medicinal chemistry criteria, our new subset features molecules with higher synthetic accessibility and possibly bioactivity yet retains a broad and continuous coverage of chemical space typical of the entire GDB17. GDBChEMBL is accessible at http://gdb.unibe.ch for download and for browsing using an interactive chemical space map at http://faerun.gdb.tools.