AUTHOR=Li Ting , Tong Weida , Roberts Ruth , Liu Zhichao , Thakkar Shraddha 

TITLE=DeepCarc: Deep Learning-Powered Carcinogenicity Prediction Using Model-Level Representation

JOURNAL=Frontiers in Artificial Intelligence

VOLUME=4

YEAR=2021

URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2021.757780

DOI=10.3389/frai.2021.757780

ISSN=2624-8212

ABSTRACT=<p>Carcinogenicity testing plays an essential role in identifying carcinogens in environmental chemistry and drug development. However, it is a time-consuming and label-intensive process to evaluate the carcinogenic potency with conventional 2-years rodent animal studies. Thus, there is an urgent need for alternative approaches to providing reliable and robust assessments on carcinogenicity. In this study, we proposed a DeepCarc model to predict carcinogenicity for small molecules using deep learning-based model-level representations. The DeepCarc Model was developed using a data set of 692 compounds and evaluated on a test set containing 171 compounds in the National Center for Toxicological Research liver cancer database (NCTRlcdb). As a result, the proposed DeepCarc model yielded a Matthews correlation coefficient (MCC) of 0.432 for the test set, outperforming four advanced deep learning (DL) powered quantitative structure-activity relationship (QSAR) models with an average improvement rate of 37%. Furthermore, the DeepCarc model was also employed to screen the carcinogenicity potential of the compounds from both DrugBank and Tox21. Altogether, the proposed DeepCarc model could serve as an early detection tool (<ext-link ext-link-type="uri" xlink:href="https://github.com/TingLi2016/DeepCarc" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/TingLi2016/DeepCarc</ext-link>) for carcinogenicity assessment.</p>