AUTHOR=Machado Lucas A. , Krempser Eduardo , Guimarães Ana Carolina Ramos TITLE=A machine learning-based virtual screening for natural compounds capable of inhibiting the HIV-1 integrase JOURNAL=Frontiers in Drug Discovery VOLUME=2 YEAR=2022 URL=https://www.frontiersin.org/journals/drug-discovery/articles/10.3389/fddsv.2022.954911 DOI=10.3389/fddsv.2022.954911 ISSN=2674-0338 ABSTRACT=

HIV-1 integrase is an essential enzyme for the HIV-1 replication cycle, and currently, integrase inhibitors are in the first line of treatment in many guidelines. Despite the discovery of new inhibitors, including a new class of molecules with different mechanisms of action, resistance is still a relevant problem, and adding new options to the therapeutic arsenal to fight viral resistance is a Sisyphean task. Because of the difficulty and cost of in vitro screenings, machine learning-driven ligand-based virtual screenings are an alternative that can not only cut costs but also use valuable information about active compounds with yet unknown mechanisms of action. In this work, we describe a thorough model exploration and hyperparameter tuning procedure in a dataset with class imbalance and show several models capable of distinguishing between compounds that are active or inactive against the HIV-1 integrase. The best of the models was then used to screen the natural product atlas for active compounds, resulting in a myriad of molecules that share features with known integrase inhibitors. Here we also explore the strengths and shortcomings of our models and discuss the use of the applicability domain to guide in vitro screenings and differentiate between the “predictable” and “unknown” regions of the chemical space.