MINI REVIEW article

Front. Artif. Intell., 18 August 2020

Sec. Medicine and Public Health

Volume 3 - 2020 | https://doi.org/10.3389/frai.2020.00065

Artificial Intelligence for COVID-19 Drug Discovery and Vaccine Development

  • 1. Burnett School of Biomedical Sciences, University of Central Florida, Orlando, FL, United States

  • 2. Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL, United States

  • 3. A2A Pharmaceuticals, Cambridge, MA, United States

  • 4. Atomwise Inc., San Francisco, CA, United States

  • 5. Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ, United States

  • 6. Department of Biochemistry and Biophysics, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States

Abstract

SARS-COV-2 has roused the scientific community with a call to action to combat the growing pandemic. At the time of this writing, there are as yet no novel antiviral agents or approved vaccines available for deployment as a frontline defense. Understanding the pathobiology of COVID-19 could aid scientists in their discovery of potent antivirals by elucidating unexplored viral pathways. One method for accomplishing this is the leveraging of computational methods to discover new candidate drugs and vaccines in silico. In the last decade, machine learning-based models, trained on specific biomolecules, have offered inexpensive and rapid implementation methods for the discovery of effective viral therapies. Given a target biomolecule, these models are capable of predicting inhibitor candidates in a structural-based manner. If enough data are presented to a model, it can aid the search for a drug or vaccine candidate by identifying patterns within the data. In this review, we focus on the recent advances of COVID-19 drug and vaccine development using artificial intelligence and the potential of intelligent training for the discovery of COVID-19 therapeutics. To facilitate applications of deep learning for SARS-COV-2, we highlight multiple molecular targets of COVID-19, inhibition of which may increase patient survival. Moreover, we present CoronaDB-AI, a dataset of compounds, peptides, and epitopes discovered either in silico or in vitro that can be potentially used for training models in order to extract COVID-19 treatment. The information and datasets provided in this review can be used to train deep learning-based models and accelerate the discovery of effective viral therapies.

Introduction

Coronaviridae is a viral family responsible for causing pneumonia-like symptoms that has been a global threat since its first outbreak in 2002 (Jabeer Khan et al., 2020). Severe Acute Respiratory Disease (SARS) and Middle Eastern Respiratory Syndrome (MERS), emerging in 2002 and 2013, respectively, caused diseases marked by both gastrointestinal and pulmonary dysfunction (Hilgenfeld and Peiris, 2013). In 2019, SARS-COV-2 was the causative agent of a third Coronavirus outbreak and has been identified as the virus responsible for COVID-19, the symptoms of which range from those of the common cold to more severe respiratory failure (Kong W.-H. et al., 2020). Despite its having been declared a pandemic by the World Health Organization (WHO), COVID-19 has continued to spread and has infected at least 20 million individuals, reaching a death toll of over half a million at the time of this review (Worldometer, 2020).

While hospitals are resorting to trial and error tactics for COVID-19 drug discovery, Virtual Screening (VS) has emerged as a popular method for discovering potent compounds due to the inefficiency of lab-based high throughput screening (HTS) (Jin et al., 2020; Kandeel and Al-Nazawi, 2020). VS for rational drug discovery is essentially an approach that involves computationally targeting a specific biomolecule (e.g., DNA, protein, RNA, lipid) of a cell to inhibit its growth and/or activation (Shoichet, 2004; Lionta et al., 2014). Structure-based and ligand-based drug discovery and design are two important subgroups of this type of screening (Lionta et al., 2014; Yu and Mackerell, 2017; Arshadi et al., 2020; Broom et al., 2020). Given our access to computationally and experimentally determined viral protein structures (Senior et al., 2020; Zhang L. et al., 2020), VS provides a rapid and cost-effective strategy for identifying antiviral candidates.

Additionally, conventional vaccine discovery methods have been costly, and it may take many years to develop an appropriate vaccine against a specified pathogen. In the early 1990s, the introduction of a genome-based vaccine design approach dubbed “Reverse Vaccinology” (RV) (Rappuoli, 2000; Bullock et al., 2020), revolutionized the field to a more efficient status, due in part to the fact that bacterial culturing was no longer required for identifying vaccine targets (Bruno et al., 2015; Heinson et al., 2015; Soria-Guerra et al., 2015). Moreover, all of the putative target protein antigens can be identified, rather than identification being limited to those isolated from bacterial cultures (Xiang and He, 2009; Bowman et al., 2011). All of these advantages taken together led scientists to generate RV prediction programs.

Over the past decade, artificial intelligence (AI)-based models have revolutionized drug discovery in general (Zhong et al., 2018; Duan et al., 2019; Lavecchia, 2019). AI has also led to the creation of many RV virtual frameworks, which are generally classified as rule-based filtering models (Naz et al., 2019; Ong et al., 2020a). Machine learning (ML) enables the creation of models that learn and generalize the patterns within the available data and can make inferences from previously unseen data. With the advent of deep learning (DL), the learning procedure can also include automatic feature extraction from raw data (Lecun et al., 2015). Moreover, it has recently been found that deep learning's feature extraction can result in superior performance compared to other computer-aided models (Ma et al., 2015; Chen et al., 2018; Zhavoronkov et al., 2019).

In this review, we provide a survey of AI-based models for COVID-19 drug discovery and vaccine development. Moreover, we identify and evaluate the best candidate targets for future treatment development. We propose that a concerted effort should be made to leverage the knowledge from pre-existing data by using machine learning approaches. To that end, we present a wide-ranging collection of small molecules, peptides, and epitopes for therapy discovery that could also direct AI-based models, screening, or generation, in an intelligent manner.

Background of Machine Learning Methods for Therapy Discovery

In recent years, machine learning has revolutionized many fields of science and engineering. It has largely transformed our daily lives, from speech and face recognition (Alaghband et al., 2020; Grover and Toghi, 2020; Sun et al., 2020) to customized targeted advertisements (Zhai et al., 2016). The power of automatic abstract feature learning, combined with a massive volume of data, has immensely contributed to the successful application of ML (Lecun et al., 2015). Two of the most impactful areas affected are drug and vaccine discovery (Chen et al., 2018), in which ML has offered compound property prediction (Ma et al., 2015), activity prediction (Zhavoronkov et al., 2019), reaction prediction (Fooshee et al., 2018), and ligand–protein interaction.

On the prediction front, Graph Convolutional Neural Networks (GCNN) have been the favorite tool for drug discovery applications (Duvenaud et al., 2015; Kearnes et al., 2016). These networks are able to handle graphs and extract features via encoding the adjacency information within the features. Successful representation learning from molecules using GCNNs has been demonstrated in drug property prediction (Heskett et al., 2018; Bazgir et al., 2019; Liu et al., 2019), protein interface estimation (Fout et al., 2017), reactivity prediction (Coley et al., 2019), and drug–target interactions (Torng and Altman, 2019; Wang et al., 2020). Sequence-based models such as genomics, proteomics, and transcriptomics have also gained some attention in recent years due to the advancements made in the natural language processing domain. The more recent generation of context-based models are transformers that use attention mechanisms and self-supervision to extract representations from sequences (Vaswani et al., 2017; Devlin et al., 2018). Transformers have demonstrated the capacity to predict drug–target interactions (Shin et al., 2019), model protein sequences (Choromanski et al., 2020), and predict retrosynthetic reactions. These models learn to extract features from sequences on the location, context, and order of the input tokens (Belinkov and Glass, 2018). Recurrent neural networks (RNNs) and long short-term memory (LSTM) networks have successfully demonstrated the ability to perform when trained on molecules or protein sequences to predict secondary structure (Pollastri et al., 2002), quantitative structure–activity relationship (QSAR) modeling (Chakravarti et al., 2019), and function prediction (Liu, 2017).

On the lead generation front, de novo design has benefitted the most from the application of deep learning. This subfield has drastically evolved from its traditional usage of ligand-based models and creating molecules from sub-blocks (Acharya et al., 2010). The current approach involves the use of state-of-the-art deep learning models such as Generative Adversarial Networks (GANs) to create data-oriented molecules (Guimaraes et al., 2017). Traditional de novo design fails to fully implement this exploration by constraining the generation of molecules with ligand or fragment libraries. More recent approaches utilize deep learning generative models such as variational autoencoders (VAE) (De Cao and Kipf, 2018) in order to create sequences of atoms. This approach lifts the constraints of ligand-based designs and allows the generation of unique molecules with greater diversity (Guimaraes et al., 2017; De Cao and Kipf, 2018; Jin et al., 2018; Liu et al., 2018; Simonovsky and Komodakis, 2018).

Machine learning has also improved the field of vaccine design over the past two decades. VaxiJen was the first implementation of ML in RV approaches and has shown promising results for antigen prediction (Doytchinova and Flower, 2007; Heinson et al., 2017). In addition, the recent development of Vaxign-ML, a web-based RV program leveraging machine learning approaches for bacterial antigen prediction, is a testament to the success of exercising mathematical ML-based in RV (He et al., 2010a; Heinson et al., 2017). In essence, these pipelines consist of feature extraction, feature selection, data augmentation, and cross-validation implemented to predict vaccine candidates against various bacterial and viral pathogens known to cause infectious disease. The use of biological, structural, and physiochemical features is prevalent among the approaches in this domain, as seen in reverse vaccinology and immunoinformatic methods such as IEDB and BlastP, which are feature extractors for AI-based models like RNN in the study of different pathogenic viruses (Flower et al., 2010; He and Zhu, 2015; Abbasi, 2020). More recently, graph-based features have also shown the ability to represent the antibodies instead of an expert-designed feature; Magar et al. showed that graph featurization is followed by mean pooling, and then classification is implemented using shallow and deep models (Magar et al., 2020). Deep Learning approaches have also revolutionized the field of cancer vaccinology through the improved prediction of neoantigens and their HLA binding affinity (Sher et al., 2017; Tran et al., 2019; Wu et al., 2019). Autoencoders of deep learning have shown promising improvement in extracting characteristics of human Leukocyte Antigen (HLA-A), which could be utilized in both transplantations and vaccine discovery (Miyake et al., 2018).

Key aspects of therapy discovery are safety and reliability. The Vaccine Adverse Event Reporting System (VAERS) and Vaccine Safety Databank (VSD) have been among the most popular immunization registries for tracking, recording, and predicting vaccine safety. In prior decades, implementations of computational simulation and mathematical modeling have significantly improved the tradeoff between the assessment of safety and efficacy by using the aforementioned resources (He et al., 2010b; Vaishnav et al., 2015). Zheng et al. implemented Natural language Processing (NLP) for the identification of adverse events related to Tdap vaccines (Zheng et al., 2019).

In drug development cases, the final drug candidate produced in the process of drug discovery needs to be safe for human consumption. This requires an observation of the drug's side effects as well as confirmation that the drug is non-toxic. To accomplish this, the Toxicology in the 21st Century program (Tox-21) has screened ~10,000 compounds from 70 screening assays, creating a database that can be used to facilitate toxicity modeling. Furthermore, the project has also expanded to contain 700 assays with nearly 1,800 molecules in the ToxCast dataset. On the side-effect prevention front, the off-target interactions are predicted and minimized in silico. In doing so, potential drug candidates are chosen, with consideration given to their off-target polypharmacological profiles (Zhou H. et al., 2015). In a different approach, AI-based studies were implemented to detect the potential prolongation of QT intervals and cardiotoxicity of a candidate drug, hydroxychloroquine, using ECG data from smartwatches (Li J. et al., 2020)1.

In summary, artificial intelligence has been applied to many subfields of drug discovery and vaccine development. This improvement is crucial for the current situation and immediate SARS-COV-2 therapy discovery for several key reasons. Firstly, the automatic feature extraction ability of deep learning can support models with better accuracy and deliver more reliable results. Secondly, the generative ability demonstrated by deep learning models can be utilized to create more druggable molecules and better epitope prediction, lowering the chance of failure in the trial pipeline. Lastly, the novelty of the virus causes the data around its possible therapies to be scarce, which is a suitable scenario for transfer learning and leveraging the learned knowledge from previous tasks (e.g., TranscreenTM) (Salem et al., 2020). Transfer learning has been shown to alleviate this problem through the transferring of learned knowledge and parameters from a secondary task with big data available to the task at hand (Weiss et al., 2016). Therefore, the use of deep learning in therapy discovery for SARS-COV-2 is essential in order to make a timely and accurate response to the virus.

COVID-19 Molecular Mechanism and Target Selection

Coronaviruses are enveloped viruses with a positive-sense single-stranded RNA genome (Fehr and Perlman, 2015). They are known to infect both humans and other eukaryotes (Andersen et al., 2020; Hoffmann et al., 2020). The novel coronavirus manages to bind to the host receptor with a higher affinity than SARS due to the increased modification of its viral spike, among other structural proteins, resulting in enhanced transmission (Zhou Y. et al., 2020).

SARS-CoV-2 interaction with host cells begins with attachment via the viral spike (S) protein to the host ACE2 receptor (Hoffmann et al., 2020; Zhou P. et al., 2020). ACE2 binding induces the host surface serine protease, TMPRSS2, to prime the S protein via cleavage at its S1/S2 border, facilitating viral fusion with the cell membrane (Hoffmann et al., 2020). Once inside the cell, the viral RNA genome is released into the cytosol, where it is translated by host ribosome machinery, producing two polyproteins: pp1a and pp1ab, which are then cleaved by viral 3CL protease (main protease) and PL protease. This gives rise to several non-structural proteins (nsps) as the foundation of RNA-dependent RNA polymerase (RdRP); this RdRP then transcribes a template strand of the genomic RNA, from which it then transcribes subgenomic mRNA products to be translated. These products encode the structural proteins S, E, M, and N, as well as additional accessory nsps (Figure 1) (Lai and Cavanagh, 1997; Kim D. et al., 2020).

Figure 1

The severity of the host response depends on an innate response to viral recognition, involving the expression of type-1 IFNs and pro-inflammatory cytokines (Pazhouhandeh et al., 2018; Prompetchara et al., 2020). If the antiviral response is delayed or inhibited, viral proliferation can lead to the large-scale recruitment of neutrophils and monocyte-macrophages to the lungs, creating a hyperinflammatory environment (Prompetchara et al., 2020). Overactive release of pro-inflammatory cytokines, i.e., cytokine storm (CS), has been found in COVID-19 patients and can lead to severe complications like acute respiratory distress syndrome (ARDS) (Moore and June, 2020). It has been found that levels of IL-1B, IL-1RA, IL-8, IL-10, IFNγ, IP10, MCP1, and MIP1s are higher in COVID-19 patients than in healthy adults (Huang et al., 2020). IL-6, in particular, has been highly implicated in CRS and COVID-19 severity, and inhibition of IL-6/IL-6R activity may lead to improved patient outcome, increasing its desirability as a target (Figure 1) (Scheller et al., 2014; Tanaka et al., 2016; Zhang C. et al., 2020).

Throughout the process of viral entry, replication, and dissemination, there are several proteins that can serve as suitable targets for therapeutic intervention. The S protein is one of the candidates receiving the most focus, as it is necessary for viral entry into host cells and is highly specific to the virus itself. The host receptor ACE2 is another possible target, but the presence of ACE2 in non-lung tissues such as heart, kidney, and intestine (Hamming et al., 2004) could complicate its inhibition. Another host protein, the TMPRSS2 protease, is essential for viral entry into the cell, making it an additional viable target (Hoffmann et al., 2020).

COVID-19 Drug Discovery

Protein-Based

The recent applications of Artificial Intelligence for COVID-19 include the virtual screening of both repurposed drug candidates and new chemical entities. For repurposed drugs, the goal has been to rapidly predict and exploit interconnected biological pathways or the off-target biology of existing medicines that are proven safe and can thus be readily tested in new clinical trials. In one of the early attempts, Gordon et al. paved the way for the repurposing of candidate drugs by experimentally identifying 66 human proteins linked with 26 SARS-CoV-2 proteins (Gordon et al., 2020). In addition to wet-lab approaches, network-based model simulation has been the main computational approach for analyzing the virus–host interactome (Messina et al., 2020). Li et al. identified 30 drugs for repurposing by analyzing the genome sequence of three main viral family members of the coronavirus and then relating them to the human disease-based pathways (Li X. et al., 2020). In a different approach, Zhou et al. offered a combination of network-based methodologies for repurposed drug combination (Zhou Y. et al., 2020).

UK-based BenevolentAI leveraged its AI-derived knowledge graph, which integrates biomedical data from structured and unstructured sources (Richardson et al., 2020). It targeted the inhibition of host protein AAK1 and identified Baricitinib, an approved drug for the treatment of rheumatoid arthritis (Stebbing et al., 2020). Similarly, Beck et al. published an application of their DL-based drug–target interaction model that predicted commercially available antiviral drugs that may target the SARS-COV-2-related protease and helicase (Beck et al., 2020a). Atomwise has also focused on targeting several SARS-CoV-2 protein binding sites that are highly conserved across multiple coronavirus species in an effort to develop new broad-spectrum antivirals. Using its AtomNet® deep convolutional neural network technology (Wallach et al., 2020), Atomwise is screening millions of virtual compounds against these diverse targets alongside 15 different partnerships with academic researchers that will test the predicted compounds in their in vitro assays2.

There have been several other applications of multi-task deep learning models for identifying existing drugs that can target the main viral proteins, especially the main protease (3CLpro) and spike protein (Hu et al., 2020; Kadioglu et al., 2020; Kim J. et al., 2020; Redka et al., 2020). One impressive example is Cyclica's creation and mining of PolypharmDB, a platform of known drugs and their predicted binding to human protein targets that uncovered off-target applications of 30 existing drugs against the viral protein 3CLpro and the ACE2 binding site as two examples (Redka et al., 2020). At least two other applications of DL-based virtual screening for the SARS-CoV-2 main protease have been published and include the open sharing of newly predicted chemical structures (Bung et al., 2020; Zhang H. et al., 2020).

ML-aided molecular docking has been one of the most prevalent approaches for virtual screening. This process normally requires the following: (1) Dataset of Druglike or Approved Molecules, (2) Crystal Structure or Homology Model of the target, (3) Molecular Docking Program, and (4) Compute Resources (Ewing et al., 2001; Pagadala et al., 2017). Through docking, many molecules have been reported to fit the binding site of various SARS-CoV-2 proteins essential for viral replication and infection. 3CLpro, Spike Protein, RdRP, and PLpro are among those screened, as well as the host ACE2 receptor and TMPRSS2 protease (Chen et al., 2020; Choudhary et al., 2020; Kong R. et al., 2020; Smith and Smith, 2020; Wu et al., 2020). As an example, Ton et al. identified at least 1000 protease inhibitors by creating and utilizing the Deep Docking (DD) network technology approach. However, as they used the QSAR for training their model, no novel docking score was provided (Ton et al., 2020).

It is clear that 3CLpro is the most popular target for virtual screening (Figure 1). The main reason for this is its pivotal role in viral replication and transcription and its well-defined structural information. Viral protease inhibitors have been extensively studied as treatments for other viruses. In addition, deep learning-aided approaches have been the main focus of research, as their automatic feature extraction accelerates discovery. The datasets cited often rely on the ZINC database (Wu et al., 2020), while other screened datasets include the FDA-approved LOPAC library (Choudhary et al., 2020), SWEETLEAD library (Smith and Smith, 2020), or all purchasable drugs (Drugs-lib) (Chen et al., 2020). Moreover this review sampled a variety of publications witch used different computational resources. It can be carried out on a small scale on a MacOS Mojave Workstation with an 8 core Zeon E5 processor or on a large scale as with the world's strongest supercomputer, SUMMIT, for enhanced parallelization (Choudhary et al., 2020; Smith and Smith, 2020).

RNA-Based

Conserved structured elements have already been shown to play critical functional roles in the life cycles of Coronaviruses (Yang and Leibowitz, 2015). Through direct interactions with host RNA-binding proteins and helicases, structural elements add a layer of complexity to the regulatory information that is encoded in the viral RNA. Targeted disruption of the regulatory functions of these structural elements provides a largely unexplored strategy that can limit viral loads with minimal impact on the biology of normal cells (Park et al., 2011). While this idea would have been farfetched a mere 5 years ago, advances in AI-driven computational modeling and high-throughput experimental RNA shape analyses have all but overcome the critical barriers (Alipanahi et al., 2015).

Highly conserved RNA structural elements have been identified in a number of viral families, many of which have been functionally validated (Jaafar and Kieft, 2019). Some of these stem loops in SARS-CoV-2′s 5′UTRs structural elements are conserved across beta coronaviruses and are known to impact viral replication (Yang and Leibowitz, 2015). There are many functional RNA structural elements that fall within the coding sequence and the 3′UTR as well (Plant and Dinman, 2008; Stammler et al., 2011). Rangan et al. identified 106 structurally conserved regions that would be suitable biotargets for unexplored antiviral agents (Rangan et al., 2020). Moreover, they predicted at least 59 unstructured regions that are conserved within SARS-CoV-2. Park et al. identified an RNA Pseudoknot-Binding molecule against SARS-CoV-1 in target-based virtual screening (Park et al., 2011; Nakagawa et al., 2016).

Studying the changes in RNA information also allows for the identification of new and evolved targets. In a different approach, Wu et al. showed that a recently FDA-approved drug named Remdesivir could bind to the RNA-binding channel of the novel coronavirus. They discovered other candidate drugs via analyzing the proteins critical to RNA processing and pathways (Wu et al., 2020). It seems that viral genome, RdRP, and processed mRNA would make promising targets for drug repurposing.

Generative Approaches

Molecule generation has been one of the fields of drug discovery that have been most revolutionized by the implementation of artificial intelligence over the last decade. As mentioned, VAE is a generator model for enhancing the diversity of generated data. Autoencoders instruct molecules into a vector that captures properties such as bond order, element, and functional group (Bjerrum and Sattarov, 2018). Chenthamarakshan et al., together with IBM Research, demonstrated a VAE that captures molecules in a latent space. Once captured, variations are made on the original molecule vectors based on desired properties. These can then be decoded back into novel molecules (Chenthamarakshan et al., 2020). To optimize the structures, QED, Synthetic Accessibility, and LogP regressors were used to improve the latent space variations.

In a different approach, Tang et al. overcame many of the issues with traditional generative models by developing a novel advanced deep Q-learning network with fragment-based drug design (ADQN-FBDD). This allowed for the enhanced exploration of space by assembling SARS-CoV-2 molecules one fragment at a time rather than relying on latent space adjustments. After making connections and rewarding molecules with the most druglike connections, a pharmacophore and descriptor filter was used to refine the set. They demonstrated a robust method for designing novel, high-binding compounds refined to the structure of SARS-CoV-2 3CLPro (Tang et al., 2020). To design a drug-generative network, the following is necessary: (1) collection of Druglike Molecules, (2) a representation of these molecules in silico (i.e., Fingerprints, Tokenizers), (3) a method of altering molecules to increase diversity, and (4) screening and modification of the altered molecules. Pursuing GAN-related models, Insilico Medicine used three of its previously validated generative chemistry approaches to target the main protease, namely, crystal-derived pocked-based generation, homology modeling-based generation, and ligand-based generation (Zhavoronkov et al., 2020). Similar to target-based virtual screening, the main protease has been the main object of interest for scientists for de novo drug discovery.

COVID-19 Vaccine Discovery

Identification of the best possible targets for the development of a vaccine is crucial in order to counteract a virus's high infection rate (Choudhary et al., 2020). A host immune system fights virus-infected cells either through the production of antibodies by B cells or through the direct attack of T cells (Amanat and Krammer, 2020). The HLA gene encodes MCH-I and MCH-II proteins, which present epitopes as antigenic determinants. These proteins assist B-cell and T-cell antibodies in their ability to bind and attack invaders (Dangi et al., 2018; Gupta et al., 2020; Smith and Smith, 2020). Machine learning approaches, including Random Forest (RF), Support Vector Machine (SVM), and Recursive Feature Selection (RFE), have been basic tools for identifying antigens from protein sequences (Bowick et al., 2010; Rahman et al., 2019). However, due to their low sensitivity in the prediction of locally clustered interactions in some cases, Deep Convolutional Neural Networks (DCNN) have been a more valid alternative for the binding prediction of MHC and peptides (Han and Kim, 2017).

Since the outbreak of this first coronavirus, different AI-based approaches have been used to predict potential epitopes so as to design vaccines (Park et al., 2011; Yang and Leibowitz, 2015; Ton et al., 2020). Fast and Chen used MARIA (Chen et al., 2019) and NetMHCPan4 (Jurtz et al., 2017), two supervised neural network-driven tools, to discover potential T-cell epitopes for SARS-CoV-2 close to the 2019-nCoV spike receptor-binding domain (RBD) (Fast and Chen, 2020). The Long Short-Term Memory (LSTM) network has also shown some promising results. Abbasi et al. used this type of RNN to predict epitopes for Spike (Abbasi, 2020). Using a similar tactic, Crossman et al. employed deep-learning RNN and provided simulated sequences of Spike to identify possible targets for vaccine design (Crossman, 2020). RNN provided the sequences for a protein of interest with high sequence identity to the BLAST match.

Using a separate method, Feng et al. leveraged the iNeo tool to design a vaccine containing both B-cell and T-cell epitopes. This multi-peptide vaccine could provide a new strategy against SARS-CoV-2. Additionally, they discovered 17 vaccine peptides involving both immune cells (Nakagawa et al., 2016; Rangan et al., 2020). Ong et al. used Vaxign-RV to prioritize non-structural proteins as vaccine candidates for SARS-CoV-2 (Ong et al., 2020b). Nsp3, the largest non-structural protein of the coronavirus family, was identified as the most promising potential target for vaccine development after Spike (Ong et al., 2020b). Malone et al. also studied the entire SARS-CoV-2 proteome beyond Spike and provided a comprehensive vaccine design blueprint for SARS-CoV-2 using NEC Immune Profiler, IEDB, and BepiPred tools to create an epitope map for different HLA alleles (Malone et al., 2020).

Natural language processing models, specifically language modeling techniques, have also made an impact in the domain of COVID-19 vaccine discovery. Pre-trained transformers were used to predict protein interaction (Nambiar et al., 2020) and model molecular reactions in carbohydrate chemistry (Pesciullesi et al., 2020), which can be utilized in the process of vaccine development. Chen et al. discussed the use-case of an LSTM-based seq-2-seq model for predicting the secondary structure of certain SARS-COV-2 proteins (Karpov et al., 2019)3. Also, Beck et al. used transformers to repurpose commercially available drugs by predicting their interactions with viral proteins of SARS-COV-2 (Beck et al., 2020b).

Taking this work together, it is clear that spike protein has been the most popular candidate for virtual vaccine discovery (Oany et al., 2014). As the spike protein of SARS-COV-2 is crucial for viral entry, specific neutralizing antibodies against the receptor-binding domain of Spike can interrupt the attachment and fusion of viral proteins (Wan et al., 2019). This method could provide simulated sequences that can serve as a guide for further vaccine discovery against COVID-19 and possibly new zoonosis that may arise in the future.

Data Collection

Data-driven solutions rely on patterns embedded in the data in order to extract mathematical models. That being said, a data collection campaign will face a plethora of challenges in the case of any recently emerged virus, primarily due to the existence of bias and imbalance in the limited data available. Therefore, even the most sophisticated of modeling approaches will be ineffective when trained on such datasets. In order to overcome this issue, we compiled a multifaceted and comprehensive investigation of the existing literature, datasets, and online resources to provide potential small molecules, peptides, and epitopes. Such elements can be beneficial in the process of discovering or designing novel drugs to treat COVID-19 when used with both conventional and data-driven AI-based approaches.

We choose to focus on both potential antiviral agents and host biotarget inhibitors. The provided data entitled CoronaDB-AI in Table 1 includes the small molecules and peptides proposed by both in-silico and in-vitro approaches. In addition to candidate scaffolds against the coronavirus's structural proteins, the potential inhibition of other respiratory tract viruses is taken into consideration to increase the therapeutic potential. Antimicrobial peptides have been validated as potent antivirals that disrupt either the viral membrane or an additional molecular mechanism of the virus (Akaji et al., 2011; Han and Kraí, 2020; Xia et al., 2020). As described before, the cytokine storm and an elevated immune response of the host plays a vital role in disease complication, so candidate immunosuppressants were also added as host-targeted agents. In addition to the potency of a candidate drug, it is crucial that the drug have high selectivity and low toxicity. Therefore, we also gathered a complete toxicity dataset from distinct databases, including ToxCast and Tox21. Finally, we gathered a comprehensive epitope-based dataset that could also guide deep learning-based models for improved vaccine development and epitope generation.

Table 1

Data providedDiscoveryTypeMechanism of actionReferences
ANTIVIRAL DATA
Total of 59,107Small molecules and peptides
50,000In-silicoSmall moleculeAntiviral1
3,000In-silicoSmall moleculeAnti SARS2 proteinChenthamarakshan et al., 2020
1,000In-silicoSmall moleculeAnti-proteaseTon et al., 2020
406In-vitroSmall moleculeInhibiting autophagy2
802In-vitroSmall moleculeActivating autophagy2
393In-vitroSmall moleculeBiotargets of coronaviruses3
110In-vitroPeptide and small moleculeCoronavirus and respiratory diseasePillaiyar et al., 2020
1,000In-silicoSmall molecule3C protease inhibitorZhavoronkov et al., 2020
11In-silicoSmall moleculeMain protease inhibitorFischer et al., 2020
20In-vitroAntimicrobial peptideAnti-SARS/MERSMustafa et al., 2018
7In-silicoAntimicrobial peptideAnti-MERSMustafa et al., 2019
277In-vitroAntimicrobial peptideAntiviralWang et al., 2015
4In-silicoAntimicrobial peptideAnti-spike of sars-Cov-2Han and Kraí, 2020
379In-vitroSmall moleculeAnti-respiratory syncytial virusPlant et al., 2015
13In-vitroSmall moleculeAnti-recurrent respiratory papillomatosis by HPV-6Alkhilaiwi et al., 2019
1,280In-vitroSmall moleculeAnti-respiratory syncytial virusRasmussen et al., 2011
16In-silicoSmall moleculesAnti-SARS-COV-2Zhou Y. et al., 2020
77In-silicoSmall moleculesAnti-S Protein of SARS-COV-2Smith and Smith, 2020
10In-silicoSmall moleculesAnti-SARS-COV2Hu et al., 2020
25In-silicoSmall moleculesAnti SARS2 ProteinsKim J. et al., 2020
10In-silicoSmall moleculesACE2 and Spike inhibitorsChoudhary et al., 2020
78In-silicoSmall moleculesAll SARS2 proteinsWu et al., 2020
47In-silicoSmall molecules3cl protease and M proTang et al., 2020
16In-silicoSmall molecules3cl protease inhibitorChen et al., 2020
36In-vitroSmall moleculesAnti- Coronavirus-OC43Shen et al., 2019
90In-vitroSmall moleculesAnti- SARS-COV-2Touret et al., 2020
ANTI-HOST PROTEINS
Total of 677Small molecules and peptides
6In-vitroSmall moleculesAnti-IL-1β and TNFαLaufer et al., 2002
182In-vitroPeptidesCytokine Signaling Inhibitors4
269In-silicoSmall moleculesAnti-IL-6Shukla et al., 2019
121In-vitroSmall moleculesSevere acute respiratory5
69In-silicoSmall moleculesAnti-protein-protein interaction of virus-hostGordon et al., 2020
30In-silicoSmall moleculesAnti-host & virus interactionRedka et al., 2020
TOXICITY DATA
Total of 25,333Small molecules
11,800In-vitroSmall moleculesTox21 and ToxCastToxicology, EPA's National Center for Computational, 2018
13,533In-vitroSmall moleculesToxic for HepG2 Cell LineGamo et al., 2010
VACCINE DATA
Total of 517Epitopes and vaccines
162In-silicoEpitopesAnti-SARS-COV-2Ahmed et al., 2020
174In-silicoEpitopeAnti-SARS-COV-2Prachar et al., 2020
2In-silicoEpitopeAnti-SARS-COV-2Fast and Chen, 2020
30In-silicoVaccine candidateAnti-SARS-COV-2Feng et al., 2020
7In-silicoEpitopeAnti-SARS-COV-2Lon et al., 2020
12In-silicoEpitopeAnti-SARS-COV-2Tilocca et al., 2020
59In-silicoEpitopeAnti-SARS-COV-2Sarkar et al., 2020
71In-silicoEpitopeAnti-SARS-COV-2Bhattacharya et al., 2020

CoronaDB-AI is a collection of small molecules, peptides, and epitopes for the purpose of COVID-19 therapy discovery.

1Download CAS COVID-19 Antiviral Candidate Compounds Dataset | CAS. Available online at: https://www.cas.org/covid-19-antiviral-compounds-dataset (accessed April 27, 2020).

2Novel Coronavirus Information Center. Available online at: https://www.elsevier.com/connect/coronavirus-information-center (accessed April 27, 2020).

3https://www.elsevier.com/__data/assets/pdf_file/0004/978745/Copy-of-RMC-substances-coronovirus-targets-pX6.pdf (accessed April 27, 2020).

4Cytokines Inhibitor library|Targetmol|96-well. Available online at: https://www.targetmol.com/compound-library/Cytokines-inhibitors-Library (accessed April 27, 2020).

5https://www.elsevier.com/__data/assets/pdf_file/0007/977173/ResNet-Data_Coronavirus.pdf (accessed April 27, 2020).

Discussion

SARS-COV-2 rapidly transformed into a global challenge, costing thousands of lives, overwhelming healthcare systems, and threatening the economy all around the world. As we demonstrated above, it can be extremely challenging to experimentally perform a comprehensive potency evaluation of all drug and vaccine candidates in a timely fashion. We believe that leveraging computational models capable of filtering and generating reliable therapies can significantly speed up these discovery efforts. Employing artificial neural networks and supervised learning methods has proven to be a vital game-changer when used for the purpose of virtual filtering and de novo design. However, in order to achieve the desired performance in such intelligent methods, one requires the knowledge to recognize the most relevant biotargets in addition to a large-scale training dataset. This fact motivated us to perform a survey of biotargets that have been employed in the virtual drug and vaccine discovery literature. We observed that the viral spike protein and the main protease have been the most prevalent choices for vaccine development and drug discovery, respectively, due to their importance. Furthermore, we gathered a list of datasets titled “CoronaDB-AI” that can be used for our particular application. Having access to these key elements removes the burden of collecting training data and the required knowledge for both computer scientists and bioinformaticians and consequently enhances research outcomes.

Statements

Author contributions

AK organized and wrote most of article and gathered all the data. JW contributed to the molecular part. MS contributed to the background for AI-based methods. EC, ED-C, and BK from A2A and SC-T from Atomwise contributed to the COVID19 drug discovery. NG and JC contributed to the vaccine discovery. HG contributed to the RNA-based and molecular sections. JY provided guidance in the opportunities of deep learning in a multidiscipline collaboration. All authors contributed to the article and approved the submitted version.

Acknowledgments

We thank Farnam Kavehei for designing the figure. Also, we thank Melana Francisco for her contribution to the introduction of the article.

Conflict of interest

EC, ED-C, and BK were employed by the company A2A Pharmaceuticals. SC-T was employed by the company Atomwise Inc. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Footnotes

1.^AI study launched to monitor cardiac safety of COVID-19 patients receiving hydroxychloroquine. Available online at: https://cardiacrhythmnews.com/ai-study-launched-to-monitor-cardiac-safety-of-covid-19-patients-receiving-hydroxychloroquine/ (accessed July 04, 2020).

2.^Atomwise Partners with Global Research Teams to Pursue Broad-Spectrum Treatments Against COVID-19 and Future Coronavirus Outbreaks | Business Wire. Available online at: https://www.businesswire.com/news/home/20200521005238/en/Atomwise-Partners-Global-Research-Teams-Pursue-Broad-Spectrum (accessed June 28, 2020).

3.^OSF Preprints. ZeroFold-Understanding Mutations of SARS-CoV-2 Spike Protein base on Secondary Structure Event Extracting for guiding Vaccine development. Available online at: https://osf.io/3vkuw/ (accessed Jul. 01, 2020).

References

  • 1

    AbbasiB. A. (2020). Identification_of_vaccine_targets_and_design_of_vaccine_against_SARS. OSF Preprints. 10.31219/osf.io/f8zyw

  • 2

    AcharyaC.CoopA.PolliJ. E.MacKerellA. D. (2010). Recent advances in ligand-based drug design: relevance and utility of the conformationally sampled pharmacophore approach. Curr. Comput. Aided-Drug Des.7, 1022. 10.2174/157340911793743547

  • 3

    AhmedS. F.QuadeerA. A.McKayM. R. (2020). Preliminary identification of potential vaccine targets for the COVID-19 Coronavirus (SARS-CoV-2) Based on SARS-CoV immunological studies. Viruses12:253. 10.3390/v12030254

  • 4

    AkajiK.KonnoH.MitsuiH.TeruyaK.ShimamotoY.HattoriY.et al. (2011). Structure-based design, synthesis, and evaluation of peptide-mimetic SARS 3CL protease inhibitors. J. Med. Chem.54, 79627973. 10.1021/jm200870n

  • 5

    AlaghbandM.YousefiN.GaribayI. (2020). FePh: an annotated facial expression dataset for the RWTH-PHOENIX-weather 2014 Dataset. arXiv: 2003.08759v1. Available online at: https://arxiv.org/pdf/2003.08759.pdf

  • 6

    AlipanahiB.DelongA.WeirauchM. T.FreyB. J. (2015). Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol.33, 831838. 10.1038/nbt.3300

  • 7

    AlkhilaiwiF.PaulS.ZhouD.ZhangX.WangF.Palechor-CeronN.et al. (2019). High-throughput screening identifies candidate drugs for the treatment of recurrent respiratory papillomatosis. Papillomavirus Res.8:100181. 10.1016/j.pvr.2019.100181

  • 8

    AmanatF.KrammerF. (2020). SARS-CoV-2 vaccines: status report. Immunity52, 583589. 10.1016/j.immuni.2020.03.007

  • 9

    AndersenK. G.RambautA.LipkinW. I.HolmesE. C.GarryR. F. (2020). The proximal origin of SARS-CoV-2. Nat. Med.26, 450452. 10.1038/s41591-020-0820-9

  • 10

    ArshadiA. K.SalemM.CollinsJ.YuanJ. S.ChakrabartiD. (2020). Deepmalaria: artificial intelligence driven discovery of potent antiplasmodials. Front. Pharmacol.10:1526. 10.3389/fphar.2019.01526

  • 11

    BazgirO.ZhangR.Rahman DhrubaS.RahmanR.GhoshS.PalR. (2019). REFINED (REpresentation of Features as Images With NEighborhood Dependencies): a novel feature representation for convolutional neural networks. arXiv [Preprint] arXiv:1912.05687 (2019).

  • 12

    BeckB. R.ShinB.ChoiY.ParkS.KangK. (2020a). Predicting commercially available antiviral drugs that may act on the novel coronavirus (2019-nCoV), Wuhan, China through a drug-target interaction deep learning model. bioRxiv [Preprint]. 10.1101/2020.01.31.929547

  • 13

    BeckB. R.ShinB.ChoiY.ParkS.KangK. (2020b). Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput. Struct. Biotechnol. J.18, 784790. 10.1016/j.csbj.2020.03.025

  • 14

    BelinkovY.GlassJ. (2018). Analysis methods in neural language processing: a survey. Trans. Assoc. Comput. Linguist.7, 4972. 10.1162/tacl_a_00254

  • 15

    BhattacharyaM.SharmaA. R.PatraP.GhoshP.SharmaG.PatraB. C.et al. (2020). Development of epitope-based peptide vaccine against novel coronavirus (2019). (SARS-COV-2): immunoinformatics approach. J. Med. Virol.92, 618631. 10.1002/jmv.25736

  • 16

    BjerrumE. J.SattarovB. (2018). Improving chemical autoencoder latent space and molecular de novo generation diversity with heteroencoders. Biomolecules8:131. 10.3390/biom8040131

  • 17

    BowickG. C.BarrettA. D. T. (2010). Comparative pathogenesis and systems biology for biodefense virus vaccine development. J. Biomed. Biotechnol. (2010) 2010:236528. 10.1155/2010/236528

  • 18

    BowmanB. N.McAdamP. R.VivonaS.ZhangJ. X.LuongT.BelewR. K.et al. (2011). Improving reverse vaccinology with a machine learning approach. Vaccine29, 81568164. 10.1016/j.vaccine.2011.07.142

  • 19

    BroomA.RakotoharisoaR. V.ThompsonM. C.ZarifiN.NguyenE.MukhametzhanovN.et al. (2020). Evolution of an enzyme conformational ensemble guides design of an efficient biocatalyst. bioRxiv [Preprint].10.1101/2020.03.19.999235

  • 20

    BrunoL.CorteseM.RappuoliR.MerolaM. (2015). Lessons from Reverse Vaccinology for viral vaccine design. Curr. Opin. Virol.11, 8997. 10.1016/j.coviro.2015.03.001

  • 21

    BullockJ.AlexandraL.PhamK. H.LamC. S. N.Luengo-OrozM. (2020). Mapping the landscape of artificial intelligence applications against COVID-19. arXiv [Preprint] arXiv:2003.11336 (2020).

  • 22

    BungN.KrishnanS. R.BulusuG.RoyA. (2020). De Novo design of new chemical entities (NCEs) for SARS-CoV-2 using artificial intelligence. ChemRxiv [Preprint]. 10.26434/chemrxiv.11998347.v2

  • 23

    ChakravartiS. K.AllaS. R. M. (2019). Descriptor free QSAR modeling using deep learning with long short-term memory neural networks. Front. Artif. Intell.2:17. 10.3389/frai.2019.00017

  • 24

    ChenB.KhodadoustM. S.OlssonN.WagarL. E.FastE.LiuC. L.et al. (2019). Predicting HLA class II antigen presentation through integrated deep learning. Nat. Biotechnol.37, 13321343. 10.1038/s41587-019-0280-2

  • 25

    ChenH.EngkvistO.WangY.OlivecronaM.BlaschkeT. (2018). The rise of deep learning in drug discovery. Drug Discov. Today23, 12411250. 10.1016/j.drudis.2018.01.039

  • 26

    ChenY. W.YiuC.-P. B.WongK.-Y. (2020). Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CLpro) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates. F1000Research9:129. 10.12688/f1000research.22457.2

  • 27

    ChenthamarakshanV.DasP.PadhiI.StrobeltH.LimK. W.HooverB.et al. (2020). Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models. Available: http://arxiv.org/abs/2004.01215 (accessed April 19, 2020).

  • 28

    ChoromanskiK.LikhosherstovV.DohanD.SongX.DavisJ.SarlosT.et al. (2020). Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers. Available online at: http://arxiv.org/abs/2006.03555 (accessed July 01, 2020).

  • 29

    ChoudharyS.MalikY. S.TomarS. (2020). Identification of SARS-CoV-2 cell entry inhibitors by drug repurposing using in silico structure-based virtual screening approach. ChemRxiv [Preprint]. 10.3389/fimmu.2020.01664

  • 30

    ColeyC. W.JinW.RogersL.JamisonT. F.JaakkolaT. S.GreenW. H.et al. (2019). A graph-convolutional neural network model for the prediction of chemical reactivity. Chem. Sci.10, 370377. 10.1039/C8SC04228D

  • 31

    CrossmanL. C. (2020). Leverging deep learning to simulate coronavirus spike proteins has the potential to predict future Zoonotic sequences. bioRxiv [Preprint]. 10.1101/2020.04.20.046920

  • 32

    DangiM.KumariR.SinghB.ChhillarA. K. (2018). Advanced in silico tools for designing of antigenic epitope as potential vaccine candidates against coronavirus. Bioinforma. Seq. Struct. Phylogeny. 329357. 10.1007/978-981-13-1562-6_15

  • 33

    De CaoN.KipfT. (2018). MolGAN: An implicit generative model for small molecular graphs. Available online at: http://arxiv.org/abs/1805.11973 (accessed April 26, 2020).

  • 34

    DevlinJ.ChangW.-M.LeeK.GoogleK. T.LanguageA. I. (2018). BERT: pre-Training of deep bidirectional transformers for language understanding. arXiv [preprint] arXiv:1810.04805 (2018).

  • 35

    DoytchinovaI. A.FlowerD. R. (2007). VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines. BMC Bioinformatics8:4. 10.1186/1471-2105-8-4

  • 36

    DuanY.EdwardsJ. S.DwivediY. K. (2019). Artificial intelligence for decision making in the era of Big Data – evolution, challenges and research agenda. Int. J. Inf. Manage.48, 6371. 10.1016/j.ijinfomgt.2019.01.021

  • 37

    DuvenaudD.MaclaurinD.Aguilera-IparraguirreJ.Gómez-BombarelliR.HirzelT.Aspuru-GuzikA.et al. (2015). Convolutional networks on graphs for learning molecular fingerprints. arXiv:1509.09292.

  • 38

    EwingT. J. A.MakinoS.SkillmanA. G.KuntzI. D. (2001). DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases. J. Comput. Aided. Mol. Des.15, 411428. 10.1023/A:1011115820450

  • 39

    FastE.ChenB. (2020). Potential T-cell and B-cell Epitopes of 2019-nCoV. bioRxiv [Preprint]. 10.1101/2020.02.19.955484

  • 40

    FehrA. R.PerlmanS. (2015). “Coronaviruses: an overview of their replication and pathogenesis,” in Coronaviruses: Methods and Protocols 1282. New York, NY: Springer, 123. 10.1007/978-1-4939-2438-7_1

  • 41

    FengY.QiuM.ZouS.LiY.LuoK.ChenR.et al. (2020). Multi-epitope vaccine design using an immunoinformatics approach for 2019 novel coronavirus in China (SARS-CoV-2). bioRxiv [Preprint]. 10.1101/2020.03.03.962332

  • 42

    FischerA.SellnerM.NeranjanS.LillM. A.SmieškoM. (2020). Inhibitors for novel coronavirus protease identified by virtual screening of 687 million compounds. ChemRxiv [Preprint].10.26434/chemrxiv.11923239.v1

  • 43

    FlowerD. R.MacDonaldI. K.RamakrishnanK.DaviesM. N.DoytchinovaI. A. (2010). Computer aided selection of candidate vaccine antigens. Immunome Res.6(Suppl. 2), 116. 10.1186/1745-7580-6-S2-S1

  • 44

    FoosheeD.MoodA.GutmanE.TavakoliM.UrbanG.LiuF.et al. (2018). Deep learning for chemical reaction prediction. Mol. Syst. Des. Eng.3, 442452. 10.1039/C7ME00107J

  • 45

    FoutA.ByrdJ.ShariatB.Ben-HurA. (2017). “Protein interface prediction using graph convolutional networks,” in Advances in Neural Information Processing Systems (Long Beach, CA), 65306539.

  • 46

    GamoF.-J.SanzL. M.VidalJ.de CozarC.AlvarezE.LavanderaJ.-L.et al. (2010). Thousands of chemical starting points for antimalarial lead identification. Nature465, 305310. 10.1038/nature09107

  • 47

    GordonD. E.JangG. M.BouhaddouM.XuJ.ObernierK.O'MearaM. J.et al. (2020). A SARS-CoV-2-human protein-protein interaction map reveals drug targets and potential drug-repurposing. bioRxiv [Preprint].10.1101/2020.03.22.002386

  • 48

    GroverD.ToghiB. (2020). MNIST dataset classification utilizing k-NN classifier with modified sliding-window metric. Adv. Intel. Syst. Comp.944, 583591. 10.1007/978-3-030-17798-0_47

  • 49

    GuimaraesG. L.Sanchez-LengelingB.OuteiralC.FariasL. C.Aspuru-GuzikA. (2017). Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models. Available online at: http://arxiv.org/abs/1705.10843 (accessed April 26, 2020).

  • 50

    GuptaE.MishraR. K.NirajR. R. K. (2020). Identification of potential vaccine candidates against SARS-CoV-2, a step forward to fight novel coronavirus 2019-nCoV: a reverse vaccinology approach. bioRxiv [Preprint]. 10.1101/2020.04.13.039198

  • 51

    HammingI.TimensW.BulthuisM. L, C.LelyA. T.NavisG. J.van GoorH. (2004). Tissue distribution of ACE2 protein, the functional receptor for SARS coronavirus. A first step in understanding SARS pathogenesis. J. Pathol.203, 631637. 10.1002/path.1570

  • 52

    HanY.KimD. (2017). Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction. BMC Bioinformat.18:585. 10.1186/s12859-017-1997-x

  • 53

    Han Y. Kraí P. (2020) Computational design of ACE2-based peptide inhibitors of SARS-CoV-2. ACS Nano 14, 5143–5147. 10.1021/acsnano.0c02857

  • 54

    HeL.ZhuJ. (2015). Computational tools for epitope vaccine design and evaluation. Curr. Opin. Virol.11, 103112. 10.1016/j.coviro.2015.03.013

  • 55

    HeY.RappuoliR.De GrootA. S.ChenR. T. (2010b). Emerging vaccine informatics. J. Biomed. Biotechnol.10.1155/2010/218590

  • 56

    HeY.XiangZ.MobleyH. L. T. (2010a). Vaxign: the first web-based vaccine design program for reverse vaccinology and applications for vaccine development. J. Biomed. Biotechnol. 2010:297505. 10.1155/2010/297505

  • 57

    HeinsonA. I.GunawardanaY.MoeskerB.Denman HumeC. C.VatagaE.HallY.et al. (2017). Enhancing the biological relevance of machine learning classifiers for reverse vaccinology. Int. J. Mol. Sci.18:312. 10.3390/ijms18020312

  • 58

    HeinsonA. I.WoelkC. H.NewellM. L. (2015). The promise of reverse vaccinology. Int. Health7, 8589. 10.1093/inthealth/ihv002

  • 59

    HeskettC.FairclothB.RoperS.ClayM. (2018). Executive Insights Artificial Intelligence in Life Sciences: The Formula for Pharma Success Across the Drug Lifecycle. Available online at: https://www.lek.com/sites/default/files/insights/pdf-attachments/2060-AI-in-Life-Sciences.pdf (accessed June 18, 2019).

  • 60

    HilgenfeldR.PeirisM. (2013). From SARS to MERS: 10 years of research on highly pathogenic human coronaviruses. Antivir. Res.100, 286295. 10.1016/j.antiviral.2013.08.015

  • 61

    HoffmannM.Kleine-WeberH.SchroederS.KrugerN.HerrlerT.ErichsenS.et al. (2020). SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor. Cell181, 271-280.e8. 10.1016/j.cell.2020.02.052

  • 62

    HuF.JiangJ.YinP. (2020). Prediction of Potential Commercially Inhibitors Against SARS-CoV-2 by Multi-Task Deep Model. Available online at: https://arxiv.org/ftp/arxiv/papers/2003/2003.00728.pdf (accessed April 22, 2020).

  • 63

    HuangC.WangY.LiX.RenL.ZhaoJ.HuY.et al. (2020). Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet395, 497506. 10.1016/S0140-6736(20)30183-5

  • 64

    JaafarZ. A.KieftJ. S. (2019). Viral RNA structure-based strategies to manipulate translation. Nat. Rev. Microbiol.17, 110123. 10.1038/s41579-018-0117-x

  • 65

    Jabeer KhanR.Kumar JhaR.Muluneh AmeraG.JainM.SinghE.PathakA.et al. (2020). Targeting novel coronavirus 2019: a systematic drug repurposing approach to identify promising inhibitors against 3C-like proteinase and 2'-O-ribose methyltransferase: a systematic drug repurposing approach to identify promising inhibitors against 3C-like proteinase and 2'-O-ribose methyltransferase. ChemRxiv [Preprint]. 10.26434/chemrxiv.11888730.v1

  • 66

    JinW.BarzilayR.JaakkolaT. (2018). Junction tree variational autoencoder for molecular graph generation. arXiv [Preprint]. arXiv:1802.04364.

  • 67

    JinZ.DuX.XuY.DengY.LiuM.ZhaoY.et al. (2020). Structure of Mpro from COVID-19 virus and discovery of its inhibitors. bioRxiv [Preprint].10.1101/2020.02.26.964882

  • 68

    JurtzV.PaulS.AndreattaM.MarcatiliP.PetersB.NielsenM. (2017). NetMHCpan-4.0: improved peptide–mhc class i interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol.199, 33603368. 10.4049/jimmunol.1700893

  • 69

    KadiogluO.SaeedM.GretenH. J.EfferthY. (2020). Identification of novel compounds against three targets of SARS CoV2 coronavirus by combined virtual screening and supervised machine learning. Bull World Heal. Organ.10.2471/BLT.20.255943

  • 70

    KandeelM.Al-NazawiM. (2020). Virtual screening and repurposing of FDA approved drugs against COVID-19 main protease. Life Sci.251:117627. 10.1016/j.lfs.2020.117627

  • 71

    KarpovP.GodinG.TetkoI. V. (2019). “A transformer model for retrosynthesis,” in Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science, Vol. 11731, eds TetkoI.KurkováV.KarpovP.TheisF (Cham: Springer). 10.1007/978-3-030-30493-5_78

  • 72

    KearnesS.McCloskeyK.BerndlM.PandeV.RileyP. (2016). Molecular graph convolutions: moving beyond fingerprints. J. Comput. Aided. Mol. Des.30, 595608. 10.1007/s10822-016-9938-8

  • 73

    KimD.LeeJ.-Y.YangJ.-S.KimJ. W.KimV. N.ChangH. (2020). The architecture of SARS-CoV-2 transcriptome. Cell.181, 914921. 10.1016/j.cell.2020.04.011

  • 74

    KimJ.ZhangJ.ChaY.KolitzS.FuntJ.Escalante ChongR.et al. (2020). Advanced bioinformatics rapidly identifies existing therapeutics for patients with coronavirus disease–2019 (COVID-19). ChemRxiv [Preprint]. 10.26434/chemrxiv.12037416

  • 75

    KongR.YangG.XueR.LiuM.WangF.HuJ.et al. (2020). COVID-19 Docking Server: An Interactive Server for Docking Small Molecules, Peptides and Antibodies Against Potential Targets of COVID-19. Available online at: https://arxiv.org/abs/2003.00163 (accessed April 29, 2020). 10.1093/bioinformatics/btaa645

  • 76

    KongW.-H.LiY.PengM.-W.KongD.-G.YangX.-B.WangL.et al. (2020). SARS-CoV-2 detection in patients with influenza-like illness. Nat. Microbiol.5, 675678. 10.1038/s41564-020-0713-1

  • 77

    LaiM. M.CavanaghD. (1997). The molecular biology of coronaviruses. Adv. Virus Res.48, 1100. 10.1016/S0065-3527(08)60286-9

  • 78

    LauferS.GreimC.BertscheT. (2002). An in-vitro screening assay for the detection of inhibitors of proinflammatory cytokine synthesis: A useful tool for the development of new antiarthritic and disease modifying drugs. Osteoarthr. Cartil.10, 961967. 10.1053/joca.2002.0851

  • 79

    LavecchiaA. (2019). Deep learning in drug discovery: opportunities, challenges and future prospects. Drug Discovery Today24, 20172032. 10.1016/j.drudis.2019.07.006

  • 80

    LecunY.BengioY.HintonG. (2015). Deep learning. Nature521, 436444. 10.1038/nature14539

  • 81

    LiJ.ShaoJ.WangC.LiW. (2020). The epidemiology and therapeutic options for the COVID-19. Precis. Clin. Med.3, 7184. 10.1093/pcmedi/pbaa017

  • 82

    LiX.YuJ.ZhangZ.RenJ.PeluffoA. E.ZhangW.et al. (2020). Network bioinformatics analysis provides insight into drug repurposing for COVID-2019. Preprints 1–15. 10.20944/preprints202003.0286.v1

  • 83

    LiontaE.SpyrouG.VassilatisD.CourniaZ. (2014). Structure-based virtual screening for drug discovery: principles, applications and recent advances. Curr. Top. Med. Chem.14, 19231938. 10.2174/1568026614666140929124445

  • 84

    LiuK.SunX.JiaL.MaJ.XingH.WuJ.et al. (2019). Chemi-net: A molecular graph convolutional network for accurate drug property prediction. Int. J. Mol. Sci.20:3389. 10.3390/ijms20143389

  • 85

    LiuQ.AllamanisM.BrockschmidtM.GauntA. L. (2018). “Constrained graph variational autoencoders for molecule design,” in Advances in Neural Information Processing Systems (Montreal, QC), 77957804.

  • 86

    LiuX. (2017). Deep Recurrent Neural Network for Protein Function Prediction from Sequence. Available online at: https://arxiv.org/abs/1701.08318 (accessed April 26, 2020). 10.1101/103994

  • 87

    LonJ. R.BaiY.ZhongB.CaiF.DuH. (2020). Prediction and evolution of B cell epitopes of surface protein in SARS-CoV-2. bioRxiv [Preprint]. 10.1101/2020.04.03.022723

  • 88

    MaJ.SheridanR. P.LiawA.DahlG. E.SvetnikV. (2015). Deep neural nets as a method for quantitative structure-activity relationships. J. Chem. Inf. Model.55, 263274. 10.1021/ci500747n

  • 89

    MagarR.YadavP.FarimaniA. B. (2020). Potential Neutralizing Antibodies Discovered for Novel Corona Virus Using Machine Learning. Available onlin at: http://arxiv.org/abs/2003.08447 (accessed April 30, 2020). 10.1101/2020.03.14.992156

  • 90

    MaloneB.SimovskiB.MolinéC.ChengJ.FontenelleH.VardaxisI.et al. (2020). Artificial intelligence predicts the immunogenic landscape of SARS-CoV-2: toward universal blueprints for vaccine designs. bioRxiv [Preprint]. 10.1101/2020.04.21.052084

  • 91

    MessinaF.GiombiniE.AgratiC.VairoF.Ascoli BartoliT.Al MoghaziS.et al. (2020). COVID-19: viral-host interactome analyzed by network based-approach model to study pathogenesis of SARS-CoV-2 infection. J. Transl. Med.18:233. 10.1186/s12967-020-02405-w

  • 92

    MiyakeJ.KaneshitaY.AsataniS.TagawaS.NiiokaH.HiranoT. (2018). Graphical classification of DNA sequences of HLA alleles by deep learning. Hum. Cell31, 102105. 10.1007/s13577-017-0194-6

  • 93

    MooreB. J. B.JuneC. H. (2020). Cytokine release syndrome in severe COVID-19. Science. 368, 473474. 10.1126/science.abb8925

  • 94

    MustafaS.BalkhyH.GabereM. (2019). Peptide-Protein Interaction Studies of Antimicrobial Peptides Targeting Middle East Respiratory Syndrome Coronavirus Spike Protein: An In Silico Approach. London: Hindawi. 10.1155/2019/6815105

  • 95

    MustafaS.BalkhyH.GabereM. N. (2018). Current treatment options and the role of peptides as potential therapeutic components for Middle East Respiratory Syndrome (MERS): a review. J. Infect. Public Health11, 917. 10.1016/j.jiph.2017.08.009

  • 96

    NakagawaK.LokugamageK. G.MakinoS. (2016). “Viral and cellular mRNA translation in coronavirus-infected cells,” in Advances in Virus Research, Vol. 96 (Cambridge, MA: Academic Press Inc.), 165192. 10.1016/bs.aivir.2016.08.001165

  • 97

    NambiarA.HeflinM. E.LiuS.MaslovS.HopkinsM.RitzA. (2020). Transforming the Language of Life: Transformer Neural Networks for Protein Prediction Tasks. bioRxiv. 06.15.153643, (2020). 10.1101/2020.06.15.153643

  • 98

    NazK.NazA.AshrafS. T.RizwanM.AhmadJ.BaumbachJ.et al. (2019). PanRV: pangenome-reverse vaccinology approach for identifications of potential vaccine candidates in microbial pangenome. BMC Bioinformatics20, 110. 10.1186/s12859-019-2713-9

  • 99

    OanyA. R.Al EmranA.JyotiT. (2014). Design of an epitope-based peptide vaccine against spike protein of human coronavirus: an in silico approach. Drug Des. Devel. Ther.8, 11391149. 10.2147/DDDT.S67861

  • 100

    OngE.WangH.WongM. U.SeetharamanM.ValdezN.HeY. (2020a). Vaxign-ML: supervised machine learning reverse vaccinology model for improved prediction of bacterial protective antigens. Bioinformatics36, 17. 10.1093/bioinformatics/btaa119

  • 101

    OngE.WongM. U.HuffmanA.HeY. (2020b). COVID-19 coronavirus vaccine design using reverse vaccinology and machine learning. bioRxiv [Preprint]. 10.1101/2020.03.20.000141

  • 102

    PagadalaN. S.SyedK.TuszynskiJ. (2017). Software for molecular docking: a review. Biophys. Rev.9, 91102. 10.1007/s12551-016-0247-1

  • 103

    ParkS. J.KimY. G.ParkH. J. (2011). Identification of rna pseudoknot-binding ligand that inhibits the - 1 ribosomal frameshifting of SARS-coronavirus by structure-based virtual screening. J. Am. Chem. Soc.133, 1009410100. 10.1021/ja1098325

  • 104

    PazhouhandehM.M.-SahraianA.SiadatS. D.FatehA.VaziriF.TabriziF.et al. (2018). A systems medicine approach reveals disordered immune system and lipid metabolism in multiple sclerosis patients. Clin. Exp. Immunol.192, 1832. 10.1111/cei.13087

  • 105

    Pesciullesi G. Schwaller P. Laino T. and J.-Reymond, L. (2020). Carbohydrate transformer: predicting regio- and stereoselective reactions using transfer learning. ChemRxiv [Preprint]. 10.26434/chemrxiv.11935635

  • 106

    PillaiyarT.MeenakshisundaramS.ManickamM. (2020). Recent discovery and development of inhibitors targeting coronaviruses. Drug Discovery Today. 5, 668688. 10.1016/j.drudis.2020.01.015

  • 107

    PlantE. P.DinmanJ. D. (2008). The role of programmed-1 ribosomal frameshifting in coronavirus propagation. Front. Biosci.13, 48734881. 10.2741/3046

  • 108

    PlantH.StaceyC.Tiong-YipC. L.WalshJ.YuQ.RichK. (2015). High-throughput hit screening cascade to identify respiratory syncytial virus (RSV) inhibitors. J. Biomol. Screen.20, 597605. 10.1177/1087057115569428

  • 109

    PollastriG.PrzybylskiD.RostB.BaldiP. (2002). Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins Struct. Funct. Genet.47, 228235. 10.1002/prot.10082

  • 110

    PracharM.JustesenS.Steen-JensenD. B.WintherO.BaggerF. O. (2020). COVID-19 vaccine candidates: prediction and validation of 174 SARS-CoV-2 epitopes. bioRxiv [Preprint]. 10.1101/2020.03.20.000794

  • 111

    PrompetcharaE.KetloyC.PalagaT. (2020). Immune responses in COVID-19 and potential vaccines: Lessons learned from SARS and MERS epidemic. Asian Pacific J. Allergy Immunol.38, 19. 10.12932/AP-200220-0772

  • 112

    RahmanM. S.RahmanM. K.SahaS.KaykobadM.RahmanM. S. (2019). Antigenic: an improved prediction model of protective antigens. Artif. Intell. Med.94, 2841. 10.1016/j.artmed.2018.12.010

  • 113

    RanganR.ZheludevI. N.DasR. (2020). RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses. bioRxiv [Preprint]. 10.1101/2020.03.27.012906

  • 114

    RappuoliR. (2000). Reverse vaccinology rino rappuoli. Curr. Opin. Microbiol.3, 445450. 10.1016/S1369-5274(00)00119-3

  • 115

    RasmussenL.MaddoxC.MooreB. P.SeversonW.WhiteE. L. (2011). A high-throughput screening strategy to overcome virus instability. Assay Drug Dev Technol.9, 184190. 10.1089/adt.2010.0298

  • 116

    RedkaD. S.MacKinnonS. S.LandonM.WindemuthA.KurjiN.ShahaniV. (2020). PolypharmDB, a Deep Learning-Based Resource, Quickly Identifies Repurposed Drug Candidates for COVID-19. ChemRxiv [Preprint]10.26434/chemrxiv.12071271.v1

  • 117

    RichardsonP.GriffinI.TuckerC.SmithD.OechsleO.PhelanA.et al. (2020). Baricitinib as potential treatment for 2019-nCoV acute respiratory disease. Lancet395, e30e31, 15. 10.1016/S0140-6736(20)30304-4

  • 118

    SalemM.KhormaliA.ArshadiA. K.WebbJ.YuanS.-J. (2020). Transcreen: transfer learning on graph-based anti-cancer virtual screening model. Big Data Cogn. Comput.4:16. 10.3390/bdcc4030016

  • 119

    SarkarB.UllahM. A.JohoraF. T.TaniyaM. A.ArafY. (2020). The essential facts of wuhan novel coronavirus outbreak in china and epitope-based vaccine designing against COVID-19. bioRxiv [Preprint]. 10.1101/2020.02.05.935072

  • 120

    SchellerJ.GarbersC.Rose-JohnS. (2014). Interleukin-6: From basic biology to selective blockade of pro-inflammatory activities. Sem. Immunol.26, 212. 10.1016/j.smim.2013.11.002

  • 121

    SeniorA. W.EvansR.JumperJ.KirkpatrickJ.SifreL.GreenT.et al. (2020). Improved protein structure prediction using potentials from deep learning. Nature577, 706710. 10.1038/s41586-019-1923-7

  • 122

    ShenL.NiuJ.WangC.HuangB.WangW.ZhuN.et al. (2019). High-throughput screening and identification of potent broad-spectrum inhibitors of coronaviruses. J. Virol.10.1128/JVI.00023-19

  • 123

    SherG.ZhiD.ZhangS. (2017). DRREP: deep ridge regressed epitope predictor. BMC Genomics18:676. 10.1186/s12864-017-4024-8

  • 124

    ShinB.ParkS.KangK.HoJ. C. (2019). Self-attention based molecule representation for predicting drug-target interaction. arXiv [Preprint] arXiv:1908.06760.

  • 125

    ShoichetB. K. (2004). Virtual screening of chemical libraries. Nature432, 862865. 10.1038/nature03197

  • 126

    ShuklaP.KhandelwalR.SharmaD.DharA.NayarisseriA.SinghS. K. (2019). Virtual screening of IL-6 inhibitors for idiopathic arthritis. Bioinformation15, 121130. 10.6026/97320630015121

  • 127

    SimonovskyM.KomodakisN. (2018). “GraphVAE: towards generation of small graphs using variational autoencoders,” in International Conference on Artificial Neural Networks (Cham: Springer), 412422. 10.1007/978-3-030-01418-6_41

  • 128

    SmithM.SmithJ. C. (2020). Repurposing therapeutics for COVID-19: supercomputer-based docking to the SARS-CoV-2 viral spike protein and viral spike protein-human ACE2 interface. ChemRxiv [Preprint].10.26434/chemrxiv.11871402.v4

  • 129

    Soria-GuerraR. E.Nieto-GomezR.Govea-AlonsoD. O.Rosales-MendozaS. (2015). An overview of bioinformatics tools for epitope prediction: Implications on vaccine development. J. Biomed. Inform.53, 405414. 10.1016/j.jbi.2014.11.003

  • 130

    StammlerS. N.CaoS.ChenS. J.GiedrocD. (2011). A conserved RNA pseudoknot in a putative molecular switch domain of the 3′-untranslated region of coronaviruses is only marginally stable. RNA17, 17471759. 10.1261/rna.2816711

  • 131

    StebbingJ.PhelanA.GriffinI.TuckerC.OechsleO.SmithD.et al. (2020). COVID-19: combining antiviral and anti-inflammatory treatments. The Lancet Infectious Diseases20, 400402. 10.1016/S1473-3099(20)30132-8

  • 132

    SunY.LiangD.WangX.TangX. (2020). DeepID3: Face Recognition with Very Deep Neural Networks. Available online at: http://arxiv.org/abs/1502.00873 (accessed April 26, 2020).

  • 133

    TanakaT.NarazakiM.KishimotoT. (2016). Immunotherapeutic implications of IL-6 blockade for cytokine storm. Immunotherapy8, 959970. 10.2217/imt-2016-0020

  • 134

    TangB.HeF.LiuD.FangM.WuZ.XuD. (2020). AI-aided design of novel targeted covalent inhibitors against SARS-CoV-2. bioRxiv [Preprint]. 10.1101/2020.03.03.972133

  • 135

    TiloccaB.SoggiuA.SanguinettiM.MusellaV.BrittiD.BonizziL.et al. (2020). Comparative computational analysis of SARS-CoV-2 nucleocapsid protein epitopes in taxonomically related coronaviruses. Microbes Infect.22, 188194. 10.1016/j.micinf.2020.04.002

  • 136

    TonA.-T.GentileF.HsingM.BanF.CherkasovA. (2020). Rapid identification of potential inhibitors of SARS-CoV-2 main protease by deep docking of 1.3 billion compounds. Mol. Inform.39:202000028. 10.1002/minf.202000028

  • 137

    TorngW.AltmanR. B. (2019). Graph convolutional neural networks for predicting drug-target interactions. J. Chem. Inf. Model. 59, 41314149. 10.1021/acs.jcim.9b00628

  • 138

    TouretF.GillesM.BarralK.NougairèdeA.DecrolyE.de LamballerieX.et al. (2020). In vitro screening of a FDA approved chemical library reveals potential inhibitors of SARS-CoV-2 replication. bioRxiv [Preprint]. 10.1101/2020.04.03.023846

  • 139

    Toxicology EPA's National Center for Computational. (2018). ToxCast Database (invitroDB). The United States Environmental Protection Agency's Center for Computational Toxicology and Exposure. Dataset. 10.23645/epacomptox.6062623.v5

  • 140

    TranN. H.QiaoR.XinL.ChenX.ShanB.LiM. (2019). Personalized deep learning of individual immunopeptidomes to identify neoantigens for cancer vaccines. bioRxiv [Preprint]. 10.1101/620468

  • 141

    VaishnavN.GuptaA.PaulS.JohnG. J. (2015). Overview of computational vaccinology: vaccine development through information technology. J. Appl. Genet.56, 381391. 10.1007/s13353-014-0265-2

  • 142

    VaswaniA.BrainG.ShazeerN.ParmarN.UszkoreitJ.JonesL.et al. (2017). “Attention is all you need,” in 31st Conference Neural Infection Processing System (NIPS 2017).

  • 143

    WallachI.DzambaM.HeifetsA. (2020). AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery. Available onlion at: http://arxiv.org/abs/1510.02855 (accessed April 22, 2020).

  • 144

    WanY.ShangJ.SunS.TaiW.ChenJ.GengQ.et al. (2019). Molecular mechanism for antibody-dependent enhancement of coronavirus entry. J. Virol.94, 115. 10.1128/JVI.02015-19

  • 145

    WangD.LiuW.ShenZ.JiangL.WangJ.LiS.et al. (2020). Deep learning based drug metabolites prediction. Front. Pharmacol.10:1586. 10.3389/fphar.2019.01586

  • 146

    WangG.LiX.WangZ. (2015). APD3: the antimicrobial peptide database as a tool for research and education. Nucleic Acids Res.44, 10871093. 10.1093/nar/gkv1278

  • 147

    WeissK.KhoshgoftaarT. M.WangD. D. (2016). A survey of transfer learning. Big Data J.3:9. 10.1186/s40537-016-0043-6

  • 148

    Worldometer (2020). Coronavirus Cases. Worldometer. Available online at: https://www.worldometers.info/coronavirus/coronavirus-cases/#daily-cases (accessed April 27, 2020).

  • 149

    WuC.LiuY.YangY.ZhangP.ZhongW.WangY.et al. (2020). Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods. Acta Pharm. Sin. B. 10, 766788. 10.1016/j.apsb.2020.02.008

  • 150

    WuJ.WangW.ZhangJ.ZhouB.ZhaoW.SuZ.et al. (2019). DeepHLApan: a deep learning approach for neoantigen prediction considering both HLA-peptide binding and immunogenicity. Front. Immunol.10:2559. 10.3389/fimmu.2019.02559

  • 151

    XiaS.XuW.WangQ.WangC.HuaC.LiW.et al (2020). Peptide-Based Membrane Fusion Inhibitors Targeting HCoV-229E Spike Protein HR1 and HR2 Domains. mdpi.com. Available online at: https://www.mdpi.com/1422-0067/19/2/487 (accessed April 28, 2020).

  • 152

    XiangZ.HeY. (2009). Vaxign: a web-based vaccine target design program for reverse vaccinology. Procedia Vaccinol.1, 2329. 10.1016/j.provac.2009.07.005

  • 153

    YangD.LeibowitzJ. L. (2015). The structure and functions of coronavirus genomic 3' and 5' ends. Virus Research206, 120133. 10.1016/j.virusres.2015.02.025

  • 154

    YuW.MackerellA. D. (2017). “Computer-aided drug design methods,” in Methods in Molecular Biology, Vol. 1520, ed SassP. (New York, NY: Humana Press Inc.), 85106. 10.1007/978-1-4939-6634-9_5

  • 155

    ZhaiS.ChangK.ZhangR.ZhangZ. (2016). DeepIntent: Learning attentions for online advertising with recurrent neural networks KDD'16. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY: Association for Computing Machinery), 12951304.

  • 156

    ZhangC.WuZ.LiJ.-W.ZhaoH.WangG.-Q. (2020). The cytokine release syndrome (CRS) of severe COVID-19 and Interleukin-6 receptor (IL-6R) antagonist Tocilizumab may be the key to reduce the mortality. Int. J. Antimicrob. Agents55:105954. 10.1016/j.ijantimicag.2020.105954

  • 157

    ZhangH.SaravananK. M.YangY.HossainT. (2020). Deep learning based drug screening for novel coronavirus 2019-nCov. Prepr19, 117. 10.20944/preprints202002.0061.v1

  • 158

    ZhangL.LinD.SunX.CurthU.DrostenC.SauerheringL.et al. (2020). Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors. Science368:eabb3405. 10.1126/science.abb3405

  • 159

    ZhavoronkovA.AladinskiyV.ZhebrakA.ZagribelnyyB.TerentievV.BezrukovD. S.et al. (2020). Potential 2019-nCoV 3C-like protease inhibitors designed using generative deep learning approaches Potential COVID-19 3C-like protease inhibitors designed using generative deep learning approaches. Insilico Med. Hong Kong Ltd A307:E1. 10.26434/chemrxiv.11829102.v1

  • 160

    ZhavoronkovA.IvanenkovY. A.AliperA.VeselovM. S.AladinskiyV. A.AladinskayaA. V.et al. (2019). Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol.37, 10381040. 10.1038/s41587-019-0224-x

  • 161

    ZhengC.YuW.XieF.ChenW.MercadoC.SyL. S.et al. (2019). The use of natural language processing to identify Tdap-related local reactions at five health care systems in the Vaccine Safety Datalink. Int. J. Med. Inform.127, 2734. 10.1016/j.ijmedinf.2019.04.009

  • 162

    ZhongF.XingJ.LiX.LiuX.FuZ.XiongZ.et al. (2018). Artificial intelligence in drug design. Sci. China Life Sci.61, 11911204. 10.1007/s11427-018-9342-2

  • 163

    ZhouH.GaoM.SkolnickJ. (2015). Comprehensive prediction of drug-protein interactions and side effects for the human proteome. Sci. Rep.5:11090. 10.1038/srep11090

  • 164

    ZhouP.YangX.-L.WangX.-G.HuB.ZhangL.ZhangW.et al. (2020). A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature579, 270273. 10.1038/s41586-020-2012-7

  • 165

    ZhouY.HouY.ShenJ.HuangY.MartinW.ChengF. (2020). Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2. Cell Discov.6:14. 10.1038/s41421-020-0153-3

Summary

Keywords

COVID-19, SARS-COV-2, drug, vaccine, artificial intelligence, deep learning

Citation

Keshavarzi Arshadi A, Webb J, Salem M, Cruz E, Calad-Thomson S, Ghadirian N, Collins J, Diez-Cecilia E, Kelly B, Goodarzi H and Yuan JS (2020) Artificial Intelligence for COVID-19 Drug Discovery and Vaccine Development. Front. Artif. Intell. 3:65. doi: 10.3389/frai.2020.00065

Received

09 May 2020

Accepted

17 July 2020

Published

18 August 2020

Volume

3 - 2020

Edited by

Weida Tong, National Center for Toxicological Research (FDA), United States

Reviewed by

Xiaowei Xu, University of Arkansas at Little Rock, United States; Zhichao Liu, National Center for Toxicological Research (FDA), United States

Updates

Copyright

*Correspondence: Jiann Shiun Yuan

This article was submitted to Medicine and Public Health, a section of the journal Frontiers in Artificial Intelligence

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics