- 1School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, China
- 2Shenzhen Research Institute of Xiamen University, Shenzhen, China
- 3Department of Automation, Xiamen University, Xiamen, China
- 4School of Economics and Management, Xiamen University of Technology, Xiamen, China
Drug-drug interaction prediction plays an important role in pharmacology and clinical applications. Most traditional methods predict drug interactions based on drug attributes or network structure. They usually have three limitations: 1) failing to integrate drug features and network structures well, resulting in less informative drug embeddings; 2) being restricted to a single view of drug interaction relationships; 3) ignoring the importance of different neighbors. To tackle these challenges, this paper proposed a multiview fusion based on dual-level attention to predict drug interactions (called MFDA). The MFDA first constructed multiple views for the drug interaction relationship, and then adopted a cross-fusion strategy to deeply fuse drug features with the drug interaction network under each view. To distinguish the importance of different neighbors and views, MFDA adopted a dual-level attention mechanism (node level and view level) to obtain the unified drug embedding for drug interaction prediction. Extensive experiments were conducted on real datasets, and the MFDA demonstrated superior performance compared to state-of-the-art baselines. In the multitask analysis of new drug reactions, MFDA obtained higher scores on multiple metrics. In addition, its prediction results corresponded to specific drug reaction events, which achieved more accurate predictions.
Introduction
Drug combination therapy is widely used in clinical practice and has been recognized as a safe and effective treatment for serious diseases (Sun et al., 2016). However, multiple medications can also lead to harmful drug-drug interactions (DDIs), which can be life-threatening in severe cases and lead to the withdrawal of drugs from the market. The methods currently available to detect DDIs are primarily through long-term in vivo and in vitro clinical trials, which are costly and tedious. Therefore, it is critical to propose an efficient method for DDI identification.
Nowadays, AI-based computational methods have been widely used in biomedical fields (Dong et al., 2022; Meng et al., 2022), such as drug repositioning (Pan et al., 2022), protein interactions (Song et al., 2022), and DDI predictions (Deng et al., 2020). For DDI prediction, feature or topological similarity-based approaches are the most common. Many methods have been developed based on these methods in different ways, including aggregating multiple types of drug features (Gottlieb et al., 2012), predicting multi-class drug reaction events (Ryu et al., 2018), and integrating drug attribute features with drug interaction relationship (Kang et al., 2022). These optimized methods are effective and make the DDI predictions more complete. However, most of those methods ignore the importance of multiview learning (Blum and Mitchell, 1998) and do not take it into consideration. In recent years, multiview learning has been successfully applied to many tasks. As illustrated in Figure 1 for the classification task, integrating multiple views can capture more discriminative features and thus enhance difference. It means that integrating multiple views facilitates to capture of more comprehensive relationships. The situation also applies to drug interaction networks, as the reactions between drugs are complex and diverse. The adjacency view (graph), which is often used in experiments, reflects only one aspect of direct similarity between interacting drugs. Indeed, similarities between nodes that do not directly interact have proven to be very useful in biological networks (Huang et al., 2020), including genetic interactions and protein-protein interaction networks (Costanzo et al., 2016; Kovács et al., 2018). Therefore, we propose a novel multiview fusion approach (MFDA) to predict DDI in this paper. In addition to the adjacency view that provides local topological relations, we add a diffusion view providing beyond first-order topological relationships and the nearest neighbor view constructed from feature similarities to capture a more comprehensive and accurate DDI pattern. Moreover, for the integration of drug features and network structures under each view, we employed cross-fusion to flexibly exchange and fuse the two sources of information. Our contributions are summarized as follows:
1) This study presents a novel MFDA method to predict drug interactions. It constructs multiple relational views of drug interactions and employs a cross-fusion strategy to handle the fusion of drug features and topological information under each view. The experiment results prove the effectiveness of the proposed method.
2) MFDA proposes a dual-level attention mechanism to distinguish the importance of drug nodes and different views for effective fusion. Furthermore, the attention score can also provide interpretable predictions for drug reactions.
3) Most traditional methods usually focus on binary prediction results, and MFDA can predict multiple drug reaction events, which is more useful for investigating the mechanism hidden behind the drug reactions.
FIGURE 1. Two different views. Cl and C2 denote two categories in two different colors. V1 and V2 represent two views, in view V1, the two categories can be easily distinguished by the classifier, while in view V2, the data of these two categories are mixed and difficult to separate.
Related work
Many computation-based methods have been proposed for DDI prediction, and they can be mainly divided into the following four groups.
Feature similarity-based methods assume that drugs with similar features share similar reaction patterns. In early DDI studies, researchers focused on the calculation of drug feature similarities for DDI prediction, such as side effects (Tatonetti et al., 2011), chemical structure (Vilar et al., 2012), and phenotype (Peng et al., 2015). Later, multiple features were combined to improve the model performance (Gottlieb et al., 2012; Cheng and Zhao, 2014; Yan et al., 2020). For example, Kastrin et al. (2018) considered semantic and topological similarities and adopted five classifiers to build a prediction model. Yan et al. (2019) proposed IDNDDI, which used cosine similarity to calculate information about chemical structure, biological and phenotypic features of drugs to infer drug reactions. Rohani and Eslahchi. (2019) calculated multiple drug similarities and used a heuristic similarity selection method to select features that are informative and less redundant for combination. These methods are natural and straightforward, but they ignore structural information. In addition, the selection of features relies on expert experience.
Graph embedding-based methods automatically learn low-dimensional node embeddings from relational networks and use them as node features. For example, Yue et al. (2020) proposed a common graph representation learning framework for predicting drug reactions. It utilized three common types of graph embedding methods (factorization, random walk, and neural networks) to obtain drug embeddings that were used to infer potential drug reactions. The above approaches follow an unsupervised general framework that leads to suboptimal model performance. Subsequent efforts were made to improve it for specific DDI tasks. For example, to obtain high-order drug topological relations, SkipGNN (Huang et al., 2020) aggregated neighboring topological information within two hops for DDI prediction. To introduce multiple drug-related entities, Yan et al. (2021) constructed five drug-related heterogeneous networks and implemented random wander with restart algorithms and positive point mutual information to obtain unified topological embeddings for prediction. The above optimized methods are promising, but they fail to take feature information into consideration, which can also provide discriminative information for prediction tasks.
Attribute-based network methods combine both network topology and attribute information to obtain more comprehensive information. They are usually optimized by incorporating drug feature information into previous graph embedding methods. For example, among the factorization-based approaches (Xiao et al., 2017; Yu et al., 2018; Zhu et al., 2020), Zhu et al. (2020) proposed an attribute supervised learning model that incorporated two drug attributes, molecular structure and side effects, and their correlation to infer adverse interactions among drugs. In graph neural network approaches (Zitnik et al., 2018; Kip F and Welling, 2017), where propagation-based methods usually incorporate the feature information into the relational network. For instance, Zitnik et al. (2018) proposed Decagon, which constructed a multimodal network and adopted convolutional operators to propagate and transform feature vectors across the multirelational network. As a result, the final obtained embedding vector learned both the drug attribute information and the structure information. Decagon has achieved favorable results by encoding feature information of multiple entities. To easily integrate multiple entities and handle multiple relationships flexibly, knowledge graph (KG)-based methods have received more attention (Zeng et al., 2022). Many excellent models have been proposed, such as the relation-aware network embedding model for DDI prediction (RANEDDI) (Hui et al., 2021), multimodal deep neural network (MDNN) (Lyu et al., 2021), and knowledge summarization GNN (Sum-GNN) (Yu et al., 2021), which can better capture the structure and semantic information of the network.
The above attribute-based network methods have made considerable improvements over pure structure-based methods, but they do not integrate the information about attributes and structures well. For example, factorization-based methods are shallow models that cannot capture the nonlinear relationship between the two information sources. Propagation-based methods prefer network structures to drug attributes and may result in oversmoothing problems. More importantly, these methods mainly focus on a single view of drug reaction relationships and ignore the importance of multiple views. As a result, the model’s results are highly dependent on the quality of the selected view, resulting in poor robustness.
Multiview-based methods construct multiple views of the same target and learn by exploiting the complementarity complementary between views. Many multiview learning methods have been proposed. For example, Wang et al. (2020) proposed AM-GCN, which constructed two views from network structure and node features, respectively, and then used an attention mechanism to automatically learn the weights of the views. Yuan et al. (2021) designed three complementary views and adopted convolutional operations to learn view embeddings. Finally, an attention mechanism was also used to fuse node representations for the classification task. Wang et al. (2021) proposed to capture inter-view molecule structure and intra-view interactions between drugs, and then used an unsupervised contrast learning component to balance and integrate multiview information. Although the above methods are effective, there is room for improvement. For instance, AM-GCN only constructed two views, and Yuan et al. (2021) used convolution operations to simply aggregate node features and topological information, which limits the model to capture important information. Therefore, we constructed three views to learn the comprehensive drug interaction relationships. Unlike before, we propose a cross-fusion strategy to fuse drug features and topologies under each view. Specifically, it utilizes dual channels to encode drug features and network structures separately and then exchanges information flexibly with convey operations. Through extensive experiments, the experimental results show that our model outperforms other optimal methods.
Methodology
Overview
The framework of the model MFDA is shown in Figure 2. It consists of four main parts: multiple-view construction, dual-level attention mechanism, cross-fusion strategy, and model optimization. The model first constructed three graphs
Problem definition
In this section, we use the relevant notation in Table 1 to state the problem definition. Given a multirelational DDI network
Multiple view construction
Given a graph
1) Adjacency graph (denoted
2) Diffusion graph (denoted
where
3) K-nearest neighbors graph (denoted
where
Cross fusion strategy
To effectively fuse the network structure and node attribute information of each view, this study adopts a cross-fusion strategy.
First, we need to process the three raw features given in this study (substructure, target, enzyme). Since each feature corresponds to a set of feature descriptors, and each drug can be represented by a binary vector, we calculate the pairwise similarity by performing the Jaccard similarity using Eq. 3.
where
We use dual channels to encode drug features and network structures separately. For clarity, we take drug feature
For drug features, we adopt the widely used AE to learn feature embeddings.
where
For the network structures, different neighbors play various roles for the central node. Therefore, we use a node-level attention mechanism to assign associated weights to neighbors, which is essentially implemented through GAT. The topological embedding learned at each layer is denoted
where
Dual-level attention mechanism
The influence of different neighbors on the central node is distinct, as are the different views. As an analogy, it seems that when evaluating a person, his or her parents, friends, classmates, and teachers (regarded as the node level) do not have the same opinions about him. Notably, parents and friends have better familiarity, and their judgment should be more comprehensive. Considering other aspects, such as learning and sociability (regarded as the view level), the impact of their evaluation should also be distinguished. This phenomenon may also apply to the prediction of drug reactions. Therefore, we use a dual-level attention mechanism to assign learnable weights to different nodes and views.
Node-level attention mechanism
Considering that the importance of different neighbor nodes is varied, we adopt a self-attention mechanism to adaptively learn the weights. Taking drug
where
View-level attention mechanism
We use view-level attention to assign the appropriate weights based on their contribution to the prediction task, which can help to reduce noise and achieve more efficient integration.
We first calculate the attention coefficients of three graph embeddings. For simplicity, we denote them by
Taking the calculation process of
where
Similarly, we can obtain the attention scores
Loss optimization
The candidate drug pairs
For the prediction task of drug reactions, this paper combines cross-entropy loss
where
where
Experiments
Experimental setup
Dataset
The experimental data in this paper come from DDIMDL (Deng et al., 2020), which contains 572 drugs with four different types of features: chemical substructures, targets, enzymes, and pathways. There are also 37,264 drug pairs with 65 types of reaction events. According to the DDIMDL experiment, we used the three most effective features, i.e., chemical substructure, target, and enzyme for experiments.
Baselines
The following state-of-the-art methods were used to evaluate the performance of MFDA.
Classic classifier: These methods are traditional supervised learning methods. The unified feature vector
DNN: We reproduced Lee’s idea (Lee et al., 2019) with a deep neural network (DNN). The drug pair vectors of the three features are concatenated and fed into the DNN classifier.
DeepDDI1: DeepDDI (Ryu et al., 2018) is the first model to apply deep learning to drug reaction events. It uses the chemical substructure similarity of the drugs as the input and predicts the interaction type through a deep learning model. In this work, we adjusted the output to 65 classes to accommodate multitype DDI prediction.
DDIMDL2: DDIMDL (Deng et al., 2020) optimizes the DeepDDI (Ryu et al., 2018) model and inputs three drug-related features of chemical substructure, target, and enzyme into the three constructed sub-models for training.
LINE: LINE (Tang et al., 2015) considers the first-order similarity and second-order similarity between nodes.
HOPE: HOPE (Ou et al., 2016) is a factorization method that maintains high-order proximity by reconstructing the adjacency matrix.
Node2Vec: Node2vec (Grover and Leskovec, 2016) employs a biased random walking strategy to obtain a sequence of neighbor nodes and then inputs them into the skip-gram model to get the embedding of nodes
SDNE: SDNE (Wang et al., 2016) improves the LINE method. It utilizes deep AE models to optimize both first- and second-order similarities instead of optimizing them separately.
GAE: GAE (Kipf and Welling, 2016) adopts an encoder-decoder structure to learn embedding representations and train the model by minimizing the reconstruction loss.
SkipGNN3: SkipGNN (Huang et al., 2020) aggregates messages from two-hop neighbors and immediate neighbors in the interaction network.
RANEDDI4: RANEDDI (Hui et al., 2021) considers multiple relationships between drugs. It uses the knowledge graph-based approach RotatE (Sun et al., 2019) to learn initial drug embeddings, which are fed into the relationship-aware network to predict multiple DDI reactions.
DPDDI5: To be fair, this experiment incorporates attribute feature information based on the original DPDDI (Feng et al., 2020) and then uses graph convolutional networks (GCNs) to learn the drug representation for prediction.
DANE6: DANE (Gao and Huang, 2018) not only captures potential high nonlinearity in topology and attributes but also maintains first-order and high-order the proximities in the original network.
AM-GCN7: AM-GCN (Wang et al., 2020) constructs KNN graphs from the node feature space; and network topology graphs from the interaction relationship of nodes. Then, the attention mechanism is used to adaptively integrate deep correlation information between topological structures and node features.
MGCCN: MGCCN (Wang et al., 2022) builds three graphs from drug relationships and uses multiple parallel convolutional layers for each graph to learn topological representation. Finally, the multigraph attention module is used to obtain a unified node representation.
Reimplement details: the above comparison methods can be classified into four groups, and the detailed settings refer to Table 3. To ensure fairness, the embedding results learned by the model are saved and uniformly classified by using a DNN classifier. Note that the embedding learning methods (LINE, HOPE, Node2vec, SDNE, GAE) can be re-implemented in the BioNEV framework (Yue et al., 2020).
Evaluation metrics
To comprehensively evaluate the performance of MFDA, we used six evaluation metrics, including the accuracy (ACC), area under the precision-recall curve (AUPR), area under the receiver operating characteristic (ROC) curve (AUC), F1, Precision (Pre), and Recall. We use micro metrics for AUPR and AUC and macro metrics for the other metrics (Pre_macro, F1_macro, Recall_macro).
Parameter and evaluation settings
The maximum iteration number is set to 1,000, and we selected the Adam optimizer with a learning rate of 0.003 to optimize MFDA.
Meanwhile, we applied fivefold cross-validation in our experiments and randomly divided all DDI pairs into five subsets. The final score was calculated by taking the average of the five rounds of output. To avoid overfitting, we used the early-stopping strategy, which automatically terminated training after 20 epochs if no improvement was observed. Regarding the settings of the comparison method parameters, they all follow the original paper. The experiment was conducted on Windows Server 2016 Datacenter, which was configured with Intel Xeon Processor (Skylake, IBRS), 2.3 GHz, 32 CPUs. It was running on python 3.8 and the total running time was about 1551.76 s.
Results and analysis
The results of the comparison experiments in Table 4 show that RANEDDI has the highest score on ACC, which may be due to the KG-based approach used. The KG-based approach can capture the relationships between multiple entities and obtain rich structural and semantic information. And Node2Vec scored highest on AUC_micro, which was due to its biased sampling that integrated structural features obtained by depth and breadth search. However, MFDA performed best overall, outperforming RANEDDI by 12.68, 9.71, and 13.97% in F1_macro, Pre_macro, and Recall_macro metrics, respectively, and by 7.13, 5.09, and 8.23% over Node2vec.
Compared to the multiview-based methods, we can see that MFDA is better than the best MGCCN, the difference is that in each view, we use a cross-fusion strategy to deeply fuse drug features and topological information, rather than simply fusing with the GCN. Besides, the proposed MFDA has the best results compared to other view ablation variants, which demonstrated that integrating multiple views is beneficial to improve model performance. In addition, we observed that DDIMDL performed better in feature-based methods. A plausible explanation was that DDIMDL used three types of drug features and captured different heterogeneous information from them. However, the GAE, and SDNE performed poorly in the network-based methods. This can be attributed to the unbalanced distribution of multiple DDI events. For verification, we counted the DDI events involved in the experiment, as shown in Figure 3. The results showed that there was a sharp decline over 65 DDI class distributions, and the data were unevenly distributed. This could directly affect the overall performance of the model using GCN as the encoder.
Ablation study
To explore the influence of each module in the model, we set five ablation variants, as detailed in Table 5 and the results of the ablation experiments are shown in Figure 4.
MFDAw/o AE: This variant has no AE component. The fusion coefficient alpha was set to 0, so the MFDA degenerated into a multilayer GAT model, capturing only the information of the network structure.
MFDAw/o GAT: This variant has no GAT module. It inputs the embedding representation obtained by AE into the model for end-to-end training and uses only the feature information of the drug.
MFDAw/o Convey: There is no convey operation to exchange information between layers. It combined the embeddings learned from two independent networks through an attention mechanism to obtain a unified embedding vector. This variant was designed to verify the importance of the convey operation.
MFDAw/o NodeAtt: This variant does not use the node-level attention mechanism. It replaced GAT with GCN, which uniformly aggregated neighboring nodes. The rest of the settings are the same as MFDA.
MFDAw/o LayerAtt: This variant does not use a view-level attention mechanism. When fusing three view embeddings, the final drug embedding is obtained by averaging method.
The results in Figure 4 show that the ablation variants scored lower on all metrics compared to MFDA, indicating that each component of the model is functional. In addition, we note that the MFDAw/o AE variant, which does not have the AE model to capture feature information, decreases significantly across all metrics, suggesting that feature information contributes the most to the prediction task, followed by the attention mechanism.
Parameter sensitivity analysis
In this section, we investigate the effects of the fusion coefficient
FIGURE 5. Results of ablation experiments. (A) Effect of fusion coefficient. (B) Effects of drug combination methods. (C) Effect of the number of fusion layers.
The fusion coefficient
Different methods of drug combination affect the meaning of the constructed drug-pair vectors. Here, we explore the effect of different combinations on the experimental results. The results in Figure 5B show that when the average method is applied, almost all metrics achieved the highest value.
To evaluate the fusion layer parameters, we varied layers one to five and the number of hidden layer nodes was empirically set to (2000, 512, 512, 256, 65). We fixed the number of neurons in the last hidden layer to 65, as there are 65 types of DDI events. Then, the number of layers was gradually increased to evaluate how the number of layers affects the experimental results. The results are shown in Figure 5C, where the model performance showed an increasing trend until it reached three levels, indicating that it incorporated more useful information. When the layer number is 3, all metrics reached the highest point. After that, all of them had a decreasing trend, indicating that too many layers may also incorporate noisy information, which instead reduced the model effect. Finally, we set the number of layers to 3 as a trade-off.
Multitask analysis
In general, we are more interested in the ability to predict reactions to new drugs. The previous experiment is equivalent to Task A in Figure 6, which evaluates the model’s ability to predict known drugs. While Task B and Task C were designed to assess the predictive ability of the new drug reactions. For clarity, the relationship between the three tasks is shown in Figure 6.
Task A: We applied fivefold cross-validation and randomly split all drug pair vectors into five subsets. We trained models based on DDIs in the training set and then made predictions for DDIs in the test set. The evaluation score was the average of the output of the five rounds.
Task B: Without using previous drug pairs, 572 drugs were randomly divided into five subsets. We used one of them as a test set to simulate drugs with no known interactions. The model was constructed on the training drugs and tested between the training drugs and testing drugs.
Task C: This design is similar to Task B. The difference is that the model is tested between the testing drugs, which is equal to predicting two completely new drugs.
We selected multiple drug reaction methods (DeepDDI, DPDDI, DDIMDL) for comparison experiments on three tasks, and the experimental results are shown in Table 6.
We observed a significant decrease in the scores of all models when predicted with new drugs. In particular, the AUPR_micro and F1_macro scores of DDIMDL in Task B dropped from 0.656 to 0.446, respectively, to 0.364, and 0.159 in Task C, respectively. However, the MFDA did not decrease drastically and always achieved optimal results. A possible explanation is that MFDA had a stable network structure, which constructed three views and was able to capture deep interaction information of drug features with the network topology under each view.
Case study
In this section, we conduct case studies to validate the predictive ability of MFDA in practice. We performed experiments on 572 drugs that had 37,264 DDIs and 65 types of reaction events. The experimental goal is to predict the reaction events between the remaining drug-drug pairs. We pay attention to the top ten most common drug reaction events and recorded drug pair reactions with the highest scores in each reaction type. We used the interactions checker tool provided by DrugBank8 for verification. The experimental results are shown in Table 7. Seven DDI events can be identified out of the top ten events, which achieved good prediction accuracy. In addition, the predicted results were well-readable. For example, the predicted reaction of abemaciclib with astemizole was type “1″, which suggested that the metabolism of abemaciclib would be weaker when combined with astemizole. This specific type of predictive result could provide additional insight into the underlying mechanisms behind the drug reactions.
Furthermore, we selected the top 100 drug pairs with the highest scores and then checked up evidence in DrugBank. The results were organized into a relational graph (Figure 7) for presentation, where confirmed reactions were represented by red edges, otherwise by gray edges, and node size was related to the number of reactions involved. Statistically, 62% of the interactions have been confirmed. This case shows that MFDA is promising for predicting unknown reactions between drugs.
Analysis of attention mechanism
To overcome the lack of interpretability of graph neural networks, we predicted specific reaction events, which made the results more readable. Furthermore, we used a dual-level attention mechanism to provide interpretability for DDI. We first output three view-level attention scores as shown in Figure 8. From the attention distribution, we found that the KNN graph contributed more to drug prediction. The KNN graph revealed similarities in the drug feature space, which was in line with the conclusion drawn from the previous ablation experiments that drug feature information would be important.
For further analysis, we chose to investigate node-level attention under the KNN view with a high importance score. Specifically, we saved the last layer of node-level attention scores under the KNN view and plotted it as a heatmap, as shown in Figure 9. We randomly selected a drug node 452 (Calcidiol) and found that it has a large correlation score with node 565 (Calcipotriol). They did not belong to the same type of drug, but we found six common neighbors between drug nodes 452 and 565 by analysing their relationships in known drug pair reactions. To make it clear, we used Echart9 for display. The two nodes shared many common neighbors, so there was a strong connection between the two nodes. To sum up, the dual-level attention mechanism allows us to know what view is the most important, and the relationship between neighbors. This can facilitate the understanding of drug reaction prediction.
Conclusion
We proposed a novel method MFDA for the prediction of multiple drug reaction events. MFDA integrated information from multiple views, which reduced the dependence on a single view and greatly improved the stability of the model. In addition, under each view, convey operations were applied to flexibly exchange feature information with topology information, which helped to fully capture interaction information and therefore obtained high-quality drug representations. Through extensive comparative experiments as well as prediction experiments for new drugs, MFDA achieved satisfactory performance compared to optimal baselines. In addition, the dual-level attention mechanism could also provide an interpretable prediction for drug reactions. In the future, we may extend MFDA to heterogeneous graphs and explore capturing high-level embedding representations using knowledge graph-based approaches.
Data availability statement
Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/kk12321/SKjjdk-MFDA.
Author contributions
KL conceived and designed the method; LK implemented the method; JL collected andanalyzed the data; KL, LK and PL drafted the manuscript; FY managed the quality andprogress of the whole study; KL, PL, and FY jointly modified the manuscript. All the authorsreviewed and approved the final manuscript.
Funding
This work is partially supported by National Natural Science Foundation of China under Grant No. 62173282, the Shenzhen Fundamental Research Program No. JCYJ20190809161603551, the Science Foundation of Fujian Province No. 2021J011188, the Natural Science Foundation of Fujian under Grant No. 2020J01053, the Xiamen Science and Technology Planning Project No. 3502Z20206073, and the XMUT Scientific Research Project (No.YSK20002R, No. YKJCX2021053, No. YKJCX2020103).
Acknowledgments
We acknowledge reviewers for the valuable comments on the original manuscript.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Abbreviations
ACC, Accuracy; AE, Autoencoder; AUC, Area under the receiver operating characteristic curve; AUPR, Area under the precision-recall curve; DDI, Drug‒drug interaction; GAT, Graph attention network; KNN, K nearest neighbors; LR, logistic regression.
Footnotes
1https://bitbucket.org/kaistsystemsbiology/deepddi
2https://github.com/YifanDengWHU/DDIMDL
3https://github.com/kexinhuang12345/SkipGNN
4https://github.com/DongWenMin/RANEDDI
5https://github.com/NWPU-903PR/DPDDI
6https://github.com/gaoghc/DANE
7https://github.com/zhumeiqiBUPT/AM-GCN
8Drugbank: https://go.drugbank.com/
9https://echarts.apache.org/zh/index.html
References
Blum, A., and Mitchell, T. (1998). “Combining labeled and unlabeled data with co-training,” in Proceedings of the eleventh annual conference on Computational learning theory, 92–100. doi:10.1145/279943.279962
Cheng, F., and Zhao, Z. (2014). Machine learning-based prediction of drug-drug interactions by integrating drug phenotypic, therapeutic, chemical, and genomic properties. J. Am. Med. Inf. Assoc. 21 (2), e278–e286. doi:10.1136/amiajnl-2013-002512
Costanzo, M., VanderSluis, B., Koch, E. N., Baryshnikova, A., Pons, C., Tan, G., et al. (2016). A global genetic interaction network maps a wiring diagram of cellular function. Science. 353 (6306), aaf1420. doi:10.1126/science.aaf1420
Deng, Y., Xu, X., Qiu, Y., Xia, J., Zhang, W., and Liu, S. J. B. (2020). A multimodal deep learning framework for predicting drug-drug interaction events. Bioinformatics 36 (15), 4316–4322. doi:10.1093/bioinformatics/btaa501
DeStefano, J. J. (1990). “Logistic regression and the Boltzmann machine,” in International joint conference on neural networks, 199–204. doi:10.1109/ijcnn.1990.137845
Dong, J., Zhao, M., Liu, Y., Su, Y., and Zeng, X. (2022). Deep learning in retrosynthesis planning: Datasets, models and tools. Brief. Bioinform 23 (1), bbab391. doi:10.1093/bib/bbab391
Feng, Y.-H., Zhang, S.-W., and Shi, J.-Y. (2020). Dpddi: A deep predictor for drug-drug interactions. BMC Bioinforma. 21 (1), 419–515. doi:10.1186/s12859-020-03724-x
Gao, H., and Huang, H. (2018). “Deep attributed network embedding,” in Twenty-seventh international joint conference on artificial intelligence (IJCAI). doi:10.24963/ijcai.2o18/46710.24963/ijcai.2018/467
Gottlieb, A., Stein, G. Y., Oron, Y., Ruppin, E., and Sharan, R. (2012). Indi: A computational framework for inferring drug interactions and their associated recommendations. Mol. Syst. Biol. 8, 592. doi:10.1038/msb.2012.26
Grover, A., and Leskovec, J. (2016). “node2vec,” in The 22nd ACM SIGKDD international conference. doi:10.1145/2939672.2939754
Huang, H., Li, J., and Hu, H. (2017). “Accelerated attributed network embedding,” in Proceedings of the 2017 SIAM international conference on data mining (SDM), 633–641. doi:10.1137/1.9781611974973.71
Huang, K., Xiao, C., Glass, L., Zitnik, M., and Sun, J. (2020). SkipGNN: Predicting molecular interactions with skip-graph networks. Sci. Rep. 10 (11), 21092–21116. doi:10.1038/s41598-020-77766-9
Kang, L.-P., Lin, K.-B., Lu, P., Yang, F., and Chen, J.-P. (2022). Multitype drug interaction prediction based on the deep fusion of drug features and topological relationships. PloS one 17 (8), e0273764. doi:10.1371/journal.pone.0273764
Kastrin, A., Ferk, P., and Leskošek, B. (2018). Predicting potential drug-drug interactions on topological and semantic similarity features using statistical learning. PloS one 13 (5), e0196865. doi:10.1371/journal.pone.0196865
Kip F, T. N., and Welling, M. (2017). “Semi-supervised classification with graph convolutional networks,” in International conference on learning representations. doi:10.48550/arXiv.1609.02907
Kipf, T. N., and Welling, M. (2016). “Variational graph auto-encoders,” in Conference on neural information processing systems. doi:10.48550/arXiv.1611.07308
Kovács, I., Luck, K., Spirohn, K., Wang, W., Pollis, D. K., Schlabach, S., et al. (2018). Network-based prediction of protein interactions. Cold Spring Harb. Lab. (1). doi:10.1101/275529
Lee, G., Park, C., and Ahn, J. (2019). Novel deep learning model for more accurate prediction of drug-drug interaction effects. BMC Bioinforma. 20 (1), 415. doi:10.1186/s12859-019-3013-0
Li, L., Huang, H., Fu, Y., Wang, J., Wu, Z., Ru, J., et al. (2015). Large-scale exploration and analysis of drug combinations. Bioinformatics 31 (12), 2007–2016. doi:10.1093/bioinformatics/btv080
Lyu, T., Gao, J., Tian, L., Li, Z., Zhang, P., and Zhang, J. (2021). “Mdnn: A multimodal deep neural network for predicting drug-drug interaction events,” in Proceedings of the thirtieth international joint conference on artificial intelligence, 3536–3542. doi:10.24963/ijcai.2021/487
Meng, Y., Lu, C., Jin, M., Xu, J., Zeng, X., and Yang, J. (2022). A weighted bilinear neural collaborative filtering approach for drug repositioning. Brief. Bioinform 23 (2), bbab581. doi:10.1093/bib/bbab581
Ou, M., Cui, P., Pei, J., Zhang, Z., and Zhu, W. (2016). “Asymmetric transitivity preserving graph embedding,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 1105–1114. doi:10.1145/2939672.2939751
Pan, X., Lin, X., Cao, D., Zeng, X., Yu, P. S., He, L., et al. (2022). “Deep learning for drug repurposing: Methods, databases, and applications,” in Wiley interdisciplinary reviews: Computational molecular science, 12, e1597. doi:10.1002/wcms.1597WIREs Comput. Mol. Sci.
Rohani, N., and Eslahchi, C. (2019). Drug-drug interaction predicting by neural network using integrated similarity. Sci. Rep. 9 (1), 13645–13711. doi:10.1038/s41598-019-50121-3
Ryu, J. Y., Kim, H. U., and Lee, Y. L. (2018). Deep learning improves prediction of drug-drug and drug-food interactions. Proc. Natl. Acad. Sci. U. S. A. 115, E4304. doi:10.1073/pnas.1803294115
Song, B., Luo, X., Luo, X., Liu, Y., Niu, Z., and Zeng, X. (2022). Learning spatial structures of proteins improves protein-protein interaction prediction. Brief. Bioinform 23 (2), bbab558. doi:10.1093/bib/bbab558
Sun, W., Sanderson, P. E., and Zheng, W. (2016). Drug combination therapy increases successful drug repositioning. Drug Discov. Today 21 (7), 1189–1195. doi:10.1016/j.drudis.2016.05.015
Sun, Z., Deng, Z. H., Nie, J. Y., and Tang, J. (2019). “RotatE: Knowledge graph embedding by relational rotation in complex space,” in The international conference on learning representations. doi:10.48550/arXiv.1902.10197
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., and Mei, Q. (2015). “Line,” in Proceedings of the 24th international conference on world wide web, 1067–1077. doi:10.1145/2736277.2741093
Tatonetti, N. P., Denny, J. C., Murphy, S. N., Fernald, G. H., Krishnan, G., Castro, V., et al. (2011). Detecting drug interactions from adverse-event reports: Interaction between paroxetine and pravastatin increases blood glucose levels. Clin. Pharmacol. Ther. 90, 133–142. doi:10.1038/clpt.2011.15910.1038/clpt.2011.83
Vilar, S., Harpaz, R., Uriarte, E., Santana, L., Rabadan, R., and Friedman, C. (2012). Drug-drug interaction through molecular structure similarity analysis. J. Am. Med. Inf. Assoc. 19 (6), 1066–1074. doi:10.1136/amiajnl-2012-000935
Wang, B., Wang, Y., He, X., Hu, Y., and Yin, B. J. I. S. P. (2022). Multi‐graph convolutional clustering network. IET Signal Process. 16, 650–661. doi:10.1049/sil2.12116
Wang, D., Cui, P., and Zhu, W. (2016). “Structural deep network embedding,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 1225–1234. doi:10.1145/2939672.2939753
Wang, X., Zhu, M., Bo, D., Cui, P., Shi, C., and Pei, J. (2020). “Am-gcn: Adaptive multi-channel graph convolutional networks,” in Proceedings of the 26th ACM SIGKDD International conference on knowledge discovery & data mining, 1243–1253. doi:10.1145/3394486.3403177
Wang, Y., Min, Y., Chen, X., and Wu, J. (2021). “Multi-view graph contrastive representation learning for drug-drug interaction prediction,” in WWW '21: The web conference 2021. doi:10.1145/3442381.3449786
Yan, C., Duan, G., Zhang, Y., Wu, F.-X., Pan, Y., and Wang, J. (2019). Idnddi: An integrated drug similarity network method for predicting drug-drug interactions, 89–99. doi:10.1007/978-3-030-20242-2_8
Yan, C., Duan, G., Zhang, Y., Wu, F. X., Pan, J., and Wang, J. (2020). Predicting drug-drug interactions based on integrated similarity and semi-supervised learning. IEEE/ACM Trans. Comput. Biol. Bioinform 99, 1. doi:10.1109/TCBB.2020.2988018
Yan, X., Yin, P., Wu, X., and Han, J. (2021). Prediction of the drug-drug interaction types with the unified embedding features from drug similarity networks. Front. Pharmacol. 12. doi:10.3389/fphar.2021.794205
Yu, H., Mao, K. T., Shi, J. Y., Huang, H., Chen, Z., Dong, K., et al. (2018). Predicting and understanding comprehensive drug-drug interactions via semi-nonnegative matrix factorization. BMC Syst. Biol. 12 (1), 14. doi:10.1186/s12918-018-0532-7
Yu, Y. A., Dong, W. M., and Shi, B. (2022). Raneddi: Relation-aware network embedding for drug-drug interaction prediction. Inf. Sci. 582, 167–180. doi:10.1016/j.ins.2021.09.008
Yu, Y., Huang, K., Zhang, C., Glass, L. M., Sun, J., and Xiao, C. J. B. (2021). SumGNN: Multi-typed drug interaction prediction via efficient knowledge graph summarization. Bioinformatics 37 (18), 2988–2995. doi:10.1093/bioinformatics/btab207
Yuan, J., Yu, H., Cao, M., Xu, M., Xie, J., and Wang, C. (2021). “Semi-supervised and self-supervised classification with multi-view graph neural networks,” in Proceedings of the 30th ACM international conference on information & knowledge management, 2466–2476. doi:10.1145/3459637.3482477
Yue, X., Wang, Z., Huang, J., Parthasarathy, S., Moosavinasab, S., Huang, Y., et al. (2020). Graph embedding on biomedical networks: Methods, applications and evaluations. Bioinformatics 36 (4), 1241–1251. doi:10.1093/bioinformatics/btz718
Zeng, X., Tu, X., Liu, Y., Fu, X., and Su, Y. (2022). Toward better drug discovery with knowledge graph. Curr. Opin. Struct. Biol. 72, 114–126. doi:10.1016/j.sbi.2021.09.003
Zhu, J., Liu, Y., Zhang, Y., and Li, D. (2021). Attribute supervised probabilistic dependent matrix tri-factorization model for the prediction of adverse drug-drug interaction. IEEE J. Biomed. Health Inf. 25 (99), 2820–2832. doi:10.1109/jbhi.2020.3048059
Keywords: multiview, dual-level attention, cross-fusion strategy, graph attention network, drug-drug interactions
Citation: Lin K, Kang L, Yang F, Lu P and Lu J (2022) MFDA: Multiview fusion based on dual-level attention for drug interaction prediction. Front. Pharmacol. 13:1021329. doi: 10.3389/fphar.2022.1021329
Received: 17 August 2022; Accepted: 13 September 2022;
Published: 06 October 2022.
Edited by:
Pan Zheng, University of Canterbury, New ZealandReviewed by:
Aiqing Zhang, Anhui Normal University, ChinaYong Xie, Nanjing University of Posts and Telecommunications, China
Copyright © 2022 Lin, Kang, Yang, Lu and Lu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Liping Kang, MjAyMjAzMTM5OEBzLnhtdXQuZWR1LmNu; Fan Yang, eWFuZ0B4bXUuZWR1LmNu