Prediction of Antimicrobial Potential of a Chemically Modified Peptide From Its Tertiary Structure

Agrawal, Piyush; Raghava, Gajendra P. S.

doi:10.3389/fmicb.2018.02551

ORIGINAL RESEARCH article

Front. Microbiol., 26 October 2018

Sec. Antimicrobials, Resistance and Chemotherapy

Volume 9 - 2018 | https://doi.org/10.3389/fmicb.2018.02551

Prediction of Antimicrobial Potential of a Chemically Modified Peptide From Its Tertiary Structure

Piyush Agrawal^1,2

Gajendra P. S. Raghava^2*

¹CSIR-Institute of Microbial Technology, Chandigarh, India
²Center for Computational Biology, Indraprastha Institute of Information Technology, Delhi, New Delhi, India

Designing novel antimicrobial peptides is a hot area of research in the field of therapeutics especially after the emergence of resistant strains against the conventional antibiotics. In the past number of in silico methods have been developed for predicting the antimicrobial property of the peptide containing natural residues. This study describes models developed for predicting the antimicrobial property of a chemically modified peptide. Our models have been trained, tested and evaluated on a dataset that contains 948 antimicrobial and 931 non-antimicrobial peptides, containing chemically modified and natural residues. Firstly, the tertiary structure of all peptides has been predicted using software PEPstrMOD. Structure analysis indicates that certain type of modifications enhance the antimicrobial property of peptides. Secondly, a wide range of features was computed from the structure of these peptides using software PaDEL. Finally, models were developed for predicting the antimicrobial potential of chemically modified peptides using a wide range of structural features of these peptides. Our best model based on support vector machine achieve maximum MCC of 0.84 with an accuracy of 91.62% on training dataset and MCC of 0.80 with an accuracy of 89.89% on validation dataset. To assist the scientific community, we have developed a web server called “AntiMPmod” which predicts the antimicrobial property of the chemically modified peptide. The web server is present at the following link (http://webs.iiitd.edu.in/raghava/antimpmod/).

Introduction

The emergence of drug-resistant pathogenic strains is one of the major threats for the survival of humans and livestock; antibiotics designed to eliminate these pathogens are losing their sensitivity (Price et al., 2012; Veltri et al., 2018). The rapid emergence of the antibiotic resistance has endangered the efficacy of antibiotics, and one of the potential causes of this is the misuse and overuse of antibiotics (Gould and Bal, 2013; Sengupta et al., 2013; Wright, 2014). Hence, there is a need to develop more potent and effective drugs to combat deadly diseases occurring worldwide. In the past few decades, peptide-based therapeutics has been preferred for the drug development over the small molecule-based drugs. Peptide-based drugs are highly selective, efficacious, safer and well tolerated compared to conventional small molecule-based drugs (Otvos and Wade, 2014). Proteins and peptide-based drugs cover around 10% of the pharmaceutical market as per the current report and will continue to grow in future (Bruno et al., 2013; Craik et al., 2013). Currently, more than 239 therapeutic proteins and peptides have been already approved by US-FDA (Fosgerau and Hoffmann, 2015; Usmani et al., 2017) and therefore researchers nowadays are focusing more on peptide-based drugs.

Broadly, peptides can be classified in four classes based on their therapeutic potential; (i) peptides as drug delivery vehicle, (ii) peptides as vaccine candidates, (iii) peptide-based inhibitors, and (iv) peptides-based disease biomarkers. Group I peptide can be used for delivering small molecules or drugs at their targets such as cell penetrating peptides, tumor homing peptides, brain barrier penetrating peptides (Gautam et al., 2012, 2013, 2016; Kapoor et al., 2012; Van Dorpe et al., 2012; Holton et al., 2013; Sharma et al., 2013; Agrawal et al., 2016; Wei et al., 2017; Wolfe et al., 2018). Group II peptides can be used for designing epitope-based vaccines or subunit vaccine; these are generally synthetic peptides or subunits of the whole organism commonly known as epitopes (Singh and Raghava, 2001; Ansari and Raghava, 2010; Singh et al., 2013; Shi et al., 2015; Oyarzún and Kobe, 2016; Alonso-Padilla et al., 2017; Jespersen et al., 2017). Group II peptides are one of the important categories of peptide-based therapeutics and can be clearly seen by the number of in silico methods developed in last decade (Rammensee et al., 1999; Singh and Raghava, 2003; Bhasin and Raghava, 2007; Zhang et al., 2008, 2011; Kringelum et al., 2012; Dhanda et al., 2013; Singh et al., 2013; Jurtz et al., 2017). These peptides generate memory cells and hence are very important nowadays for treating pathogenic infections. Epitopes/peptides are poor immunogens on their own and hence need the assistance of molecules known as adjuvants for increasing its potency (Sayers et al., 2012; Chaudhary et al., 2016; Nagpal et al., 2015, 2017, 2018).

Group III, peptides are inhibitors which can be used as drug molecules or inhibiting activity of drug targets (Eldar-Finkelman and Eisenstein, 2009; Groner et al., 2012; Beekman and Howell, 2016). These peptides kill pathogens by disrupting their cell membranes, by inhibiting their regulatory enzymes or by carrying out lysis (Ivanciuc et al., 2003; Rashid et al., 2009; Pirtskhalava et al., 2016; Wang et al., 2016, 2018; Singh et al., 2018). AMPs represent one of the broadest class of this group, for which number of databases and prediction methods have been developed in order to identify novel peptides which could act as drugs (Saha and Raghava, 2006; Gautam et al., 2014; Mehta et al., 2014; Kumar et al., 2015; Meher et al., 2017; Agrawal et al., 2018). Lastly, Group IV consists of those peptides which could potentially act as a biomarker and can be useful in developing different diagnostic kits (Shao, 2015; Bhalla et al., 2017). For example, peptides obtained from urine have been used as potential biomarkers for identifying multiple diseases (Siwy et al., 2011). Likewise, many computational methods have been created to maintain information related to peptides which could act as biomarkers (Zhang et al., 2006; Bhalla et al., 2017). Despite tremendous potential of peptides, there are many challenges in designing therapeutic peptides that include short half-life, challenges in oral delivery, immunotoxicity, cytotoxicity, etc. To address these issues, a number of computational resources has been developed in last two decades (Gupta et al., 2013; Sharma et al., 2014; Mathur et al., 2016, 2018; Liu et al., 2017; Porto et al., 2017b).

In the past few years, numerous methods have been developed to predict AMPs. Broadly, these methods can be classified in the following two groups (i) General methods and (ii) Class specific methods. The first group includes methods like CAMPR3, APD, AmPEP, and CS-AMPPred which predicts whether the given peptide is AMP or non-AMP (Porto et al., 2012; Wang, 2015; Waghu et al., 2016; Bhadra et al., 2018). CAMPR3 implements four different machine learning techniques for developing a prediction model (Waghu et al., 2016). APD is a physicochemical property based method that predicts AMP from the physicochemical property of the peptide (Wang, 2015). AmPEP is a random forest-based model developed using distribution patterns of amino acid properties along the sequence (Bhadra et al., 2018). CS-AMPPred is a support vector machine (SVM) based AMP prediction method developed for cysteine-stabilized peptides (Porto et al., 2012). The second group, i.e., class specific methods are those methods which are designed to predict peptides that can kill/inhibit specific class of organism and not in general. For example we have methods which predicts and designed peptides which are effective specifically either to bacteria or fungi or viruses or parasites. For example, Antibp and Antibp2 are two widely used SVM based methods developed to predict the antibacterial nature of the given peptide (Lata et al., 2007, 2010). AVPpred is developed for predicting antiviral peptide using machine learning technique SVM; features like the amino acid composition and physiochemical properties were used in this method (Thakur et al., 2012). Similarly, a method called Antifp has been developed for predicting antifungal peptide, it uses features like amino acid composition, binary profile (Agrawal et al., 2018). In addition, there are methods that predict the class of AMP (e.g., antibacterial, antifungal, and antiviral) like ClassAMP (Joseph et al., 2012). Similarly, another method iAMPpred predicts the probability of a peptide as an antibacterial, antifungal, and antiviral by providing the probability score for all the three classes (Meher et al., 2017). In past methods also developed for predicting AMPs in first step and class of AMP in the second step (Xiao et al., 2013).

Despite tremendous advances in the field of prediction of antimicrobial peptides, limited attempt has been made to predict antimicrobial peptides of chemically modified peptides. CS-AMPPred is a only method developed for predicting antimicrobial activity of a specific-type of chemical modification (cysteine-stabilized peptides). Best of our knowledge no method has been developed in past that can predict antimicrobial activity of a modified-peptide, which supports wide range of chemical modifications. In reality, most of the FDA approved therapeutic peptides are chemically modified, as the chemical modification is important for improving the stability of peptides in the body fluid, protection of peptide from the immune system, reducing the toxicity of peptide (Usmani et al., 2017; Al Musaimi et al., 2018). Thus it is need of time to develop a method that can predict antimicrobial inhibition potential of a chemically modified peptide from its tertiary structure. In this study, a systematic attempt had been made to predict AMP potential of a chemically modified peptide.

Materials and Methods

Dataset Creation

Modified AMPs were extracted from the SATPDB database (Singh et al., 2016) which maintains information about more than 19,000 natural and modified peptides. All those peptides which show any modification (terminus, chemical, and D-amino acids), is antimicrobial and whose tertiary structure is present were assigned as modified AMPs. In total, we got 948 such peptides. To develop any prediction method, we need negative dataset also. In our case, we selected those peptides as modified non-AMPs/negative dataset which exhibits any modifications (terminus, chemical, and D-amino acids), is non-antimicrobial in nature and whose tertiary structure is present in the SATPDB database. In the end, we got 931 such peptides. Therefore, we built the dataset of 948 positive peptides and 931 negative peptides.

Internal and External Validation Dataset

The dataset was divided into two parts (i) training and (ii) validation dataset (Kumar et al., 2018). The training or main dataset consists of 80% of the total data, i.e., 758 modified AMPs and 745 non-AMPs. The validation dataset comprises of remaining 20% data, i.e., 190 modified AMPs and 186 non-AMPs. These peptides were selected randomly to avoid any biasness. Training dataset was used for internal validation, where models were trained and tested using fivefold cross-validation technique (Gautam et al., 2013). Performance of the best model achieved using training dataset was evaluated on the validation dataset, in the process commonly known as external validation.

Additional Dataset

Discriminating between peptides which are compositionally similar but show different activity is a challenging task (Loose et al., 2006; Porto et al., 2017a). In order to evaluate the performance of different models developed in this study, we prepared another dataset “Mod_AMP_similar” having compositionally similar modified AMPs and non-AMPs. The positive set consists of those peptides which are present in the validation dataset whereas negative set consists of those peptides which are compositionally similar to the positive peptides. Compositionally similar peptides were identified by computing Euclidean distance between the diatomic composition of two peptides and the peptides having minimum Euclidean distance were selected. This kind of methodology has already been used in earlier studies (Kumar et al., 2008; Agrawal et al., 2018).

Model Development

Feature Computation From Peptide Structures

Atom composition

Atom composition was calculated from modified AMPs and non-AMPs by converting peptides structures in SMILES format using openbabel (O’Boyle et al., 2011). The SMILES were further used to calculate atom composition of following atoms C, H, O, N, S, Cl, Br, and F. The atomic composition is calculated using formula 1 and provides a fixed length of eight vectors.

Fraction of atom (a) = \frac{total number of atom (a)}{total number of all possible atoms} \times 100

(1)

where atom (a) is one out of all eight atoms.

Diatom composition

Diatom composition was computed in a similar manner as atom composition. The diatomic composition provides information about the pairs of atoms in each residue (e.g., C-C, C-O, C-N, etc.) of the peptides. The diatomic composition was computed using formula 2 which provided us a fixed length of 64 (8 × 8) vectors.

Fraction of atom (a) = \frac{total number of diatom (a)}{total number of all possible diatoms} \times 100

(2)

where diatom (a) is one out of all 64 diatoms.

Chemical descriptors

Chemical or Molecular descriptors are terms that represents specific information of a given chemical molecule and determines its biological properties. Chemical descriptors represent the correlation between the physical, chemical and biological properties of a molecule and its chemical constitution in the form of numerical values (Roy et al., 2015). Majority of these chemical descriptors are classified on the basis of their dimensionality, which refers to the molecule representation from which descriptor values are calculated. Broadly, these descriptors are calculated as one dimensional (1D), two dimensional (2D), three dimensional (3D), and fingerprints (Xue and Bajorath, 2000). In the past, researchers have used the molecular descriptors to develop QSAR based prediction methods (Kumar et al., 2015). In our study, we used PaDEL software (Yap, 2011), which is a freely available software for calculating various descriptors of a given molecule. We calculated different types of descriptors which includes 2D descriptors and 10 different types of fingerprints. We performed feature selection technique to remove unnecessary descriptors, since all descriptors don’t correlate with the biological activity of the molecule, hence reducing noise from the dataset.

In this study, feature selection was performed using WEKA software (Data Mining: Practical Machine Learning Tools and Techniques, 2018) at default parameters. We selected “CfsSubsetEval” as an evaluator and “Best First” as a search method. The feature selection was performed in the forward direction with amount of backtracking, N = 5 and lookup size D = 1.

Statistical Analysis

To find out the significant difference between modified AMPs and non-AMPs, we performed the Mann–Whitney–Wilcoxon test, which is a non-parametric test, using in-house R-script on the selected features of 2D descriptors, fingerprints, and combination of 2D descriptors and fingerprints.

Binary Profiles

Differentiating AMPs with non-AMPs with similar peptide sequence is one of the challenging tasks. Although features like the composition and chemical descriptors can differentiate between AMPs and non-AMPs, they are unable to maintain the order of the residues in the peptide. To combat this situation, we converted the peptides into its SMILES format and extracted different numbers of atoms, symbols and both from the N and C terminus. Binary profiles of these atoms and symbols were generated, and prediction models were developed in three different categories. The first category includes only atoms present in the SMILES format, the second profile consists only of symbols, and the third contains the mixture of both. The binary profile was created from terminus (N, C, or both) for the first 25, 50, and 100 elements in case of only atoms and only symbols whereas for both (atom + symbol) first 50, 100, and 200 elements were considered. In the case of only atoms, there were total 8 atoms (C, H, O, N, S, F, Cl, and Br) where the presence of atom was represented by “1” and the absence by “0”, hence generating a vector of N × 8. In case of only symbols, we considered the most commonly occurring symbols (@, +, =, #, [, ],.). These symbols are the chemical notations of a given chemical. For example, “-” is used to represent single bond, “=” is used to represent double bond, “#” is used to represent triple bond and so on. These symbols are represented in such a way so when given as an input, computer can easily understand it. Here also, the presence of symbol was indicated by “1” and the absence by “0”, hence leading to the vector of N × 7. In case of both, atom and symbols as mentioned above were taken, generating the vector of length N × 15. Binary profile generation is explained in Figure 1.

FIGURE 1

FIGURE 1. Feature extraction using SMILES format. Different features were calculated using SMILES format (A) binary profile generation of only atoms, (B) binary profile generation of only symbols, (C) binary profile generation of both symbol and atoms, (D) atom composition, and (E) diatom composition.

Performance Measure

Performance of models were evaluated using different parameters which can be divided into two groups (i) threshold dependent parameters and (ii) threshold independent parameters.

The first group, i.e., threshold dependent parameters include Sensitivity (Sen), Specificity (Spc), Accuracy (Acc), and Matthew’s Correlation Coefficient (MCC). Here Sensitivity is defined as the true positive rate; Specificity is the true negative rate; Accuracy as the ability to differentiate between true positive and true negative whereas MCC is a correlation between observed and predicted value. These can be calculated using equations 3–6.

Sensitivity = \frac{T P}{T P + F N} \times 100

(3)

Specificity = \frac{T N}{T N + F P} \times 100

(4)

Accuracy = \frac{T P + T N}{T P + F P + T N + F N} \times 100

(5)

Matthew’s Correlation Coefficient = \frac{(T P * T N) - (F P * F N)}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}} \times 100

(6)

where TP and TN represents correctly predicted modified AMPs and non-AMPs, respectively. FP and FN represent wrongly predicted modified AMPs and non-AMPs, respectively.

The second group, i.e., threshold independent parameter includes AUROC, i.e., Area Under Receiver Operating Characteristic.

Results

Analysis

Percent average composition of atoms present in modified AMPs and non-AMPs was computed for understanding the type of atom preference. Overall, the profile was found to be more or less the same in modified AMPs and non-AMPs. AMPs were found to be slightly higher in “C” atom compared to non-AMPs whereas non-AMPs were found to be higher in “S” atom compared to AMPs. Halogens were found to be absent in AMPs and non-AMPs (Figure 2). We also analyzed the diatoms composition and observe that diatom “CC” is dominant in AMPs whereas “NC,” “OC,” “CS,” and “SC” were more abundant in non-AMPs (Figure 3).

FIGURE 2

FIGURE 2. Comparison of atom composition present in modified AMPs and non-AMPs.

FIGURE 3

FIGURE 3. Comparison of diatom composition present in modified AMPs and non-AMPs.

Machine Learning Based Prediction Model

Peptide tertiary structure can present different types of chemical modifications. Therefore, the structure of peptides was utilized to compute the feature and predict its antimicrobial nature. Various machine learning techniques like SVM (Cortes and Vapnik, 1995), Random Forest, Naive Bayes, J48, and SMO were used to develop the prediction model in the study. These models utilize different features for discriminating modified AMPs form non-AMPs. The results are explained below in the following sections:

Composition Based Prediction

We developed prediction models for the atomic and diatomic composition of the peptide using various classifiers. In case of atomic composition, SVM model performed better than other models with an accuracy of 86.83% with MCC of 0.74 on the training dataset and accuracy of 83.51% and MCC of 0.67 on the validation dataset (Table 1). For diatomic composition, Random Forest model achieved the highest accuracy of 89.75% with MCC of 0.80 on training dataset whereas on validation dataset the model showed the accuracy of 87.50% and MCC of 0.75 (Table 2).

TABLE 1

TABLE 1. The performance of atom composition based models developed using different machine learning techniques.

TABLE 2

TABLE 2. The performance of diatom composition based models developed using different machine learning techniques.

Chemical Descriptors Based Prediction

Individual models were developed for 2D descriptors and fingerprints as well as the single model was developed combining features of 2D descriptors and fingerprints. These model were developed on the complete features as well as features obtained after feature selection process (see section “Materials and Methods”). In case of 2D descriptors, initially 231 descriptors were calculated, and SVM based model achieved the highest accuracy of 61.29% with MCC of 0.23 on training dataset and accuracy of 60.90% and MCC of 0.28 on validation dataset (Table 3). We applied feature selection process on these 231 features reducing them to 4. List of these features is provided in Supplementary Table S1. Machine learning techniques were applied on these selected features, and we observed that SVM based model achieved the highest accuracy of 80.68% with MCC of 0.62 on training dataset and accuracy of 79.79% and MCC of 0.60 on validation dataset (Table 3).

TABLE 3

TABLE 3. The performance of 2D descriptors based models developed using different machine learning techniques.

In case of fingerprints initially, we calculated 4812 features and developed the SVM model which shows 91.62% accuracy with 0.84 MCC on training dataset and 89.89% accuracy and 0.80 MCC on the validation dataset (Table 4). We applied feature selection technique on these features reducing them to a total of 18 features (Supplementary Table S2). The SVM based model developed on these 18 features showed the accuracy of 81.77% with MCC of 0.64 on the training dataset and accuracy of 79.26% and MCC of 0.59 on the validation dataset. Therefore, we developed different machine learning models using complete features and reported the performance in Table 4.

TABLE 4

TABLE 4. The performance of fingerprints based models developed using different machine learning techniques.

In case of all combined features (2D descriptors + fingerprints), we calculated 5043 features initially. SVM model developed using complete feature showed the accuracy of 59.59% with MCC of 0.29 on training dataset and accuracy of 59.57% and MCC of 0.28 on the validation dataset. Feature selection technique reduced the number of features from 5043 to 20 (Supplementary Table S3). SVM model developed on these features showed the higher accuracy of 81.76% and MCC of 0.64 on the training dataset, and on the validation dataset, it achieved an accuracy of 82.71% and MCC of 0.65. Performance of other classifiers obtained on these features is provided in Supplementary Table S4. Random Forest performed best among all the models with accuracy of 90.35% and MCC of 0.81 on training dataset and accuracy of 88.56% and MCC of 0.77 on the validation dataset.

Significance of Features

Significant difference was observed between the positive and negative features based on p-values. For most of the features, we found the p-value less than 0.05. Therefore, we can conclude that these features are important and can be used to discriminate between modified AMPs and non-AMPs. Mean value of features (positive and negative) along with their p-value for 2D descriptors, fingerprints and hybrid feature (2D descriptors + fingerprints) is provided in Supplementary Tables S1–S3, respectively.

Binary Profile Based Prediction

In this part of the study, the binary profile was generated using SMILES format, and prediction models were developed in three different categories. In the first category, where only atoms were taken we developed SVM based models for the first 25, 50, and 100 elements from N terminus (N25, N50, and N100), C terminus (C25, C50, and C100) and joining both termini (N25C25, N50C50, and N100C100). We obtained the best performance for the N100C100 binary profile with an accuracy of 89.84% and MCC of 0.80 on training dataset and accuracy of 87.37% and MCC of 0.75 on validation dataset (Table 5). In the second category, we considered only symbols and calculated the binary profile in the same manner as for the first category. Here also, N100C100 binary profile achieved the highest accuracy of 87.42% and MCC of 0.75 on training dataset and accuracy of 80.53% and MCC of 0.61 on the validation dataset (Supplementary Table S5). For the last category, where both symbol and atoms were considered we calculated the binary profile for the first 50, 100, and 200 elements from N-terminus, C-terminus, and by joining elements of both termini. Here, the model developed on N200C200 binary profile performed better than other models with an accuracy of 89.35% and MCC of 0.79 on training dataset and accuracy of 85.86% and MCC of 0.72 on validation dataset (Table 6).

TABLE 5

TABLE 5. The performance of SVM based models developed using binary profile of atoms obtained from terminals of SMILES format.

TABLE 6

TABLE 6. The performance of SVM based models developed using binary profile of atoms and symbols together obtained from terminals of SMILES format.

Additional Dataset Performance

We evaluated the performance of the model developed on the additional dataset termed as “Mod_AMP_similar”. Performance of the SVM models developed on different features like the composition, chemical descriptors and binary profiles is compared in Table 7. It can be clearly seen that model developed using fingerprints with an accuracy of 90.26% and MCC of 0.81 followed by the model developed using N100C100 binary profile where only atoms were considered performed best with an accuracy of 89.66% and MCC of 0.80.

TABLE 7

TABLE 7. The performance of SVM based models developed using different features on additional dataset.

Implementation of the Web Server

To assist the researchers, we have developed a web server named “AntiMPmod” where the best prediction model has been incorporated. The PREDICTION module takes a tertiary structure of the modified peptide (PDB format) as an input for performing prediction. If a user does not have its own modified peptide tertiary structure, user can generate the peptide tertiary structure up to 25 residues in length using the server “PEPstrMOD”¹ (Singh et al., 2015). This server was developed by our group specifically for tertiary structure prediction of the modified peptide. A user can select the desired modification from the wide variety of modification options present at the server. Once the structure is generated, the user can submit the structure in the PREDICTION module and can predict whether the provided peptide was AMP or non-AMP. Another module termed as “DOWNLOAD”, provides the dataset used in this study.

Standalone

In order to assist the researchers, we have also developed the standalone software of AntiMPmod. User needs to pull the docker image “raghavagps/gpsraghava” and can run the software using the PERL code provided inside the folder termed as “gpsr.”

Discussion

Rapidly growing resistance and failure of conventional antibiotics to treat pathogenic infections are one of the serious public health concerns (Komolafe, 2003; Fair and Tor, 2014). In the “post-antibiotic era”, researchers are heading toward the peptide-based antibiotics due to its various advantages over the antibiotics. Natural AMPs because of its various therapeutic properties (bactericidal property, immunomodulatory activity, a broad spectrum of activity, etc.) have rapidly captured attention as novel drug candidates. AMPs are short innate immunity peptides present in almost all living organism and act as a universal host defense molecule. AMPs belong to diverse families which include cathelicidins (Zanetti, 2005), defensins (Lehrer, 2004), cercopins (Boman, 2000), and magainins (Berkowitz et al., 1990). AMPs possess a broad range of properties in terms of their physiochemical properties, composition, 3D structure and mechanism of action. Majority of them are small, positively charged and amphipathic, 4–100 amino acid in length with diverse amino acid composition (Gentilucci et al., 2006; Wang, 2012). Recently, the 3D structure of the natural AMPs has been classified into four broad families (i) α-helical (possess helix), (ii) β-sheet (consists of sheet usually stabilized by disulfide bonds), (iii) αβ (consists of both helix and sheet), and (iv) non-αβ (do not have clearly defined structures) (Fjell et al., 2012). AMPs mostly kill their targets by various mechanisms such as cell membrane damage or pore formation that leads to efflux of nutrients and ions (Melo et al., 2009), DNA interference or signaling responses (Wimley and Hristova, 2011).

Natural ecosystem has been proven a reservoir of a wide variety of compounds that may be explored for the development of potential drug molecule. Researchers have explored several biomes and discovered a large number of AMPs from the microorganism, plants and animals having therapeutic potentials, for example, bovine lactoferrin, LL-37 (de Castro and Franco, 2015; Mahlapuu et al., 2016). Literature is full of such discoveries, and a large number of databases have been developed which maintains a wide variety of information of AMPs (Novković et al., 2012; Pirtskhalava et al., 2016; Waghu et al., 2016; Wang et al., 2016). However, most of the natural AMPs based drug have not reached clinical trials. This is largely due to the high structural complexity of the compound, low compound stability, low activity toward the target, compound side effects, degradation of the compound by the host enzyme, and the high drug development cost (de Castro and Franco, 2015). To overcome the above-mentioned problems, researchers have tried to design the modified compounds by incorporating various chemical modifications such as capping, halogenation, hydroxylation, glycosylation, phosphorylation, designing antimicrobial peptide mimetic, AMP congeners, AMP conjugates, and immobilized AMPs. Details of the different kind of modifications for the novel antimicrobial peptide engineering are reviewed by Wang (2012). Computational methods have shown a wide variety of success in the field of drug discovery process (Dhanda et al., 2017).

In the past, numerous methods have been made for predicting and designing novel AMP, but one of the biggest limitations of these methods is that they can only handle the peptide sequence containing natural residues. In the current study, we have developed a prediction method which predicts the antimicrobial property of a given chemically modified peptide using its tertiary structure. One of the major advantage of using 3D structure over sequence is the inclusion of chemical modification information during prediction which is nearly impossible with sequence based prediction. It is because representing chemical modification in a sequence is a challenging task. Also, molecular descriptors can be calculated easily using 3D structure which covers information of all the chemical properties of a modified peptide in comparison to sequence. These structure-based methods have their own limitations which includes requirement of tertiary structure of peptides. Experimental techniques (e.g., X-ray crystallography, NMR, and cryo-electron microscopy) for determination of peptide structure are time consuming and costly. Computational techniques like molecular dynamics and method like PEPstrMOD for predicting structure from sequence have their own limitations including accuracy and speed of prediction. The overall scheme of the AntiMPmod has been shown in Figure 4. We extracted the modified AMPs and non-AMPs from the SATPDB database and analyzed these structures. We found different kind of modifications such as acetylation, amidation, methylation, glycosylation, and presence of non-natural residues such as ornithine, norleucine, D amino acids, etc. Secondary structure content was analyzed by running DSSP (Kabsch and Sander, 1983; Joosten et al., 2011) and we found that modified AMPs were highly dominated by turns, coils and extended loop regions (∼62.5%) followed by helical content (∼36%) and very little amount of sheet content (∼1.5%). We extracted different kind of features such as composition, chemical descriptors, fingerprints and binary profiles from these modified peptides and used them for developing prediction models using various machine learning classifiers. We found that SVM based model utilizing fingerprints as feature performed best among all the models followed by the model developed using binary profiles. In case of binary profile based models, we observed as the number of terminus elements was increasing their performance too was increasing and when we join the elements of both termini, they performed better than their individual terminus. This suggests that terminus information plays a significant role in predicting the nature of peptide. In addition to this, we created an additional dataset where positive and negative peptides were compositionally similar. We evaluated the performance of different models on this dataset and found that binary profile model which considers only atoms and fingerprint-based model performed best and can classify the modified AMPs and non-AMPs with higher accuracy. Overall summary of the result of this study is given in Table 8 where we have mentioned the best performance obtained by the prediction model on different input features. Performance achieved on the independent dataset by the best models developed using various input features is shown in the Figure 5, where we have calculated AUROC. We implemented our best model in the web server “AntiMPmod” and believes that this study will be helpful for the researchers working in the field of drug discovery.

FIGURE 4

FIGURE 4. Schematic representation of AntiMPmod workflow.

TABLE 8

TABLE 8. The performance of best models developed using different machine learning techniques based on different features.

FIGURE 5

FIGURE 5. The performance of best models on independent dataset, in terms of ROC curves developed using different input features.

Author Contributions

PA collected the data, created the datasets and the back-end server, performed all the experiments, and developed the front end user interface. PA and GPSR analyzed the results and wrote the manuscript. GPSR conceived the idea and coordinated the project.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

The authors are thankful to funding agencies Department of Science and Technology (DST-INSPIRE) and J. C. Bose National Fellowship (DST) for fellowships and financial support.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2018.02551/full#supplementary-material

Footnotes

^http://webs.iiitd.edu.in/raghava/pepstrmod/

References

Agrawal, P., Bhalla, S., Chaudhary, K., Kumar, R., Sharma, M., and Raghava, G. P. S. (2018). In silico approach for prediction of antifungal peptides. Front. Microbiol. 9:323. doi: 10.3389/fmicb.2018.00323

PubMed Abstract | CrossRef Full Text | Google Scholar

Agrawal, P., Bhalla, S., Usmani, S. S., Singh, S., Chaudhary, K., Raghava, G. P. S., et al. (2016). CPPsite 2.0: a repository of experimentally validated cell-penetrating peptides. Nucleic Acids Res. 44, D1098–D1103. doi: 10.1093/nar/gkv1266

PubMed Abstract | CrossRef Full Text | Google Scholar

Al Musaimi, O., Al Shaer, D., de la Torre, B. G., and Albericio, F. (2018). 2017 FDA peptide harvest. Pharmaceuticals 11:E42. doi: 10.3390/ph11020042

PubMed Abstract | CrossRef Full Text | Google Scholar

Alonso-Padilla, J., Lafuente, E. M., and Reche, P. A. (2017). Computer-aided design of an epitope-based vaccine against epstein-barr virus. J. Immunol. Res. 2017:9363750. doi: 10.1155/2017/9363750

PubMed Abstract | CrossRef Full Text | Google Scholar

Ansari, H. R., and Raghava, G. P. (2010). Identification of conformational B-cell Epitopes in an antigen from its primary sequence. Immunome Res. 6:6. doi: 10.1186/1745-7580-6-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Beekman, A. M., and Howell, L. A. (2016). Small-molecule and peptide inhibitors of the pro-survival protein Mcl-1. ChemMedChem 11, 802–813. doi: 10.1002/cmdc.201500497

PubMed Abstract | CrossRef Full Text | Google Scholar

Berkowitz, B. A., Bevins, C. L., and Zasloff, M. A. (1990). Magainins: a new family of membrane-active host defense peptides. Biochem. Pharmacol. 39, 625–629. doi: 10.1016/0006-2952(90)90138-B

PubMed Abstract | CrossRef Full Text | Google Scholar

Bhadra, P., Yan, J., Li, J., Fong, S., and Siu, S. W. I. (2018). AmPEP: sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest. Sci. Rep. 8:1697. doi: 10.1038/s41598-018-19752-w

PubMed Abstract | CrossRef Full Text | Google Scholar

Bhalla, S., Verma, R., Kaur, H., Kumar, R., Usmani, S. S., Sharma, S., et al. (2017). CancerPDF: a repository of cancer-associated peptidome found in human biofluids. Sci. Rep. 7:1511. doi: 10.1038/s41598-017-01633-3

PubMed Abstract | CrossRef Full Text | Google Scholar

Bhasin, M., and Raghava, G. P. S. (2007). A hybrid approach for predicting promiscuous MHC class I restricted T cell epitopes. J. Biosci. 32, 31–42. doi: 10.1007/s12038-007-0004-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Boman, H. G. (2000). Innate immunity and the normal microflora. Immunol. Rev. 173, 5–16. doi: 10.1034/j.1600-065X.2000.917301.x

CrossRef Full Text | Google Scholar

Bruno, B. J., Miller, G. D., and Lim, C. S. (2013). Basics and recent advances in peptide and protein drug delivery. Ther. Deliv. 4, 1443–1467. doi: 10.4155/tde.13.104

PubMed Abstract | CrossRef Full Text | Google Scholar

Chaudhary, K., Nagpal, G., Dhanda, S. K., and Raghava, G. P. S. (2016). Prediction of Immunomodulatory potential of an RNA sequence for designing non-toxic siRNAs and RNA-based vaccine adjuvants. Sci. Rep. 6:20678. doi: 10.1038/srep20678

PubMed Abstract | CrossRef Full Text | Google Scholar

Cortes, C., and Vapnik, V. (1995). Support-vector networks. Mach. Learn. 20, 273–297. doi: 10.1023/A:1022627411411

CrossRef Full Text | Google Scholar

Craik, D. J., Fairlie, D. P., Liras, S., and Price, D. (2013). The future of peptide-based drugs. Chem. Biol. Drug Des. 81, 136–147. doi: 10.1111/cbdd.12055

PubMed Abstract | CrossRef Full Text | Google Scholar

Data Mining: Practical Machine Learning Tools and Techniques (2018). Available at: https://www.cs.waikato.ac.nz/ml/weka/book.html

Google Scholar

de Castro, A. P., and Franco, O. L. (2015). Modifying natural antimicrobial peptides to generate bioinspired antibiotics and devices. Future Med. Chem. 7, 413–415. doi: 10.4155/fmc.15.8

PubMed Abstract | CrossRef Full Text | Google Scholar

Dhanda, S. K., Gupta, S., Vir, P., and Raghava, G. P. S. (2013). Prediction of IL4 inducing peptides. Clin. Dev. Immunol. 2013:263952. doi: 10.1155/2013/263952

CrossRef Full Text | Google Scholar

Dhanda, S. K., Usmani, S. S., Agrawal, P., Nagpal, G., Gautam, A., and Raghava, G. P. S. (2017). Novel in silico tools for designing peptide-based subunit vaccines and immunotherapeutics. Brief. Bioinform. 18, 467–478. doi: 10.1093/bib/bbw025

PubMed Abstract | CrossRef Full Text | Google Scholar

Eldar-Finkelman, H., and Eisenstein, M. (2009). Peptide inhibitors targeting protein kinases. Curr. Pharm. Des. 15, 2463–2470. doi: 10.2174/138161209788682253

CrossRef Full Text | Google Scholar

Fair, R. J., and Tor, Y. (2014). Antibiotics and bacterial resistance in the 21st century. Perspect. Medicin. Chem. 6, 25–64. doi: 10.4137/PMC.S14459

PubMed Abstract | CrossRef Full Text | Google Scholar

Fjell, C. D., Hiss, J. A., Hancock, R. E. W., and Schneider, G. (2012). Designing antimicrobial peptides: form follows function. Nat. Rev. Drug Discov. 11, 37–51. doi: 10.1038/nrd3591

PubMed Abstract | CrossRef Full Text | Google Scholar

Fosgerau, K., and Hoffmann, T. (2015). Peptide therapeutics: current status and future directions. Drug Discov. Today 20, 122–128. doi: 10.1016/j.drudis.2014.10.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Gautam, A., Chaudhary, K., Kumar, R., Sharma, A., Kapoor, P., Tyagi, A., et al. (2013). In silico approaches for designing highly effective cell penetrating peptides. J. Transl. Med. 11:74. doi: 10.1186/1479-5876-11-74

PubMed Abstract | CrossRef Full Text | Google Scholar

Gautam, A., Chaudhary, K., Singh, S., Joshi, A., Anand, P., Tuknait, A., et al. (2014). Hemolytik: a database of experimentally determined hemolytic and non-hemolytic peptides. Nucleic Acids Res. 42, D444–D449. doi: 10.1093/nar/gkt1008

PubMed Abstract | CrossRef Full Text | Google Scholar

Gautam, A., Nanda, J. S., Samuel, J. S., Kumari, M., Priyanka, P., Bedi, G., et al. (2016). Topical delivery of protein and peptide using novel cell penetrating peptide IMT-P8. Sci. Rep. 6:26278. doi: 10.1038/srep26278

PubMed Abstract | CrossRef Full Text | Google Scholar

Gautam, A., Singh, H., Tyagi, A., Chaudhary, K., Kumar, R., Kapoor, P., et al. (2012). CPPsite: a curated database of cell penetrating peptides. Database 2012:bas015. doi: 10.1093/database/bas015

PubMed Abstract | CrossRef Full Text | Google Scholar

Gentilucci, L., Tolomelli, A., and Squassabia, F. (2006). Peptides and peptidomimetics in medicine, surgery and biotechnology. Curr. Med. Chem. 13, 2449–2466. doi: 10.2174/092986706777935041

CrossRef Full Text | Google Scholar

Gould, I. M., and Bal, A. M. (2013). New antibiotic agents in the pipeline and how they can help overcome microbial resistance. Virulence 4, 185–191. doi: 10.4161/viru.22507

PubMed Abstract | CrossRef Full Text | Google Scholar

Groner, B., Weber, A., and Mack, L. (2012). Increasing the range of drug targets: interacting peptides provide leads for the development of oncoprotein inhibitors. Bioengineered 3, 320–325. doi: 10.4161/bioe.21272

PubMed Abstract | CrossRef Full Text | Google Scholar

Gupta, S., Kapoor, P., Chaudhary, K., Gautam, A., Kumar, R., Open Source Drug Discovery Consortium et al. (2013). In silico approach for predicting toxicity of peptides and proteins. PLoS One 8:e73957. doi: 10.1371/journal.pone.0073957

PubMed Abstract | CrossRef Full Text | Google Scholar

Holton, T. A., Pollastri, G., Shields, D. C., and Mooney, C. (2013). CPPpred: prediction of cell penetrating peptides. Bioinformatics 29, 3094–3096. doi: 10.1093/bioinformatics/btt518

PubMed Abstract | CrossRef Full Text | Google Scholar

Ivanciuc, O., Schein, C. H., and Braun, W. (2003). SDAP: database and computational tools for allergenic proteins. Nucleic Acids Res. 31, 359–362. doi: 10.1093/nar/gkg010

CrossRef Full Text | Google Scholar

Jespersen, M. C., Peters, B., Nielsen, M., and Marcatili, P. (2017). BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes. Nucleic Acids Res. 45, W24–W29. doi: 10.1093/nar/gkx346

PubMed Abstract | CrossRef Full Text | Google Scholar

Joosten, R. P., te Beek, T. A. H., Krieger, E., Hekkelman, M. L., Hooft, R. W. W., Schneider, R., et al. (2011). A series of PDB related databases for everyday needs. Nucleic Acids Res. 39, D411–D419. doi: 10.1093/nar/gkq1105

PubMed Abstract | CrossRef Full Text | Google Scholar

Joseph, S., Karnik, S., Nilawe, P., Jayaraman, V. K., and Idicula-Thomas, S. (2012). ClassAMP: a prediction tool for classification of antimicrobial peptides. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 1535–1538. doi: 10.1109/TCBB.2012.89

PubMed Abstract | CrossRef Full Text | Google Scholar

Jurtz, V., Paul, S., Andreatta, M., Marcatili, P., Peters, B., and Nielsen, M. (2017). NetMHCpan-4.0: improved peptide-MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data. J. Immunol. 199, 3360–3368. doi: 10.4049/jimmunol.1700893

PubMed Abstract | CrossRef Full Text | Google Scholar

Kabsch, W., and Sander, C. (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637. doi: 10.1002/bip.360221211

PubMed Abstract | CrossRef Full Text | Google Scholar

Kapoor, P., Singh, H., Gautam, A., Chaudhary, K., Kumar, R., and Raghava, G. P. S. (2012). TumorHoPe: a database of tumor homing peptides. PLoS One 7:e35187. doi: 10.1371/journal.pone.0035187

PubMed Abstract | CrossRef Full Text | Google Scholar

Komolafe, O. O. (2003). Antibiotic resistance in bacteria - an emerging public health problem. Malawi Med. J. 15, 63–67.

Google Scholar

Kringelum, J. V., Lundegaard, C., Lund, O., and Nielsen, M. (2012). Reliable B cell epitope predictions: impacts of method development and improved benchmarking. PLoS Comput. Biol. 8:e1002829. doi: 10.1371/journal.pcbi.1002829

PubMed Abstract | CrossRef Full Text | Google Scholar

Kumar, M., Thakur, V., and Raghava, G. P. S. (2008). COPid: composition based protein identification. In Silico Biol. 8, 121–128.

PubMed Abstract | Google Scholar

Kumar, R., Chaudhary, K., Singh Chauhan, J., Nagpal, G., Kumar, R., Sharma, M., et al. (2015). An in silico platform for predicting, screening and designing of antihypertensive peptides. Sci. Rep. 5:12512. doi: 10.1038/srep12512

PubMed Abstract | CrossRef Full Text | Google Scholar

Kumar, V., Agrawal, P., Kumar, R., Bhalla, S., Usmani, S. S., Varshney, G. C., et al. (2018). Prediction of cell-penetrating potential of modified peptides containing natural and chemically modified residues. Front. Microbiol. 9:725. doi: 10.3389/fmicb.2018.00725

PubMed Abstract | CrossRef Full Text | Google Scholar

Lata, S., Mishra, N. K., and Raghava, G. P. S. (2010). AntiBP2: improved version of antibacterial peptide prediction. BMC Bioinformatics 11(Suppl. 1):S19. doi: 10.1186/1471-2105-11-S1-S19

PubMed Abstract | CrossRef Full Text | Google Scholar

Lata, S., Sharma, B. K., and Raghava, G. P. S. (2007). Analysis and prediction of antibacterial peptides. BMC Bioinformatics 8:263. doi: 10.1186/1471-2105-8-263

CrossRef Full Text | Google Scholar

Lehrer, R. I. (2004). Primate defensins. Nat. Rev. Microbiol. 2, 727–738. doi: 10.1038/nrmicro976

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, S., Fan, L., Sun, J., Lao, X., and Zheng, H. (2017). Computational resources and tools for antimicrobial peptides. J. Pept. Sci. 23, 4–12. doi: 10.1002/psc.2947

PubMed Abstract | CrossRef Full Text | Google Scholar

Loose, C., Jensen, K., Rigoutsos, I., and Stephanopoulos, G. (2006). A linguistic model for the rational design of antimicrobial peptides. Nature 443, 867–869. doi: 10.1038/nature05233

PubMed Abstract | CrossRef Full Text | Google Scholar

Mahlapuu, M., Håkansson, J., Ringstad, L., and Björn, C. (2016). Antimicrobial peptides: an emerging category of therapeutic agents. Front. Cell. Infect. Microbiol. 6:194. doi: 10.3389/fcimb.2016.00194

CrossRef Full Text | Google Scholar

Mathur, D., Prakash, S., Anand, P., Kaur, H., Agrawal, P., Mehta, A., et al. (2016). PEPlife: a repository of the half-life of peptides. Sci. Rep. 6:36617. doi: 10.1038/srep36617

PubMed Abstract | CrossRef Full Text | Google Scholar

Mathur, D., Singh, S., Mehta, A., Agrawal, P., and Raghava, G. P. S. (2018). In silico approaches for predicting the half-life of natural and modified peptides in blood. PLoS One 13:e0196829. doi: 10.1371/journal.pone.0196829

PubMed Abstract | CrossRef Full Text | Google Scholar

Meher, P. K., Sahu, T. K., Saini, V., and Rao, A. R. (2017). Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC. Sci. Rep. 7:42362. doi: 10.1038/srep42362

PubMed Abstract | CrossRef Full Text | Google Scholar

Mehta, D., Anand, P., Kumar, V., Joshi, A., Mathur, D., Singh, S., et al. (2014). ParaPep: a web resource for experimentally validated antiparasitic peptide sequences and their structures. Database 2014:bau051. doi: 10.1093/database/bau051

PubMed Abstract | CrossRef Full Text | Google Scholar

Melo, M. N., Ferre, R., and Castanho, M. A. R. B. (2009). Antimicrobial peptides: linking partition, activity and high membrane-bound concentrations. Nat. Rev. Microbiol. 7, 245–250. doi: 10.1038/nrmicro2095

PubMed Abstract | CrossRef Full Text | Google Scholar

Nagpal, G., Chaudhary, K., Agrawal, P., and Raghava, G. P. S. (2018). Computer-aided prediction of antigen presenting cell modulators for designing peptide-based vaccine adjuvants. J. Transl. Med. 16:181. doi: 10.1186/s12967-018-1560-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Nagpal, G., Chaudhary, K., Dhanda, S. K., and Raghava, G. P. S. (2017). Computational prediction of the immunomodulatory potential of RNA sequences. Methods Mol. Biol. 1632, 75–90. doi: 10.1007/978-1-4939-7138-1_5

PubMed Abstract | CrossRef Full Text | Google Scholar

Nagpal, G., Gupta, S., Chaudhary, K., Dhanda, S. K., Prakash, S., and Raghava, G. P. S. (2015). VaccineDA: prediction, design and genome-wide screening of oligodeoxynucleotide-based vaccine adjuvants. Sci. Rep. 5:12478. doi: 10.1038/srep12478

PubMed Abstract | CrossRef Full Text | Google Scholar

Novković, M., Simunić, J., Bojović, V., Tossi, A., and Juretić, D. (2012). DADP: the database of anuran defense peptides. Bioinformatics 28, 1406–1407. doi: 10.1093/bioinformatics/bts141

PubMed Abstract | CrossRef Full Text | Google Scholar

O’Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T., and Hutchison, G. R. (2011). Open Babel: an open chemical toolbox. J. Cheminform. 3:33. doi: 10.1186/1758-2946-3-33

PubMed Abstract | CrossRef Full Text | Google Scholar

Otvos, L., and Wade, J. D. (2014). Current challenges in peptide-based drug discovery. Front. Chem. 2:62. doi: 10.3389/fchem.2014.00062

CrossRef Full Text | Google Scholar

Oyarzún, P., and Kobe, B. (2016). Recombinant and epitope-based vaccines on the road to the market and implications for vaccine design and production. Hum. Vaccin. Immunother. 12, 763–767. doi: 10.1080/21645515.2015.1094595

PubMed Abstract | CrossRef Full Text | Google Scholar

Pirtskhalava, M., Gabrielian, A., Cruz, P., Griggs, H. L., Squires, R. B., Hurt, D. E., et al. (2016). DBAASP v.2: an enhanced database of structure and antimicrobial/cytotoxic activity of natural and synthetic peptides. Nucleic Acids Res. 44, D1104–D1112. doi: 10.1093/nar/gkv1174

PubMed Abstract | CrossRef Full Text | Google Scholar

Porto, W. F., Pires,Á. S., and Franco, O. L. (2017a). Antimicrobial activity predictors benchmarking analysis using shuffled and designed synthetic peptides. J. Theor. Biol. 426, 96–103. doi: 10.1016/j.jtbi.2017.05.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Porto, W. F., Pires, A. S., and Franco, O. L. (2017b). Computational tools for exploring sequence databases as a resource for antimicrobial peptides. Biotechnol. Adv. 35, 337–349. doi: 10.1016/j.biotechadv.2017.02.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Porto, W. F., Pires,Á. S., and Franco, O. L. (2012). CS-AMPPred: an updated SVM model for antimicrobial activity prediction in cysteine-stabilized peptides. PLoS One 7:e51444. doi: 10.1371/journal.pone.0051444

PubMed Abstract | CrossRef Full Text | Google Scholar

Price, L. B., Stegger, M., Hasman, H., Aziz, M., Larsen, J., Andersen, P. S., et al. (2012). Staphylococcus aureus CC398: host adaptation and emergence of methicillin resistance in livestock. mBio 3:e00305-11. doi: 10.1128/mBio.00305-11

PubMed Abstract | CrossRef Full Text | Google Scholar

Rammensee, H., Bachmann, J., Emmerich, N. P., Bachor, O. A., and Stevanović, S. (1999). SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50, 213–219. doi: 10.1007/s002510050595

PubMed Abstract | CrossRef Full Text | Google Scholar

Rashid, M., Singla, D., Sharma, A., Kumar, M., and Raghava, G. P. S. (2009). Hmrbase: a database of hormones and their receptors. BMC Genomics 10:307. doi: 10.1186/1471-2164-10-307

PubMed Abstract | CrossRef Full Text | Google Scholar

Roy, K., Kar, S., Das, R. N., Roy, K., Kar, S., and Das, R. N. (eds). (2015). “Chemical information and descriptors,” in Understanding the Basics of QSAR for Applications in Pharmaceutical Sciences and Risk Assessment, (Boston, MA: Academic Press), 47–80. doi: 10.1016/B978-0-12-801505-6.00002-8

CrossRef Full Text | Google Scholar

Saha, S., and Raghava, G. P. S. (2006). AlgPred: prediction of allergenic proteins and mapping of IgE epitopes. Nucleic Acids Res. 34, W202–W209. doi: 10.1093/nar/gkl343

PubMed Abstract | CrossRef Full Text | Google Scholar

Sayers, S., Ulysse, G., Xiang, Z., and He, Y. (2012). Vaxjo: a web-based vaccine adjuvant database and its application for analysis of vaccine adjuvants and their uses in vaccine development. J. Biomed. Biotechnol. 2012:831486. doi: 10.1155/2012/831486

PubMed Abstract | CrossRef Full Text | Google Scholar

Sengupta, S., Chattopadhyay, M. K., and Grossart, H.-P. (2013). The multifaceted roles of antibiotics and antibiotic resistance in nature. Front. Microbiol. 4:47. doi: 10.3389/fmicb.2013.00047

CrossRef Full Text | Google Scholar

Shao, C. (2015). Urinary protein biomarker database: a useful tool for biomarker discovery. Adv. Exp. Med. Biol. 845, 195–203. doi: 10.1007/978-94-017-9523-4_19

PubMed Abstract | CrossRef Full Text | Google Scholar

Sharma, A., Kapoor, P., Gautam, A., Chaudhary, K., Kumar, R., Chauhan, J. S., et al. (2013). Computational approach for designing tumor homing peptides. Sci. Rep. 3:1607. doi: 10.1038/srep01607

PubMed Abstract | CrossRef Full Text | Google Scholar

Sharma, A., Singla, D., Rashid, M., and Raghava, G. P. S. (2014). Designing of peptides with desired half-life in intestine-like environment. BMC Bioinformatics 15:282. doi: 10.1186/1471-2105-15-282

PubMed Abstract | CrossRef Full Text | Google Scholar

Shi, J., Zhang, J., Li, S., Sun, J., Teng, Y., Wu, M., et al. (2015). Epitope-based vaccine target screening against highly pathogenic MERS-CoV: an in silico approach applied to emerging infectious diseases. PLoS One 10:e0144475. doi: 10.1371/journal.pone.0144475

PubMed Abstract | CrossRef Full Text | Google Scholar

Singh, A., Deshpande, N., Pramanik, N., Jhunjhunwala, S., Rangarajan, A., and Atreya, H. S. (2018). Optimized peptide based inhibitors targeting the dihydrofolate reductase pathway in cancer. Sci. Rep. 8:3190. doi: 10.1038/s41598-018-21435-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Singh, H., Ansari, H. R., and Raghava, G. P. S. (2013). Improved method for linear B-cell epitope prediction using antigen’s primary sequence. PLoS One 8:e62216. doi: 10.1371/journal.pone.0062216

PubMed Abstract | CrossRef Full Text | Google Scholar

Singh, H., and Raghava, G. P. (2001). ProPred: prediction of HLA-DR binding sites. Bioinformatics 17, 1236–1237. doi: 10.1093/bioinformatics/17.12.1236

CrossRef Full Text | Google Scholar

Singh, H., and Raghava, G. P. S. (2003). ProPred1: prediction of promiscuous MHC Class-I binding sites. Bioinformatics 19, 1009–1014. doi: 10.1093/bioinformatics/btg108

PubMed Abstract | CrossRef Full Text | Google Scholar

Singh, S., Chaudhary, K., Dhanda, S. K., Bhalla, S., Usmani, S. S., Gautam, A., et al. (2016). SATPdb: a database of structurally annotated therapeutic peptides. Nucleic Acids Res. 44, D1119–D1126. doi: 10.1093/nar/gkv1114

PubMed Abstract | CrossRef Full Text | Google Scholar

Singh, S., Singh, H., Tuknait, A., Chaudhary, K., Singh, B., Kumaran, S., et al. (2015). PEPstrMOD: structure prediction of peptides containing natural, non-natural and modified residues. Biol. Direct 10:73. doi: 10.1186/s13062-015-0103-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Siwy, J., Mullen, W., Golovko, I., Franke, J., and Zürbig, P. (2011). Human urinary peptide database for multiple disease biomarker discovery. Proteomics Clin. Appl. 5, 367–374. doi: 10.1002/prca.201000155

PubMed Abstract | CrossRef Full Text | Google Scholar

Thakur, N., Qureshi, A., and Kumar, M. (2012). AVPpred: collection and prediction of highly effective antiviral peptides. Nucleic Acids Res. 40, W199–W204. doi: 10.1093/nar/gks450

PubMed Abstract | CrossRef Full Text | Google Scholar

Usmani, S. S., Bedi, G., Samuel, J. S., Singh, S., Kalra, S., Kumar, P., et al. (2017). THPdb: database of FDA-approved peptide and protein therapeutics. PLoS One 12:e0181748. doi: 10.1371/journal.pone.0181748

PubMed Abstract | CrossRef Full Text | Google Scholar

Van Dorpe, S., Bronselaer, A., Nielandt, J., Stalmans, S., Wynendaele, E., Audenaert, K., et al. (2012). Brainpeps: the blood-brain barrier peptide database. Brain Struct. Funct. 217, 687–718. doi: 10.1007/s00429-011-0375-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Veltri, D., Kamath, U., and Shehu, A. (2018). Deep learning improves antimicrobial peptide recognition. Bioinformatics 34, 2740–2747. doi: 10.1093/bioinformatics/bty179

PubMed Abstract | CrossRef Full Text | Google Scholar

Waghu, F. H., Barai, R. S., Gurung, P., and Idicula-Thomas, S. (2016). CAMPR3: a database on sequences, structures and signatures of antimicrobial peptides. Nucleic Acids Res. 44, D1094–D1097. doi: 10.1093/nar/gkv1051

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, G. (2012). Post-translational modifications of natural antimicrobial peptides and strategies for peptide engineering. Curr. Biotechnol. 1, 72–79. doi: 10.2174/2211550111201010072

CrossRef Full Text | Google Scholar

Wang, G. (2015). Improved methods for classification, prediction, and design of antimicrobial peptides. Methods Mol. Biol. 1268, 43–66. doi: 10.1007/978-1-4939-2285-7_3

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, G., Li, X., and Wang, Z. (2016). APD3: the antimicrobial peptide database as a tool for research and education. Nucleic Acids Res. 44, D1087–D1093. doi: 10.1093/nar/gkv1278

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, J., Yin, T., Xiao, X., He, D., Xue, Z., Jiang, X., et al. (2018). StraPep: a structure database of bioactive peptides. Database 2018:bay038. doi: 10.1093/database/bay038

PubMed Abstract | CrossRef Full Text | Google Scholar

Wei, L., Xing, P., Su, R., Shi, G., Ma, Z. S., and Zou, Q. (2017). CPPred-RF: a sequence-based predictor for identifying cell-penetrating peptides and their uptake efficiency. J. Proteome Res. 16, 2044–2053. doi: 10.1021/acs.jproteome.7b00019

PubMed Abstract | CrossRef Full Text | Google Scholar

Wimley, W. C., and Hristova, K. (2011). Antimicrobial peptides: successes, challenges and unanswered questions. J. Membr. Biol. 239, 27–34. doi: 10.1007/s00232-011-9343-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Wolfe, J. M., Fadzen, C. M., Choo, Z.-N., Holden, R. L., Yao, M., Hanson, G. J., et al. (2018). Machine learning to predict cell-penetrating peptides for antisense delivery. ACS Cent. Sci. 4, 512–520. doi: 10.1021/acscentsci.8b00098

PubMed Abstract | CrossRef Full Text | Google Scholar

Wright, G. D. (2014). Something old, something new: revisiting natural products in antibiotic drug discovery. Can. J. Microbiol. 60, 147–154. doi: 10.1139/cjm-2014-0063

PubMed Abstract | CrossRef Full Text | Google Scholar

Xiao, X., Wang, P., Lin, W.-Z., Jia, J.-H., and Chou, K.-C. (2013). iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. Anal. Biochem. 436, 168–177. doi: 10.1016/j.ab.2013.01.019

PubMed Abstract | CrossRef Full Text | Google Scholar

Xue, L., and Bajorath, J. (2000). Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Comb. Chem. High Throughput Screen. 3, 363–372. doi: 10.2174/1386207003331454

CrossRef Full Text | Google Scholar

Yap, C. W. (2011). PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J. Comput. Chem. 32, 1466–1474. doi: 10.1002/jcc.21707

PubMed Abstract | CrossRef Full Text | Google Scholar

Zanetti, M. (2005). The role of cathelicidins in the innate host defenses of mammals. Curr. Issues Mol. Biol. 7, 179–196.

Google Scholar

Zhang, G. L., DeLuca, D. S., Keskin, D. B., Chitkushev, L., Zlateva, T., Lund, O., et al. (2011). MULTIPRED2: a computational system for large-scale identification of peptides predicted to bind to HLA supertypes and alleles. J. Immunol. Methods 374, 53–61. doi: 10.1016/j.jim.2010.11.009

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, H., Loriaux, P., Eng, J., Campbell, D., Keller, A., Moss, P., et al. (2006). UniPep–a database for human N-linked glycosites: a resource for biomarker discovery. Genome Biol. 7:R73. doi: 10.1186/gb-2006-7-8-R73

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Q., Wang, P., Kim, Y., Haste-Andersen, P., Beaver, J., Bourne, P. E., et al. (2008). Immune epitope database analysis resource (IEDB-AR). Nucleic Acids Res. 36, W513–W518. doi: 10.1093/nar/gkn254

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: chemically modified peptides, antimicrobial peptide prediction, machine learning technique, resistance, fingerprints, peptide therapeutics

Citation: Agrawal P and Raghava GPS (2018) Prediction of Antimicrobial Potential of a Chemically Modified Peptide From Its Tertiary Structure. Front. Microbiol. 9:2551. doi: 10.3389/fmicb.2018.02551

Received: 08 August 2018; Accepted: 05 October 2018;
Published: 26 October 2018.

Edited by:

Octavio Luiz Franco, Universidade Católica de Brasília, Brazil

Reviewed by:

William Farias Porto, Universidade Católica Dom Bosco, Brazil
Piyush Baindara, National Centre for Biological Sciences, India

Copyright © 2018 Agrawal and Raghava. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Gajendra P. S. Raghava, cmFnaGF2YUBpaWl0ZC5hYy5pbg==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Prediction of Antimicrobial Potential of a Chemically Modified Peptide From Its Tertiary Structure

Introduction

Materials and Methods

Dataset Creation

Internal and External Validation Dataset

Additional Dataset

Model Development

Feature Computation From Peptide Structures

Atom composition

Diatom composition

Chemical descriptors

Statistical Analysis

Binary Profiles

Performance Measure

Results

Analysis

Machine Learning Based Prediction Model

Composition Based Prediction

Chemical Descriptors Based Prediction

Significance of Features

Binary Profile Based Prediction

Additional Dataset Performance

Implementation of the Web Server

Standalone

Discussion

Author Contributions

Conflict of Interest Statement

Acknowledgments

Supplementary Material

Footnotes

References

94% of researchers rate our articles as excellent or good

94% of researchers rate our articles as excellent or good