About this Research Topic
In this Research Topic, our main focus is to find excellent classification research of DNA, RNA and amino acid sequences. We also attempt to solve two recurring problems in biological data analysis.
1. Data imbalance
Most machine learning algorithms assume the training data is balanced while data in the real world is usually imbalanced. Thus, largely affecting the reliability and application of prediction tools with the class imbalance problem should be proposed and their effectiveness are compared.
2. Feature embedding models
Sequence data representation is a major factor in controlling model performance, selection of suitable embedding models is essential for model training.
We welcome submissions in the research areas: of bioinformatics and machine learning. Authors who focus on DNA, RNA, protein classification, genome assembly, annotation and functional analysis from the next-generation sequencing data are welcome.
The scope of the Research Topic includes topics but is not limited to:
1. Machine learning
2. Deep learning
3. Data imbalance
4. Sequence analysis
5. Sequence representation
6. Protein sequence embedding
7. Post-translational modifications
8. Sequence modification
Keywords: Machine learning; Deep learning; Biological sequence analysis
Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.