Skip to main content

REVIEW article

Front. Bioinform.
Sec. Genomic Analysis
Volume 4 - 2024 | doi: 10.3389/fbinf.2024.1457619

A review of model evaluation metrics for machine learning in genetics and genomics

Provisionally accepted
  • 1 Liggins Institute, The University of Auckland, Auckland, Auckland, New Zealand
  • 2 Maurice Wilkins Centre for Molecular Biodiscovery, Faculty of Science, The University of Auckland, Auckland, Auckland, New Zealand
  • 3 MRC Lifecourse Epidemiology Unit, Faculty of Medicine, University of Southampton, Southampton, Hampshire, United Kingdom
  • 4 Singapore Institute for Clinical Sciences (A*STAR), Singapore, Singapore

The final, formatted version of the article will be published soon.

    Machine learning (ML) has shown great promise in genetics and genomics where large and complex datasets have the potential to provide insight into many aspects of disease risk, pathogenesis of genetic disorders, and prediction of health and well-being. However, with this possibility there is a responsibility to exercise caution against biases and inflation of results that can have harmful unintended impacts. Therefore, researchers must understand the metrics used to evaluate ML models which can influence the critical interpretation of results. In this review we provide an overview of ML metrics for clustering, classification, and regression and highlight the advantages and disadvantages of each. We also detail common pitfalls that occur during model evaluation. Finally, we provide examples of how researchers can assess and utilise the results of ML models, specifically from a genomics perspective.

    Keywords: metrics, machine learning, Genomics prediction, clustering, Classification, regression, Disease Prediction

    Received: 01 Jul 2024; Accepted: 27 Aug 2024.

    Copyright: © 2024 Miller, Portlock, Nyaga and O'Sullivan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Catriona Miller, Liggins Institute, The University of Auckland, Auckland, 1142, Auckland, New Zealand
    Justin M. O'Sullivan, Liggins Institute, The University of Auckland, Auckland, 1142, Auckland, New Zealand

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.