We aim to develop myopia classification models based on machine learning algorithms for each schooling period, and further analyze the similarities and differences in the factors influencing myopia in each school period based on each model.
Retrospective cross-sectional study.
We collected visual acuity, behavioral, environmental, and genetic data from 7,472 students in 21 primary and secondary schools (grades 1–12) in Jiamusi, Heilongjiang Province, using visual acuity screening and questionnaires.
Machine learning algorithms were used to construct myopia classification models for students at the whole schooling period, primary school, junior high school, and senior high school period, and to rank the importance of features in each model.
The main influencing factors for students differ by school section, The optimal machine learning model for the whole schooling period was Random Forest (AUC = 0.752), with the top three influencing factors being age, myopic grade of the mother, and Whether myopia requires glasses. The optimal model for the primary school period was a Random Forest (AUC = 0.710), with the top three influences being the myopic grade of the mother, age, and extracurricular tutorials weekly. The Junior high school period was an Support Vector Machine (SVM; AUC = 0.672), and the top three influencing factors were gender, extracurricular tutorial subjects weekly, and whether can you do the “three ones” when reading and writing. The senior high school period was an XGboost (AUC = 0.722), and the top three influencing factors were the need for spectacles for myopia, average daily time spent outdoors, and the myopic grade of the mother.
Factors such as genetics and eye use behavior all play an essential role in students’ myopia, but there are differences between school periods, with those in the lower levels focusing on genetics and those in the higher levels focusing on behavior, but both play an essential role in myopia.