Classification of autism spectrum disorder using electroencephalography in Chinese children: a cross-sectional retrospective study

Ke, Si Yang; Wu, Huiwen; Sun, Haoqi; Zhou, Aiqin; Liu, Jianhua; Zheng, Xiaoyun; Liu, Kevin; Westover, M. Brandon; Xu, Haiqing; Kong, Xue-jun

doi:10.3389/fnins.2024.1330556

ORIGINAL RESEARCH article

Front. Neurosci. , 25 January 2024

Sec. Neurodevelopment

Volume 18 - 2024 | https://doi.org/10.3389/fnins.2024.1330556

Classification of autism spectrum disorder using electroencephalography in Chinese children: a cross-sectional retrospective study

Updated

A correction has been applied to this article in:

Corrigendum: Classification of autism spectrum disorder using electroencephalography in Chinese children: a cross-sectional retrospective study
1. Read correction

Si Yang Ke¹^†

Huiwen Wu²^†

Haoqi Sun^1,3

Aiqin Zhou²

Jianhua Liu⁴

Xiaoyun Zheng²

Kevin Liu^1,5

M. Brandon Westover^3,6^‡

Haiqing Xu²^*^‡

Xue-jun Kong^1,7^*^‡

¹Anthinoula A. Martinos Center, Massachusetts General Hospital, Charlestown, MA, United States
²Hubei Maternity and Child Health Hospital, Wuhan, Hubei, China
³Department of Neurology, Beth Israel Deaconess Medical Center, Boston, MA, United States
⁴Huangshi Maternity and Child Health Care Hospital, Huangshi, Hubei, China
⁵Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
⁶Department of Neurology, Massachusetts General Hospital, Boston, MA, United States
⁷Department of Psychiatry, Beth Israel Deaconess Medical Center, Beth Israel Deaconess Medical Center, Boston, MA, United States

Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by diverse clinical features. EEG biomarkers such as spectral power and functional connectivity have emerged as potential tools for enhancing early diagnosis and understanding of the neural processes underlying ASD. However, existing studies yield conflicting results, necessitating a comprehensive, data-driven analysis. We conducted a retrospective cross-sectional study involving 246 children with ASD and 42 control children. EEG was collected, and diverse EEG features, including spectral power and spectral coherence were extracted. Statistical inference methods, coupled with machine learning models, were employed to identify differences in EEG features between ASD and control groups and develop classification models for diagnostic purposes. Our analysis revealed statistically significant differences in spectral coherence, particularly in gamma and beta frequency bands, indicating elevated long range functional connectivity between frontal and parietal regions in the ASD group. Machine learning models achieved modest classification performance of ROC-AUC at 0.65. While machine learning approaches offer some discriminative power classifying individuals with ASD from controls, they also indicate the need for further refinement.

1 Introduction

Autism spectrum disorder (ASD) is characterized by a range of neurodevelopmental disabilities affecting an individual’s social interactions, communication, behavior, learning, and overall functioning [National Institute of Mental Health (NIMH), 2022]. The diagnosis of ASD currently relies primarily on developmental and behavior screenings, as there are no definitive medical tests available [Centers for Disease Control and Prevention (CDC), 2022]. Researchers have recently turned their attention to the potential of utilizing neuroimaging modalities, such as electroencephalography (EEG) biomarkers to enhance ASD diagnosis. These EEG biomarkers hold promise as objective tools for early screening and intervention, shedding light on the underlying neural processes associated with ASD (Gurau et al., 2017). Several studies have explored EEG characteristics, such as spectral power and functional connectivity across various frequency bands, in children with ASD (Brihadiswaran et al., 2019; Das et al., 2023). Although certain studies consistently demonstrate notable distinctions in spectral power and functional connectivity between autistic and typically developing children, conflicting results have also been reported (Garcés et al., 2022). These inconsistencies can be seen even in the well supported U-shaped profile of electrophysiological power changes noted in children with ASD as compared to controls (Wang et al., 2013). As described by Wang and colleagues, the U-shaped profile of EEG power encompasses increased spectral power at high frequency (beta, gamma) and low frequency (delta, theta) bands, with reduced power noted in the alpha band. However, several studies have revealed inconsistencies in this U-shape pattern of EEG spectral power, with reduced power in ASD individuals and enhanced (Zhao et al., 2023) or unaffected alpha power (Liao et al., 2022) in most brain regions. In addition, EEG coherence studies with ASD individuals demonstrate similar inconsistencies across various frequency bands, with alterations in local and long-distance coherence within and between different brain regions and hemispheres (Milovanovic and Grujicic, 2021). Thus, overall, no consensus has been achieved within the ASD research community. This underscores the importance of conducting a data-driven analysis on a sizable sample of young children with ASD aiming to identify the EEG functional connectivity/coherence specific features and their relationship with ASD core symptoms, the neurobiology basis behind, and the potential value for ASD early diagnosis and subtyping.

In this study, we analyzed resting-state EEG from a total of 288 children, comprising 246 with ASD and 42 controls. Our approach extracted diverse EEG features related to spectral power, functional connectivity, complexity, and time domain statistics. Our primary objective was to identify differences in these EEG features between the ASD and control groups, employing statistical inference and machine learning methods.

2 Methods

2.1 Cohort and study design

The study was a retrospective cross-sectional study conducted over a period of 3 years and 1 month, from October 2013 to November 2016, at the Hubei Provincial Women and Children’s Hospital in Wuhan, China. The study initially included a total of 362 participants with clinically diagnosed ASD and 57 health control participants. ASD participants were recruited from children that were visiting the neurology department for ASD and other neurological disorders. On the other hand, control participants were recruited from children that came to the hospital for routine physicals (i.e., well-child visits). ASD diagnoses were made from a team of two clinicians based upon The Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (APA, 2013) and Childhood Autism Rating Scale second edition (Schopler, 2010). Participants with neurological organic lesions, genetic metabolic diseases, hearing impairment, and psychiatric disorders were excluded from the study. 121 participants who had their eyes closed were excluded from the study due to the heterogenous electrophysiology between individuals with eyes open and those with eyes closed. Moreover, the substantially higher number of participants with eyes open, in contrast to those with eyes closed, contributed to this decision. Table 1 provides information on the final cohort of 288 participants included in the analysis.

Table 1

Table 1. Participant characteristics.

2.2 Institutional review board approvals

All participants included in the study were given informed consent by their parents and guardians subject to oversight and approval by the ethics committee at Hubei Women and Children’s Hospital. The research protocol for data analysis in this study was submitted to the Massachusetts General Hospital Institutional Review Board (IRB) on 18 August 2022. The protocol received approval under the IRB Number: 2022P002152.

2.3 EEG data acquisition

The EEG data acquisition process utilized BrainMaster Discovery 24 system, designed to capture EEG signals spanning from DC 0 Hz to 80 Hz, boasting 24-bit precision. Application of up to 24 scalp electrodes was meticulously performed using collodion and adhering to precise measurement protocols. The sampling frequency was 256 Hz. The participants were required to sit in a comfortable chair in a quiet room for about 5 min. EEG recordings were conducted during resting but wakeful state and no visual cues were presented to the participants. Participants were instructed to close their eyes during the recording; however, due to the young age of the participants, most were unable to follow the instructions and keep their eyes closed for the entire duration of the EEG measurement. The subsequent analytical focus was centered on 19 channels available for all participants with channel placement following the international 10–20 system. This set of channels encompassed Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, and O2 with A1 being the reference channel at left mastoid.

2.4 Data preprocessing

The EEG preprocessing process was primarily carried out using the MNE-python package version 1.3.1 (Larson et al., 2022). First, a notch filter was employed to eliminate the 50 Hz line noise caused by powerline interference. Following this step, a bandpass filter ranging from 0.5 Hz to 42 Hz was applied. Next, average referencing was applied as the re-referencing method.

To address problematic channels, EEGLAB tool (Delorme and Makeig, 2004) was employed. Channels were designated as “bad” and consequently removed under specific criteria: channels with flat activity for over 5 s, those exhibiting a high-frequency noise standard deviation lower than 4.5 μV, and channels displaying a correlation lower than 0.7 with neighboring channels (SCCN, 2023). Subsequently, EEG waveforms were plotted in the time domain and subjected to visual inspection. As a result of the visual inspection, 6 participants characterized by extreme EEG artifacts were excluded from subsequent analyses.

The EEG waveforms were then divided into 10-s epochs, each with a 2-s overlap. Any epochs deemed “bad” were excluded from further analysis if their maximum peak-to-peak signal amplitude exceeded 10,000 μV or their minimum peak-to-peak signal amplitude fell below 0.1 μV. Finally, 4 participants with total length less than 1 min after removal of bad epochs were excluded from subsequent analyses. The final number of EEG samples that went into the analysis was 246 samples for ASD and 42 samples for control.

2.5 Feature extraction

Since our approach is a data-driven approach, we included a variety of EEG features. Five frequency bands were defined as delta (1–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), and gamma (30–42 Hz). EEG features were extracted from each epoch and averaged across all epochs and 1,046 features were extracted in total from the EEG of each participant. Spectral power for all five bands and each channel combination was first obtained using multi-taper spectral estimation (Slepian, 1978), where we used 7 tapers and therefore having a frequency resolution of 0.4 Hz. Subsequently, relative spectral power was calculated by normalizing the spectral power over the total power (0.5-42 Hz), yielding values between 0 and 1. A total of 95 relative spectral power features were computed.

Channel-to-channel spectral coherence was calculated for each of the 5 bands. 171 channel-to-channel spectral coherence features were computed for each band and a total of 855 spectral coherence features were computed for all 5 bands in total. The computation of spectral coherence features was done using the spectral_connectivity_epochs() function from the mne_connectivity package (MNE-Connectivity, 2023).

Several common statistical features in time-domain—sample entropy, skew, kurtosis, standard deviation, and mean—were all calculated for each of the 19 channels using respective functions from the mne-features package (Schiratti et al., 2018). This yielded a total of 95 common time-domain statistical features. Lastly, a binary feature was computed to check whether a participant’s EEG has alpha band using a method (Corcoran et al., 2018) and associated software (Corcoran et al., 2019) that quantifies individual alpha frequency.

2.6 Statistical inference

Here, the EEG feature served as the outcome variable, reflecting the dependent variable within the regression model. On the other hand, ASD/non-ASD represented the exposure variable, serving as the independent variable in the regression analysis. Additionally, demographic variables were integrated as covariates in the model to reduce the effects of confounding.

To further mitigate potential confounding effects stemming from the demographic variables, a propensity model based on logistic regression was executed. This facilitated an optimal full match (Hansen and Klopfer, 2006) procedure through the utilization of the MatchIt package (Ho et al., 2011) in R. The resultant matching weights obtained from the optimal full match were then incorporated as propensity score weights (Greifer, 2023) within the regression models using the Survey package (Lumley, 2023) in R. The entire process encompassed the execution of a total of 1,046 regression models, each corresponding to a distinct EEG feature that had been extracted. For a detailed explanation of the optimal full match and the utilization of matching weights in the regression models, please refer to Supplementary Data File 1.

Within this framework, logarithmic transformations were applied to EEG feature variables with strictly positive values, while EEG feature variables ranging between 0 and 1 underwent logit transformations. The coefficients linked to the exposure variable (ASD/non-ASD) were computed for every regression model, accompanied by their respective value of ps. Specifically, EEG features with Bonferroni-adjusted value of p for exposure variable coefficients lower than 0.1 were designated as statistically significant. Figure 1 depicts the entire inference workflow.

Figure 1

Figure 1. Statistical inference workflow. ASD represents the binary variable for ASD/control. Transformation applied to EEG feature is dependent on feature type.

2.7 Machine learning classification

After the statistical inference, a leave-one-out-cross-validation (LOOCV) was performed where the entire inference step was nested within each split. That is during each split, the inference step is applied to each training set to select the top K features based on ranked Bonferroni adjusted p-values. Here we choose K = 15 based on a trade-off between (1) including as few features as possible to maintain a reasonable ratio between sample size and feature size to avoid overfitting; and (2) including as many features as possible to allow for the following nested forward feature selection.

The nested forward feature selection involved ridge logistic regression using the Scikit-learn package (Pedregosa et al., 2012). It was performed on each LOOCV training split using the top 15 features selected from the nested inference step. Notably, each ridge logistic regression was, in fact, a nested grid search to find the optimal alpha (i.e., regularization strength) that maximizes ROC-AUC. Moreover, to address the high-class imbalance, the minority class was upweighted in the loss function of the ridge logistic regression model. The features that achieved best nested cross validation performance were then used to fit another ridge logistic regression model, employing grid search on the training split to determine the optimal alpha. Subsequently, this model was used for inference on the single sample left out. We have included code to perform the aforementioned LOOCV procedure in Supplementary Data File 2.

2.8 Univariate classification of ASD using individual EEG features

In addition, each of the statistically significant features from the statistical inference step were used to classify ASD/non-ASD as standalone features for male and female participants separately, and an optimal threshold was chosen on the receiver operator characteristic curve (ROC) using the point closest to the top-left point of the ROC plot with perfect sensitivity and specificity.

3 Results

3.1 Comparison of EEG features in ASD vs. control

The covariates were balanced after optimal full matching (Supplementary Table 1), which reduced confounding bias in the covariates. Out of the 1,046 regression models analyzed in the matched cohort, only 10 exhibited a Bonferroni-adjusted value of p of less than 0.1 for the coefficient related to the binary ASD/non-ASD variable (Table 2). To facilitate a better understanding of the 0.1 significance level, we present histograms illustrating the distribution of raw p-values and their corresponding adjustments in Supplementary Figure 1. The 10 statistically significant EEG features exclusively consisted of spectral coherence features, all revealing elevated channel-to-channel spectral coherence within the ASD group. Notably, most instances of channel-to-channel spectral coherence were concentrated in the gamma and beta bands, as depicted in Figure 2. Within the gamma band, enhanced spectral coherence becomes evident among pairs like P4 and C3, P4 and Fz, P4 and F4, Pz and Fz, as well as Pz and Fz. Similarly, in the beta band, augmented spectral coherence was observable between P4 and C3, P4 and Fz, and Pz and Fz. Moreover, the alpha and delta bands showed statistically significant spectral coherence between Pz and T5. No statistically significant spectral coherence was observed in the theta band.

Table 2

Table 2. Statistically significant EEG features.

Figure 2

Figure 2. Statistically significant channel-to-channel spectral coherence features plotted on standard 10–20 montage. All connection lines indicate hyperconnectivity.

3.2 Machine learning classification of ASD using EEG features

The area under the receiver operating characteristic curve (ROC-AUC) from the LOOCV are 0.6 (95% confidence interval 0.50–0.69), 0.64 (0.54–0.73), and 0.65 (0.54–0.74) respectively for the baseline model that only included age and sex as features, the model that used inverse propensity score weighting (IPW) in the nested inference step for feature selection, and the model that used optimal full match in the nested inference step for feature selection (Figure 3A). The precision-recall curve AUC (PRC-AUC) from LOOCV for the three models are 0.89 (0.83–0.93), 0.91 (0.86–0.95), and 0.90 (0.84–0.94), respectively, for the three models (Figure 3B). The high PRC-AUCs are due to a 6 to 1 class imbalance of ASD participants (class 1) to control participants (class 0). Our models with EEG features only resulted in, at best, a 0.05 increase in ROC-AUC and a 0.02 increase in PRC-AUC. Moreover, the positive predictive value is calculated to be (0.89, 0.90, 0.91) and negative predictive value is calculated to be (0.24, 0.24, 0.29) for the baseline model, the model that used optimal full match in the nested inference step for feature selection, and the model that used IPW in the nested inference step for feature selection. Figure 4 depicts the confusion matrices from the LOOCV for the three models.3.3 Univariate Classification of ASD using Individual EEG Features.

Figure 3

Figure 3. ROC curves and PRC curves resulting from machine learning LOOCV. (A) ROC curves (B) PRC curves. Baseline model represents a model that only uses age and sex as features. IPW and full match (i.e., optimal full match) represent different methods used in the nested inference step for feature selection. No skill model in panel (B) represents a model that classifies every participant as ASD. Bootstrapped confidence intervals for AUC curves are depicted by the shaded colors.

Figure 4

Figure 4. Confusion matrices resulting from machine learning LOOCV. (A) Confusion matrix for baseline model. (B) Confusion matrix for the model that uses IPW in the nested inference step for feature selection. (C) Confusion matrix for the model that uses full match in the nested inference step for feature selection.

Since machine learning classification using EEG features as predictive features only moderately enhanced model performance, we performed univariate analysis of the top 10 statistically significant EEG features to classify ASD/non-ASD for male and female participants separately. Supplementary Figure 2 depicts statistically significant spectral coherence feature plotted against CARS numeric total score for male and female separately, with the vertical line indicating the optimal cutoff. Although there is no statistically significant correlation between the 10 spectral coherence features with CARS numeric total score after Bonferroni correction, several spectral coherence features did exhibit relatively good discriminative performance for specific sex. For example, gamma band spectral coherence between Pz and Fz achieved sensitivity of 0.7 and specificity of 0.64 for male participants (Supplementary Figure 2J), and delta band spectral coherence between Pz and T5 achieved sensitivity of 0.75 and specificity of 0.78 for female participants (Supplementary Figure 2E).

4 Discussion

We extracted 1,046 features from the EEG of 246 children with ASD and 42 control children in a relaxed eye-opened condition. Statistical inference revealed only increased functional connectivity which was concentrated mostly in the gamma and beta frequency bands, but also in delta and alpha bands. Machine learning classification using a LOOCV framework showed that the statistically significant EEG features only moderately enhanced classification performance (0.05 increase in AUC).

Among the 10 statistically significant channel-to-channel spectral coherence features we measured, five of them pertain to the gamma band. This aligns with multiple studies that have previously reported gamma band abnormalities in individuals with ASD (Sheikhani et al., 2012; Rojas and Wilson, 2014). We observed increased long-range gamma coherence between frontal and parietal lobe channels for the ASD participants in both the statistical inference approach and the univariate approach. Although most previous studies have primarily reported reduced long-range spectral coherence in ASD (Perez Velazquez et al., 2009; Khan et al., 2013), a few studies have also reported solely increased coherence both short range and long range (Sheikhani et al., 2012; Wang et al., 2020). Long-range connectivity is a higher-level brain function that allows different brain regions to quickly coordinate and integrate information, enabling coherent behavioral and cognitive responses (Wang et al., 2020). The increased coherence observed in ASD participants may indicate a failure of developmentally appropriate pruning or die-back and may interfere with normal cortical processing. Alternatively, the increased coherence may represent a compensatory attempt of the autistic brain which leads to the formation of atypical, spatially disparate, cortical networks in an attempt to replace function normally subserved by assumed-to-be deficient more localized networks (Duffy and Als, 2012). Moreover, weak central coherence (WCC) is a special feature and in-built cognitive style of ASD (Happé and Frith, 2006; Bojda et al., 2021) presenting difficulty to form meaningful links from similar parts such as generalizing forest from individual trees. In other words, ASD patients stay super focused on certain details such as an individual tree which reflects on the hyperconnectivity or increased coherence between certain brain regions to specific stimuli. This could be the roof of restricted interests and repetitive behaviors (RRB) which is one of the two core symptoms of ASD. RRB is critical for ASD diagnosis (Hyman et al., 2020) and its early restrictive interest in some non-social objects contributed to the ASD’s another core symptom social communication deficit (Sasson and Touchstone, 2014). Two previous studies indicated that increased α connectivity at 14 months was associated with later ASD diagnosis and RRB severity (Orekhova et al., 2014; Haartsen et al., 2019). A recent study displayed a strong association between EEG functional connectivity and RRBs and suggests its potential utilization as a biomarker to differentiate individuals with and without ASD (Sun et al., 2023). The natural links of certain obsessive attentions and their corresponding hyperconnectivity pathways in different brain regions provide reliable early diagnosis and subtyping guidance.

There are other studies that used machine learning to classify ASD vs. control group using resting state EEG recordings. One study used features derived from recurrence quantification analysis (Heunis et al., 2018) methods to classify ASD vs. controls and achieved high (0.97) accuracy in an approximately 1:1 ASD to control sample size (Bosl et al., 2017). Another study combined spectral power and eye tracking features for classification and achieved a good (0.93) AUC (Kang et al., 2020). However, a similar study also used power spectrum and functional connectivity features for classification also achieved poor results (accuracy 47%–57%; Garcés et al., 2022). These disparities in classification performances could be attributed to reasons such as differences EEG instruments, processing pipeline, feature extraction procedures, sample sizes, participant demographics, and machine learning methodologies etc. One methodology used in our study worth noting is combining optimal full matching and outcome regression analysis in the statistical inference. Matching is a non-parametric way of ensuring that the treated and control groups are similar in terms of subpopulation demographics which reduces confounding bias during group comparison of EEG features. After matching, the use of outcome regression increases precision in the effect estimates. Moreover, nesting the inference step inside LOOCV as the feature selection method facilitates a feature selection that is based on statistical significance without leaking information into the test splits.

In the future, we plan to use novel machine learning methods such as creating pre-trained deep learning foundation model on publicly available large EEG datasets and then fine-tune for ASD classification on this dataset.

4.1 Limitations

Several limitations exist in the study. First, the female sample is drastically smaller than the male sample size (44 vs. 244). There is also a big relative imbalance in sex where 87% of the participants with ASD are male and only 13% of ASD participants are female. Although propensity score matching was used to balance the subpopulation and demographic covariates were adjusted for in the outcome regression models, potential bias could still exist. Moreover, the choice of excluding eyes-closed participants significantly reduced our participant sample size for data analysis. In addition, the study follows a cross-sectional study design which has inherent weaknesses such as being prone to sampling bias and potential confounders (Wang and Cheng, 2020). Furthermore, the recruitment strategy resulted in a substantial disparity between ASD and control participants, with a disproportionate representation of ASD participants due to challenges in obtaining consent from parents of children in the control group. The reluctance of parents with healthy children to participate led to a significantly lower number of control participants, introducing a substantial 6 to 1 class imbalance. The class imbalance could have significant impact on both the machine learning model performance and the regression coefficient estimates during statistical inference, due to the potential of the models biasing the majority class (Luque et al., 2019).

Another major limitation of the paper is not having a held-out test set (external validation) that was never used in any model training, tuning, and feature selection. We used nested LOOCV during machine learning classification of ASD due to the small sample size; however, it poses risks for potential leakage, overfitting, and inflated estimation of model performance. The risk of overfitting and inflated performance from cross validation has been reported in multiple systematic reviews of machine learning classification in neurological and developmental disorders (Pulini et al., 2019; Vabalas et al., 2019). Even though we nested feature selection in the training splits of LOOCV, our approach could still be considered a form of circular analysis or “peeking” (Pulini et al., 2019). This is because statistical analysis was performed on the entire dataset prior to performing machine learning classification and statistical analysis helped to inform the rough number (i.e., threshold) of features to pass to the nested forward selection step during LOOCV. Thus, the reported machine learning classification performance in this paper could be inflated.

Lastly, CARS does not have sub-scores of social deficits and RRB as Autism Diagnostic Observation Schedule (ADOS) does, so that we were unable to make direct correlation between our increased coherence and RRB which we believe they could be very likely correlated.

Data availability statement

The dataset used in this research is protected information that is not publicly available. Requests to access the datasets should be directed to eGtvbmcxQG1naC5oYXJ2YXJkLmVkdQ==.

Ethics statement

The studies involving humans were approved by Ethics Committee at Hubei Women and Children’s Hospital and Massachusetts General Hospital Institutional Review Board. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants’ legal guardians/next of kin. Written informed consent was obtained from the individual(s), and minor(s)’ legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article.

Author contributions

SK: Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft. HW: Data curation, Writing – review & editing. HS: Investigation, Methodology, Software, Writing – review & editing. AZ: Data curation, Writing – review & editing. JL: Data curation, Writing – review & editing. XZ: Data curation, Writing – review & editing. KL: Project administration, Writing – review & editing. MW: Methodology, Supervision, Writing – review & editing. HX: Funding acquisition, Supervision, Writing – review & editing. X-jK: Funding acquisition, Project administration, Supervision, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. The data collection part of the study was funded by Hubei Province Natural Science Foundation, grant number 2012FFA064. The data analysis part of the study was funded by Massachusetts General Hospital, grant number 233263.

Conflict of interest

MW has private equity as co-founder of Beacon Biosignals and receives compensation for consulting and scientific advisory roles.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnins.2024.1330556/full#supplementary-material

References

APA (2013). The diagnostic and statistical manual of mental disorders (fifth). Arlington, Virginia, USA: American Psychiatric Association.

Google Scholar

Bojda, A., Srebnicki, T., Konowałek, Ł., and Bryńska, S. (2021). A. Weak central coherence—construct conception, development, research methods. Psychiatr. Pol. 55, 1373–1386. doi: 10.12740/PP/OnlineFirst/120931

PubMed Abstract | Crossref Full Text | Google Scholar

Bosl, W. J., Loddenkemper, T., and Nelson, C. A. (2017). Nonlinear EEG biomarker profiles for autism and absence epilepsy. Neuropsychiatr Electrophysiol 3:1. doi: 10.1186/s40810-017-0023-x

Crossref Full Text | Google Scholar

Brihadiswaran, G., Haputhanthri, D., Gunathilaka, S., Meedeniya, D., and Jayarathna, S. (2019). EEG-based processing and classification methodologies for autism Spectrum disorder: A review. J. Comput. Sci. 15, 1161–1183. doi: 10.3844/jcssp.2019.1161.1183

Crossref Full Text | Google Scholar

Centers for Disease Control and Prevention (CDC) (2022). "Screening and Diagnosis|Autism Spectrum Disorder (ASD)|NCBDDD." Available at: https://www.cdc.gov/ncbddd/autism/screening.html (Accessed 31 March 2022).

Google Scholar

Corcoran, A., Alday, P., and Bornkessel-Schlesewsky, I. (2019). Corcorana/restingIAF: RestingIAF v1.0.3 (v1.0.3) [computer software] Zenodo.

Google Scholar

Corcoran, A. W., Alday, P. M., Schlesewsky, M., and Bornkessel-Schlesewsky, I. (2018). Toward a reliable, automated method of individual alpha frequency (IAF) quantification. Psychophysiology 55:e13064. doi: 10.1111/psyp.13064

PubMed Abstract | Crossref Full Text | Google Scholar

Das, S., Zomorrodi, R., Mirjalili, M., Kirkovski, M., Blumberger, D. M., Rajji, T. K., et al. (2023). Machine learning approaches for electroencephalography and magnetoencephalography analyses in autism spectrum disorder: A systematic review. Prog. Neuropsychopharmacol. Biol. Psychiatry 123:110705. doi: 10.1016/j.pnpbp.2022.110705

PubMed Abstract | Crossref Full Text | Google Scholar

Delorme, A., and Makeig, S. (2004). EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J. Neurosci. Methods 134, 9–21. doi: 10.1016/j.jneumeth.2003.10.009

PubMed Abstract | Crossref Full Text | Google Scholar

Duffy, F. H., and Als, H. (2012). A stable pattern of EEG spectral coherence distinguishes children with autism from neuro-typical controls—A large case control study. BMC Med. 10:64. doi: 10.1186/1741-7015-10-64

PubMed Abstract | Crossref Full Text | Google Scholar

Garcés, P., Baumeister, S., Mason, L., Chatham, C. H., Holiga, S., Dukart, J., et al. (2022). Resting state EEG power spectrum and functional connectivity in autism: A cross-sectional analysis. Mol. Autism. 13:22. doi: 10.1186/s13229-022-00500-x

PubMed Abstract | Crossref Full Text | Google Scholar

Greifer, N. (2023). Matching Weights are Propensity Score Weights | Noah Greifer. Available at: https://ngreifer.github.io/blog/matching-weights/ (Accessed 6 September 2023).

Google Scholar

Gurau, O., Bosl, W. J., and Newton, C. R. (2017). How useful is electroencephalography in the diagnosis of autism Spectrum disorders and the delineation of subtypes: A systematic review. Front. Psych. 8:121. doi: 10.3389/fpsyt.2017.00121

PubMed Abstract | Crossref Full Text | Google Scholar

Haartsen, R., Jones, E. J. H., Orekhova, E. V., Charman, T., and Johnson, M. H.BASIS team (2019). Functional EEG connectivity in infants associates with later restricted and repetitive behaviours in autism; a replication study. Transl. Psychiatry 9:66. doi: 10.1038/s41398-019-0380-2

PubMed Abstract | Crossref Full Text | Google Scholar

Hansen, B. B., and Klopfer, S. O. (2006). Optimal full matching and related designs via network flows. J. Comput. Graph. Stat. 15, 609–627. doi: 10.1198/106186006X137047

Crossref Full Text | Google Scholar

Happé, F., and Frith, U. (2006). The weak coherence account: detail-focused cognitive style in autism spectrum disorders. J. Autism Dev. Disord. 36, 5–25. doi: 10.1007/s10803-005-0039-0

PubMed Abstract | Crossref Full Text | Google Scholar

Heunis, T., Aldrich, C., Peters, J. M., Jeste, S. S., Sahin, M., Scheffer, C., et al. (2018). Recurrence quantification analysis of resting state EEG signals in autism spectrum disorder – a systematic methodological exploration of technical and demographic confounders in the search for biomarkers. BMC Med. 16:101. doi: 10.1186/s12916-018-1086-7

PubMed Abstract | Crossref Full Text | Google Scholar

Ho, D. E., Imai, K., King, G., and Stuart, E. A. (2011). MatchIt: nonparametric preprocessing for parametric causal inference. J. Stat. Softw. 42:i08. doi: 10.18637/jss.v042.i08

Crossref Full Text | Google Scholar

Hyman, S. L., Levy, S. E., and Myers, S. M. (2020). Identification, evaluation, and Management of Children with Autism Spectrum Disorder. Pediatrics 145:e20193447. doi: 10.1542/peds.2019-3447

Crossref Full Text | Google Scholar

Kang, J., Han, X., Song, J., Niu, Z., and Li, X. (2020). The identification of children with autism spectrum disorder by SVM approach on EEG and eye-tracking data. Comput. Biol. Med. 120:103722. doi: 10.1016/j.compbiomed.2020.103722

PubMed Abstract | Crossref Full Text | Google Scholar

Khan, S., Gramfort, A., Shetty, N. R., Kitzbichler, M. G., Ganesan, S., Moran, J. M., et al. (2013). Local and long-range functional connectivity is reduced in concert in autism spectrum disorders. Proc. Natl. Acad. Sci USA. 110, 3107–3112. doi: 10.1073/pnas.1214533110

Crossref Full Text | Google Scholar

Larson, E., Gramfort, A., Engemann, D. A., Leppakangas, J., Brodbeck, C., Jas, M., et al. (2022). MNE-Python (1.2.3) [computer software] Zenodo.

Google Scholar

Liao, M., Duan, H., and Wang, G. (2022). Application of machine learning techniques to detect the children with autism Spectrum disorder. J Healthcare Engineer 2022, 9340027–9340010. doi: 10.1155/2022/9340027

PubMed Abstract | Crossref Full Text | Google Scholar

Lumley, T. (2023). Survey: analysis of complex survey samples (4.2) [R]. Available at: https://cran.r-project.org/web/packages/survey/survey.pdf

Google Scholar

Luque, A., Carrasco, A., Martín, A., and Las Heras, A. (2019). The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recogn. 91, 216–231. doi: 10.1016/j.patcog.2019.02.023

Crossref Full Text | Google Scholar

Milovanovic, M., and Grujicic, R. (2021). Electroencephalography in assessment of autism Spectrum disorders: A review. Front. Psych. 12:6021. doi: 10.3389/fpsyt.2021.686021

PubMed Abstract | Crossref Full Text | Google Scholar

MNE-Connectivity. MNE-Connectivity 0.6.0documentation (2023). Available at: https://mne.tools/mne-connectivity/dev/ (Accessed 25 August 2023).

Google Scholar

National Institute of Mental Health (NIMH) (2022). "Autism Spectrum Disorder." Available at: https://www.nimh.nih.gov/health/publications/autism-spectrum-disorder (Accessed 10 February 2023).

Google Scholar

Orekhova, E. V., Elsabbagh, M., Jones, E. J., Dawson, G., Charman, T., Johnson, M. H., et al. (2014). EEG hyper-connectivity in high-risk infants is associated with later autism. J. Neurodev. Disord. 6:40. doi: 10.1186/1866-1955-6-40

Crossref Full Text | Google Scholar

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2012). Scikit-learn: machine learning in Python. arXiv [Preprint].

Google Scholar

Perez Velazquez, J. L., Barcelo, F., Hung, Y., Leshchenko, Y., Nenadovic, V., Belkas, J., et al. (2009). Decreased brain coordinated activity in autism spectrum disorders during executive tasks: Reduced long-range synchronization in the fronto-parietal networks. Int. J. Psychophysiol. 73, 341–349.

Google Scholar

Pulini, A. A., Kerr, W. T., Loo, S. K., and Lenartowicz, A. (2019). Classification accuracy of neuroimaging biomarkers in attention-deficit/hyperactivity disorder: effects of sample size and circular analysis. Biol Psychiatry Cogn Neurosci Neuroimaging 4, 108–120. doi: 10.1016/j.bpsc.2018.06.003

PubMed Abstract | Crossref Full Text | Google Scholar

Rojas, D. C., and Wilson, L. B. (2014). Gamma-band abnormalities as markers of autism spectrum disorders. Biomark. Med. 8, 353–368. doi: 10.2217/bmm.14.15

PubMed Abstract | Crossref Full Text | Google Scholar

Sasson, N. J., and Touchstone, E. W. (2014). Visual attention to competing social and object images by preschool children with autism spectrum disorder. J. Autism Dev. Disord. 44, 584–592. doi: 10.1007/s10803-013-1910-z

PubMed Abstract | Crossref Full Text | Google Scholar

SCCN. (2023). Makoto’s preprocessing pipeline—SCCN. Available at: https://sccn.ucsd.edu/wiki/Makoto's_preprocessing_pipeline (Accessed 31 August 2023)

Google Scholar

Schiratti, J.-B., Le Douget, J.-E., Le Van Quyen, M., Essid, S., and Gramfort, A. (2018). An ensemble learning approach to detect epileptic seizures from long intracranial EEG recordings. In: 2018 IEEE international conference on Acousticmes, speech and signal processing (ICASSP), 856–860.

Google Scholar

Schopler, E. (2010). The childhood autism rating scale, second edition (CARS2): Manual. Torrance, CA, United States: Western Psychological Services.

Google Scholar

Sheikhani, A., Behnam, H., Mohammadi, M. R., Noroozian, M., and Mohammadi, M. (2012). Detection of abnormalities for diagnosing of children with autism disorders using of quantitative electroencephalography analysis. J. Med. Syst. 36, 957–963. doi: 10.1007/s10916-010-9560-6

PubMed Abstract | Crossref Full Text | Google Scholar

Slepian, D. (1978). Prolate spheroidal wave functions, fourier analysis, and uncertainty - V: the discrete case. Bell Syst. tech. j. 57, 1371–1430. doi: 10.1002/j.1538-7305.1978.tb02104.x

Crossref Full Text | Google Scholar

Sun, B., Wang, B., Wei, Z., Feng, Z., Wu, Z. L., Yassin, W., et al. (2023). Identification of diagnostic markers for ASD: a restrictive interest analysis based on EEG combined with eye tracking. Front. Neurosci. 17:1236637. doi: 10.3389/fnins.2023.1236637

PubMed Abstract | Crossref Full Text | Google Scholar

Vabalas, A., Gowen, E., Poliakoff, E., and Casson, A. J. (2019). Machine learning algorithm validation with a limited sample size. PloS One 14, –e0224365. doi: 10.1371/journal.pone.0224365

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, J., Barstein, J., Ethridge, L. E., Mosconi, M. W., Takarae, Y., and Sweeney, J. A. (2013). Resting state EEG abnormalities in autism spectrum disorders. J. Neurodev. Disord. 5:24. doi: 10.1186/1866-1955-5-24

PubMed Abstract | Crossref Full Text | Google Scholar

Wang, X., and Cheng, Z. (2020). Cross-sectional studies: strengths, weaknesses, and recommendations. Chest 158, S65–S71. doi: 10.1016/j.chest.2020.03.012

Crossref Full Text | Google Scholar

Wang, J., Wang, X., Wang, X., Zhang, H., Zhou, Y., Chen, L., et al. (2020). Increased EEG coherence in long-distance and short-distance connectivity in children with autism spectrum disorders. Brain Behav. 10:e01796. doi: 10.1002/brb3.1796

PubMed Abstract | Crossref Full Text | Google Scholar

Zhao, Q., Luo, Y., Mei, X., and Shao, Z. (2023). Resting-state EEG patterns of preschool-aged boys with autism spectrum disorder: A pilot study. Appl. Neuropsychol. Child 12, 1–8. doi: 10.1080/21622965.2023.2211702

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: autism spectrum disorder, electroencephalography, machine learning, spectral power, functional connectivity, coherence

Citation: Ke SY, Wu H, Sun H, Zhou A, Liu J, Zheng X, Liu K, Westover MB, Xu H and Kong X-j (2024) Classification of autism spectrum disorder using electroencephalography in Chinese children: a cross-sectional retrospective study. Front. Neurosci. 18:1330556. doi: 10.3389/fnins.2024.1330556

Received: 31 October 2023; Accepted: 09 January 2024;
Published: 25 January 2024.

Edited by:

Zonglei Zhen, Beijing Normal University, China

Reviewed by:

Jing Li, Tianjin University of Technology, China
Yanli Zhang-James, Upstate Medical University, United States

Copyright © 2024 Ke, Wu, Sun, Zhou, Liu, Zheng, Liu, Westover, Xu and Kong. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Xue-jun Kong, eGtvbmcxQG1naC5oYXJ2YXJkLmVkdQ==; Haiqing Xu, eHVoYWlxaW5nOUAxMjYuY29t

^†These authors share first authorship

^‡These authors share senior authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Classification of autism spectrum disorder using electroencephalography in Chinese children: a cross-sectional retrospective study

1 Introduction

2 Methods

2.1 Cohort and study design

2.2 Institutional review board approvals

2.3 EEG data acquisition

2.4 Data preprocessing

2.5 Feature extraction

2.6 Statistical inference

2.7 Machine learning classification

2.8 Univariate classification of ASD using individual EEG features

3 Results

3.1 Comparison of EEG features in ASD vs. control

3.2 Machine learning classification of ASD using EEG features

4 Discussion

4.1 Limitations

Data availability statement

Ethics statement

Author contributions

Funding

Conflict of interest

Publisher’s note

Supplementary material

References

95% of researchers rate our articles as excellent or good

95% of researchers rate our articles as excellent or good