AUTHOR=Arcot Desai Sharanya , Tcheng Thomas , Morrell Martha TITLE=Non-linear Embedding Methods for Identifying Similar Brain Activity in 1 Million iEEG Records Captured From 256 RNS System Patients JOURNAL=Frontiers in Big Data VOLUME=5 YEAR=2022 URL=https://www.frontiersin.org/journals/big-data/articles/10.3389/fdata.2022.840508 DOI=10.3389/fdata.2022.840508 ISSN=2624-909X ABSTRACT=
Finding electrophysiological features that are similar across patients with epilepsy may facilitate identifying treatment options for one patient that worked in patients with similar brain activity patterns. Three non-linear iEEG (intracranial electroencephalogram) embedding methods of finding similar cross-patient iEEG records in a large iEEG dataset were developed and compared. About 1 million iEEG records from 256 patients with drug-resistant focal onset seizures who were treated in prospective trials of the RNS System were used for analyses. Data from 200, 25, and 31 patients were randomly selected to be in the train, validation, and test datasets. In method 1, ResNet50 convolutional neural network (CNN) model pre-trained on the ImageNet dataset was used for extracting feature maps from spectrogram images (ImageNet-ResNet) of iEEG records. In method 2, ResNet50 custom trained on an iEEG classification task using ~138,000 manually labeled iEEG records was used as the feature extractor (ESC-ResNet). Feature maps were passed through dimensionality reduction and k nearest neighbors were found in the reduced feature space. In method 3, a 256 dimensional iEEG embedding space was learned via contrastive learning by training a ResNet50 model with triplet training sets generated using within-patient iEEG clustering (CL-ResNet). All three methods had comparable performance when identifying iEEG records from the search dataset similar to test iEEG records of baseline (non-seizure) and interictal spiking activity. Epileptic interictal spikes are represented by vertical (broadband) edges in spectrogram images, and hence even generic features extracted using models trained on everyday images appear to be sufficient to represent iEEG records with similar levels of interictal spiking activity in close proximity. In the case of electrographic seizures, however, the ESC-ResNet model, identified cross-patient iEEG records with electrographic seizure morphology features that were most similar to the test iEEG records. For nuanced electrographic seizure iEEG representation learning, domain specific model training with manually generated labels had the advantage. Finally, representative iEEG records were selected from every patient using an unsupervised clustering method which effectively reduced the number of iEEG records in the search dataset from ~750,000 to 2,148, thus substantially reducing the time required for finding similar cross-patient iEEG records.