AUTHOR=Chen Peiji , Zou Bochao , Belkacem Abdelkader Nasreddine , Lyu Xiangwen , Zhao Xixi , Yi Weibo , Huang Zhaoyang , Liang Jun , Chen Chao 

TITLE=An improved multi-input deep convolutional neural network for automatic emotion recognition

JOURNAL=Frontiers in Neuroscience

VOLUME=Volume 16 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2022.965871

DOI=10.3389/fnins.2022.965871

ISSN=1662-453X

ABSTRACT=<p>Current decoding algorithms based on a one-dimensional (1D) convolutional neural network (CNN) have shown effectiveness in the automatic recognition of emotional tasks using physiological signals. However, these recognition models usually take a single modal of physiological signal as input, and the inter-correlates between different modalities of physiological signals are completely ignored, which could be an important source of information for emotion recognition. Therefore, a complete end-to-end multi-input deep convolutional neural network (MI-DCNN) structure was designed in this study. The newly designed 1D-CNN structure can take full advantage of multi-modal physiological signals and automatically complete the process from feature extraction to emotion classification simultaneously. To evaluate the effectiveness of the proposed model, we designed an emotion elicitation experiment and collected a total of 52 participants' physiological signals including electrocardiography (ECG), electrodermal activity (EDA), and respiratory activity (RSP) while watching emotion elicitation videos. Subsequently, traditional machine learning methods were applied as baseline comparisons; for arousal, the baseline accuracy and f1-score of our dataset were 62.9 ± 0.9% and 0.628 ± 0.01, respectively; for valence, the baseline accuracy and f1-score of our dataset were 60.3 ± 0.8% and 0.600 ± 0.01, respectively. Differences between the MI-DCNN and single-input DCNN were also compared, and the proposed method was verified on two public datasets (DEAP and DREAMER) as well as our dataset. The computing results in our dataset showed a significant improvement in both tasks compared to traditional machine learning methods (<italic>t</italic>-test, arousal: <italic>p</italic> = 9.7E-03 &lt; 0.01, valence: 6.5E-03 &lt; 0.01), which demonstrated the strength of introducing a multi-input convolutional neural network for emotion recognition based on multi-modal physiological signals.</p>