AUTHOR=Wang Di , Zhao Fengyuan , Wang Rui , Guo Junwei , Zhang Cihai , Liu Huimin , Wang Yongsheng , Zong Guohao , Zhao Le , Feng Weihua TITLE=A Lightweight convolutional neural network for nicotine prediction in tobacco by near-infrared spectroscopy JOURNAL=Frontiers in Plant Science VOLUME=14 YEAR=2023 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2023.1138693 DOI=10.3389/fpls.2023.1138693 ISSN=1664-462X ABSTRACT=

The content of nicotine, a critical component of tobacco, significantly influences the quality of tobacco leaves. Near-infrared (NIR) spectroscopy is a widely used technique for rapid, non-destructive, and environmentally friendly analysis of nicotine levels in tobacco. In this paper, we propose a novel regression model, Lightweight one-dimensional convolutional neural network (1D-CNN), for predicting nicotine content in tobacco leaves using one-dimensional (1D) NIR spectral data and a deep learning approach with convolutional neural network (CNN). This study employed Savitzky–Golay (SG) smoothing to preprocess NIR spectra and randomly generate representative training and test datasets. Batch normalization was used in network regularization to reduce overfitting and improve the generalization performance of the Lightweight 1D-CNN model under a limited training dataset. The network structure of this CNN model consists of four convolutional layers to extract high-level features from the input data. The output of these layers is then fed into a fully connected layer, which uses a linear activation function to output the predicted numerical value of nicotine. After the comparison of the performance of multiple regression models, including support vector regression (SVR), partial least squares regression (PLSR), 1D-CNN, and Lightweight 1D-CNN, under the preprocessing method of SG smoothing, we found that the Lightweight 1D-CNN regression model with batch normalization achieved root mean square error (RMSE) of 0.14, coefficient of determination (R2) of 0.95, and residual prediction deviation (RPD) of 5.09. These results demonstrate that the Lightweight 1D-CNN model is objective and robust and outperforms existing methods in terms of accuracy, which has the potential to significantly improve quality control processes in the tobacco industry by accurately and rapidly analyzing the nicotine content.