AUTHOR=Thorp Kelly R. , Thompson Alison L. , Herritt Matthew T. TITLE=Phenotyping cotton leaf chlorophyll via in situ hyperspectral reflectance sensing, spectral vegetation indices, and machine learning JOURNAL=Frontiers in Plant Science VOLUME=15 YEAR=2024 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2024.1495593 DOI=10.3389/fpls.2024.1495593 ISSN=1664-462X ABSTRACT=

Cotton (Gossypium hirsutum L.) leaf chlorophyll (Chl) has been targeted as a phenotype for breeding selection to improve cotton tolerance to environmental stress. However, high-throughput phenotyping methods based on hyperspectral reflectance sensing are needed to rapidly screen cultivars for chlorophyll in the field. The objectives of this study were to deploy a cart-based field spectroradiometer to measure cotton leaf reflectance in two field experiments over four growing seasons at Maricopa, Arizona and to evaluate 148 spectral vegetation indices (SVI’s) and 14 machine learning methods (MLM’s) for estimating leaf chlorophyll from spectral information. Leaf tissue was sampled concurrently with reflectance measurements, and laboratory processing provided leaf Chl a, Chl b, and Chl a+b as both areas-basis (µg cm-2) and mass-basis (mg g-1) measurements. Leaf reflectance along with several data transformations involving spectral derivatives, log-inverse reflectance, and SVI’s were evaluated as MLM input. Models trained with 2019–2020 data performed poorly in tests with 2021–2022 data (e.g., RMSE=23.7% and r2 = 0.46 for area-basis Chl a+b), indicating difficulty transferring models between experiments. Performance was more satisfactory when training and testing data were based on a random split of all data from both experiments (e.g., RMSE=10.5% and r2 = 0.88 for area basis Chl a+b), but performance beyond the conditions of the present study cannot be guaranteed. Performance of SVI’s was in the middle (e.g., RMSE=16.2% and r2 = 0.69 for area-basis Chl a+b), and SVI’s provided more consistent error metrics compared to MLM’s. Ensemble MLM’s which combined estimates from several base estimators (e.g., random forest, gradient booting, and AdaBoost regressors) and a multi-layer perceptron neural network method performed best among MLM’s. Input features based on spectral derivatives or SVI’s improved MLM’s performance compared to inputting reflectance data. Spectral reflectance data and SVI’s involving red edge radiation were the most important inputs to MLM’s for estimation of cotton leaf chlorophyll. Because MLM’s struggled to perform beyond the constraints of their training data, SVI’s should not be overlooked as practical plant trait estimators for high-throughput phenotyping, whereas MLM’s offer great opportunity for data mining to develop more robust indices.