AUTHOR=Podlejski Witold , Descloitres Jacques , Chevalier Cristèle , Minghelli Audrey , Lett Christophe , Berline Léo TITLE=Filtering out false Sargassum detections using context features JOURNAL=Frontiers in Marine Science VOLUME=9 YEAR=2022 URL=https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2022.960939 DOI=10.3389/fmars.2022.960939 ISSN=2296-7745 ABSTRACT=

Since 2011, the distribution extent of pelagic Sargassum algae has substantially increased and now covers the whole Tropical North Atlantic Ocean, with significant inter-annual variability. The ocean colour imagery has been used as the only way to monitor regularly such a vast area. However, the detection is hampered by cloud masking, sunglint, coastal contamination and other phenomena. All together, they lead to false detections that can hardly be discriminated by classic radiometric analysis, but may be overcome by considering the shape and the context of the detections. Here, we built a machine learning model base exclusively on spatial features to filter out false detections after the detection process. Moderate-Resolution Imaging Spectroradiometer (MODIS, 1 km) data from Aqua and Terra satellites were used to generate daily map of Alternative Floating Algae Index (AFAI). Based on this radiometric index, Sargassum presence in the Tropical Atlantic North Ocean was inferred. For every Sargassum aggregations, five contextual indices were extracted (number of neighbours, surface of neighbours, temporal persistence, distance to the coast and aggregation texture) then used by a random forest binary classifier. Contextual features at large-scale were most important in the classifier. Trained with a multi-annual (2016-2020) learning set, the model performs the filtering of daily false detections with an accuracy of ~ 90%. This leads to a reduction of detected Sargassum pixels of ~ 50% over the domain. The method provides reliable data while preserving high spatial and temporal resolutions (1 km, daily). The resulting distribution is consistent with the literature for seasonal and inter-annual fluctuations, with maximum coverage in 2018 and minimum in 2016. This dataset will be useful for understanding the drivers of Sargassum dynamics at fine and large scale and validate future models. The methodology used here demonstrates the usefulness of contextual features for complementing classical remote sensing approaches. Our model could easily be adapted to other datasets containing erroneous detections.