Skip to main content

OPINION article

Front. Ocean Sustain.

Sec. Marine Governance

Volume 3 - 2025 | doi: 10.3389/focsu.2025.1522648

Publishing datasets, using artificial intelligence to help with metadata, can enhance ocean sustainability research and management

Provisionally accepted
  • Marine Research Division, AZTI Foundation, Marine Research Division, Pasaia, Spain

The final, formatted version of the article will be published soon.

    ). However, synthesizing heterogeneous data from different ecosystem components in a monitoring network, coding all data preparation, and creating standard formats and metadata, to make reproducible, collaborative and transparent science (Lowndes et al., 2017), could prevent scientists from publishing large open datasets.During the 40 years of my career, although evolving towards more and better technologies, most of the methods used in marine monitoring can be considered as traditional and standardized (Anonymous, 2002;Karydis and Kitsiou, 2013;UNEP, 2016). However, in the last 10-15 years, many innovative and practical tools for monitoring and assessing the marine status have been developed and have experienced a growing use (Borja et al., 2024). The most common types of emerging methods include, among others, portable eDNA sequencers, underwater cameras, modelling methods, drones, satellites and artificial intelligence assisted data processing (European Commission et al., 2023).Regarding data, there is now a range of technologies emerging for processing large volumes of heterogeneous environmental data (Vitolo et al., 2015). In fact, one of the ten strategic areas to strengthen the European Union's global leadership, is the capacity in data management, artificial intelligence and cutting-edge technologies (European Commission et al., 2022). In the introduction, I have commented some facts that can prevent scientists to share datasets. Nowadays, the need for ever more sophisticated data processing makes it even harder to meet the open data standards, which are needed going forward to make data accessible and synoptic analyses possible (Addison et al., 2018). Hence, the increasing scope of data collected and the potential future purposes for which they can be used (e.g. different sectors of Blue Economyfisheries, aquaculture, tourism, biotechnology, etc.-, as well as maritime spatial planning, conservation, management, protection, restoration, assessment, etc.), means that traditional and emerging tools and processes for collecting, storing and analysing datasets may become increasingly bespoke, particularly if the trend for repurposing data continues (e.g. the use of artificial intelligence and machine learning to extract new information from existing open access databases) (Addison et al., 2018).In the last decade, several scientific journals have been created to publish open data, e.g. Data in Brief, Scientific Data, GigaScience, Biodiversity Data Journal, etc. However, when I was contacted by Frontiers Media to attend the presentation of the idea of a new platform for publishing open data, using generative artificial intelligence to assist the authors in preparing the datasets and metadata, as well as in writing the text accompanying the data, I was impressed by the first tests undertook. Hence, I offered the developers of the tool to use the large database generated for the Basque Water Agency, challenging the tool with real data and a good knowledge of the environment. The fact that the tool can learn not only from the dataset itself, but also from the ORCID numbers of the authors or additional information, was an added value for the experience.After some interactions and tests, the text created had some shortcomings, but the experience of the authors allowed to easily and quickly build a final manuscript which has been the first published as a new article type (Open Data Article) in Frontiers in Ocean Sustainability (Borja et al., 2025). As main author of this manuscript, I'm fully engaged with the five principles of human accountability and responsibility to protect the integrity of science in the age of generative artificial intelligence, as proposed by Blau et al. (2024): (i) transparent disclosure and attribution of the work done with the artificial intelligence in handling the dataset and writing the paper; (ii) verification of the content and analyses generated by the artificial intelligence, ensuring as scientists the accuracy of the data, imagery, and inferences draw from the use of generative models in writing the paper; (iii) documentation of data and metadata generated by the artificial intelligence;(iv) focusing on ethics and equity, to ensure that products (i.e. metadata, texts, figures, tables) are scientifically sound and provide socially beneficial results (in this case, datasets fully and freely available), and (v) continuous monitoring, oversight, and public engagement to evaluate the impact of artificial intelligence on the scientific process, to maintain integrity and reproducibility.In 2022, member states asked the United Nations Environment Programme (UNEP) to examine how artificial intelligence could accelerate work in three areas: climate action, nature protection, and pollution prevention (Wilson, 2024). In response, the UNEP (i) launched the World Environment Situation Room (wesr.unep.org), a digital platform that is planning to leverage artificial intelligence capabilities to analyse complex, multifaceted data sets, and (ii) is committed to develop a Global Environmental Data Strategy by 2025, aiming to improve monitoring data standards and digital cooperation between countries, and finally contributing to drive new frontiers in ecological research and management (Wilson, 2024).Most of the ecological and biodiversity monitoring data will be needed to take decisions on conservation and restoration, especially after the adoption of the Kunming-Montreal Global Biodiversity Framework of the Convention on Biological Diversity (CBD, 2022). Similarly, the European Biodiversity Strategy 2030 has as a main policy goal to halt the decline of biodiversity and promote its recovery by 2030 (European Commission, 2020). One way to achieve this goal is based in legally binding restoration targets of 30% of degraded ecosystems, by 2030, 60% by 2040, and 90% by 2050, as approved by the Nature Restoration Law (Hering et al., 2023).In this context, after a standardized survey, undertook by Moersberger et al. (2024), European science and policy stakeholders identified four clusters of key policy questions related to biodiversity monitoring within the next decade: (i) "Assessing biodiversity and species trends", including biodiversity status and trends, indicators for the quality of habitats, and assessing the impact of invasive species on the environment;(ii) "Biodiversity policy impact and effectiveness", including the assessment of the effectiveness of biodiversity policies and the outcomes of conservation management and restoration; (iii) "Integrating biodiversity in other policy sectors", including agriculture, fisheries, water management, climate change, green and blue infrastructure projects, poverty, equity, and trade; and (iv) "Operationalization of monitoring", including ways to standardize and harmonize biodiversity monitoring programs and integrate novel technologies to meet policy targets. Among those novel technologies, artificial intelligence occupies a relevant position (Moersberger et al., 2024).In the case of the ocean, the increasing threats to biodiversity, coming from human activities, as well as the effects of climate change, resulted in a Workshop between the Intergovernmental Panel on Climate Change (IPCC) and the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) (Pörtner et al., 2021).After that, to build synergies between strategies for climate, biodiversity, ocean and human health, a group of scientists proposed the establishment of an International Panel for Ocean Sustainability (IPOS) (Gaill et al., 2022). After these authors, IPOS could facilitate the implementation of a global, integrated and fit-for-purpose observing system, providing information for robust understanding, monitoring, predicting and projecting the state of the ocean, across requirements and scales (from global to local), in alignment with the Global Ocean Observing System. Again, innovative digital tools that use observation, advanced modelling and data management can be integrated into a digital twin of the ocean (an open source of combined ocean observations, artificial intelligence, and advanced modelling providing a consistent, high resolution, multi-dimensional and near real-time virtual representation of the ocean) (Gaill et al., 2022). This will be a multidisciplinary endeavour, involving the acquisition, integration and analysis of an increasing amount of ocean data. For completing this, Sagi et al. (2020) identified the key missing tools, with a focus on "(i) development of artificial intelligence-based tools for assisting ocean scientists in aligning their schema with existing ontologies when organizing their measurements in datasets; Hence, one of the main lessons learnt during these years is that building on adequate knowledge architecture is essential for sustainability transitions (Oliver et al., 2021). For assisting in this endeavour, Frontiers in Ocean Sustainability has included this new article type (Open Data Article), making data available, which can benefit the ocean scientific community by providing the necessary information to take informed decisions on marine management, for a sustainable use of the ecosystem services. As pointed out by Borja (2023), this can benefit also multiple international initiatives needing data available, and taking place around the sustainability of the planet and, specifically, the ocean: (i) the United Nations (UN) Sustainable Development Goals (SDGs), including SDG14, to conserve and sustainably use the ocean, seas and marine resources for sustainable development; (ii) the UN Decade of Ocean Science for Sustainable Development, which will increase the international collaboration on scientific research;(iii) the UN Decade on Ecosystem Restoration, including marine degraded ecosystems;(iv) the "30by-30" from the "High Ambition Coalition for Nature and People", a worldwide initiative for governments to designate 30% of Earth's land and ocean area as protected areas by 2030; and (v) the Agreement under the United Nations Convention on the Law of the Sea on the conservation and sustainable use of marine biological diversity of areas beyond national jurisdiction.Of course, artificial intelligence can be used also in an unethical way, e.g. by creating fake datasets, or creating new patterns of overexploitation and unforeseen interactions between human activities and marine ecosystems. This presents a paradox: while generative artificial intelligence can enhance sustainability through better data management, it may also drive the depletion of marine resources, creating new environmental costs and sustainability challenges. Hence, as editors of the journal, we must be attentive to any misuse of these technologies, verifying the content and analyses generated by the artificial intelligence, and ensuring the accuracy of the data and accompanying information and explanations.We encourage the whole ocean scientific community to provide data from surveys, monitoring networks, PhD and master thesis, national and international projects, etc., on the benefit of the sustainability of the ocean, through an informed management decision process.The author declares that this opinion paper was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.AB had the idea and wrote the manuscript.

    Keywords: artificial intelligence, Monitoring, Dataset, ocean sustainability, Research

    Received: 04 Nov 2024; Accepted: 10 Feb 2025.

    Copyright: © 2025 Borja. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Angel Borja, Marine Research Division, AZTI Foundation, Marine Research Division, Pasaia, 20110, Spain

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

    Research integrity at Frontiers

    Man ultramarathon runner in the mountains he trains at sunset

    94% of researchers rate our articles as excellent or good

    Learn more about the work of our research integrity team to safeguard the quality of each article we publish.


    Find out more