AUTHOR=Bayraktarov Elisa , Ehmke Glenn , O'Connor James , Burns Emma L. , Nguyen Hoang A. , McRae Louise , Possingham Hugh P. , Lindenmayer David B. TITLE=Do Big Unstructured Biodiversity Data Mean More Knowledge? JOURNAL=Frontiers in Ecology and Evolution VOLUME=6 YEAR=2019 URL=https://www.frontiersin.org/journals/ecology-and-evolution/articles/10.3389/fevo.2018.00239 DOI=10.3389/fevo.2018.00239 ISSN=2296-701X ABSTRACT=

Conserving species biodiversity demands decisive and effective action. Effective action requires an understanding of species population dynamics. Therefore, robust measures which track temporal changes in species populations are needed. This need, however, must be balanced against the scale at which population change is being assessed. Advances in citizen science and remote sensing technology have heralded an era of “big unstructured data” for biodiversity conservation. However, the value of big unstructured data for assessing changes in species populations, and effectively guiding conservation management has not been rigorously assessed. This can be achieved only by benchmarking big unstructured data against high-quality structured datasets, and ensuring the latter are not lost through an over-emphasis on “big data.” Here, we illustrate the current trend to disproportionately prioritize data quantity over data quality and highlight the discrepancy in global availability between both data types. We propose a research agenda to test whether this trend will result in a net decrease of useful knowledge for biodiversity conservation. We exemplify this by examining the availability of big unstructured data vs. standardized data using data from global repositories on birds as an example. We share experiences from the data collation exercise needed to develop the Australian Threatened Species Index. We argue there is an urgent need to validate and enhance the utility of big unstructured data by: (1) maintaining existing well-designed, standardized long-term species population studies; (2) strengthening data quality control, management, and curation of any type of dataset; and (3) developing purpose-specific rankings to assess data quality.