AUTHOR=Chen Yan , Sherren Kate , Lee Kyung Young , McCay-Peet Lori , Xue Shan , Smit Michael TITLE=From theory to practice: insights and hurdles in collecting social media data for social science research JOURNAL=Frontiers in Big Data VOLUME=7 YEAR=2024 URL=https://www.frontiersin.org/journals/big-data/articles/10.3389/fdata.2024.1379921 DOI=10.3389/fdata.2024.1379921 ISSN=2624-909X ABSTRACT=

Social media has profoundly changed our modes of self-expression, communication, and participation in public discourse, generating volumes of conversations and content that cover every aspect of our social lives. Social media platforms have thus become increasingly important as data sources to identify social trends and phenomena. In recent years, academics have steadily lost ground on access to social media data as technology companies have set more restrictions on Application Programming Interfaces (APIs) or entirely closed public APIs. This circumstance halts the work of many social scientists who have used such data to study issues of public good. We considered the viability of eight approaches for image-based social media data collection: data philanthropy organizations, data repositories, data donation, third-party data companies, homegrown tools, and various web scraping tools and scripts. This paper discusses the advantages and challenges of these approaches from literature and from the authors' experience. We conclude the paper by discussing mechanisms for improving social media data collection that will enable this future frontier of social science research.