Session: Scraping for Change: Using Data Science Techniques to Gather Online Data for Social Work Research (Society for Social Work and Research 21st Annual Conference - Ensure Healthy Development for all Youth)

288 Scraping for Change: Using Data Science Techniques to Gather Online Data for Social Work Research

Schedule:
Sunday, January 15, 2017: 9:45 AM-11:15 AM
Bacchus (New Orleans Marriott)
Cluster: Research Design and Measurement
Speakers/Presenters:
John E. Sullivan, MSW, University of Texas at Austin, Catherine A. LaBrenz, MSW, University of Texas at Austin, Christopher P. Salas-Wright, PhD, Boston University, Michael G. Vaughn, PhD, Saint Louis University and Brian Perron, PhD, University of Michigan-Ann Arbor
Publicly-accessible information posted to online websites comprises much of the data in the world. The challenges and opportunities of gathering and using online data for research are captured by the Three V’s—volume, variety, and velocity. Volume speaks to the vast amounts of data being generated; variety represents the wide-ranging kinds of data; and velocity is the ever increasing pace at which data accumulates with the advent of new technologies (Gandomi & Haider, 2015). Although much online data is available to the public, it is not usually organized into researcher-friendly row–column formats. Gathering data from online sites, or web scraping, and curating this collected data for empirical analyses are important skills that open new avenues for social work research.

Yet for all the excitement surrounding data science methods that capture online data, ethical concerns abound. This roundtable will evaluate and discuss common data science methods for scraping data—big and small—from websites. We will cover current data science techniques for webscraping that use the statistical software environment R or user-friendly applications (e.g., Tableau, Import.io). Addressing both the conference theme and ongoing challenge to harness technology for social good, the roundtable will describe the potential benefits and pitfalls of using web-based data in social work research. The roundtable moderators will discuss the following:

  1. Possibilities and challenges for incorporating publicly-available data into social work research: presenters will share exemplars of social work studies incorporating data scraped from websites to facilitate a discussion about the value of this method for social work research and practice.

  2. Illustrations from the roundtable presenters’ current social work research that demonstrate data science web scraping techniques: panelists will provide brief examples, using open-source software or user-friendly applications, to explain web scraping techniques. We will use the statistical software environment R to collect and reshape publicly-available information presented on a government website. We will use Tableau and Import.io, two accessible and user-friendly web scraping applications, to retrieve and organize data from popular crowdfunding and fundraising websites.  

  3. Ethical considerations for researchers using publicly-available data: roundtable presenters will address ethical issues related to incorporating publicly-available data scraped from websites into investigations and how reproducible research can mitigate unethical practice.

  4. Resources for developing skills with online data and web scraping: presenters will highlight online resources for continuing education and online communities that provide technical support.

The roundtable will also describe preliminary techniques available to researchers for manipulating and preparing scraped data for statistical analysis. Each topic above will be addressed by the presenters using illustrations or brief tutorials. Handouts provided at the roundtable (and made available online) will include step-by-step guides, R scripts used in each demonstration as well as recommendations for online resources. Attendees are encouraged to ask questions throughout the roundtable and participate in discussions.

See more of: Roundtables