Yet for all the excitement surrounding data science methods that capture online data, ethical concerns abound. This roundtable will evaluate and discuss common data science methods for scraping data—big and small—from websites. We will cover current data science techniques for webscraping that use the statistical software environment R or user-friendly applications (e.g., Tableau, Import.io). Addressing both the conference theme and ongoing challenge to harness technology for social good, the roundtable will describe the potential benefits and pitfalls of using web-based data in social work research. The roundtable moderators will discuss the following:
- Possibilities and challenges for incorporating publicly-available data into social work research: presenters will share exemplars of social work studies incorporating data scraped from websites to facilitate a discussion about the value of this method for social work research and practice.
- Illustrations from the roundtable presenters’ current social work research that demonstrate data science web scraping techniques: panelists will provide brief examples, using open-source software or user-friendly applications, to explain web scraping techniques. We will use the statistical software environment R to collect and reshape publicly-available information presented on a government website. We will use Tableau and Import.io, two accessible and user-friendly web scraping applications, to retrieve and organize data from popular crowdfunding and fundraising websites.
- Ethical considerations for researchers using publicly-available data: roundtable presenters will address ethical issues related to incorporating publicly-available data scraped from websites into investigations and how reproducible research can mitigate unethical practice.
- Resources for developing skills with online data and web scraping: presenters will highlight online resources for continuing education and online communities that provide technical support.
The roundtable will also describe preliminary techniques available to researchers for manipulating and preparing scraped data for statistical analysis. Each topic above will be addressed by the presenters using illustrations or brief tutorials. Handouts provided at the roundtable (and made available online) will include step-by-step guides, R scripts used in each demonstration as well as recommendations for online resources. Attendees are encouraged to ask questions throughout the roundtable and participate in discussions.