The Guizhou Province of China offers a unique setting in which to systematically examine the benefits and challenges of embracing big data for social good. Recognized as the “big data valley” of China, Guizhou has recently undergone rapid technological growth. Yet, it remains as one of the poorest provinces. Many residents migrate in search of livable wages making it one of five key “sending” provinces in China. The government also recently embarked on a planned effort to relocate two million Miao residents from mountain villages to urban centers as an anti-poverty measure. In Guiyang City, government officials tasked with managing this technological growth, complex migration dynamics, ongoing poverty, and relocation of disadvantaged residents, elected to harness their big data to improve overall social wellbeing. The Guizhou Berkeley Big Data Innovation Research Center (GBIC) was founded and funded by the city of Guiyang as an international government-university partnership dedicated to (1) applying data science approaches to improve the health and well-being of residents, and (2) serving as a leader of bringing big data to social welfare research in China and worldwide. Launched in 2017, GBIC has successfully identified and wrangled big data from four government bureaus, used the data to launch two programs of research, and collaborated with government officials to craft social policy. To support these foundational aims, GBIC has established a computational social welfare lab with embedded structures to securely store and manage sensitive data, promote collaboration among scholars and data scientists, apply advanced and cutting edge modeling techniques, and train recent college graduates in data science for social good.
The first paper describes the Lab and addresses the benefits, risks, and challenges of embracing big data as raised in the grand challenge. The second and third papers present preliminary findings from two studies that respond to pressing local governmental concerns, including examinations of (1) the relationship between school characteristics and academic achievement of left behind children, and (2) the relationship between negative health outcomes and clusters of the health determinants in older adults. These papers each conclude with a discussion of the benefits and challenges of using big data and computation for their research purpose. The symposium ends with discussion facilitated by a co-lead of the “Harnessing Technology for Social Good” grand challenge.