New Clinic Leverages Data Science for Social and Environmental Causes

December 17, 2020

For modern social good organizations, data can be both a blessing and a source of frustration. Data provides a valuable resource to realize an organization’s mission, identifying problems to target and individuals or communities to help, assessing the effectiveness of their work, and deepening communication with stakeholders and the public. But most non-profits can’t afford the large data science teams of tech companies or government agencies, the expertise that can help them use their numbers for maximum impact

With the support of the 11th Hour Project, the grant-making arm of The Schmidt Family Foundation, the University of Chicago established a new Civic Data & Technology Clinic to help fill this need. Offered through the Master of Science in Computational Analysis & Public Policy (MS-CAPP) program jointly offered by the UChicago Department of Computer Science and the Harris School of Public Policy, and co-organized by the Center for Data and Computing, the clinic connected 13 students with three organizations working in social and economic justice, sustainability, and climate change.

Over a remote and very fast 10-week autumn quarter, the students formed teams to work on projects with FracTracker Alliance, Inclusive Development International, and Hohonu. The teams built an application to track palm oil deforestation, a system for informing the public about oil and gas industry effects on local air quality, and a data-driven dashboard for predicting water level changes across the country.

“The goal of this clinic is to partner our incredible students and programs here with public interest organizations to leverage data science, skills, and technology research, but with a real mission point of view, to press change for good in social and environmental challenges,” Uminsky said. “For the students, we really want to transcend conventional classroom experiences...to work with real world data, applying messy and hard algorithms to messy and hard data, but always with an eye on the mission of the project.”

On December 10th, the three projects presented their results in a virtual event. You can watch the videos here.

Data Watchdogs for Air Quality, Deforestation, and Water Levels

If there was a theme across all three projects, it was helping organizations use their data to monitor and inform the public about environmental changes. Teams developed tools that took unwieldy data sources — free-text citizen complaints, ocean monitors placed around the United States, and satellite images of palm forests  — and refined that information into useful dashboards or apps for the organization, regulators, and communities to take further action. 

With FracTacker Alliance, students Rui Chen, Tian Chen, and Ruize Liu, turned a one-way data path where Colorado residents submitted complaints about the oil and gas industry into a multi-pronged system that communicates rich information to regulators and back to the public. For each complaint submitted, the app built by the project team creates automated reports, attaching numbers and visualizations about the air quality in the user’s area and nearby facilities which may contribute to pollution. The reports are then e-mailed to the Colorado Oil and Gas Conservation Commission and the original complainant. 

“There are about 5.7 million people in Colorado, and we hope that with our efforts, a large number of them will understand the environment and how it may affect their health,” Ray Chen said. “We believe that the FracTracker team will better understand and analyze users’ complaints about the environment and what is happening around them. We also believe the government can identify which facilities caused problems to users in a shorter period of time, so as to solve environmental problems more efficiently.”

Hohonu provides coastal monitoring that assists communities harmed by frequent flooding, combining sensor data with machine learning to make predictions about future changes in ocean levels. The team of Marc Richardson, Ryan Webb, Jiaqi Yang, Jinfei Zhu, and Yiheng Zhu worked with the organization on improving their data quality and predictive models by focusing on anomaly detection — distinguishing data collection errors from real extreme events such as storms and abnormal tides to produce more accurate forecasts. The team also tested new algorithms against the model currently used by Hohonu and new ways of relaying predictions to the communities served by the organization.

“Detecting these natural anomalies is essential for mitigating the harm from extreme weather events,” Richardson said. “We also figured out how to effectively communicate anomalies to the appropriate stakeholders, using data visualization and a dashboard application that stakeholders could use to stay informed.”

Palm oil is a fast growing product, with production doubling every 10 years since the 1960’s, and Inclusive Development International hopes to monitor this industry to inform consumers about the effect of palm oil mills on deforestation in countries such as Indonesia and Malaysia. The team of Launa Greer, Tim Hannifan, Daniel Lee, Olu Ogidan, and Amanda Whaley combined tree cover image data, a database of palm oil mill locations in Indonesia, and scraped data linking those mills to consumer brands to create a new web application of “risk scores” for each mill and brand. 

The tool allows users to see which mills — and by association, which companies using palm oil in their products — contribute the most to environmental damage, to empower communities and regulators to fight for inclusive and sustainable harvesting, and help people make responsible choices at the grocery store.

“We chose to target one specific user story: an environmentally-conscious consumer hoping to make more informed purchasing decisions for products containing palm tree oil,” Greer said. “That consumer might ask what brands are the best and worst performers — what's the rate of historical tree loss per brand, and across brands in aggregate? This web app, to the best of our knowledge, is the first to bring together disparate data sources to begin answering those questions.”

Launching an Open Data Science Ecosystem

The inaugural Civic Data and Technology Clinic was only the first chapter in the partnership between the 11th Hour Project and the University of Chicago. A $500,000 grant from 11th Hour will fund the clinic and additional activities over the next year, connecting data science experts and students across University units and programs with the dozens of organizations that receive funding from the foundation. In addition to specific projects like those tackled in Autumn 2020, the grant will also fund the development of an open-source platform at the Center for Data and Computing that provides data science tools and applications for use beyond their original purpose. 

“At CDAC, we can serve as the centralized hub for both software and data science solutions,” Grzenda said. “And not just solutions that affect the individual organization or the individual grantee, but also solutions that can be scaled across grantees and to the wider nonprofit community through open source software.”

The long-term partnership and these broader research efforts will help ensure that the output of these projects will live on beyond the work of one academic quarter, so that organizations can build upon the tools and use data more effectively to fulfill their social missions.

“The goal here is to really understand the model that allows for data science to continue in a sustainable way and succeed after the projects are over,” Uminsky said. “This is a really exciting opportunity to know that change is not just at the end of this presentation, but the change continues after the work was done.”