Each year, the Data for Good Exchange (D4GX) assembles a wide range of professionals in the fields of data science, technology, sustainable development, and more, who explore ways to make data more useful to the public. We have attended this event every year since its inaugural edition in 2014. It’s a favorite!
This year’s D4GX centered around the theme of data science for the UN Sustainable Development Goals. It kicked off with a keynote by Shawn Edwards, CTO of Bloomberg, Francesca Perucci, Assistant Director of the UN Statistics Division, and Claire Melamed, CEO of Global Partnership for Sustainable Development Data. They reflected on the likelihood of achieving the Sustainable Development Goals by 2030, pointing out that one third of the allotted time has already passed since their adoption in 2015; more progress is urgently needed. A statistic partially illustrates the scope of the challenge; 7% of the world population is still not included any data set—their experiences are completely ignored, and the ability to achieve the goals in those communities is hindered.
Although sessions focused on a wide range of topics and locations, some common themes emerged. One common thread was the importance not just of data, but of data practice and practitioners in the creation of social equity. For example, many speakers mentioned that not all communities have equal access to data methods; a lack of technology or expertise often impedes the utilization of data outside of cities. In a room full of data scientists, only a few worked in areas that were not major metropolitan areas. Bringing talent to areas that need it is a challenge and is an important part of ensuring data is used for the benefit of all. Also on the topic of equity, a panel focused on using data science to address gender inequality. A key problem of there is a lack of data; the data gaps are themselves indicators of equity gaps. If we don’t have information on gender inequity, we cannot formulate effective policies to address it.
Another theme was the need for systems and practices that support making decisions and taking action. Data that is collected but not used is, well, not very useful. One organization that attempts to address this is Data Mermaid, an open platform that collects and visualizes data about coral reefs. While the deteriorating state of much of the world’s coral has been publicly discussed, much of the underlying data has been available to researchers in forms that do not actually supports quick decision making and effective management. Making data available does not necessarily make it useful—some thought must be given to presentation as well. We were reminded of some other crowd-sourced or collaborative data efforts that are yielding interesting results. For instance, uBiome is attempting to catalogue the surprisingly diverse microflora of the human body. The Great Backyard Bird Count engages amateur and professional birdwatchers alike in a massive, multi-country effort to count all of the birds in North America.
Some sessions zoomed in on the use of data in specific cities. A new data platform in Bogotá, Colombia showed that an overwhelming majority of Colombian citizens feel that the government lacks transparency and corruption is a significant issue. To address this problem, data scientists at Bloomberg partnered with the city government of Bogotá to open up data access to construction projects initiated and owned by the municipal government. For each project, the public can see the location, progress, cost, number of employees and information about the construction company. Open access to this previously hidden data helps to build trust between the citizens of the city and their government.
A final takeaway: all of this must be done ethically. In order for data to be useful to the public, people must be able to trust that their data is safe and their privacy is being protected. Multiple organizations are taking on this challenge. The Global Data Ethics Project has rolled out a the FORTS framework—short for Fairness, Openness, Reliability, Trust, and Social Benefit. It attempts to lay out guidelines for technologists to follow as they build the next generation of data systems. Mozilla is also developing a curriculum to teach data ethics to developers and other members of the technology community. We will be interested to see how these efforts evolve into a firm foundation on which to build data systems that serve the public interest.
We look forward to D4GX every year, and this edition did not disappoint!