A visit at the Do Good Data conference
Corporations are becoming more and more aware of how the growing availability of data will impact and transform their business model. But also in the nonprofit world, data literacy and analytical capacity is a key factor. One of the biggest conferences worldwide, which is dedicated to how nonprofits can harness the power of data in a responsible way, is the Do Good Data Conference, which took place at the University of Stanford in the beginning of February. Our colleague Daniel Kirsch attended the event and shares some of his insights in this posts.
Nonprofits do data differently
While working with nonprofits is not exactly idalab’s main focus, the cause is certainly close to our hearts. As a data science agency, our work at idalab necessarily centers around the opportunities of data and algorithms – in various industries and sectors. In my freetime, I also run a pro bono intermediary for data science, so the Do Good Data conference has each time been a major source of inspiration for me – both personally and professionally.
In the last two years, the conferences took place in Chicago. However, as the conference recently merged with the Data on Purpose conference, the event was hosted at the University of Stanford this year. This change also brought a slight change of focus with it. Previous Do Good Data conferences had more „lessons learned“, „what we have done“ and „practical skills“ sessions, which I really liked, because I could see what the „state of the art“ in data utilization in nonprofits is.
In this year’s conference there was one overarching topic being carried through many of the conference’s sessions: How should nonprofits approach data and algorithms as opposed to for-profits? What are ethical processes? Who needs to be included into decisions? What about the people represented in the data? What are models for governance?
Of course not all questions could be answered but places like Do Good Data / Data on Purpose are ideal melting pots to spark these discussions and carry them into the organizations that actually do the work.
Nonprofits must develop internal expertise
Not all sessions were purely meta though. Especially the breakout sessions were very hands on. I attended one that talked among other things about the challenges of attracting data nerds into nonprofits and government. Not many want to trade an attractive salary for a more meaningful job. Volunteering models like those of DataKind and Code for America (which have German equivalents in DSSG Berlin/CorrelAid and Code for Germany) can kickstart projects but for them to be sustainable nonprofits/government must develop internal expertise.
However, to get started it’s not that important to have the hardcore data nerds inhouse. It’s much more important to have a person that can translate from the nonprofit’s needs into technical solutions – someone that speaks both languages. This mirrors our own work at DSSG Berlin where we started to do data impact mapping workshops as a format to co-discover data opportunities in the work of nonprofits that approach us with often fuzzy requirements. Even with corporates and startups, idalab has been experiencing the same situation and “data opportunity workshops” are oftentimes the beginning of client contact.
Causal inference is hard
Given that I’m a mathematician, my favorite talk was given by Hal Varian (Google’s Chief Economist) about causal inference. He talked about RCTs, Natural Experiments, Instrumental Variables, Regression Discontinuity Designs and Difference in Difference and did not eschew showing some math. My main novel learning: I had never seen the use of synthetic controls before where a model is trained and evaluated on the population before a treatment is given. Then the effect is compared to the counterfactual predicted by the model.
Another highlight was Kristian Lum of the Human Rights Data Analysis Group talking about bias in predictive policing. The HRDAG is one of my long time favorite „Data for Good“ initiatives (read more about them at https://hrdag.org/). The problem with predictive policing is that policing is concentrated where historically the most crime has been observed. This creates a pernicious feedback loop. Those same areas are strongly policed which leads to more reported crime (due to the streetlight effect) which is fed back into the algorithm that can’t help but double down on these crime infested areas that produce the most reports. The algorithm becomes a self-fulfilling prophecy. (More on similar effects can be found in the book „Weapons of Math Destruction“ by Cathy O’Neil.)
The conference was, again, a great experience and the 15 hours of flight to San Francisco were well worth it. It’s a pity that so far Germany cannot offer anything comparable. But organizations like DSSG Berlin, CorrelAid and the Datenschule can be the spark of the movement. And there are good news: The Datenschule will have its own event called Datensummit this year. And also Data on Purpose / Do Good Data will under its new name and as part of the DIGITALIMPACT World Tour make a stop in Berlin. I’m very much looking forward to these events and hope they can intensify the discussion around data in nonprofits also here in Germany. The discussion is certainly much needed.