“There is a fine line between potential benefits of data science and data privacy”


The potential and scope of social uprisings has been significantly enhanced by the rise of the internet and its modern communication tools. At the same time though, the increased connectedness of protesters brings larger vulnerability, as governments have better leverage to collect intelligence. Similarly, they can exert their power to force a complete network shutdown as a strategic tool to advance their military offensives. This was shown by current Harvard Kennedy School postdoc Anita Gohdes in her PhD thesis on the example of the Syrian civil war. In our interview with Anita, we discussed her research and the role of data science for human rights.

The hypothesis, which sparked Gohdes’ research is in itself grounded in the two-folded character of the internet for social movements. Social Networks and internet-based messaging apps are crucial for the coordination of any large group of people, especially if freedom of expression is already hampered in public spheres. While carrying huge benefits, the dependency also makes movements more vulnerable. On the one hand, governments have a central point of access to collect intelligence. On the other hand – and this is where Gohdes’ research comes in – governments can use the shutdown of networks as a tool to advance their militaristic goals, impeding possibility for coordination of counter attacks. As a significant number of those shutdowns were observed in Syria, Gohdes carried out comprehensive empirical research on whether higher overall killings occur during network shutdowns.

“There is actually a lot of data on the number of casualties in Syria available.”

“In previous civil wars, realtime information about killings was oftentimes scarce. In the Syrian conflict, many organizations on the ground now use modern tools for reporting, and there is actually a lot of data on the number of casualties in Syria available, despite the severity of the conflict. However, this doesn’t mean we know about every event – many atrocities that are happening remain unreported”, Gohdes explains the basis of her empirical work. And indeed, Gohdes showed that the number of documented killings in Syria significantly increases in case of a network shutdown. In those regions, where there was active fighting between rebels and government forces, this does especially hold true. Interestingly, though, there is no evidence of network shutdowns being used by the Syrian government as a cover-up for their military advances. Documented killings go up already on the days before the shutdown. “This is in line with the general observation that the Assad regime didn’t try to cover up atrocities ”, Gohdes contextualizes her findings. While establishing substantial support for the deliberate use of network shut downs of the Assad regime, this doesn’t mean that increases in violence always coincide with network shutdowns, however

“Every time there is a network shutdown, violence does increase.”

Gohdes currently not only conducts post-doc research at Harvard University, but also works as a consultant for the Human Rights Data Analysis Group (HRDAG), which was called by the United Nations to establish baseline enumerations on civilian casualties in the Syrian conflict. While data science is truly cross-disciplinary in its nature, every field of appliance brings its unique specifics. Here are three things essential for understanding data science and human rights.

1. Data cleansing, aggregation and integration are key

Data science endeavors are bound to fail if the data to work with lacks granularity, coherence and comprehensiveness. Given the fragmentation of data sources in any kind of data science project, the situation is understandably even more complex if the subject of analysis is a civil war.

“Data work is essential, but in the Syrian setting extremely tough.”

“We worked with five different sources for documented fatalities, which were to some degree overlapping. Thus, a large part of our work actually revolves around the sound integration of these sources to create a robust dataset”, Gohdes says of her research. These sources included among others the Syrian Center for Statistics and Research (CSR-SY), the Syrian Network for Human Rights (SNHR) and the Violations Documentation Centre (VDC). “Defining sampling rules to effectively match the records was the nitty-gritty work needed to make the research possible.”, elaborates Gohdes. In the context of the data value chain, data source identification and integration are crucial value-creation steps in human rights research.

2. For human rights advocacy, the current focus on data science can be a double-edged sword

Generally, more data about human rights violations, conflict zones and the likes enables human rights organizations to increase awareness for their cause. However, the direct value of data science for human rights has always to be put into perspective. Research funding for human rights might give priority to projects that rely on quantitative methods and data collection. These methods, however, are oftentimes only applicable if sufficient data is available. “Consequently, human rights research runs the risk of focussing on those conflicts where data already exists. At the same time, other geographic areas might be neglected in terms of funding, as data might be hard to obtain and data-driven methods cannot be applied”, Gohdes explains. This has the potential to bias perceptions, if research is only conducted in conflict areas of high data availability. These are likely to be those conflicts, where the local population is highly connected and literate, leaving other conflicts out of the research scope. Other conflict areas might lack such insights of research and field work being carried out.

3. In light of current hype, focus on ethics remains vital

The holy grail of data science work in human rights research remains the operational applicability. How can insight be leveraged by organizations on the ground to improve local conditions? In light of the current public focus on data science, oftentimes these methods are called upon as the universal solution. “Interestingly, there is little difference in what we do now as compared to a few years ago, but now people are starting to call it data science”, Gohdes remarks. And with increased public attention comes increased responsibility. Gohdes calls upon the data science community not to infringe basic ethical standards. “From an ethical perspective, carrying out research on personalized data of vulnerable populations – take for example the refugee and migration crisis – can be a very sensitive issue. There is a fine line between potential benefits of data science and data privacy, which needs to guide research”, explains Gohdes. While she is convinced that there is a great potential of data science for human rights advocacy, it needs to be unlocked in a highly responsible fashion.

Additional reading:

There are notable efforts to advance ethical standards in the context of data science, for example the Responsible Data Forum, where researchers reflect on the challenges of the use of data for their projects. Interested readers might want to read the article “Recognising uncertainty in statistics”, which deals with the work of Gohde in Syria.

Julian Beimes


+49 (0) 173 67 62 781


Potsdamer Straße 68
10785 Berlin