Data Risk Checker

From Responsible Data Wiki
Jump to: navigation, search

Categorizing harm levels on knowledge assets to inform mitigation and protection

Connection to previous RDFs

This output builds upon (and diverges from) work done in the RDF on private sector data.

The Output



Three-step process.

We assume that the risk checking will occur inside of a three-step process:

  1. Data (and responsible data) literacy
  2. Risk checking
  3. Mitigation

Data literacy

In order to be able to effectively utilise the risk checking tool, it is assumed that the practitioners understand the basic concepts and components of data, such as metadata, collection strategies, formats and storage types (boolean, integer, geographic coordinates, etc), and that they are comfortable working with data wrangling tools such as spreadsheets.

Practitioners should also understand the core Responsible Data (talk to Niels, and Mary) principles that apply when collecting data that might pose risks to entities providing the data (data owners).


The risk checking tool only assesses the risks; it does not propose or recommend risk mitigation techniques. It is assumed that risk checking will be followed by a concrete risk mitigation phase that will be informed by the results of the risk checking.


The risk checking is always tailored towards the audience. Thus, it assumes that whoever is using it has a deep knowledge of the audience, its needs and risks. As a recommendation, the audience should always be included in the process.

Data is inherently unsafe

As indicated by the recent events, the overarching assumption throughout this process is that data is always under the risk of exposure. The risk checking process is not intended to communicate or build awareness on how to secure data. We recommend reading and implementing best practices when it comes to collection, storage and dissemination of data

Types of threats

We also assume that the person using the risk checking tool understands the basic concepts of digital and physical threat: understanding categories, the power of information, understands what threat modelling means and what it is for, etc.

Types of harm

To make the assessment a non-exhaustive exercise, we have broadly classified the harms:

  1. Physical Harm: Identifies any harm that directly puts the owner of the data as a target and cause physical damage.
  2. Psychosocial/Emotional Harm: Identifies any harm that cause emotional or social damage to the owner of the data or their acquaintances.
  3. Economic Harm: Identifies any harm that cause damages to personal and financial assets.

Process for generating a Responsible Data Risk Map

Types of Harm:

  • Psycho-Social / Emotional
  • Physical
  • Economic

1. Identify the Persons at Risk in the event of exposure

Definition of Persons at Risk: Any entity at risk of being by the exposure. Therefore, not restricted to the data owner or collector.

2. Identify Knowledge Assets that can be extracted from the data collected

Definition of Knowledge Assets: Discrete data points, information extracted from collections of discrete data points, information extracted from meta analysis of data points, information extracted from the mashup of the collected data and external data sources.

3. Evaluate the importance of each knowledge asset to the campaign

The importance is used in combination with Risk assessment to determine what data to collect. Importance is rated on this scale:

  • Low Importance: knowledge assets that have little or no relevance to the success of the campaign
  • High Importance: knowledge assets that have significant relevance to the success of the campaign
  • Must Have: knowledge assets that are crucial to the success of the campaign

4. For each Type of Harm:

Evaluate probability and severity of harm for each type of harm for each person at risk by each knowledge asset

Probability of Harm:

  • Low - Assessed as 49% or less probability of harm
  • High - Assessed as 50% or more probability of harm

Severity of Harm

  • Low - Assessed as causing little to no harm to the Person at Risk
  • High - Assessed as causing moderate to severe harm to the Person at Risk
  • No Go - Assessed as causing catastrophic harm to the Person at Risk

The output of this process is a high-level score for each Person at Risk, with detailed matrices for each Type of Harm as supporting documentation.

RISK PROFILES of data collectors and data owners

  1. Severe/catastrophic risk: Clear, present, very high probability, direct threat with catastrophic impacts that cannot be mitigated. Severe risks include denial of civic rights, detainment, imprisonment, disabling physical injury, or death.
  1. High risk: Clear, present or future, high probability, direct or indirect threat with medium to high impacts. High risks include denigration, exclusion, access to civic rights, psychosocial distress, social stigma, loss of reputation, loss of livelihood, economic deprivation, moderate to severe physical injury with temporary or permanent effects on basic life functions. High risks threats also include organizational infiltration; personal intimidation, persecution, harassment, targeting for rights violations. High risks also include organizational or team breakdown.
  1. Low/moderate risk: Clear, present or future, low to medium probability, direct or indirect threat with low to moderate impacts. Risks with low to moderate impacts include verbal aggression, temporary psychosocial distress, temporary economic deprivation, discrediting, or temporary organizational or team breakdown.

Migitation/Safety planning

Responsible data practices require safety planning. This identifies actions you can take to address threats to data collectors and data owners. Questions that may help formulate your plan include:

  • What risks can be eliminated entirely?
  • Which risks can be mitigated?
  • Based on their likelihood and significance, which risks should be addressed first?
  • How can those risks be mitigated?

It is assumed that data collectors and data owners will not be able to address all threats at once. They should be prepared to schedule work on risk-of-harm assessment as well as safety planning alongside project design, implementation, monitoring and evaluation activities, and across the data lifecycle. Risk assessment and safety planning should be repeated as changes come about in the project context or population of data collectors or owners. Safety plan implementation should be monitored for needed adjustments to the plan for different profiles of data collectors or data owners.


Be inclusive in your planning. A practitioner's own or participants' risks may depend on other people's habits. Having confidential discussions about organizational safety policies and practices is important.

Be judicious with permissions and access to digital data, software or hardware: Does everyone in the office have access to all the data or devices in that office? Should they?

Use cases for validation and testing

  • NAZRA for human rights: piloting the tool with their forthcoming data collection process
  • Zasto Ne: testing the tool against election monitoring data

Next steps

  • Development of a spreadsheet that automatically maps and colours content according to input, and created charts and visualizations of the broad picture to assist with decision making
  • Having a few people test this out, and document some case studies (without real data)


Darko Brkan, founder, Zasto Ne

Jennifer Schulte, researcher

Mahy Hassaan, campaign and ad-hoc coordinator, NAZRA for feminist studies

Sajjad Anwar, software developer

Tin Geber, project manager, the engine room

Zack Halloran, director, Crowdmap

Food for thought

  • concepts, problems
  • questions to ask frequently
  • preventions: what do you actually do in concrete terms to prevent these things from happening
  • reactions: responsible responses for when things go wrong

Resources (we <3 links!)

Frontline Defenders, Digital Rights and Security for Human Rights Defenders