Practical de-identification guide

From Responsible Data Wiki
Jump to: navigation, search


Main Output

The actual output is the Basic De-identification Solution Matrix, an editable Google Spreadsheet that list common variable types from the fields of health, education, finance, environmental, political, a list of de-identification solutions for these types of data, and some suggestions about what forms of de-identification are most useful for each type of data.

This output intersects really well with the work done by the Responsible Data Risk Mapping group, because in order to make decisions about what data to de-identify you need to assess the possible harm of that data first and after harm analysis you need to mitigate that harm by... de-identifying!

Wishlist Output

The wishlist output of this group is a piece of software that would automate different de-identification functions, so an individual could select the variables they'd like to identify and how and de-identification would then be carried out automatically.

Connection to previous RDFs


Intermediate Work Products

Variables to Watch out for When Anonymizing Data (Google Spreadsheet)
Types of Data Releasers: Organizations and Individuals (Google Spreadsheet)
Common Variables by Releaser and Field (Google Spreadsheet)
Day 1 Report-Back (Google Spreadsheet)


Matrix: Individuals without a data background making decisions about identification when releasing data; developers creating de-identification software

Next steps

Adding to the matrix (it's only partially filled in)
Making de-identification software?


Mary (other participants not listed due to lack of consent)