Practical de-identification techniques for advocacy initiatives and open data

From Responsible Data Wiki
Revision as of 16:27, 7 April 2015 by Kristin (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The last half of this year saw vigorous debates on the efficiency and reliability of de-identification techniques for large data sets. There remains disagreement about how effective and accessible different de-identification techniques are, how much risk is acceptable, and the feasibility of using auxiliary data sets to re-identify individuals. These questions are important and should discussed rigorously and at length by experts. But most people using data in the service of social good objectives are not experts, nor statisticians, computer scientists, or even very familiar with managing large data sets.

For organizations and advocacy initiatives leveraging data for transparency and accountability, the basic question remains: What can I do to make sure that the data I’m using can not be used to harm people or violate their privacy?

It’s a simple question without a very simple answer, but one which needs to be posed and discussed at the expertise level of project managers and campaigners, not only security gurus.

To help normalize some of these concerns and provide an introduction to basic de-identification concerns and strategies, the Responsible Data Forum is hosting a 90 minute online discussion. A small group of practitioners will present basic de-identification strategies, such as perturbation, trimming and pseudonymization, discussing their limitations, the contexts in which they’re most appropriate, and the expertise and tools required to use them.

We’re currently coordinating with potential practitioners and subject matter experts, if you know someone who’d be a good fit, please pass this along or put us in touch.

Contact Kristin Antin, the engine room's Community Catalyst, if you're interested to join this event kristin [at] theengineroom [dot] org.

These hangouts will be public and open for anyone to join.

Instructions to join using Jitsi Meet:

  • You will need the Google Chrome internet browser (This video hangout tool is browser-based, but is currently limited to Chrome, Chromium and Opera – Firefox coming soon)
  • Jitsi Meet will want to use your video camera so be prepared! If you don't want to use video, you can put a sticker over the camera or turn that feature off *after* you join the meeting.
  • Plug in your headset with a microphone. This is the best way to ensure that we can hear you and you can hear others.
  • In Google Chrome, at the time of the hangout, go to the URL https://meet.jit.si/responsibledata
  • Note: We will be recording this hangout to share with a wider audience so please do not share any personal identifiable information.

Join discussion and planning on this and other events: http://lists.theengineroom.org/lists/info/responsible_data