Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Data 2.0 (Emergent session): Open neuroimaging data and personal data privacy: convergence or divergence #70

Open
jsheunis opened this issue Jun 16, 2020 · 2 comments

Comments

@jsheunis
Copy link
Contributor

jsheunis commented Jun 16, 2020

Open neuroimaging data and personal data privacy: convergence or divergence

By:
Stephan Heunis
Emma Bluemke
Andrew Task
Jonathan Passerat-Palmbach
PJ Toussaint
Abeba Birhane
Adam Thomas
Tonya White
Michael Beauvais
Gustav Nilsonne
Lyuba Zehl
Reubs J Walsh

  • Theme: Open Data 2.0
  • Format: Emergent session

Abstract

The virtues of open sharing of data in science are clear, but sharing of data derived from humans comes with particular risks. In particular, data may convey personally identifiable information that is protected under a range of legal regimes. How do we balance the need for easy sharing of data against the need to protect people's privacy? In this panel discussion we will take up the often conflicting tensions of openness and privacy. We will consider the current status of data sharing practices with respect to data privacy and protection, discussing whether current practices take sufficient care with respect to data privacy. While standards have improved (e.g. defacing is more common) are common practices yet good enough? In what ways do twin ethical and ideological commitments, to both personal data privacy and open science, conflict, and how can these conflicts be resolved ? What compromises may we find necessary in seeking to both respect privacy and leverage the immense wealth of data to be shared. What changes can we propose that would deliver a potential compromise that would facilitate data sharing.

Specific talking points / questions (continuously updated):

  • Considering the deep institutional and ideological ties between systems of oppression and academic science, how might data from marginalised groups be subject to misuse in secondary analyses, and how can the (non)consent of participants be respected in that they are unlikely to consent to certain uses of their data? What methodological/QRP correlates of this misuse can we identify? Can identifying these offer a path forward to opening data without risking its ab/mis-use?
  • Can (other) principles from (intersectional feminist) open science offer solutions to the problems presented by open data?
  • Is there an overview of the MRI defacing algorithms that exist and their efficacy in terms of anonymization. Do some anonymize "better" than others in some quantifiable way, e.g. can we assess the difficulty or re-identifying a person from supposedly anonymous data? Can we ultimately say that such data are then anonymous, or is that impossible? Should the data then be strictly be "fully anonymous" in order to share it?

Useful Links

Public Mattermost channel for discussions prior to, during and after the session.

Tagging @jsheunis @tjhwhite @iamtrask @reubsjw @lzehl @jopasserat @GNilsonne @agt24

@jsheunis
Copy link
Contributor Author

Hi everyone. I'm trying to isolate a few more concrete talking points so that we can cover good ground during what will probably feel like a very short time in the discussion. I think we have quite wide representation here, so my idea is to structure a few questions based on the distinct expertise or fields of people in our group. Below are some ideas (in addition to the ones listed above by @reubsjw and myself). Note that these are all outside of my field of expertise and comfort, I've just taken a shot at summarising some talking points, so please help me represent your work in the right light:

  • @iamtrask, @jopasserat, Emma, I propose talking about privacy preserving tech/workflows/processing as a possible alternative to public data sharing. It seems that if systems are in place that could help one run preprocessing, models, etc, on data that you won't ever interact with, it could circumvent risks to personal data privacy to some extent. But then it is also important to consider and question the built-in (and exposed?) levels of validation that such a system allows the user. It would be interesting to dive a bit into this on multiple levels: looking at the technical infrastructure that's required for this to scale; considering the implications of this for brain imaging data; and discussing the limitations and bias of such a system.

  • @lzehl @GNilsonne @agt24 @tjhwhite Perhaps we can focus on some of the challenges that exist and that you've experienced when aiming to share brain research data. Both when preparing planning for studies / cohorts / repositories and when considering existing ones. What are the big showstoppers you've encountered, and how have you managed to overcome them? Or haven't you? Also would be exciting to hear about what you consider wins / advancements in this domain, and what the core areas are that you think we should be focusing on as a community.

  • Michael, I think your experience will be very helpful basically in all the points above, when we discuss things like GDPR interpretations, balancing of risks, legal bases for processing data, etc. But more than that I would love to bring in some context from your experience in the field of genomics, to see if as a community we can learn from some concrete examples.

  • @reubsjw and Abeba, I would also like to focus more on what you think current trends in data sharing and protection of privacy mean for globally underrepresented folks and for both informed and uninformed research participants. IMO we currently put the main focus on advancing tech, processes, legal frameworks, etc, but are we spending enough time (and early enough) on considering the implications of what we do on marginalised people? Do the approaches in North America and Western Europe affect how research data and personal data privacy is regarded in the Global South? What should we be doing to counteract the embedded bias that exists in tech when it comes to protecting individuals?

Again, these are my quick thoughts and they could do with a lot more expert input and improvement. Happy to hear your thoughts.

@tjhwhite
Copy link

tjhwhite commented Jun 18, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants