The Berkman Center is pleased to announce a new publication from the Privacy Tools for Sharing Research Data project, authored by a multidisciplinary group of project collaborators from the Berkman Center, MIT Libraries, and Harvard's Center for Research on Computation and Society. This article summarizes research exploring various models by which governments release data to the public and the interventions in place to protect the privacy of individuals in the data. Applying concepts from the recent scientific and legal literature on privacy, the authors propose a framework for a modern privacy analysis and illustrate how governments can use the framework to select appropriate privacy controls that are calibrated to the specific benefits and risks in individual data releases.
Governments are under increasing pressure to promote transparency, accountability, and innovation by making the data they hold available to the public. Because the data often contain information about individuals, agencies rely on various standards and interventions to protect privacy interests while supporting a range of beneficial uses of the data. However, there are growing concerns among privacy scholars, policymakers, and the public that these approaches are incomplete, inconsistent, and difficult to navigate.
This article provides a survey of practices for releasing data in response to freedom of information and Privacy Act requests, traditional public and vital records, official statistics, and e-government and open government initiatives. This review yields a number of findings:
Governments rely on a narrow set of tools to analyze and mitigate privacy risks, compared to the wide range of privacy interventions proposed by computer scientists, legal scholars, and social scientists.
Most agencies address privacy concerns by withholding or redacting records that contain certain pieces of information considered to be directly or indirectly identifying based on an ad hoc balancing of interests.
Treatment of data across government actors is largely inconsistent. Similar privacy risks (and, in some cases, even identical sets of data) are addressed differently by different actors.
Guidance on interpreting regulatory standards for privacy and implementing appropriate methods for privacy protection in specific data release settings is very limited.
In light of these findings, this article proposes a framework for a modern privacy analysis informed by recent advances in data privacy from disciplines such as computer science, statistics, and law. Modeled on an information security approach, this framework characterizes and distinguishes between privacy controls, threats, vulnerabilities, and utility at each stage of the information lifecycle. In characterizing privacy controls, the article provides a catalog of a range of procedural, economic, educational, legal, and technical interventions for protecting privacy, along with descriptions of recent advances in each of these areas.
The authors argue that changes in science, technology, and the understanding of privacy risks offer the opportunity for sophisticated characterization of privacy risks and harms, as well as new tools for protecting privacy. Governments now have the opportunity to select from a distinct set of interventions, in order to construct a comprehensive policy that is based on the desired data uses, the expected benefits, and the privacy threats and vulnerabilities associated with a data release. The article seeks to lay the groundwork for the future design of such policies, by sketching the contours of a comprehensive analytical framework, and populating selected portions of its contents. By applying the framework to two real-world examples of government data releases, the article also illustrates how it can inform the selection of suitable privacy controls, promote data releases that support varied uses while protecting privacy, and provide a natural foundation for increased transparency through the documentation of the uses, potential risks, and the privacy and security interventions selected.
This article, and other papers from the 19th Annual BCLT/BTLJ Symposium: Open Data: Addressing Privacy, Security, and Civil Rights Challenges, were published in Volume 30, Issue 3 of the Berkeley Technology Law Journal.
Microsoft Corporation, in collaboration with the Berkeley Center for Law & Technology, supported the research and the writing of this report. In addition, this material is based upon work supported by the National Science Foundation under Grant No. CNS-1237235, the Ford Foundation, and the John D. and Catherine T. MacArthur Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of Microsoft Corporation, the Berkeley Center for Law & Technology, the National Science Foundation, the Ford Foundation, or the John D. and Catherine T. MacArthur Foundation.
About the Privacy Tools for Sharing Research Data Project
Funded by the National Science Foundation and the Alfred P. Sloan Foundation, the Privacy Tools for Sharing Research Data project is a collaboration between the Berkman Center for Internet & Society, the Center for Research on Computation and Society (CRCS), the Institute for Quantitative Social Science, and the Data Privacy Lab at Harvard University, as well as the Program on Information Science at MIT Libraries, that seeks to develop methods, tools, and policies to facilitate the sharing of data while preserving individual privacy and data utility.
Executive Director and Harvard Law School Professor of Practice Urs Gasser leads the Berkman Center's role in this exciting initiative, which brings the Center's institutional knowledge and practical experience to help tackle the legal and policy-based issues in the larger project.