Privacy Challenges in Genomic Medicine

Project summary

The rapidly declining cost of genomic sequencing promises many breakthroughs in understanding the genetic predisposition to disease and for the development of medical treatments more precisely tailored to the individual patient. Much of this genomic data will end up in databases maintained by research and healthcare organisations (and increasingly by commercial ‘personal genomics’ companies) which will have the ethical and legal responsibilities for preserving the privacy of such sensitive information. Unfortunately, recent research suggests that it is much more difficult than was first imagined to preserve the privacy of such information. Many existing methods for ‘de-identifying’ or ‘anonymising’ such data have been shown to be fragile: correlation of information from genomic databases, electronic health records and public sources such as genealogy and residence databases can often lead to surprisingly accurate inferences about the identities of individuals. If such information were to become widely available, it might compromise the ability of individuals to obtain health and life insurance, and might influence employment and even personal relationship decisions. Such information leakage might also well have a significant chilling effect on the public’s willingness to participate in research and clinical studies. With this in mind, the focus of this project was to organise a series of seminars to examine the current state of information privacy in this domain, and to look in particular at several questions:

  • To what extent can technology keep up with the arms race between ‘hackers’ and data curators? Will recent advances in cryptography, database security architectures and ‘privacy preserving’ data mining methods mitigate the risks, now and in the future?
  • What is the current state of legislation and regulation in this domain, and how is it likely to evolve in the face of developing attacks on privacy? Who actually owns and has control over genomic (and related health) data and its uses? Are there significant national and cultural differences which need to be taken into account (especially when data storage may transcend jurisdictional boundaries e.g. when data are stored in commercial ‘clouds’)?
  • To what extent does the appearance of patient-centric disease management portals such as PatientsLikeMe mitigate the concerns about privacy? Will patients’ altruistic urge to share information about themselves, their disease, and their interactions with the healthcare system outweigh their concerns about their personal privacy? What is the appropriate balance between the public good which results from data sharing and the potential private loss?
  • What changes need be made to informed consent protocols to ensure that both researchers and donors fully understand and accept the risks associated with data collection and use?
  • If, as Scott McNealy (former CEO of Sun Microsystems) once said, ‘Privacy is dead – get used to it,’ and privacy is doomed to lose the arms race, what is the impact likely to be on public attitudes towards, and expectations of, personal genomic privacy? In a world where people are willing to commit intimate personal information to Facebook, should we even worry about the consequences of loss of genomic privacy? Or should we rather be addressing the issues inherent in completely open sharing of such information?

Answers to some or all of the above questions would have a profound impact on the practice of scientific research and medicine. A clear analysis of the risks, methods for mitigating those risks, and, alternatively, of the consequences of a deliberate policy of transparency, will help policy makers to develop realistic approaches to public education about, and the setting of guidelines for future research on, and exploitation of, personal genomic information. 

The main output of this BII project was a seminar series, with details and video recordings of the events available here.  The seminars were attended by a wide range of participants from many different disciplines, and explored many aspects of the privacy challenges of genomic medicine including:

  1. Policy challenges (Freeman, Taylor, Caldicott, Brownsword): how adequate are current statutes and regulations to address the new challenges? There was general agreement that the current regulatory frameworks in the UK, EU and the USA are not adequate and will require rapid evolution to cope in changes in technology. The NHS (Caldicott) has been in the van of such efforts for many years, but recent controversies over the centralised collection in the UK of data from hospitals and GPs (the “” initiative) have created substantial controversy, and have highlighted the practical difficulties in managing large centralised collections of personally-identifiable information. The fear is that the practice is getting ahead of the regulators and may cause a public backlash which will have a chilling effect on research use of such data.
  2. Models of Informed Consent (Wilbanks, Kaye, Caldicott): there was general agreement that existing models of informed consent are inadequate to address the new kinds of scientific and clinical studies which genomics make possible. New, so-called “dynamic models” (Kaye) which are not “one-off”, but can respond to rapidly evolving questions, seem better suited than traditional models. Even more radical models which essentially involve individuals giving consent for any reasonable use (Wilbanks), leading to the creation of large-scale “data commons” were also explored. Commercial “personal genomics” companies such 23andMe are in the van of data collection under such models, but the US FDA has recently expressed concern about the use of data in this way. An important concept being developed is that patients who have given such consents should continually be kept informed on the use being made of their data, leading to the idea of “consent by accountability” described by Mayer-Schoenberger and Cukier in their book “Big Data” (Murray 2013). Other authors such as Jaron Lanier (“Who Owns the Future?” Simon & Schuster 2013) have even gone so far as to suggest that individuals should receive “micro-payments” for every instance when use if made of their data.
  3. Ethical challenges (Taylor, Parker, Kaye, Papoutsi): genomic information raises a number of issues over and above the usual ones in medicine. In particular, the fact that an individual’s genome is largely shared with his parents, siblings and children means that the concept of “ownership” of such information becomes fuzzy. What rights does an individual have to give consent to the use of information which in some sense also belongs to others, and when such use may infringe on others’ right to privacy? Similar issues arise in the sharing of information between different elements of the healthcare delivery system. What rights should healthcare workers have to share information with each other when not all the “owners” have given consent, or the sharing strains the boundaries of consented uses? Papoutsi has explored the inherent tension between effective information sharing and concerns to maintain privacy.
  4. Technical challenges: several speakers (Hubbard, Smart, Wright) explored the potential use of recent advances in cryptography and cybersecurity to protect personal genomic data. Recent, well-publicised incidents have revealed how difficult it is to truly “anonymise” personal data – surprisingly rich inferences about individuals can be made by comparing anonymised data with other, publicly-available sources. The consensus seemed to be that realising the potential of genomic data while preserving personal privacy will require the design of “safe haven” (Hubbard, Caldicott) software systems so that data are not “published” in the usual sense, but rather distinct queries can be sent to the data repositories, where the credentials of the person making the query, and its purpose, are validated, and reference made to the consent available for use of those data for that particular purpose.

Overall, the consensus seemed to be that substantial, and rapid, evolution of our current information governance policies and practices along all of these dimensions will be required if genomic information is to be used effectively for both research and health care delivery without raising public alarm and risking a regulatory “backlash” which would in turn negate much of the value of the information being generated by increasingly-sophisticated and lower-cost sequencing technologies and computational analysis methods.  Position papers from each of the speakers are currently being developed, and will be published in due course.


Lead investigator

Dr Arthur Thomas, Oxford Internet Institute and Balliol College, University of Oxford 

Research team

Dr Lisa Walker, Division of Medical Sciences and Balliol College, University of Oxford

Dr Martin Burton, Director of UK Cochrane Centre, Division of Medical Sciences and Balliol College, University of Oxford

Professor Bill Dutton, Oxford Internet Institute and Balliol College, University of Oxford

Professor Ralph Schroeder, Oxford Internet Institute, University of Oxford

Dr Eric Meyer, Oxford Internet Institute, University of Oxford

Contact details for enquiries

Please email the lead investigator, Dr Arthur Thomas, for all queries regarding this project.