As part of Genomics England's Genomics Conversation initiative, the Progress Educational Trust (PET) – the charity that publishes BioNews – is producing a public event this month entitled 'With Great Genomic Data Comes Great Responsibility'.
The phrase 'with great power comes great responsibility' can be traced back centuries, but is probably best known from Spider-Man. In the superhero's first appearance in Marvel Comics in 1962, it was given as a salutary reminder for him to be careful about how he used his newly acquired superpowers.
It is now well-established that genomic data has (super)power. What we do with this power is vitally important, given the growing number of high-profile projects that involve sequencing and studying our whole genomes – the complete set of genetic instructions contained in most cells in the body.
In the USA, the All of Us health research programme began recruiting participants this year. It plans to enrol at least one million people in longitudinal health-related research, which will involve a variety of data types including genomic data.
In Europe, the UK is one of 13 countries that signed a declaration this year committing to cross-border access to genomic data, with the aim of making one million whole genome sequences available to study by 2022 (see BioNews 945). This year will also see the completion of the 100,000 Genomes Project established by the UK Government – which has so far sequenced 66,443 genomes and counting – and the launch of a new national NHS Genomic Medicine Service.
This service will focus in the first instance on rare diseases and selected cancers, as part of a longer-term ambition to incorporate genomics into mainstream UK healthcare. As Professor Sir John Burn said in a recent interview on BBC Radio 4 (see BioNews 940): 'The term genomic medicine will disappear. It will just be medicine.'
Genomic data is formidable in size. An individual's genome consists of more than three billion base pairs (the 'letters' of the genome). This takes around 200 gigabytes to store digitally, meaning that the 100,000 Genomes Project is amassing more than 20 million gigabytes (20 petabytes) of data.
To help you understand just how much that is, one petabyte is the equivalent to 20 million four-drawer filing cabinets filled with text. Or 13-plus years of high-definition video – that's nearly 60,000 movies (and a lot of popcorn). Multiply that by 20, and it totals 266 years (and a lot more popcorn).
Big Data is both an asset and a challenge. The more patients' genomes there are in a dataset, the greater the potential of that dataset to yield accurate insights into human disease and health, and the greater the importance of holding and using the data in a responsible way.
It isn't just the number of patients involved that makes genomic data powerful, but the ability to link it with other health-related data, including lifetime medical records. Extricating all this data from the silos in which it sits promises enormous benefits – a Parliamentary committee concluded recently that 'the data collected by the 100,000 Genomes Project, the Genomic Medicine Service and the wider NHS will constitute the best data resource for genomic medicine in the world'. But it also brings difficulties.
One challenge is working within the NHS 'data infrastructure', which involves the entire UK population of 65 million and an estimated 57,000 NHS bodies and providers. It encompasses everything from cutting-edge whole genome sequencing to the Paperless 2020 initiative to move beyond reliance on paper-based medical records and documents.
Many of us have personal experience of hospital letters and notes going missing, or of different parts of the NHS involved in our care not talking to one another. Seeking to link genomic data to comprehensive health records, while efforts are underway to improve the way these records are maintained and used, is not a simple undertaking.
But as our health data becomes centralised and standardised, there are rapidly advancing technologies that can help us sift through and make sense of it – algorithms, machine learning and artificial intelligence. The Prime Minister, Theresa May, discussed this last month when she announced that the UK would 'use data, artificial intelligence and innovation to transform the prevention, early diagnosis and treatment of diseases like cancer, diabetes, heart disease and dementia by 2030'.
Many different stakeholders have responsibilities regarding genomic data. This data must be collected, held and used in a responsible and competent way, so that its value to society can be realised. At the same time, there must be clarity around who can access it and to what end.
The responsibility to make best use of this data falls not only upon researchers and medical professionals but also upon the funders who make it possible to do this work, and the public and private sector organisations who will help translate the research into clinical applications.
Then there are patients themselves. What responsibilities, if any, do patients have in relation to their genomic and other health data? And what of their family members? We share part of our genome with our blood relatives, and so our genomic data might have significance for their health (and vice versa).
In light of all this, we decided that it was PET's responsibility to work with Genomics England to organise a timely public discussion of these issues. We have a fabulous panel of speakers lined up for this free evening event on 26 June. Places are going fast, meaning that it's your responsibility to book your place soon by emailing firstname.lastname@example.org