The UK Biobank is a long-term longitudinal study following 500,000 volunteers from the UK, collecting data including health-related, biochemical, and lifestyle factors. It has also collected genetic information to enable unbiased associations to be made between health and genetic make-up. The first 200,000 genomes to be sequenced have now been made available for researchers via the online cloud based 'Research Analysis Platform.'
Dr Michael Dunn, director of Discovery Research at the Wellcome Trust, one of the partner organisations who oversaw and funded the sequencing, said: 'The release of the first 200,000 whole genome sequences is a tremendous achievement, not only for UK Biobank, but also for the sequencing partners, deCODE Genetics and the Wellcome Sanger Institute. The integration of the sequences with the other characteristic data sets from participants will create a powerful resource to enable major discoveries that will benefit health outcomes.'
The participants in the study were between 40-69 years of age when they were recruited in 2007. Genetic information was initially collected in the form of single nucleotide polymorphism genotypes (released in 2017), followed by exome sequencing (released in 2019), and now whole genome sequencing. Whole genome sequencing of participants was funded by an industry consortium (made up of Amgen, AstraZeneca, GSK and Johnson & Johnson), the Wellcome Trust, and UK Research and Innovation. The remaining 300,000 sequences are expected to be released in 2023.
It is hoped that whole genome sequencing, which (as opposed to whole exome sequencing) analyses all of a person's genome, including the non-coding parts, will allow associations to be made between health and non-coding variation in the genome. This has remained so far an underexplored area of study, especially at this scale. Non-coding regions make up around 98 percent of the genome.
Major strengths of this study are not only its unprecedented size, but also that the results are freely and globally available to researchers. 'In terms of availability and data quality, [UK Biobank] surpasses all others,' physician and statistician Dr Omar Yaxmehen Bello-Chavolla of the National Institute for Geriatrics in Mexico City, told Science magazine. 'This type of data availability is crucial for researchers in low- and middle-income countries,' he said.
Professor Sir Rory Collins, principal investigator at UK Biobank, said: 'The whole genome sequencing project will make UK Biobank the most detailed genomics database in the world and by sharing these data with the global research community our aim is to enable breakthroughs in understanding, diagnosis, prevention and treatment strategies for a range of common and life-threatening diseases.'
One limitation of the study is its lack of ethnic diversity, as the majority of UK Biobank participants are of European ancestry. Important associations between genetic make-up and disease may therefore be missed, potentially exacerbating existing healthcare disparities.