Subscribe to the BioNews newsletter for free

Login
Advanced Search

Search for
BioNews

Like the Progress Educational Trust on Facebook


The Fertility Show


 

Time to confront genomics data problem, say scientists

13 July 2015

By Chris Baldacci

Appeared in BioNews 810

Scientists have warned that the world of genomics is headed for a data bottleneck.

The team of maths and computer specialists discovered that the data created by genomic studies will soon overtake that of social media giants such as YouTube and Twitter. Even the high-tech and processor power-hungry field of astronomy does not currently generate as much data as genomics, they report in PLoS Biology.

YouTube, the current leader in the field of data generation, has 100 petabytes (100 million gigabytes) of video uploaded to its servers every year - over a thousand times what the average home computer could store. By comparison, genomics is currently generating 25 petabytes a year but the rate at which the data is produced is doubling every seven months, mostly due to the refinement and falling costs of sequencing techniques.

'As genome-sequencing technologies improve and costs drop, we are expecting an explosion of genome sequencing that will cause a huge flood of data,' said Professor Gene Robinson, director of the Carl R Woese Institute for Genomic Biology at the University of Illinois.

By 2025 it is estimated that up to two billion people will have had their genomes sequenced, meaning the level of genomic data could hit exabyte levels - or billions of gigabytes. This huge influx of data leads to the problem of, not just how to store it but, how to acquire, distribute and analyse it. And, the researchers say that all four of these challenges must be tackled if we are to solve the 'genomics data problem'.

Professor Robinson said, 'Genomics will soon pose some of the most severe computational challenges that we have ever experienced.

'If genomics is to realise the promise of having a transformative positive impact on medicine, agriculture, energy production and our understanding of life itself, there must be dramatic innovations in computing. Now is the time to start.'

According to an editorial appearing this week in Nature, perhaps one such innovation could be a more collaborative use of cloud storage.

An international group of prominent researchers, headed by Dr Lincoln Stein, put out a call to the community to collectively fund a cloud computing network that would take the strain from private networks of individual institutions. The group argues that the challenge of accessing large datasets is blocking scientists' progress, particularly when it comes to building on or replicating previous work.

They propose that funding bodies should pay for large genomic datasets to be stored and accessed in cloud format, meaning that researchers can save time and money by not having to download or process the data on local computers.

'We have now reached a stage where these data sets are too large to move around - cloud computing offers us the flexibility to hold the data in one virtual location and unleash the world's researchers on it all together,' said co-author Dr Peter Campbell, head of cancer genomics at the Wellcome Trust Sanger Institute.

SOURCES & REFERENCES
Ontario Institute for Cancer Research | 09 July 2015
 
PLOS Biology | 07 July 2015
 
Eurekalert (press release) | 07 July 2015
 
Nature | 09 July 2015
 
Yahoo News (PA) | 07 July 2015
 
Nature News | 07 July 2015
 
Washington Post | 07 July 2015
 

RELATED ARTICLES FROM THE BIONEWS ARCHIVE

14 March 2016 - by Isobel Steer 
Genetic-testing company Ambry Genetics has launched a huge database of cancer-patient genetics, freely available to the public...
05 October 2015 - by Dr Rosie Gilchrist 
An international team of scientists from the 1000 Genomes Project Consortium has created the world's largest catalogue of genomic differences among humans...
21 September 2015 - by Dr Nicoletta Charolidi 
The first findings from the UK10K project, the largest population genome sequencing effort to date, have been made available to worldwide researchers...

29 June 2015 - by Paul Waldron 
Genome analysis software developed by the Broad Institute is now available in cloud form to users of Google's online genetic data storage services...
16 March 2015 - by Hannah Somers 
Three British men have been diagnosed with rare diseases after having their complete genomes sequenced as part of the UK-based 100,000 Genomes Project...
16 March 2015 - by Arit Udoh 
US-based genetic testing company, 23andMe, has announced plans to use its customers' data for research and drug development...
18 November 2014 - by Chris Baldacci 
Google has announced that it will offer storage and analysis of genome sequencing data...
12 December 2013 - by Dr Ruth Stirton 
23andMe and UK Biobank are both large genetic databases: big enough to engage in serious population genetic research. But 23andMe has not undergone any ethical approval processes - think what they could do if they sold their database...

HAVE YOUR SAY
Be the first to have your say.

You need to or  to add comments.

By posting a comment you agree to abide by the BioNews terms and conditions


- click here to enquire about using this story.

Published by the Progress Educational Trust

CROSSING FRONTIERS

Moving the Boundaries of Human Reproduction

Public Conference
London
8 December 2017

Speakers include

Professor Azim Surani

Professor Magdalena Zernicka-Goetz

Professor Robin Lovell-Badge

Sally Cheshire

Professor Guido Pennings

Katherine Littler

Professor Allan Pacey

Dr Sue Avery

Professor Richard Anderson

Dr Elizabeth Garner

Dr Jacques Cohen

Dr Anna Smajdor

Dr Andy Greenfield

Vivienne Parry

Dr Helen O'Neill

Dr César Palacios-González

Philippa Taylor

Fiona Fox

Sarah Norcross


BOOK HERE

Good Fundraising Code

Become a Friend of PET HERE and give the Progress Educational Trust a regular donation