In 1990 an ambitious international collaboration led to the formation of the Human Genome Project, which sought to sequence the base-pairs that make up human DNA. Despite being declared complete in 2003, a few sections of the human genome, including the centromeres located at the middle of the chromosome, remained un-sequenced.
'You're just trying to dig into this final unknown of the human genome. It's never been done before and the reason it hasn't been done before is because it's hard', Dr Karen Miga from the University of California San Diego who co-led the international Telomere to Telomere (T2T) consortium told STAT News.
The difficulty of elucidating the last unknown sequences of the human genome appears to have been overcome through advances in sequencing technology. The gaps in previous drafts of the human genome were typically in areas where the DNA sequence is made up of large repeating patterns of bases. Earlier sequencing technologies took small fragments of DNA, decoded them, and finally reassembled the resulting sequences. However, DNA fragments consisting of repeating patterns are indistinguishable from one another and thus could not be accurately pieced together.
Using advanced technology developed by Pacific Biosciences and Oxford Nanopore, the T2T consortium were able to successfully sequence the missing portion of the genome. The new technologies no longer fragmented DNA into smaller pieces for reassembly later. Rather, the Oxford Nanopore technology pulls the DNA through a nanoscopic hole leading to a much longer sequence. Alternatively, lasers developed by Pacific Biosciences can read the same sequence of DNA repeatedly to get a far more accurate reading.
An earlier nanopore sequencing technology was previously used to obtain the first sequence of the human centromere (see BioNews 943).
The DNA sequence used for this study was obtained from cells isolated from a hydatidiform mole, a growth which forms in the uterus due to the fertilisation of an egg which does not possess a nucleus. Such cells contain two sets of the same 23 chromosomes, rather than two sets of different chromosomes usually found in normal human cells, thus making the computational effort more straightforward but potentially impacting the validity of these findings.
The paper is currently published as a preprint that has not yet been peer-reviewed. If the conclusions of this study are accepted by the wider scientific community, the new sequencing technologies could be applied to the complete 46 chromosome human genome.
Harvard biologist Professor George Church, who was not involved in the study, called the work 'very important'. He told STAT News that if this work successfully passes peer review, it will be the first time any vertebrate genome has been completely sequenced.