A controversial paper suggests that Chinese researchers retracted early COVID-19 viral sequences from a US database to obscure their origins.
Dr Jesse Bloom, evolutionary biologist of the Fred Hutchinson Cancer Research Centre in Seattle, Washington, detailed in an unreviewed preprint how he recovered these deleted files. He explained that these sequences support other evidence that SARS-CoV-2, the virus which leads to COVID-19, did not originate in Wuhan's Huanan seafood market, Hubei, China.
'I don't think this bolsters either the lab origin or zoonosis hypothesis... it provides additional evidence that this virus was probably circulating in Wuhan before December ... and... we have a less than complete picture of the sequences of the early viruses,' said Dr Bloom.
Dr Bloom discovered the deleted sequences in a study published in June 2020 in the journal, Small. The paper contained sequence data that Dr Bloom had not previously seen, and that he could not find on the public database, Sequence Read Archive (SRA). However, he managed to find the deleted sequences in the SRA's Google cloud storage – recovering data from 50 samples in total.
The National Institute of Health (NIH), Bethesda, Maryland, which oversees the SRA, stated that they deleted the sequences at the request of the depositing author, who announced that he planned to submit them to another database.
Dr Bloom has not found these sequences elsewhere and argues that these early sequences will increase our understanding of the origins of the virus. The deleted sequences, likely collected in January and February 2020, are more closely related to bat viruses than the those seen in individuals linked to the seafood market. This supports the hypothesis that the SARS-CoV-2 virus did not originate in the Huanan seafood market.
The preprint has amassed conflicting reactions from across the globe. Professor Ian Lipkin from Columbia University, New York, said, 'This is a creative and rigorous approach to investigating the provenance of SARS-CoV-2... The two take-home points are that the virus was circulating before the outbreak linked to the Wuhan seafood market, and that there may have been active suppression of epidemiological and sequence data needed to track its origin.'
Others were less impressed with Dr Bloom's approach, including Professor Andrew Preston from the University of Bath, 'The language of the paper is unusual, it contains a significant degree of supposition and conjecture, cites blog posts, and appears to be pointing towards a deliberate cover up by Chinese authorities of early sequence data from Wuhan. However, this is an entirely subjective appraisal of the situation, which will be very difficult to confirm or disprove.'
Dr Bloom, aware of the controversy surrounding his preprint, added: 'No matter how much people like [my paper] or don't like it, or agree with the interpretation or disagree with the interpretation, they can at least go download it and repeat it themselves.'
The paper is currently published as a preprint in bioRxiv that has not yet been peer-reviewed.