U.S. Scientists Recover Deleted Virus Sequence Promising to Facilitate New Crown Traceability

Biden administration officials warned that a 90-day investigation into the source of the new crown may not lead to a definitive conclusion. One U.S. media outlet said “the main obstacle is China’s refusal to provide further data.” But a U.S. scientist says his latest research shows that it is possible to advance understanding of the origin of New Crown virus even in the absence of field studies.

The Wall Street Journal reported June 28 that President Biden will receive an updated report on the 45-day investigation in mid-July, and administration officials said even partial progress could narrow differences in understanding the origins of New Crown and provide clues for further investigation.

But the report also noted that Biden administration officials warned that the 90-day investigation into the origin of New Crown may not lead to a definitive conclusion and that “the most significant obstacle is China’s refusal to provide further data or to allow investigators further access to scientists at the Wuhan Virus Institute.”

Bloom: WHO-China traceability report not representative of early Wuhan outbreak

But a paper published last week by a U.S. scientist said, “Even without further field studies, it is possible to advance understanding of the origin or early spread of the new coronavirus (SARS-CoV-2) by exploring in greater depth the data archived at the National Institutes of Health (NIH) and other agencies.”

Jesse Bloom, a principal investigator and virologist at the Fred Hutchinson Cancer Research Center in Seattle, published a paper on June 22, “Recovery of deleted deep sequencing data provides more information on early Wuhan neo-coronavirus epidemic. The paper has not yet been evaluated by peers. The paper has not yet been evaluated by peers.

Bloom’s research and explorations led to two stories: first, he recovered some deleted files from Google Cloud to reconstruct sequences from 13 early cases of the Wuhan neo-crown outbreak, and phylogenetic analysis of these sequences suggests that neo-crown virus spread in Wuhan, China, before the December 2019 seafood market-related outbreak.

Second, he found that some sequences originally stored in an NIH database had been deleted, and upon inquiry, the NIH confirmed that this was done after a request from the Chinese researchers who submitted these sequences.

Examination of these deleted virus sample sequences led Bloom to conclude that “the South China seafood market sequences that were the focus of the joint WHO-China report are not fully representative of the virus from the early Wuhan outbreak.”

Bloom’s paper found that the December 2019 virus sample from the seafood market contained three genetic mutations that were absent from a virus sample collected a few weeks later, meaning that the later-discovered virus was more similar to the coronavirus found in bats.

“This supports the idea that some of the early lineages of this virus did not pass through the seafood market,” the New York Times said. “The new analysis supports earlier claims that multiple coronaviruses may have spread in Wuhan before the initial outbreak associated with the fresh market in December 2019.”

Latham: Bloom study pushes lab leak even further, says

Virologist Jonathan Latham, executive director of the Bioscience Resource Project, an Ithaca, N.Y.-based science nonprofit, said Bloom’s study “comprehensively rules out the South China (seafood) market as the source” and takes a step forward in proving that the virus escaped from the lab.

“The study shows a single species jumping to humans, which is inconsistent with our understanding of a typical zoonotic outbreak.” Latham wrote in an email response to a request for comment. “The jump of a single species, on the other hand, suggests a laboratory escape.”

However, the Wall Street Journal quoted Dr. Cooper, a virologist at the University of Pittsburgh, as saying, “The deleted sequence does not resolve the ongoing debate over whether the pandemic was transmitted from a laboratory accident or an animal to a human. ‘You can still argue both ways,’ he said.”

Bloom is a co-author of the open letter “Investigating the Origin of the New Coronavirus,” published May 14 of this year in the journal Science. The letter criticizes the WHO-China joint investigation report for “not giving balanced consideration to the two theories” of natural and laboratory spillover of the new coronavirus; argues that “further clarity on the origin of the epidemic is necessary, feasible and achievable”; and calls for “A proper investigation should be transparent, objective, data-driven, inclusive of a wide range of expertise, subject to independent oversight, and managed responsibly to minimize the impact of conflicts of interest.” The letter was co-signed by 17 scientists in the U.S. and around the world.

In his study, Bloom described how he discovered that the NIH deleted sequence data. He said he was reading a paper on sequence data for SARS-CoV-2 when he learned that the sequences were stored in the National Institutes of Health’s sequence read archives.

That paper listed a table with “most of the entries about a project at Wuhan University” that “represents 282 SARS-CoV-2 sequencing runs in the sequence read archive as of March 30, 2020 out of 241.” These samples were collected by “Fu Aisu (ph) and Wuhan University People’s Hospital,” Bloom wrote.

He searched for the project at the National Center for Biotechnology Information, which manages the Sequence Read Archive [SRA] at the National Institutes of Health, and came up with ” No items found.” He then searched for a sample of individual sequencing runs instead, “and the results indicated that the sequencing runs had been removed,” he writes in the article.

“Samples from early outpatients in Wuhan are a gold mine for anyone who wants to understand the spread of the virus.” Bloom wrote.

“At the very least, the NIH should be able to immediately determine the date and claimed reason for the deletion of the dataset analyzed here, since the only way to remove sequences from the SRA is to send an email request to the Sequence Read Archive staff (SRA 2021).” The Wall Street Journal reports.

Bloom then contacted the National Institutes of Health to ask why the sequences were deleted.

NIH confirms deletion

On June 24, the NIH confirmed the deletion and explained why in an email response to a query from Voice of America.

Emma Wojtowicz, a public affairs specialist at the NIH, wrote that “staff at the National Library of Medicine (NLM), which manages the Sequence Reading Archive (SRA), have reviewed the request submitted to investigators to withdraw the data.”

The email states: “These SARS-CoV-2 sequences were submitted to the SRA in March 2020 and were subsequently requested for withdrawal by the submitting investigator in June 2020. The requestor indicated that the sequence information had been updated, was being submitted to another database, and wanted the data removed from the SRA to avoid version control issues.”

The submitting investigators published relevant information about the sequences in March 2020 via a preprint and in a journal in June 2020.

The NIH email said, “The submitting researchers have rights over their data and can request that the data be withdrawn.”

But Bloom said in his paper, “There is no sound scientific justification for the deletion. …… There are no corrections to the paper, the paper states that subject approval was obtained, and sequencing showed no evidence of plasmid or inter-sample contamination. Therefore, it appears that the sequences may have been removed to mask their presence.”

U.S. Sen. Josh Hawley (R-Texas) sent a June 24 letter to U.S. health officials demanding answers as to why important data on the earliest diagnosed New Crown patients disappeared from the National Institutes of Health’s database.

“I am increasingly concerned that the NIH is not taking the CCP’s obstruction and its motives seriously, especially after evidence suggests that NIH funds may have gone to the Wuhan Institute of Virus Research.” He wrote in the letter.

Publicly available information from the NIH shows that from 2014 to 2019, the NIH provided about $3.7 million in funding to the nonprofit Ecological Health Alliance to conduct a project called Understanding the Risk of Bat Coronavirus Emergence. Bat Coronavirus Emergence. The consortium has been using some of the funding for a collaborative project with the Wuhan Institute of Virus Research.

Why the deletion?

Virologist Latham, founder of the Bioscience Resources Project, told Voice of America that another key point of Bloom’s paper raises “the question of why this important data was removed and why it is not being made available now.”

“The Chinese government is withholding evidence.” Latham wrote. Because, “it’s hard to come up with a good reason (for why it was deleted).”

Bloom argues that it is suspicious that the sequences were removed, and that the purpose of doing so “appears to be to conceal their existence.”

Bloom writes in the paper, “In other outbreaks where direct identification of early cases has been hampered, it is increasingly possible to use genomic epidemiology to infer the timing and dynamics of transmission from virus sequence analysis. For example, analysis of the SARS-CoV-2 sequence has been able to reconstruct the initial spread of SARS-CoV-2 in North America and Europe.”

However, genomic epidemiology has so far been unable to reconstruct the initial spread model of the virus in the case of Wuhan, where the outbreak first occurred. The reason for this, according to Bloom, lies in the limited data that the Wuhan case could provide.

“Despite its advanced virology laboratory, Wuhan has only sporadic SARS-CoV-2 sequence samples from the first few months of the outbreak in the city. Apart from a set of multiplexed sequencing samples collected in late December 2019 from a dozen patients associated with the South China Seafood Market, only a small number of Wuhan sequences are available from before late January 2020.”

He believes that “this lack of sequences may be due to orders received that unauthorized Chinese laboratories must destroy all coronavirus samples from the early stages of the outbreak. This was reportedly done for ‘laboratory biosecurity’ reasons.”

On February 14, 2020, Xi Jinping chaired a meeting calling for “closing loopholes in response to the shortcomings revealed by the Wuhan pneumonia outbreak,” and stressed the need to promote a “biosafety law” as soon as possible.

On February 25, 2020, the CDC issued a document stating that “no one can provide information related to the Newcastle pneumonia outbreak, including data, biological specimens, pathogens, cultures, etc., to other institutions and individuals on behalf of individuals or research teams without authorization.” “Before publishing papers and results related to the New Crown pneumonia outbreak, they must be submitted to the Science and Technology Group/Science and Technology Division for preliminary review and, if necessary, to the Emergency Response Leadership Team or the National Health and Science Education Division for approval.” “Submitted papers that have not been reviewed by the Science and Technology Group/Science and Technology Division will be withdrawn and these regulations enforced as soon as possible.” “Anyone who violates the above provisions will be dealt with seriously in accordance with the law and regulations.”

On March 3, 2020, the State Council issued another article “earnestly implement a series of important instructions from General Secretary Xi Jinping on the prevention and control of the epidemic …… to form a national ‘chessboard’ for the release of information on the scientific research results of the new crown pneumonia pattern.”

Despite measures taken by Chinese institutions at all levels to strictly prevent information from leaking out, the Wall Street Journal quoted Sergei Pond, a professor of biology at Temple University and an expert on the evolution of viral pathogens, as saying that Dr. Bloom’s paper suggests that other early sequence data may be forthcoming.

“If more sequences come to light, especially archival samples from earlier time points or elsewhere, everything could change again,” he said. “I think it’s very likely that will happen.”