Lesson behind hyperbole of Chinese researchers withdrawing COVID data
Chinese researchers withdraw COVID data.


EDITORIAL ERROR
SMALL published a paper called "Nanopore Targeted Sequencing for the Accurate and Comprehensive Detection of SARS-CoV-2 and Other Respiratory Viruses" on June 24, 2020. The authors were led by a team of Chinese researchers.
The researchers collected throat swab clinical samples from Wuhan residents and then performed their nanopore targeted sequencing (NTS) on the samples, producing COVID-19 data, according to the paper.
The researchers uploaded their data on the US database, and included a paragraph describing how to access the data, said Zeng on July 22.
The Wuhan researcher said the same thing. As proof, the researcher shared, with the two Xinhua journalists, a snapshot of the Data Accessibility paragraph in their submitted paper to SMALL.
But when SMALL sent a draft for publication back to them, the paragraph was gone, according to another snapshot shared by the researcher.
"When we saw that the journal had deleted the paragraph, we believed that then the paragraph was unnecessary," the researcher said.
The published paper did not include the paragraph. Following that, the researchers withdrew their data from the US database, thinking its storage was no longer necessary. Furthermore, the information could have been deduced from the tables and graphs in the published paper, the researcher added.
The Wuhan researcher's account of the events was first reported by the two Xinhua journalists in newsletters in their personal capacity.
On July 29, what SMALL said in a correction on its website confirmed the Chinese account, "In the originally published article, the Data Availability paragraph of the experimental section was mistakenly deleted during the copyediting process. The original sequencing data has been submitted to China National Center for Bioinformation GSA database."
"The editorial office apologizes for any inconvenience caused," the correction said.
The Chinese database mentioned in the correction is open to international researchers.
WILD SUSPICION
But before the correction emerged, Bloom, the virologist in the United States, had already developed and published his suspicion.
According to a snapshot shared by Wuhan researcher, Bloom wrote in a June 7 email asking, "why the raw sequencing data for the study are no longer available?"
The Chinese researchers didn't reply, as the researcher said they didn't know Bloom, and that they thought -- they still think -- if they were to share the raw sequencing data, the best way was to upload them to a database and make it public, not exclusively with one person.
Bloom is the lead signatory to a May 14 letter saying, on the origin of SARS-CoV-2, "theories of accidental release from a lab and zoonotic spillover both remain viable," the Wuhan researcher also noted in the phone interview.
The lab leak theory was described as "extremely unlikely" in the joint World Health Organisation-China study on the origins of SARS-CoV-2.
On June 23 Beijing time, Bloom published a preprint -- a paper not yet peer-reviewed -- and a lengthy Twitter thread, describing how he discovered and recovered the withdrawn data, adding that "there are also broader implications" such as "this data was deleted should make us skeptical that all other relevant early Wuhan sequences have been shared."
The thread set off a wave of reports in major Western media outlets on the same day of its publication or the next, when the Chinese researchers "did not immediately respond to emails inquiring about Dr. Bloom's finding," as reported by The New York Times.
The Wuhan researcher said the authors, who have little experience of talking to the media -- especially those in the West, found the accusation unacceptable and offensive, and they were overwhelmed by the sudden, simultaneous Western media focus.
LITTLE VALUE
Another key argument put forward by Zeng, the Chinese vice-minister, and the Wuhan researcher was that the withdrawn data was of little value to COVID-19 origin tracing.
The paper said the data was based on samples from people in Wuhan of "suspected COVID-19 early in the epidemic (January 2020)," which was seized by Bloom and others to speculate that the withdrawal was to conceal valuable data.
However, Zeng and the Wuhan researcher clarified that the earliest sample was collected on January 30, 2020, and thus was not valuable for COVID-19 origin tracing.
According to the China Part of the joint WHO-China study, the earliest onset date in China's COVID-19 reporting system was December 8, 2019.
By January 30, 2020, which is almost two months later, China reported 9,692 confirmed cases and 15,238 suspected cases.
The Chinese researcher also said that since the Wuhan researchers' intention was only to test a new sequencing method, the quality of their sequence and data didn't have to -- and therefore didn't -- reach the level of accuracy meaningful for COVID-19 origin tracing.
The Chinese researcher said that their raw sequencing data was like covering only a few digits of a dozen-digit-long ID number; to use their raw sequencing data to identify the person behind the ID number would be impossible -- they don't even have the whole ID number; and that scientists in the same field worth their salt should have known this.