According to information from the National Genomics Data Center (NGDC), on January 26, the center collected the complete genome sequence of 5 new coronaviruses provided by the Chinese Academy of Medical Sciences / Peking Union Medical College Institute of Pathogen Biology.
This is the first time that data has been released publicly by a domestic public data platform since the outbreak of the disease in December 2019. The 2019 new coronavirus genomic sequences that scientists have previously obtained are generally submitted to the Global Influenza Sequence Database (GISAID) and the GenBank database of the National Center for Biotechnology Information (NCBI).
Previously on January 22, the National Genomics Scientific Data Center officially released the 2019 New Coronavirus Information Database. The library integrates the coronavirus genome sequence data, meta-information, academic literature, and news updates published by the World Health Organization (WHO), China Centers for Disease Control and Prevention (CDC), the National Center for Biotechnology Information, and the Global Influenza Sequence Database. Popular science articles. At the same time, the genomic sequences of different coronavirus strains were analyzed and displayed.
The 2019 new coronavirus information database conducts genomic variation analysis of 2019-nCoV virus strains based on different reference genomic sequences, and the results are displayed statistically and visually. Through comparison of genome-wide sequence similarity and analysis of mutation sites, the degree of variation, variation region, and variation between 2019-nCoV virus strains, 2019-nCoV virus strains and SARS coronaviruses, and between SARS-like coronavirus bat strains were obtained. Details of bases.
The analysis of the genomic variation of the 2019-nCoV virus strain provides important data foundation and decision support for tracing the source of the virus, tracking the mutation path of the virus strain, preventing and controlling the outbreak caused by the new coronavirus, and treating viral pneumonia.
When the above resource library was released, Bao Yiming, a researcher at the Beijing Institute of Genomics at the Chinese Academy of Sciences and director of the National Genomics Scientific Data Center, said in an interview with the “China Science News”, “After the release of this database, some units will contact the Wuhan New coronavirus genome The data is sent here, instead of going abroad and then domestic researchers to get it back for ‘export to domestic sales’. “
According to the National Genomic Science Data Center, Coronavirus belongs to the genus Trotaviridae, Coronaviridae, and Coronavirus. It is a type of RNA virus with a capsule and a linear positive single-stranded genome. It is a large class of viruses widely existing in nature. . Certain coronaviruses can infect humans and cause diseases, such as the Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS), whose symptoms can range from the common cold to severe lung infections.
The coronavirus first discovered in Wuhan this time is a virus strain that has not been previously found in humans and was named by the WHO as the 2019 New Coronavirus (2019-nCoV). NCBI’s virus classification tool PASC classifies 2019-nCov as a severe acute respiratory syndrome-related coronavirus (Severe acute respiratory syndrome-related coronavirus). This species also contains a SARS virus that broke out in 2003, with a genome sequence similarity of 80. %.
The sequence numbers of the 5 new 2019 coronavirus genomes released this time are GWHABKF00000000, GWHABKG00000000, GWHABKH00000000, GWHABKI00000000, GWHABKJ00000000, and the related project number is PRJCA002165.
The above sequences can be searched and downloaded from the project library of the National Bioinformatics Center / National Genomic Science Data Center or the coronavirus sequence library, without registration or application to the data submitter.