Skip to main content
ARS Home » Plains Area » Houston, Texas » Children's Nutrition Research Center » Research » Publications at this Location » Publication #340799

Research Project: Developmental Determinants of Obesity in Infants and Children

Location: Children's Nutrition Research Center

Title: Long-read sequencing and de novo assembly of a Chinese genome

Author
item SHI, LINGLING - Jinan University
item GUO, YUNFEI - University Of Southern California
item DONG, CHENGLIANG - University Of Southern California
item HUDDLESTON, JOHN - University Of Washington
item YANG, HUI - University Of Southern California
item HAN, XIAOLU - University Of Southern California
item FU, AISI - Wuhan University
item LI, QUAN - University Of Southern California
item LI, NA - Jinan University
item GONG, SIYI - Jinan University
item LINTNER, KATHERINE - The Ohio State University
item DING, QIONG - Wuhan University
item WANG, ZOU - Wuhan University
item HU, JIANG - Nextomics Biosciences Co, Ltd
item WANG, DEPENG - Nextomics Biosciences Co, Ltd
item WANG, FENG - Wuhan University
item WANG, LIN - Huazhong University Of Science And Technology
item LYON, GHOLSON - Cold Spring Harbor Laboratory
item GUAN, YONGTAO - Children'S Nutrition Research Center (CNRC)
item SHEN, YUFENG - Columbia University - New York
item EVGRAFOV, OLEG - University Of Southern California
item KNOWLES, JAMES - University Of Southern California
item THIBAUD-NISSEN, FRANCOISE - Us National Library Of Medicine
item SCHNEIDER, VALERIE - Us National Library Of Medicine
item YU, CHACK - The Ohio State University
item ZHOU, LIBING - Jinan University
item EICHLER, EVAN - University Of Washington
item SO, KWOK - Jinan University
item WANG, KAI - University Of Southern California

Submitted to: Nature Communications
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 5/26/2016
Publication Date: 6/30/2016
Citation: Shi, L., Guo, Y., Dong, C., Huddleston, J., Yang, H., Han, X., Fu, A., Li, Q., Li, N., Gong, S., Lintner, K.E., Ding, Q., Wang, Z., Hu, J., Wang, D., Wang, F., Wang, L., Lyon, G.J., Guan, Y., Shen, Y., Evgrafov, O.V., Knowles, J.A., Thibaud-Nissen, F., Schneider, V., Yu, C.Y., Zhou, L., Eichler, E.E., So, K.F., Wang, K. 2016. Long-read sequencing and de novo assembly of a Chinese genome. Nature Communications. 7:12065.

Interpretive Summary: Short-read sequencing has enabled the de novo assembly of several human genomes, but with inherent limitations in characterizing repeat elements. The single-molecule real-time (SMRT) long reads-sequencing technology was used to sequence a Chinese genome. Our results demonstrated the advantage of the long reads technology in genome assembly and gene annotation, and imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.

Technical Abstract: Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arrays and generate a de novo assembly of 2.93 Gb (contig N50: 8.3 Mb, scaffold N50: 22.0 Mb, including 39.3 Mb N-bases), together with 206 Mb of alternative haplotypes. The assembly fully or partially fills 274 (28.4%) N-gaps in the reference genome GRCh38. Comparison to GRCh38 reveals 12.8 Mb of HX1-specific sequences, including 4.1 Mb that are not present in previously reported Asian genomes. Furthermore, long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq. Our results imply that improved characterization of genome functional variation may require the use of a range of genomic technologies on diverse human populations.