|WALLBERG, ANDREAS - Uppsala University|
|BUNIKIS, IGNAS - Uppsala University|
|PETTERSON, OLGA - Uppsala University|
|MOSBECH, MAI-BRITT - Uppsala University|
|MIKHEYEV, ALEXANDER - Okinawa Institute Of Science And Technology|
|ROBERTSON, HUGH - University Of Illinois|
|ROBINSON, GENE - University Of Illinois|
|WEBSTER, MATTHEW - Uppsala University|
Submitted to: BMC Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 10/24/2018
Publication Date: 2/6/2019
Citation: Wallberg, A., Bunikis, I., Petterson, O.V., Mosbech, M., Childers, A.K., Evans, J.D., Mikheyev, A.S., Robertson, H.M., Robinson, G., Webster, M. 2019. A hybrid de novo genome assembly of the honeybee, Apis mellifera, with chromosome-length scaffolds. BMC Genomics. 20:275. https://doi.org/10.1186/s12864-019-5642-0.
Interpretive Summary: Honey bees are the predominant insect pollinator worldwide and are also the subject of widespead research efforts as a model for the social insects, including ants, bees, termites, and wasps. Honey bees face challenges from disease agents, chemical stress, and nutritional deficits. Genetic techniques can improve breeding and management strategies to address these challenges and maintain honey bee populations. The draft genome of the honey bee has led to numerous research and application breakthroughs in the past twelve years. This effort describes a nearly complete honey bee genome, a resource that will provide a lasting tool for honey bee biology and for comparisons across the insects. This international effort will be used widely to advance honey bee science and honey bee health.
Technical Abstract: The ability to generate long sequencing reads and access long-range linkage information is revolutionizing the quality and completeness of genome assemblies. Here we use a hybrid approach that combines data from four genome sequencing and mapping technologies to generate a new genome assembly of the honeybee Apis mellifera. We first generated contigs based on PacBio sequencing libraries, which were then merged with linked-read 10x Chromium data followed by scaffolding using a BioNano optical genome map and a Hi-C chromatin interaction map, complemented by a genetic linkage map. Each of the assembly steps reduced the number of gaps and incorporated a substantial amount of additional sequence into scaffolds. The new assembly (Amel_HAv3) is significantly more contiguous and complete than the previous one (Amel_4.5), based largely on Sanger sequencing reads N50 of contigs is 100-fold higher (5.381 Mbp compared to 0.053 Mbp) and we anchor >98% of the sequence to chromosomes. All of the 16 chromosomes are represented as single scaffolds with an average of three sequence gaps per chromosome. The improvements are largely due to the inclusion of repetitive sequence that was unplaced in previous assemblies. In particular, our assembly is highly contiguous across centromeres and telomeres and includes hundreds of AvaI and AluI repeats associated with these features. The improved assembly will be of utility for refining gene models, studying genome function, mapping functional genetic variation, identification of structural variants, and comparative genomics.