Submitted to: GigaByte
Publication Type: Database / Dataset
Publication Acceptance Date: 1/15/2022
Publication Date: N/A
Technical Abstract: Red clover (Trifolium pratenseL) is used as a forage crop due to a variety of favorable traits relative to other crops. Improved varieties have been developed through conventional breeding approaches, but progress could be accelerated and gene discovery facilitated using modern genomic methods. Existing short-read based genome assemblies of the ~420 Megabase (Mb) genome are fragmented into >135,000 contigs with numerous errors in order and orientation within scaffolds, likely due to the biology of the plant which displays gametophytic self-incompatibility resulting in inherent high heterozygosity. A high-quality long-read based assembly of red clover is presented that reduces the number of contigs by more than 500-fold, improves the per-base quality, and increases the contig N50 statistic by three orders of magnitude. The 413.5 Mb assembly is nearly 20% longer than the 350 Mb short read assembly, closer to the predicted genome size. Quality measures are presented and full-length isoform sequence of RNA transcripts reported for use in assessing accuracy and for future annotation of the genome. The assembly accurately represents the seven main linkage groups present in the genome of an allogamous (outcrossing), highly heterozygous plant species. This database entry contains supporting data for "Chromosome-scale assembly of the highly heterozygous genome of red clover (Trifolium pratense L.), an allogamous forage crop species" published in GigaByte Journal.