Location: Animal Genomics and Improvement Laboratory
Title: Technical options for all-breed Single Step GBLUP for US dairy cattleAuthor
LEGARRA, ANDRES - Council On Dairy Cattle Breeding | |
BERMANN, MATIAS - University Of Georgia | |
Vanraden, Paul | |
NICOLAZZI, EZEQUIEL - Council On Dairy Cattle Breeding | |
MOTA, RODRIGO - Council On Dairy Cattle Breeding | |
TABET, JOE-MENWER - University Of Georgia | |
LOURENCO, DANIELA - University Of Georgia | |
MISZTAL, IGNACY - University Of Georgia |
Submitted to: Interbull Annual Meeting Proceedings
Publication Type: Proceedings Publication Acceptance Date: 6/28/2024 Publication Date: 9/4/2024 Citation: Legarra, A., Bermann, M., Van Raden, P.M., Nicolazzi, E., Mota, R., Tabet, J., Lourenco, D.L., Misztal, I. 2024. Technical options for all-breed Single Step GBLUP for US dairy cattle. Interbull Bulletin. 60:143-147. Interpretive Summary: Technical Abstract: The multi-step method for genomic prediction has worked remarkably well for US dairy cattle, but intense genomic selection makes recent genetic trends difficult to estimate in pedigree-only based BLUP evaluations. Thus, the introduction of routine Single Step GBLUP (ssGBLUP) is under study. The large size of data of US dairy cattle data precludes naïve approaches for genomic prediction. Here we present the technical choices and needs of an all-breed (6 breeds and all existing crosses), ssGBLUP applied to different sets of traits within trait groups such as fertility, livability and health data. For each trait group, first, we prune pedigree to animals with records and their ancestors, reducing the size of pedigree and improving memory use and convergence. The model includes only genotypes of animals in this pruned pedigree, and we predict the other animals later either using Parent Average (if not genotyped) or sum of SNP effects (if genotyped). The set of markers is the usual CDCB set with 78,964 markers and included autosomes and sexual chromosomes. The method for ssGBLUP was G-matrix with Algorithm for Proven and Young (APY) with metafounders (MF). APY largely reduces computational needs whereas MF provides smooth solutions for unknown origins and automatic compatibility of pedigree and genomic relationships within and across breeds. The gamma matrix was constructed based on base allele frequencies across breeds and increases of inbreeding within breeds. Core animals were chosen within breed, in a heuristic but complete and repeatable manner: genotyped sires with more than a certain number of daughters in records, and a deterministic subset of genotyped cows with records. This resulted for fertility in ~45K animals in the core and ~2M non-core animals. Still memory needs are large as G_APY inverse, stored in double precision, takes ~720 Gb. Thus, we used memory mapping (mmap) to assign memory to disk space. For the case of fertility (4 traits), computation of G_APY inverse took 28h and 100 Gb of RAM using mmap. Solving MME took 22h, 120 Gb of RAM and 476 rounds of PCG. Genomic reliabilities took 120 Gb of RAM and 8h per trait. Backsolving for SNP solutions took negligible time and memory. Computations for ssGBLUP in this very large database can therefore be done in reasonable time with the correct technical options and if they are correctly organized. |