|Van Vleck, Lloyd|
Submitted to: Journal of Animal Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 7/2/2007
Publication Date: 10/1/2007
Citation: Kachman, S.D., Van Vleck, L.D. 2007. Technical Note: Calculation of standard errors of estimates of genetic parameters with the multiple-trait derivative-free restricted maximal likelihood programs. Journal of Animal Science. 85:2375-2381. Interpretive Summary: The MTDFREML set of programs were written to handle partially missing data in an efficient manner. In addition, to estimating (co)variance components for multiple trait models with partially missing data, the MTDFREML set of programs can also estimate standard errors in the absence of partially missing data. While the standard practice was to eliminate records with partially missing data, that practice uses only a subset of the available data. In some situations the elimination of partial records can result in elimination of all the records. An alternative approach requiring minor manipulation of the original data and model was developed that provides estimates of the standard errors for multiple trait analyses when not all traits are measured using the complete residual log likelihood. Because the same residual vector is used for the original data and the complete data the resulting REML estimators along with their sampling properties are identical.
Technical Abstract: The MTDFREML (Boldman et al., 1995) set of programs was written to handle partially missing data in an expedient manner. When estimating (co)variance components and genetic parameters for multiple trait models, the programs have not been able to estimate standard errors of those estimates for multiple trait models when all animals with observations do not have all traits measured. When some traits were not recorded on some units (e.g., animals) the standard approach used was to discard incompletely recorded units. In the worst case, when males have one trait measured and females have another trait measured, there are no animals which have both traits measured. A similar case is for genotype by environment interaction where some animals have records in one environment and other (some related) animals have records in another environment. Although the program uses a derivative-free algorithm (simplex) to minimize -2logL|y where L|y is the likelihood (L) given the data (y), the asymptotic standard errors for single trait analyses and multiple trait analyses with all traits measured are based on the average information (AI) matrix (Johnson and Thompson, 1995) as implemented by Dodenhoff et al. (1998). The limitation of requiring complete data can now be overcome without any changes in the set of MTDFREML programs. A model based procedure is described to accomplish that goal which makes use of properties of the mixed model equations. Only relatively trivial changes in the data file and model are needed. Each missing observation for a trait is assigned a unique level of a dummy factor associated with that trait. Each missing observation can be assigned the same dummy value. The program accepts that value as a real observation. The result is that the program handles the analysis as if all traits were observed for each animal with records. In essence, what happens is that the residual element associated with the dummy observation is estimated to be zero. The algorithm to compute the asymptotic average information matrix uses those zeroes which add nothing to the AI matrix so that it is the same as if it was computed with a more complex algorithm. Similarly, L|y also is the same as if it was computed with missing observations ignored.