Submitted to: World Congress of Genetics Applied in Livestock Production
Publication Type: Proceedings
Publication Acceptance Date: 10/9/2017
Publication Date: N/A
Citation: N/A Interpretive Summary:
Technical Abstract: Despite reduced genotyping costs in recent years, obtaining genotypes for all individuals in a population may still not be feasible when sample size is large. DNA pooling provides a useful alternative to determining genotype effects. Clustering algorithms allow for grouping of individuals (observations) with similar characteristics and thus may result in better pooling. The objective of the study was to determine the properties of pools constructed using clustering algorithms. Partitioning around medoids (PAM) was applied to a simulated sheep population in which both continuous and ordinal traits was generated. The effect of including the numerator relationship matrix (NRM) as a similarity measure was also determined. Calculated measures of clustering homogeneity included mean, maximum silhouette width, cluster size, proportion of clusters of size smaller than 5, and cluster size variance. The silhouette for individual i is defined as the difference between the lowest average dissimilarity to any other cluster and the average dissimilarity of its cluster divided by the maximum of the two dissimilarities. In addition, variability of aggregate phenotype was measured. For continuous traits, including the NRM had little impact on the homogeneity of pools. Without the NRM ordinal traits were more variable in size and had many clusters with fewer observations in them. In conclusion, for categorical traits including the NRM as a distance measure resulted in pools with better properties.