|MEDINA, CESAR AUGUSTO - Washington State University|
|HAWKINS, CHARLES - Washington State University|
|LIU, XIANG-PING - Heilongjiang Bayi Agricultural University (HLAU)|
Submitted to: International Journal of Molecular Sciences
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 5/6/2020
Publication Date: 5/9/2020
Citation: Medina, C., Hawkins, C., Liu, X., Peel, M., Yu, L. 2020. Genome-wide association and prediction of traits related to salt tolerance in autotetraploid alfalfa (Medicago sativa L.). International Journal of Molecular Sciences. 21:3361. https://doi.org/10.3390/ijms21093361.
Interpretive Summary: Saline soil is a worldwide problem in agriculture, reducing productivity and increasing water usage. Breeding crops to increase salt tolerance can mitigate these effects. In the case of alfalfa, however, breeding is complicated by a tetraploid genome and outcrossing reproduction, and more sophisticated breeding methods are required to progress. Genomic selection is a technique that utilizes machine learning to assist breeding by using prior data to make predictions about a plant’s traits based on its genome. In this work, we test eight machine learning models on several salt tolerance datasets in alfalfa for their ability to accurately predict salt tolerance. The models were able to achieve accuracies of up to 43% for some of the datasets, which is sufficient to make progress in a real-world breeding program.
Technical Abstract: Soil salinity is a growing problem in world agriculture. In 2014, saline soil cost an estimated $27 B in lost crop yields worldwide. Continued improvement in crop salt tolerance will require the implementation of new breeding technologies such as genomic selection (GS). This technology utilizes machine learning to predict breeding values for candidates under selection, making selections based on these predictions. GS offers high accuracy and the potential for gains that are more rapid than those of phenotypic selection while being more sustainable than those of other methods. In this work, we report the results of cross-validation of eight GS models on a population of alfalfa with three different phenotypic datasets related to salt tolerance: Yield under salt stress in a field, general health score under salt stress in a field, and a set of health and productivity metrics collected from salt-exposed plants grown in a greenhouse. The highest-performing model, SVR with '-regression, achieved an accuracy of 0.43 for the field health. Association mapping was also performed on the health and yield datasets. The most-significant markers among these data had a -log10 p-value of 4.5.