Nutrient Data Site Logo
ARS Home About Us Helptop nav spacerContact Us En Espanoltop nav spacer
Printable VersionPrintable Version     E-mail this pageE-mail this page
Agricultural Research Service United States Department of Agriculture
Search
  Advanced Search
 
Programs and Projects
Subjects of Investigation
Dietary Supplement Ingredient Database (DSID)
 

Research Project: DEVELOPMENT OF ACCURATE AND REPRESENTATIVE FOOD COMPOSITION DATA FOR THE U.S. FOOD SUPPLY

Location: Nutrient Data

Title: Methods of Imputation used in the USDA National Nutrient Database for Standard Reference

Authors
item Gebhardt, Susan
item Thomas, Robin

Submitted to: National Nutrient Databank Conference
Publication Type: Abstract Only
Publication Acceptance Date: March 6, 2008
Publication Date: May 12, 2008
Citation: Gebhardt, S.E., Thomas, R.G. 2008. Methods of imputation used in the USDA National Nutrient Database for Standard Reference. 32nd National Nutrient Data Bank Conference, May 12-14, 2008, Ottawa, Ontario, Canada.

Technical Abstract: Objective: To present the predominate methods of imputing used to estimate nutrient values for foods in the USDA National Nutrient Database for Standard Reference (SR20). Materials and Methods: The USDA Nutrient Data Laboratory developed standard methods for imputing nutrient values for foods where analytical data were not available. Beginning with SR14, a field for derivation codes was included in the Nutrient Data File. There are 54 derivation codes. Derivation Code A indicates analytical data, whereas most codes are used to identify imputation methods. As data for more foods are processed through the new Nutrient Data Bank System this field is being populated. Currently about 60% of the nutrient values in SR20 have derivation codes. This field was queried to determine the most commonly used imputing methods for different types of foods and nutrients. Results: There are about 200,000 nutrient values in SR20 that have data derivation codes indicating that the value is calculated (not analytical). About 20% of these are derivation code Z, meaning an assumed zero. Code Z is used for nutrients such as retinol and cholesterol that do not occur naturally in plant foods. About 17% are BF codes meaning the value is based on analytical data for a similar food. These procedures are mainly used for commodity foods such as fruits, vegetables and grains. About 16% have FL codes indicating calculations based on the use of a formulation. Formulations are used for multi-ingredient foods such as baked products. Code NC indicates a nutrient that is always calculated rather than analyzed, accounting for about 15% of the imputed values. These are nutrients such as carbohydrate by difference and calories Significance: Users of the database want to know the source of the nutrient values. This is particularly useful to other database developers who may have to use imputation for their database applications.

   

 
Project Team
Holden, Joanne
Exler, Jacob - Jake
Haytowitz, David
Pehrsson, Pamela
 
Publications
   Publications
 
Related National Programs
  Human Nutrition (107)
 
 
Last Modified: 05/25/2013
ARS Home | USDA.gov | Site Map | Policies and Links 
FOIA | Accessibility Statement | Privacy Policy | Nondiscrimination Statement | Information Quality | USA.gov | White House