Submitted to: Health Education Research
Publication Type: Peer reviewed journal
Publication Acceptance Date: 4/1/2006
Publication Date: 12/1/2006
Citation: Watson, K., Baranowski, T., Thompson, D. 2006. Item response modeling: an evaluation of the children's fruit and vegetable self-efficacy questionnaire. Health Education Research. 21(Suppl 1):i47-i57. Interpretive Summary: This article applies Item Response Modeling to a Fruit, Juice, and Vegetable Self-Efficacy Questionnaire administered to 4th grade students to (1) identify areas of the scale that were not as reliable, (2) identify areas of the scale that were not measured, (3) examine the functioning of the response format, and (4) examine response bias due to gender and ethnicity. Concerns have been raised about existing measures of psychosocial constructs. This paper applied an innovative psychometric technique, Item Response Remodeling (IRM), to previously validated measures of children's fruit and vegetable self efficacy. IRM analyses revealed that children used only 2 or 3 of the 5 response categories, suggesting the number of categories be reduced; and the items did not cover the extremes of the distribution. Refining this scale will require generating items to measure the extremes of the scale, and reducing the number of response categories, probably to two.
Technical Abstract: Perceived self-efficacy (SE) for eating fruit and vegetables (FV) is a key variable mediating FV change in interventions. This study applies item response modeling (IRM) to a fruit, juice and vegetable self-efficacy questionnaire (FVSEQ) previously validated with classical test theory (CTT) procedures. The 24-item (five-point Likert scale) FVSEQ was administered to 1578 fourth graders from 26 Houston schools. The IRM partial credit model indicated the five-point response options were not fully utilized. The questionnaire exhibited acceptable (>0.70) reliability except at the extremes of the SE scale. Differential item functioning (DIF) analyses revealed no response bias due to gender. However, DIF was detected by ethnic groups in 10 items. IRM of this scale expanded what was known from CTT methods in three ways: (i) areas of the scale were identified that were not as reliable, (ii) limitations were found in the response format and (iii) areas of the SE scale levels were not measured. The FVSEQ can be improved by including items at the extreme levels of difficulty. DIF analyses identified areas where IRM can be useful to improve the functioning of measures.