Devi Nina Bingham: Comparison of Assessment Instruments for Eating Disorders by Nina Bingham

Abstract: This article will compare two assessment instruments designed to measure eating disorders. It will compare: validity and reliability. It will also describe: the purpose of the instruments, cross-cultural validity, scales of measurement, test-retest stability, inter-scorer agreement, and future suggested utility of the instrument. I will evaluate which tool seems to be the stronger of the two. In comparing the constructs of two instruments designed to measure Eating Disorders, much is to be considered.

    Assessing an eating disorder is a complex issue, as the Diagnostic and Statistical Manual (DSM-IV-TR) defines three separate categories of eating disorders: Anorexia Nervosa, Bulimia Nervosa, and Eating Disorders Not Otherwise Specified (NOS). The tests I chose to compare are diametrically different. The widely-used and older test, “Eating Disorder Inventory-3” (EDI-3) (Garner, Garfinkle, 1979) is based upon 20 years of research in eating disorders, and is in its third revision. It assesses the eating disorder categories listed in the DSM, except Binge Eating Disorder. However, “The EDI-3 is not designed to arrive at a diagnosis of eating disorder. Instead the emphasis is placed on the measurement of psychological traits relevant to the development and maintenance of such disorders” (Kagee, 1984-2004).

    The newer test, “Questionnaire For Eating Disorder Diagnosis” (Q-EDD) (Mintz, O’Halloran, Mulholland, and Schneider, 1997) also operationalizes eating disorder criteria of the DSM, but goes further than the EDI-3, in that it differentiates: (a) between those with and without an eating disorder diagnosis, (b) between symptomatic and asymptomatic individuals, and (c) between anorexia and bulimia diagnosis. The Q-EDD also tested three different groups, and included a supplementary clinical oral interview to arrive at their research conclusions. The EDI-3 can be administered in 20 minutes, has 25 questions, and utilizes a Likert Scale. It can be used with individuals or groups. The symptom checklist is written at a sixth grade level, thus it can easily be scored by the administrator, and even a lay-person, such as a teacher or athletic coach. This makes it an apt screening tool for academic and athletic purposes, wherein the administrator could use the tests results for referral. Again, “The rationale behind the development of the EDI-3 was to test the continuum model of anorexia nervosa, which states that this disorder is the final stage of a continual process beginning with voluntary dieting and progressing to more stringent forms of dieting accompanied by progressive loss of insight” (Kagee, 1984-2004). Although it has high cross-cultural validity (both U.S. and international samples), group normative sample numbers were not divulged, other than describing them as, “moderate-size samples of U.S. and international male and female adults, as well as international male and female adolescents” (Garner, 1984-2004). Overall, its factor analysis validity is poor, other than proving inverse convergent validity for low self-esteem scores of .82 when compared to Rosenberg Self-Esteem Scale (Rosenburg, 1965) on a nonclinical sample of 543 females. Exploratory factor analysis three-factor model accounted for 63% of variance, and 60.8% and 65.6% of variance among different sample groups (Atlas, 1984-2004). Reliability was more impressive, ranging from .90 to .97 across four diagnostic groups and three normative groups (Kagee, 1984-2004). Test-retest stability of scores of 34 females after 1-7 days: “Correlation coefficients ranging from .86 (for Asceticism) to .98 (for Interpersonal Alienation) suggested excellent stability of subscale and composite scores, albeit on a very restricted study sample” (Atlas, 1984-2004). Overall, critics Kagee and Atlas (1984-2004) found the EDD-3 to be disappointing in construct. Future utility of the instrument seems to be for screening purposes only in detection of eating disorders, and their progress on a spectrum.
    The second test, “The Questionnaire for Eating Disorder Diagnosis” (Q-EDD) (Garner, 1984-2004) “…represents the first attempt in the field of eating disorders to provide a comprehensive assessment of the widely used methodology of operationalizing the DSM into a questionnaire format” (Mintz, O’Halloran, Mulholland, Schneider, 1997). It is a self-report, 50 questions, and takes 5 to 10 minutes to complete. The group sample was a non-clinical group of 1,400 college women who completed three eating disorder tests: the Q-EDD, revised Bulimia Test (BULIT-R; Thelen, Farmer, Wonderlich, & Smith, 1991), and the Eating Attitudes Test (EAT; Garner & Garfinkel, 1979). In addition, participants completed a structured interview by clinicians. I consider this to be a large sample group, and rigorous construct methodology. On Test 1, “Criterion validity was assessed by an examination of the diagnosis yielded by the Q-EDD and those yielded by clinical interviews, and accuracy rate was 98% and 90%. Incremental validity was examined by comparing the level of agreement between Q-EDD diagnosis and clinical interview diagnosis with the level of agreement between preexisting inventory diagnosis and clinical interview diagnosis” (Mintz, O’Halloran, Mulholland, Schneider, 1997). Incremental accuracy rates were 97% and 94%.
   Hence, the Q-EDD and the BULIT-R were roughly equivalent in all aspects except for positive predictive power: the Q-EDD was correct at predicting Bulimia 78% of the time, whereas the BULIT-R was correct 54% of the time. The incremental validity of the Q-EDD in comparison with the EAT was not examined, because there was only one interview-defined anorexic. Test 1 test-retest was delayed for 1-3 months, and scored 64% and 54%. This wider reliability score could be due to waiting a longer period of time before retesting subjects. Inter-scorer agreement was 100%, and fifty randomly selected Q-EDDs were scored by two scorers. On Test 2, 167 college women were tested, and the test instruments were the same (Q-EDD, BULIT-R, and EAT). Convergent validity could not be calculated, as there was only one bulimic. The test-retest was calculated as follows: 94% and 85%; the higher scores would be a result of retesting just 2 weeks later. Inter scorer agreement was 100%, and 50 randomly selected Q-EDDs were scored by two scorers. On Test 3, “The purpose of Study 3 was to assess criterion validity. Study 1 indicated that…the Q-EDD demonstrated good criterion validity; we were thus interested in determining whether this good criterion validity would hold for the clinical sample” (Mintz, O’Halloran, Mulholland, Schneider, 1997). 37 participants were recruited by therapists, and all had been diagnosed with eating disorders. In comparing the diagnosis of clinicians to the Q-EDD, accuracy rates were 78%. The sensitivity and accuracy rates for the differentiation of anorexia from bulimia were 100%, and the false-negative rate was 0%. Based on these three studies, “Strong support was obtained for the psychometric properties of the Q-EDD…convergent validity was demonstrated by significant correspondence between Q-EDD diagnosis and scores on the BULIT-R and the EAT. Test-retest reliabilities found that Q-EDD diagnosis were quite stable over a 2-week period and less stable over a 1 to 3 month period. The 100% inter-scorer agreement indicates that scoring of the Q-EDD can be easily mastered” (Mintz, O’Halloran, Mulholland, Schneider, 1997). In terms of validity, due to the low number of anorexics in Study 1, the Q-EDD and the EAT could not be compared. However, the high level of anorexia diagnosis in Study 3 leads to the conclusion that Q-EDD is a better measure of DSM anorexia than the EAT. The Q-EDD and the BULIT-R performed equally well on measuring bulimia; therefore, clinicians wanting to distinguish bulimics from nonbulimics could use either instrument. Although these conclusions build a strong case for using the Q-EDD, perhaps the most significant psychometric support was the criterion validity of the Q-EDD across both the clinical interview and judgment scores. Accuracy rates were: 98% and 90% in Study 1, 78% and 78% in Study 3. In differentiating anorexia from bulimia, accuracy rate was 100% in Study 3” (Mintz, O’Halloran, Mulholland, Schneider, 1997).
    Apparently, the Q-EDD is very effective at differentiating a diagnosis of anorexia from bulimia. Fairburn et al., 1990 and Williamson et al., 1995 wrote, “There is a great need in the eating disorder field for an instrument that can operationalize a full spectrum of eating disorders and make differential diagnosis.” The Q-EDD may be the first questionnaire to achieve that goal. In terms of future clinical utility, “Because the Q-EDD yields both a diagnosis and frequency data for individual behaviors, it can be used to track progress in therapy” (Mintz, O’Halloran, Mulholland, Schneider, 1997). To use a crude analogy, one could compare these two tests instruments as one would compare cars. They are both used for driving. However, the EDI-3 would drive like the trusty family station wagon; a time-tested and dependable ride, less concerned with safety, but roomy and easy to drive. The Q-EDD would drive like a 10-year-younger protégé; designed with safety, speed and utility in mind. If I had to pick from these two cars, I wouldn’t hesitate to choose the Q-EDD, even though it’s a relative “newcomer” to the market of testing instruments. It has greater validity, reliability, test-retest reliability, and inter-rater agreement than the EDI-3. Although the EDI-3 has served its purpose as an assessment tool, it’s time for a newer construct which can accurately distinguish and diagnose eating disorders. Sometimes, new is better.

References:

Garner, D. M., & Garfinkel, P. E. (1979). The Eating Attitudes Test: An index of the symptoms of anorexia nervosa. Psychological Medicine, 9, 273– 279.

Kagee, A. (1984-2004). Review of the Eating Disorder Inventory-3. Mental Measurements Yearbook and Tests in Print. Accession Number: 17123228.

Mintz, L. B., O'Halloran, M., Mulholland, A. M., & Schneider, P. A. (1997). Questionnaire for Eating Disorder Diagnoses: Reliability and validity of operationalizing DSM—IV criteria into a self-report format. Journal of Counseling Psychology, 44(1), 63-79. doi:10.1037/0022-0167.44.1.63

Rosenberg, M. (1965). Society and the adolescent self-image. Princeton, NJ: Princeton University Press.

Atlas, J. A. (1984-2004). Review of the Eating Disorder Inventory-3. Mental Measurements Yearbook and Tests in Print. Accession Number: 17123228.

Thelen, M. H., Farmer, J., Wonderlich, S., Smith, M. (1991). A revision of the Bulimia Test: The BULIT—R. Psychological Assessment: A Journal of Consulting and Clinical Psychology, Vol 3(1), pp. 119-124. US: American Psychological Association.

Fairburn, C. G., Phil, M., & Beglin, S. J. (1990). Studies of the epidemiology of bulimia nervosa. American Journal of Psychiatry, 147, 401– 408.

Williamson, D. A., Anderson, D., Jackman, L. P., & Jackson, S. R. (1995). Assessment of eating disordered thoughts, feelings, and behaviors. In Allison (Ed.), Handbook of assessment methods for eating behaviors and weight-related problems (pp. 303– 346). Thousand Oaks, CA: Sage.

Menu

Tuesday, May 22, 2012

Comparison of Assessment Instruments for Eating Disorders by Nina Bingham

No comments:

Post a Comment