A review of the meta-analytic studies of reliability generalization

Authors

  • Julio Sánchez-Meca Dpto. Psicología Básica y Metodología. Facultad de Psicología. Universidad de Murcia Spain
  • José A. López-Pina Dpto. Psicología Básica y Metodología. Facultad de Psicología. Universidad de Murcia Spain
  • José A. López-López Dpto. Psicología Básica y Metodología. Facultad de Psicología. Universidad de Murcia Spain

DOI:

https://doi.org/10.24310/espsiescpsi.v2i1.13365

Keywords:

Reliability generalization, meta-analysis, reliability coefficient

Abstract

The meta-analytic approach of reliability generalization (RG) pretends to show that reliability is an empirical property that varies from one test application to another. This recent meta-analytic approach is helping to make the researchers aware of the importance of reporting reliability estimates obtained from the own data and, of avoiding the malpractice of inducting reliability coefficients from other studies and previous applications of the test. The stages to carry out an RG study are presented: (a) formulating the problem, (b) searching for the studies, (c) coding studies, and (d) statistical analysis and interpretation. An updated overview of the statistical problems of this approach is also offered: (a) to transform versus not to transform the reliability coefficients, (b) to weight versus not to weight the coefficients, and (d) which statistical model is the most appropriate (fixed-, random-, and mixed-effects). A systematic review of the 49 RG studies published to date is presented with the purpose of analyzing the heterogeneity in how the data are statistically analyzed. Finally, the implications of the RG studies for research and professional practice are discussed.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

(Los estudios precedidos por un asterisco fueron incluidos en nuestra revisión de estudios GF.)

*Bachner, Y. G. y O’Rourke, N. (2007). Reliability generalization of responses by care providers to the Zarit Burden Interview. Aging and Mental Health, 11, 678-685.

*Barnes, L. L. B., Harp, D. y Jung, W. S. (2002). Reliability generalization of scores on the Spielberger State-Trait Anxiety Inventory. Educational and Psychological Measurement, 62, 603-618.

*Beretvas, S. N., Meyers, J. L. y Leite, W. L. (2002). A reliability generalization study of the Marlowe- Crowne Social Desirability Scale. Educational and Psychological Measurement, 62, 570-589.

Beretvas, S. N. y Pastor, D. A. (2003). Using mixedeffects models in reliability generalization studies. Educational and Psychological Measurement, 63, 75- 95.

*Beretvas, S. N., Suizzo, M.-A., Durham, J. A. y Yarnell, L. M. (2008). A reliability generalization study of scores on Rotter’s and Nowicki-Strickland’s locus of control scales. Educational and Psychological Measurement, 68, 97-119.

*Campbell, J. S., Pulos, S., Hogan, M. y Murry, F. (2005). Reliability generalization of the Psychopathy Checklist applied in youthful samples. Educational and Psychological Measurement, 65, 639-656.

*Capraro, R. M. y Capraro, M. M. (2002). Myers-Briggs type indicator score reliability across studies: A meta-analytic reliability generalization study. Educational and Psychological Measurement, 62, 590-602.

*Capraro, M. M., Capraro, R. M. y Henson, R. K. (2001). Measurement error of scores on the Mathematics Anxiety Rating Scale across studies. Educational and Psychological Measurement, 61, 373-386.

*Caruso, J. C. (2000). Reliability generalization of the NEO personality scales. Educational and Psychological Measurement, 60, 236-254.

*Caruso, J. C. y Edwards, S. (2001). Reliability generalization of the Junior Eysenck Personality Questionnaire. Personality and Individual Differences, 31, 173- 184.

*Caruso, J. C., Witkiewitz, K., Belcourt-Dittloff, A. y Gottlieb, J. D. (2001). Reliability scores from the Eysenck Personality Questionnaire: A reliability generalization study. Educational and Psychological Measurement, 61, 675-689.

*Charter, R. A. (2003). A breakdown of reliability coefficients by test type and reliability method, and the clinical implications of low reliability. Journal of General Psychology, 130, 290-304.

Crocker, L. y Algina, J. (1986). Introduction to classical and modern test theory. Nueva York: Holt, Rinehart and Winston.

Cooper, H. y Hedges, L. V. (Eds.) (1994). The handbook of research synthesis. Nueva York: Russell Sage Foundation.

*De Ayala, R. J., Vonderharr-Carlson, D. J. y Kim, D. (2005). Assessing the reliability of the Beck Anxiety Inventory scores. Educational and Psychological Measurement, 65, 742-756.

*Deditius-Island, H. K. y Caruso, J. C. (2002). An examination of the reliability of scores from Zuckerman’s Sensation Seeking Scales, Form V. Educational and Psychological Measurement, 62, 728-734.

*Dierdoff, E. C. y Wilson, M. A. (2003). A meta-analysis of job analysis reliability. Journal of Applied Psychology, 88, 635-646.

Dimitrov, D. M. (2002). Reliability: Arguments for multiple perspectives and potential problems with generalization across studies. Educational and Psychological Measurement, 62, 783-801.

*Dunn, T. W., Smith, T. B. y Montoya, J. A. (2006). Multicultural competency instrumentation: A review and analysis of reliability generalization. Journal of Counseling and Development, 84, 471-482.

Feldt, L. S. y Brennan, R. L. (1989). Reliability. En R. L. Linn (Ed.), Educational measurement (3ª ed., pp. 105-146). Nueva York: American Council on Education and Macmillan.

Feldt, L. S. y Charter, R. A. (2006). Averaging internal consistency reliability coefficients. Educational and Psychological Measurement, 66, 215-227.

*Graham, J. M., Liu, Y. J. y Jeziorski, J. L. (2006). The Dyadic Adjustment Scale: A reliability generalization meta-analysis. Journal of Marriage and Family, 68, 701-717.

Gronlund, N. E. y Linn, R. L. (1990). Measurement and evaluation in teaching (6ª ed.). Nueva York: Macmillan.

Hakstian, A. R. y Whalen, T. E. (1976). A k-sample significance test for independent alpha coefficients. Psychometrika, 41, 219-231.

Hall, S. M. y Brannick, M. T. (2002). Comparison of two random-effects methods of meta-analysis. Journal of Applied Psychology, 87, 377-389.

*Hanson, W. E., Curry, K. T. y Bandalos, D. L. (2002). Reliability generalization of Working Alliance Inventory Scale scores. Educational and Psychological Measurement, 62, 659-673.

Hedges , L. V. y Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press.

Hedges, L. V. y Vevea, J. L. (1998). Fixed- and randomeffects models in meta-analysis. Psychological Methods, 3, 486-504.

Heldref Foundation (1997). Guidelines for contributors. Journal of Experimental Education, 65, 95-96.

*Hellman, C. M., Fuqua, D. R. y Worley, J. (2006). A reliability generalization study on the Survey of Perceived Organizational Support: The effects of mean age and number of items on score reliability. Educational and Psychological Measurement, 66, 631-642.

*Helms, J. E. (1999). Another meta-analysis of the White Racial Identity Attitude Scale’s Cronbach alphas: Implications for validity. Measurement and Evaluation in Counseling and Development, 32, 122- 137.

*Henson, R. K. y Hwang, D.-Y. (2002). Variability and prediction of measurement error in Kolb’s Learning Style Inventory scores: A reliability generalization study. Educational and Psychological Measurement, 62, 712-727.

*Henson, R. K., Kogan, L. R. y Vacha-Haase, T. (2001). A reliability generalization study of the Teacher Efficacy Scale and related instruments. Educational and Psychological Measurement, 61, 404-420.

Henson, R. K. y Thompson, B. (2002). Characterizing measurement error in scores across studies: Some recommendations for conducting “reliability generalization” studies. Measurement and Evaluation in Counseling and Development, 35, 113-126.

Hunter, J. E. y Schmidt, F. S. (2004). Methods of metaanalysis: Correcting error and bias in research findings (2ª ed.). Thousand Oaks, CA: Sage.

*Kieffer, K. M., Cronin, C. y Fister, M. C. (2004). Exploring variability and sources of measurement error in Alcohol Expectancy Questionnaire reliability coefficients: A meta-analytic reliability generalization study. Journal of Studies on Alcohol, 65, 663-671.

*Kieffer, K. M. y Reese, R. J. (2002). A reliability generalization study of the Geriatric Depression Scale. Educational and Psychological Measurement, 62, 969- 994.

*Lane, G. G., White, A. E. y Henson, R. K. (2002). Expanding reliability generalization methods with KR-21 estimates: An RG study on the Coopersmith Self-esteem Inventory. Educational and Psychological Measurement, 62, 685-711.

*Leach, L. F., Henson, R. K., Odom, L. R. y Cagle, L. S. (2006). A reliability generalization study of the Self-Description Questionnaire. Educational and Psychological Measurement, 66, 285-304.

*Li, A. y Bagger, J. (2007). The Balanced Inventory of Desirable Responding (BIDR): A reliability generalization study. Educational and Psychological Measurement, 67, 525-544.

*López-Pina, J. A., Sánchez-Meca, J. y Rosa-Alcázar, A. I. (en prensa). The Hamilton Rating Scale for Depression: A reliability generalization study. International Journal of Clinical and Health Psychology.

Mason, C., Allam, R. y Brannick, M. T. (2007). How to meta-analyze coefficient-of-stability estimates: Some recommendations based on Monte Carlo studies. Educational and Psychological Measurement, 67, 765-783.

*Miller, C. S., Shields, A. L., Campfield, D., Wallace, K. A. y Weiss, R. D. (2007). Substance use scales of the Minnesota Multiphasic Personality Inventory: An exploration of score reliability via meta-analysis. Educational and Psychological Measurement, 67, 1052-1065.

*Mji, A. y Alkhateeb, H. M. (2005). Combining reliability coefficients: Toward reliability generalization of the Conceptions of Mathematics Questionnaire. Psychological Reports, 96, 627-634.

*Nilsson, J. E., Schmidt, C. K. y Meek, W. D. (2002). Reliability generalization: An examination of the Career Decision-Making Self-efficacy Scale. Educational and Psychological Measurement, 62, 647-658.

Onwuegbuzie, A. J. y Daniel, L. G. (2004). Reliability generalization: The importance of considering sample specificity, confidence intervals, and subgroup differences. Research in the Schools, 11, 60-71.

*O´Rourke, N. (2004). Reliability generalization of responses by care providers to the Center for Epidemiologic Studies-Depression Scale. Educational and Psychological Measurement, 64, 973-990.

*Reese, R. J., Kieffer, K. M. y Briggs, B. K. (2002). A reliability generalization study of select measures of adult attachment style. Educational and Psychological Measurement, 62, 619-646.

*Rexrode, K. R., Petersen, S. y O’Toole, S. (2008). The Ways of Coping Style Scale: A reliability generalization study. Educational and Psychological Measurement, 68, 262-280.

Rodriguez, M. C. y Maeda, Y. (2006). Meta-analysis of coefficient alpha. Psychological Methods, 11, 306-322.

*Ross, M. E., Blackburn, M. y Forbes, S. (2005). Reliability generalization of the patterns of adaptive learning survey goal orientation scales. Educational and Psychological Measurement, 65, 451-464.

*Rouse, S. V. (2007). Using reliability generalization methods to explore measurement error: An illustration using the MMPI-2 PSY-5 scales. Journal of Personality Assessment, 88, 264-275.

*Ryngala, D. J., Shields, A. L. y Caruso, J. C. (2005). Reliability generalization of the Revised Children’s Manifest Anxiety Scale. Educational and Psychological Measurement, 65, 259-271.

Sánchez-Meca, J. (2003). La revisión del estado de la cuestión: El meta-análisis. En C. Camisón, M. J. Oltra y M. L. Flor (Eds.), Enfoques, problemas y métodos de investigación en economía y dirección de empresas (pp. 101-110). Castellón: ACEDE/Fundació Universitat Jaime I–Empresa.

Sánchez-Meca, J. y Ato, M. (1989). Meta-análisis: Una alternativa metodológica a las revisiones tradicionales de la investigación. En J. Arnau y H. Carpintero (Coords.), Tratado de psicología general I: Historia, teoría y método (pp. 617-669). Madrid: Alhambra.

Sánchez-Meca, J. y López-Pina, J. A. (2008). El enfoque meta-analítico de generalización de la fiabilidad. Acción Psicológica, 5, 37-64.

Sánchez-Meca, J., López-Pina, J. A. y López-López, J. A. (en prensa). Generalización de la fiabilidad: Un enfoque meta-analítico aplicado a la fiabilidad. Fisioterapia.

Sánchez-Meca, J. y Marín-Martínez, F. (2008). Confidence intervals for the overall effect size in randomeffects meta-analysis. Psychological Methods, 13, 31-48.

Sánchez-Meca, J., Marín-Martínez, F. y Huedo, T. (2006). Modelo de efectos fijos versus modelo de efectos aleatorios. En J. L. R. Martín, A. Tobías y T.

Seoane (Coords.), Revisiones Sistemáticas en Ciencias de la Vida (pp. 189-204). Toledo: FISCAM.

Sawilowsky, S. S. (2000a). Psychometrics versus datametrics: Comment on Vacha-Haase’s ‘Reliability generalization’ method and some EPM editorial policies. Educational and Psychological Measurement, 60, 157-173.

Sawilowsky, S. S. (2000b). Reliability: Rejoinder to Thompson and Vacha-Haase. Educational and Psychological Measurement, 60, 196-200.

*Shields, A. L. y Caruso, J. C. (2003). Reliability generalization of the Alcohol Use Disorders Identification Test. Educational and Psychological Measurement, 63, 404-413.

*Shields, A. L. y Caruso, J. C. (2004). A reliability induction and reliability generalization study of the Cage Questionnaire. Educational and Psychological Measurement, 64, 254-270.

*Shields, A. L., Howell, R. T., Potter, J. S. y Weiss, R. D. (2007). The Michigan Alcoholism Screening Test and its shortened form: A meta-analytic inquiry into score reliability. Substance Use and Misuse, 42, 1-18.

Silver , N. y Dunlap, W. (1987). Averaging coefficients: Should Fisher’s z-transformation be used? Journal of Applied Psychology, 72, 3-9.

Thompson, B. (1994). Guidelines for authors. Educational and Psychological Measurement, 54, 837-847.

Thompson, B. (Ed.) (2003). Score reliability: Contemporary thinking on reliability issues. Thousand Oaks, CA: Sage.

*Thompson, B. y Cook, C. (2002). Stability of the reliability of LibQUAL+TM scores: A reliability generalization meta-analysis study. Educational and Psychological Measurement, 62, 735-743.

Thompson, B. y Vacha-Haase, T. (2000). Psychometrics is datametrics: The test is not reliable. Educational and Psychological Measurement, 60, 174-195.

Traub, R. E. (1994). Reliability for the social sciences: Theory and applications (Vol. 3). Thousand Oaks, CA: Sage.

*Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement, 58, 6-20.

Vacha-Haase, T., Henson, R. K. y Caruso, J. C. (2002). Reliability generalization: Moving toward improved understanding and use of score reliability. Educational and Psychological Measurement, 62, 562-569.

*Vacha-Haase, T., Kogan, L. R., Tani, C. R. y Woodall, R. A. (2001). Reliability generalization: Exploring variation of reliability coefficients of MMPI clinical scales scores. Educational and Psychological Measurement, 61, 45-59.

Vacha-Haase, T., Kogan, L. R. y Thompson, B. (2000). Sample compositions and variabilities in published studies versus those of test manuals: Validity of score reliability inductions. Educational and Psychological Measurement, 60, 509-522.

Vacha-Haase, T. y Ness, C. (1999). Practices regarding reporting of reliability coefficients: A review of three journals. Journal of Experimental Education, 67, 335- 342.

*Vacha-Haase, T., Tani, C. R., Kogan, L. R., Woodall, R. A. y Thompson, B. (2001). Reliability generalizat i o n : Ex p l o r i n g re l i a b i l i t y va r i a t i o n s o n MMPI/MMPI-2 validity scale scores. Assessment, 8, 391-401.

*Visweswaran, C. y Ones, D. S. (2000). Measurement error in ‘big five factors’ personality assessment: Reliability generalization across studies and measures. Educational and Psychological Measurement, 60, 224- 235.

*Voskuijl, O. F. y van Sliedregt, T. (2002). Determinants of interrater reliability of job analysis: A metaanalysis. European Journal of Psychological Assessment, 18, 52-62.

*Wallace, K. A. y Wheeler, A. J. (2002). Reliability generalization of the Life Satisfaction Index. Educational and Psychological Measurement, 62, 674-684.

Wilkinson, L. & APA Task Force on Statistical Inference (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594-604.

Whittington, D. (1998). How well do researchers report their measures? An evaluation of measurement in published educational research. Educational and Psychological Measurement, 58, 21-37.

*Yin, P. y Fan, X. (2000). Assessing the reliability of Beck Depression Inventory scores: Reliability generalization across studies. Educational and Psychological Measurement, 60, 201-223.

*Youngstrom, E. A. y Green, K. W. (2003). Reliability generalization of self-report of emotions when using the Differential Emotions Scale. Educational and Psychological Measurement, 63, 279-295.

*Zangaro, G. A. y Soeken, K. L. (2005). Meta-analysis of the reliability and validity of Part B of the Index of Work Satisfaction across studies. Journal of Nursing Measurement, 13, 7-22.

Published

2008-12-31

How to Cite

Sánchez-Meca, J., López-Pina, J. A., & López-López, J. A. (2008). A review of the meta-analytic studies of reliability generalization. Escritos De Psicología - Psychological Writings, 2(1), 110–121. https://doi.org/10.24310/espsiescpsi.v2i1.13365

Issue

Section

Reports