نوع مقاله : مقاله پژوهشی

نویسندگان

1 نویسنده مسئول، دانشجوی دکتری رشته سنجش و اندازه‌گیری، دانشگاه علامه طباطبائی، تهران، ایران.

2 استاد ممتاز گروه سنجش و اندازه‌گیری، دانشگاه علامه طباطبائی، تهران، ایران.

3 استاد گروه آمار، دانشگاه علامه طباطبائی، تهران، ایران.

چکیده

این پژوهش با هدف مقایسه روشهای شناسایی کنش افتراقی سؤال در مدلهای تشخیصی شناختی بر روی سؤالات درس زبان انگلیسی آزمون کارشناسی ازشد مجموعه روانشناسی بر اساس جنسیت انجام شد. نمونه تحقیق شامل 2455 داوطلب زن و 919 داوطلب مرد بود که در آزمون کارشناسی ارشد مجموعه روانشناسی سال 1396 شرکت کرده بودند. مدل تشخیصی شناختی G-DINA با داده‌ها برازش داده شد. کنش افتراقی سؤالات با سه روش آزمون والد، آزمون نسبت درستنمایی و آزمون نسبت درستنمایی بازبینی شده بررسی شد. نتایج این سه روش توافق متوسطی در تشخیص کنش افتراقی سؤال داشتند. دو روش مبتنی بر درستنمایی می‌توانند در کنار روش والد برای شناسایی کنش افتراقی سؤال به کار روند. بر اساس نتایج 16 سؤال دارای کنش افتراقی بودند. بیشتر سؤالات دارای کنش افتراقی سؤال یکنواخت به نفع مردان بودند. سؤالات دارای کنش افتراقی غیریکنواخت برای افراد مسلط به نفع مردان و برای افراد غیرمسلط به نفع زنان عمل می‌کردند. توصیه می‌شود مطالعات تکمیلی برای بررسی وجود سوگیری و علت آن در سؤالات دارای کنش افتراقی سؤال انجام شود.

کلیدواژه‌ها

عنوان مقاله [English]

Examining Gender Differential Item Functioning of English Language Items in the Master of Psychology Series Exam to compare the Wald Test, Likelihood Ratio Test, and the Revised Likelihood Ratio Test based on the Cognitive Diagnostic Model

نویسندگان [English]

  • Shoeayb Qasemi 1
  • ali delavar 2
  • Farzad Eskandari 3

1 Corresponding Author, Ph.D. Student of Educational Measurement and Assessment, Allameh Tabataba'i University, Tehran, Iran.

2 Professor, Department of Educational Measurement and Assessment, Allameh Tabataba'i University, Tehran, Iran.

3 Professor Department of Statistics, Allameh Tabataba'i University, Tehran, Iran.

چکیده [English]

This study was conducted with the aim of examining the gender differential item functioning of the English language items of the master psychology series exam using the methods available in the context of cognitive diagnostic models. The research sample included 2,455 female applicants and 919 male applicants who attended the Master of Psychology series exam in 1396. The G-DINA cognitive diagnostic model was fitted with data. DIF of the questions were examined with three methods: the Wald test, the likelihood ratio test and the revised likelihood ratio test. The results of these three methods had a moderate agreement on DIF detection. Based on the results 16 questions had DIF. Most of the questions had uniform DIF in favor of men. Questions that had non-uniform DIF, for the dominant applicants were in favor of men and for non-dominated applicants were in favor of women. It is recommended that additional studies be conducted to investigate the existence of the bias and the cause of the DIF.

کلیدواژه‌ها [English]

  • differential item functioning
  • gender
  • master's degree entrance exam
  • cognitive diagnostic model
بنی اسدی، علی، صالحی، کیوان، خدایی، ابراهیم، باقری، خسرو و ایزانلو، بلال. (1401). رویکرد سنجش کلاسی عادلانه: برازش مدل چندبعدی پاسخ مدرج. فصلنامه انداز‌ه‌گیری تربیتی، 13(49)، 51-31.‎ doi: 10.22054/jem.2023.63328.2219
کاظمی دانا، بهروز. (1395). کارکرد افتراقی سؤال‌های (DIF) آزمون ریاضی پایه هشتم تیمز ۲۰۰۷ در بین دانش‌آموزان دختر و پسر با استفاده از مدل‌های تشخیصی شناختی (CDMs). پایان‌نامه کارشناسی ارشد. دانشگاه علامه طباطبایی.
سعادتی، سمیه، مقدم‌زاده، علی، مینایی، اصغر، و گرامی‌پور، مسعود. (1399). کارکرد افتراقی سؤال‌ در چارچوب سنجش تشخیصی شناختی: مطالعه موردی سؤال‌های حساب دیفرانسیل و انتگرال کنکور سراسری 97. دو فصلنامه راهبردهای شناختی در یادگیری، 8(15)، 19-35.‎
Ackerman, T. A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29(1), 67-91. https://doi.org/10.1111/j.1745-3984.1992.tb00370.x
Alderson, J. C., & Huhta, A. (2005). The development of a suite of computer-based diagnostic tests based on the Common European Framework. Language Testing, 22(3), 301-320. https://doi.org/10.1191/0265532205lt310oa
Altman, D. G. (1999). Practical statistics for medical research. Chapman & Hall/CRC Press.
Amirian, S. M. R., Alavi, S. M., & Fidalgo, A. M. (2014). Detecting gender DIF with an English proficiency test in EFL context. Iranian Journal of Language Testing, 4(2), 187-203.
Baniasadi, A., Salehi, K., Khodaie, E., Bagheri, K., & Izanloo, B. (2022). Fair Classroom Assessment Rubric: Fitting a Multidimensional Graded Response Model. Educational Measurement13(49), 31-51. [In Persian]
de la Torre, J., & Chiu, C. Y. (2016). A general method of empirical Q-matrix validation. Psychometrika, 81(2), 253-273. https://doi.org/10.1007/s11336-015-9467-8
Doolittle, A., & Welch, C. (1989). Gender differences in performance on a college-level achievement test. Academic Achievement, 12(3), 45-62.
Effatpanah, F., Baghaei, P., & Boori, A. A. (2019). Diagnosing EFL learners' writing ability: A diagnostic classification modeling analysis. Language Testing in Asia, 9(1), 12. https://doi.org/10.1186/s40468-019-0086-7
Gafni, N. (1991). Differential item functioning: Performance by sex on reading comprehension tests. Journal of Educational Assessment, 8(2), 123-145.
George, A. C., & Robitzsch, A. (2014). Multiple group cognitive diagnosis models, with an emphasis on differential item functioning. Psychological Test and Assessment Modeling, 56(4), 405-432.
Hou, L. (2013). Differential item functioning assessment in cognitive diagnostic modeling: Applying the Wald test to investigate DIF in the generalized DINA model framework [Doctoral dissertation, Rutgers University]. ProQuest Dissertations Publishing.
Hou, L., de la Torre, J., & Nandakumar, R. (2014). Differential item functioning assessment in cognitive diagnostic modeling: Application of the Wald test to investigate DIF in the DINA model. Journal of Educational Measurement, 51(1), 98-125. https://doi.org/10.1111/jedm.12034
Hou, L., Terzi, R., & de la Torre, J. (2021). Wald test formulations in DIF detection of CDM data with the proportional reasoning test. International Journal of Assessment Tools in Education, 7(2), 145-158. https://doi.org/10.21449/ijate.706425
Hunter, C. V., Li, H., & Liu, R. (2022). Methods to retrofit and validate Q-matrices for cognitive diagnostic modeling. In M. Wiberg (Ed.), Quantitative psychology (pp. 217-225). Springer. https://doi.org/10.1007/978-3-030-81272-3_17
Jamalzadeh, M., Lotfi, A., & Rostami, M. (2022). Equity on general English achievement tests through gender-based DIF analysis across different majors. International Journal of Foreign Language Teaching and Research, 10(4), 47-65.
Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for applying Fusion Model to LanguEdge assessment. Language Testing, 26(1), 31-73. https://doi.org/10.1177/0265532208097336
Kazemi Dana, B. (2016). Examining Differential Item Functioning (DIF) of the 2007 TIMSS 8th grade mathematics test among male and female students using of Cognitive Diagnostic Models (CDMs) [Unpublished master's dissertation]. Allameh Tabataba'i University. [In Persian]
Leighton, J. P., & Gierl, M. J. (Eds.). (2007). Cognitive diagnostic assessment for education: Theory and applications. Cambridge University Press.
Li, H., Hunter, C. V., & Lei, P. W. (2015). The selection of cognitive diagnostic models for a reading comprehension test. Language Testing, 33(3), 391-409. https://doi.org/10.1177/0265532215590848
Li, X., & Wang, W. C. (2015). Assessment of differential item functioning under cognitive diagnosis models: The DINA model example. Journal of Educational Measurement, 52(1), 28-54. https://doi.org/10.1111/jedm.12060
Liu, Y., Yin, H., Xin, T., Shao, L., & Yuan, L. (2019). A comparison of differential item functioning detection methods in cognitive diagnostic models. Frontiers in Psychology, 10, 1073. https://doi.org/10.3389/fpsyg.2019.01073
Ma, W. (2018). GDINA: The generalized DINA model framework (Version 2.1) [Computer software]. https://CRAN.R-project.org/package=GDINA
Ma, W., Terzi, R., & de la Torre, J. (2021). Detecting differential item functioning using multiple-group cognitive diagnosis models. Applied Psychological Measurement, 45(1), 37-53. https://doi.org/10.1177/0146621620959829
Ma, W., Terzi, R., Lee, S., & de la Torre, J. (2017). Multiple-group cognitive diagnosis models and their applications in detecting differential item functioning. Educational Measurement: Issues and Practice, 36(3), 28-40.
Mehrazmay, R., Ghonsooly, B., & de la Torre, J. (2021). Detecting differential item functioning using cognitive diagnosis models: Applications of the Wald test and likelihood ratio test in a university entrance examination. Applied Measurement in Education, 34(4), 262-284. https://doi.org/10.1080/08957347.2021.1933981
Milewski, G. B., & Baron, P. A. (2002). Extending DIF methods to inform aggregate report on cognitive skills. Educational Assessment, 8(2), 123-145.
Paulsen, J., Svetina, D., Feng, Y., & Valdivia, M. (2020). Examining the impact of differential item functioning on classification accuracy in cognitive diagnostic models. Applied Psychological Measurement, 44(4), 267-281. https://doi.org/10.1177/0146621619889098
Qiu-Yun, L. I., Yan, S. A., Tai-Xun, W. A. N. G., & Dong-Bei, C. H. U. (2022). DIF detection methods and its application of polytomously scored items under the framework of cognitive diagnosis. Journal of Psychological Science, 45(4), 998.
Ravand, H., Barati, H., & Widhiarso, W. (2012). Exploring diagnostic capacity of a high stakes reading comprehension test: A pedagogical demonstration. Iranian Journal of Language Testing, 3(1), 12-37.
Revelle, W. (2018). psych: Procedures for personality and psychological research (Version 1.8.10) [Computer software]. Northwestern University. https://CRAN.R-project.org/package=psych
Saadati, S., Moghadamzadeh, A., Minaei, A., & Geramipour, M. (2020). Differential item functioning in the framework of cognitive diagnostic assessment: Questions related to the differential and integral calculus of the Iranian national university entrance examination 2018. Biquarterly Journal of Cognitive Strategies in Learning8(15), 19-35. [In Persian]
Svetina, D., Feng, Y., Paulsen, J., Valdivia, M., Valdivia, A., & Dai, S. (2018). Examining DIF in the context of CDMs when the Q-matrix is misspecified. Frontiers in Psychology, 9, 2365. https://doi.org/10.3389/fpsyg.2018.02365
Urmston, A., Raquel, M., & Tsang, C. (2013). Diagnostic testing of Hong Kong tertiary students' English language proficiency: The development and validation of DELTA. Hong Kong Journal of Applied Linguistics, 14(2), 60-82.
Wang, C., & Gierl, M. J. (2011). Using the attribute hierarchy method to make diagnostic inferences about examinees' cognitive skills in critical reading. Journal of Educational Measurement, 48(2), 165-187. https://doi.org/10.1111/j.1745-3984.2011.00142.x
Wang, N., & Wang, C. (2022). Exploring differential item functioning in cognitive diagnostic computer adaptive testing: A simulation study. Educational and Psychological Measurement, 82(1), 121-145.
Yi, Y. (2012). Implementing a cognitive diagnostic assessment in an institutional test: A new networking model in language testing and experiment with a new psychometric model and task type [Doctoral dissertation, University of Illinois at Urbana-Champaign].
Zhang, W. (2006). Detecting differential item functioning using the DINA model [Doctoral dissertation, University of Maryland].