中华医学教育杂志 ›› 2018, Vol. 38 ›› Issue (6): 940-943.DOI: 10.3760/cma.j.issn.1673-677X.2018.06.031

• 医学教育评估 • 上一篇    下一篇

基于两种教育测量理论的全国医学博士英语统一考试质量分析

张泉慧, 张颖   

  1. 100097北京,国家医学考试中心研究评价处
  • 收稿日期:2018-03-08 发布日期:2020-12-09
  • 通讯作者: 张泉慧, Email: quanhui1018@163.com

Quality analysis of national English test for doctor of medicine based on two kinds of educational measurement theory

Zhang Quanhui, Zhang Ying   

  1. Department of Research and Evaluation, National Medicine Examination Center, Being 100097, China
  • Received:2018-03-08 Published:2020-12-09
  • Contact: Zhang Quanhui, Email: quanhui1018@163.com

摘要: 目的 采用项目反应理论中的Rasch模型和经典测量理论,对2015年~2017年全国医学博士英语统一考试进行质量分析,比较两种理论估计结果的一致性。方法 采用资料分析方法,以2015年~2017年全国医学博士英语统一考试40 607份试卷为研究资料,采用项目反应理论的Rasch模型和经典测量理论分别分析考试的信度、难度和区分度,对比其分析结果的一致性。结果 Rasch模型的分析结果显示,3个年度试卷参数与模型拟合度较高,最大信息函数均>25.00,估计误差均<0.20,各个题型难度均在-0.70~0.70之间。经典测量理论的分析结果显示,3个年度试卷信度均>0.80,各个题型难度均在0.31~0.63之间,区分度均在0.12~0.29之间。Rasch模型和经典测量理论分析结果呈高度相关。结论 3个年度全国医学博士英语统一考试均能够较好地评价考生医学英语的应用能力。在参数估计和误差分析方面,Rasch模型的精度更高,在后续的考试评价中可以更多地使用。

关键词: 全国医学博士英语统一考试, Rasch模型, 经典测量理论

Abstract: Objective To analyze the quality of national English test for doctor of medicine from 2015 to 2017 using the Rasch model of item response theory (IRT) and the classical test theory (CTT), then compare the consistency of the results. Methods 40 607 papers of national English test for doctor of medicine from 2015 to 2017 were selected as the research materials. The reliabilities, difficulties and discriminations of the examinations were analyzed using the Rasch model of IRT and CTT respectively. Then the consistency of the results was analyzed. Results IRT analysis results showed that the parameters fit well with the Rasch model, the maximum information functions were all>25.00,the estimated errors were all <0.20, and the difficulties were all between -0.70 and 0.70. CTT analysis results showed that the reliabilities of the three annual papers were all >0.80, the difficulties were all between 0.31 and 0.63, and the discriminations were all between 0.12 and 0.29. The results of the two theoretical analyses were highly correlated. Conclusions The three years' tests can evaluate the application abilities of candidates' medical English well. In terms of parameter estimation and error analysis, Rasch model is more accurate than CTT, and Rasch model should be used more in the subsequent examination evaluations.

Key words: National English test for doctor of medicine, Rasch model, Classical test theory