中华医学教育杂志 ›› 2020, Vol. 40 ›› Issue (4): 311-315.DOI: 10.3760/cma.j.cn115259-20190822-00679

• 医学教育评估 • 上一篇    下一篇

多面拉希模型在医师资格考试第一阶段临床基本技能考试中的应用

卢燕1, 张颖1, 何惧2, 邹杰文1   

  1. 1国家医学考试中心研究评价处,北京 100097;
    2国家医学考试中心,北京 100097
  • 收稿日期:2019-08-22 发布日期:2020-12-08
  • 通讯作者: 张颖, Email: zy1392@126.com, 电话: 010-59935029

The application of many facets Rasch model in the first stage clinical fundamental skills test of National Medical Licensing Examination phased examination

Lu Yan1, Zhang Ying1, He Ju2, Zou Jiewen1   

  1. 1Research and Evaluation Department, National Medical Examination Center, Beijing 100097, China;
    2National Medical Examination Center, Beijing 100097,China
  • Received:2019-08-22 Published:2020-12-08
  • Contact: Zhang Ying, Email: zy1392@126.com, Tel: 0086-10-59935029

摘要: 目的 针对2018年医师资格考试临床类别分阶段考试第一阶段临床基本技能考试中评分者对评分标准的掌握程度进行评价,探讨标准化病人(standardized patients,SP)与考官评分的一致性,为相关研究提供参考。方法 2018年,随机抽取参加医师资格考试临床类别分阶段考试第一阶段临床基本技能考试的某所学校,以其临床医学专业77名考生的沟通交流能力和人文关怀能力的分数作为研究对象。采用多面拉希模型(many facets Rasch model,MFRM),将评分者(包括2名考官和1名SP)的情景误差因素分离出来,对考生的沟通交流能力和人文关怀能力进行评估,并对评分者的内部一致性和评价的宽严度进行分析。结果 77名考生能力估计值的平均数为2.75 logits(MFRM分析结果均采用洛基量尺logit作为基本单位),大部分考生的加权拟合检验量(Infit)小于1.5;评分者总体宽严度平均数为-0.55 logits;考官的宽严度平均数为-0.45 logits,SP的宽严度平均数为-0.70 logits,其差异无统计学意义(t=-0.129,P=0.903)。结论 评分者对评分标准掌握较好,整体标准相对宽松,SP与考官评分的内部一致性较高。

关键词: 医师资格考试, 评分者误差, 拉希理论, 多面拉希模型, 标准化病人

Abstract: Objective To evaluate the mastery of scoring standards by raters in 2018 clinical fundamental skills test of National Medical Licensing Examination phased the first stage,to explore the consistency between standardized patients (SP) and examiners' scores, and to provide more information for relevant research. Methods In 2018, based on the scores of communication capacity and humanistic care from 77 candidates in clinical fundamental skills test of the National Medical Licensing Examination phased the first stage in a randomly selected medical college, use the many facets Rasch model to calculate estimated ability of 77 candidates, analyze the internal consistency of raters and the leniency and strictness of evaluation. Results The results showed that the average estimated capacity of 77 candidates was 2.75 logits, and the most examinees' infit was less than 1.5.The average severity of the raters was -0.55, the severity of the examiners was -0.45, of the SP was -0.70,The difference was not statistically significant. Conclusions Raters have a good command of scoring standards, the overall standard is relatively loose.The scores of SP and examiners were consistent.

Key words: National medical licensing examination, Rater error, Rasch theory, Many facets Rasch model, Standardized patients

中图分类号: