多面拉希模型在医师资格考试第一阶段临床基本技能考试中的应用

doi:10.3760/cma.j.cn115259-20190822-00679

中华医学教育杂志 ›› 2020, Vol. 40 ›› Issue (4): 311-315.DOI: 10.3760/cma.j.cn115259-20190822-00679

多面拉希模型在医师资格考试第一阶段临床基本技能考试中的应用

卢燕¹, 张颖¹, 何惧², 邹杰文¹

¹国家医学考试中心研究评价处,北京 100097;
²国家医学考试中心,北京 100097

收稿日期:2019-08-22 发布日期:2020-12-08
通讯作者: 张颖, Email: zy1392@126.com, 电话: 010-59935029

The application of many facets Rasch model in the first stage clinical fundamental skills test of National Medical Licensing Examination phased examination

Lu Yan¹, Zhang Ying¹, He Ju², Zou Jiewen¹

¹Research and Evaluation Department, National Medical Examination Center, Beijing 100097, China;
²National Medical Examination Center, Beijing 100097,China

Received:2019-08-22 Published:2020-12-08
Contact: Zhang Ying, Email: zy1392@126.com, Tel: 0086-10-59935029

摘要/Abstract

摘要： 目的针对2018年医师资格考试临床类别分阶段考试第一阶段临床基本技能考试中评分者对评分标准的掌握程度进行评价,探讨标准化病人(standardized patients,SP)与考官评分的一致性,为相关研究提供参考。方法 2018年,随机抽取参加医师资格考试临床类别分阶段考试第一阶段临床基本技能考试的某所学校,以其临床医学专业77名考生的沟通交流能力和人文关怀能力的分数作为研究对象。采用多面拉希模型(many facets Rasch model,MFRM),将评分者(包括2名考官和1名SP)的情景误差因素分离出来,对考生的沟通交流能力和人文关怀能力进行评估,并对评分者的内部一致性和评价的宽严度进行分析。结果 77名考生能力估计值的平均数为2.75 logits(MFRM分析结果均采用洛基量尺logit作为基本单位),大部分考生的加权拟合检验量(Infit)小于1.5;评分者总体宽严度平均数为-0.55 logits;考官的宽严度平均数为-0.45 logits,SP的宽严度平均数为-0.70 logits,其差异无统计学意义(t=-0.129,P=0.903)。结论评分者对评分标准掌握较好,整体标准相对宽松,SP与考官评分的内部一致性较高。

关键词: 医师资格考试, 评分者误差, 拉希理论, 多面拉希模型, 标准化病人

Abstract: Objective To evaluate the mastery of scoring standards by raters in 2018 clinical fundamental skills test of National Medical Licensing Examination phased the first stage,to explore the consistency between standardized patients (SP) and examiners' scores, and to provide more information for relevant research. Methods In 2018, based on the scores of communication capacity and humanistic care from 77 candidates in clinical fundamental skills test of the National Medical Licensing Examination phased the first stage in a randomly selected medical college, use the many facets Rasch model to calculate estimated ability of 77 candidates, analyze the internal consistency of raters and the leniency and strictness of evaluation. Results The results showed that the average estimated capacity of 77 candidates was 2.75 logits, and the most examinees' infit was less than 1.5.The average severity of the raters was -0.55, the severity of the examiners was -0.45, of the SP was -0.70,The difference was not statistically significant. Conclusions Raters have a good command of scoring standards, the overall standard is relatively loose.The scores of SP and examiners were consistent.

Key words: National medical licensing examination, Rater error, Rasch theory, Many facets Rasch model, Standardized patients

中图分类号:

R0
R-05

卢燕, 张颖, 何惧, 邹杰文. 多面拉希模型在医师资格考试第一阶段临床基本技能考试中的应用[J]. 中华医学教育杂志, 2020, 40(4): 311-315.

Lu Yan, Zhang Ying, He Ju, Zou Jiewen. The application of many facets Rasch model in the first stage clinical fundamental skills test of National Medical Licensing Examination phased examination[J]. Chinese Journal of Medical Education, 2020, 40(4): 311-315.

参考文献

[1] Harden RM, Stevenson M, Downie, WW, et al. Assessment of clinical competence using objective structured examination [J].BMJ,1975,1(5955):447-451. DOI:10.1136/bmj.1.5955.447.
[2] Boulet JR, McKinley DW, Whelan GP, et al. Quality assurance methods for performance-based assessments [J].Adv Health Sci Educ,2003,8(1):27-47. DOI:10.1023/A:1022639521218.
[3] Iramaneerat C, Iramaneerat C,Yudkowsky R,et al. Quality control of an OSCE using generalizability theory and many-faceted, rasch measurement [J].Adv Health Sci Educ,2008, 13(4):479-493. DOI:10.1007/s10459-007-9060-8.
[4]徐晓峰,刘勇.评分者内部一致性的研究和应用[J].心理科学,2007,30(5):1175-1178. DOI: 10.16719/j.cnki.1671-6981.2007.05.036.
[5]Cronbach LJ, Gleser GC, Rajaratnam N. Theory of generalizability: a liberalization of reliability theory[J]. Brit J Math Stat Psy,1963, 16(2):137-163. DOI: 10.1111/j.2044-8317.1963.tb00206.x
[6]孙晓敏,薛刚.多面Racsh模型在结构化面试中的应用[J].心理学报,2008,40(9):1030-1040. DOI: 10. 3724/SP. J. 1041. 2008. 01030.
[7] Linacre JM, Wright BD.Understand, rasch measurement: construction of measures from many-facet data [J]. J Appl Meas, 2002,3(4):486-512.
[8] Chalhoub-Deville M,Wigglesworth G. Rater judgment and English language speaking proficiency[J]. World English, 2005,24(3):383-391.
[9] Bonk WJ, Ockey GJ. A many-facet, rasch analysis of the second language group oral discussion task[J].Lang Test, 2003, 20 (1):89-110.DOI:10.1191/0265532203lt245oa.
[10]何莲珍, 闵尚超.写作测试的主要实证研究方法及其发展趋势[J].中国外语,2008 5(6):42-46.DOI: 10.13564/j.cnki.issn.1672-9382.2008.06.005.
[11] Du Y, Brown WL, Rogers C. Raters and single prompt-to-prompt equating using the FACETS model in a writing performance assessment [C].Chicago: ERIC press, 1997:2-24.

多面拉希模型在医师资格考试第一阶段临床基本技能考试中的应用

The application of many facets Rasch model in the first stage clinical fundamental skills test of National Medical Licensing Examination phased examination

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 5

编辑推荐

Metrics

本文评价

[1]	王超珺, 刘昭晖, 柯瑞盛. 标准化病人在普通外科学急腹症临床见习教学中的应用[J]. 中华医学教育杂志, 2020, 40(9): 700-703.
[2]	李国建, 向阳, 何惧, 刘茂伟, 韩春梅, 席峥, 张东奇. 临床思维能力测评框架构建和测评系统设计[J]. 中华医学教育杂志, 2020, 40(7): 565-568.
[3]	邹莉萍, 王卫星. 英语为母语的标准化病人在临床医学专业留学生病史采集考核中的应用[J]. 中华医学教育杂志, 2020, 40(10): 837-840.
[4]	胡露露, 龚忠诚, 尹小朋, 王冰, 方昌, 陈青立, 帕热克江·帕塔尔, 林兆全, 刘慧. 简易标准化病人与基于问题学习相结合在口腔颌面外科学教学中的应用[J]. 中华医学教育杂志, 2018, 38(2): 249-252.
[5]	金哲, 张道俭, 王颖, 李海潮, 齐心. 基于标准化病人的住院医师石膏固定术考核方法的改进研究[J]. 中华医学教育杂志, 2018, 38(1): 124-128.