中华医学教育杂志 ›› 2023, Vol. 43 ›› Issue (5): 391-396.DOI: 10.3760/cma.j.cn115259-20221114-01427

• 医学教育评估 • 上一篇    下一篇

基于概化理论的临床医学专业(本科)水平测试临床基本技能考试质量分析

周文静1, 江哲涵2, 欧阳劲樱1, 王维民3   

  1. 1北京大学公共卫生学院卫生政策与管理学系 100191;
    2北京大学医学教育研究所 全国医学教育发展中心 100191;
    3北京大学医学部 100191
  • 收稿日期:2022-11-14 出版日期:2023-05-01 发布日期:2023-05-05
  • 通讯作者: 王维民, Email: wwm@bjmu.edu.cn
  • 基金资助:
    国家自然科学基金青年项目(72104006)

Quality analysis of the clinical fundamental skills test for standardized competence test for clinical medicine undergraduates using generalizability theory

Zhou Wenjing1, Jiang Zhehan2, Ouyang Jinying1, Wang Weimin3   

  1. 1Department of Health Policy and Management, School of Public Health, Peking University, Beijing 100191, China;
    2Institute of Medical Education & National Center for Health Professions Education Development, Peking University, Beijing 100191, China;
    3Peking University Health Science Center, Beijing 100191, China
  • Received:2022-11-14 Online:2023-05-01 Published:2023-05-05
  • Contact: Wang Weimin, Email: wwm@bjmu.edu.cn
  • Supported by:
    National Natural Science Foundation Youth Project (72104006)

摘要: 目的 评价临床医学专业(本科)水平测试技能考试信度,为改进技能考试质量提供依据。方法 应用概化理论对2020年完成临床医学专业(本科)水平测试技能考试的7 322名考生的成绩进行研究。根据考试设计,纳入的因素包括考生、考站和考题,计算各因素的方差百分比,分析影响考试信度的主要因素,计算考试的概括力系数和可靠性指数分析考试的信度。基于完成同一套试题的考生分数,通过调整考题和考站数量,得出不同概括力系数和可靠性指数以探讨考核方案的优化。结果 考题的方差百分比为70.0%~76.0%,考站的方差百分比为17.6%~24.4%,考生的方差百分比为0.3%~0.8%。概括力系数范围为0.502~0.717,可靠性指数范围为0.052~0.223。当考站数量不变,考题数量增加1道时,概括力系数增加0.002,可靠性指数增加0.002,考题数量增加5道时,概括力系数增加0.007,可靠性指数增加0.004;当考站数量增加1站时,概括力系数增加0.025,可靠性指数增加0.010,考站数量增加2站时,概括力系数增加0.045,可靠性指数增加0.017。结论 水平测试临床基本技能考试质量尚存在提升空间,考站与考题是影响测试信度的主要因素,通过增加考站与考题数量可以提高测试信度,但需综合考虑测试时间及考站、考题开发成本。   

关键词: 医学生, 概化理论, 临床医学专业(本科)水平测试, 临床技能, 测试质量, 信度

Abstract: Objective To assess the reliability of the clinical fundamental skills test forstandardized competence test for clinical medicine undergraduates and provide an empirical basis for improving the quality of the test. Methods A generalizability theory was employed to examine the scores of 7 322 examinees who completed the clinical fundamental skills test for BSc in 2020. According to the test design, examinees, stations and items were included in the analysis.The percentage of variance for each facet was calculated to identify the main sources affecting test reliability. Generalizability (G) coefficient and phi index were calculated to analyze the reliability of the test. Based on the scores of examinees who completed the same items in the test, different G coefficients and phi indices were obtained to explore the optimization of the test reliability by adjusting the number of items and stations. Results The results from the generalizability theory showed that the percentage of variance for items was 70.0%~76.0%, the percentage of variance for stations was 17.6%~24.4%, and the percentage of variance for examinees was 0.3%~0.8%. G coefficients for the test ranged from 0.502 to 0.717, and phi indices ranged from 0.052 to 0.223. When one item was added, G coefficient increased by 0.002 and phi index increased by 0.002. When five items were added, G coefficient increased by 0.007 and phi index increased by 0.004. When one station was added, G coefficient increased by 0.025, and phi index increased by 0.010. When two stations were added, G coefficient increased by 0.045 and phi index increased by 0.017. Conclusions There is still room for improving the quality of clinical fundamental skills test. The stations and items are the main sources affecting the test reliability. The test reliability can be improved by increasing the number of stations and items, but the time and the cost of stations and items need to be considered in a comprehensive manner.

Key words: Students, medical, Generalizability theory, Standardized competence test for clinical medicine undergraduates, Clinical skills, Test quality, Reliability

中图分类号: