Loading...

Current Issue

    For Selected: Toggle Thumbnails
    Humanistic Quality Education
    Opportunities, challenges and integration paths of medical humanities education in the AI era
    Zhang Kunsong, Zhang Junlong, Feng Shaoting, Chen Wei, Kuang Ming, Xiao Haipeng
    2026, 46 (4):  241-246.  DOI: 10.3760/cma.j.cn115259-20250707-00751
    Abstract ( 6 )   PDF (1106KB) ( 4 )  
    The wide application of artificial intelligence (AI) technology in the medical field is reshaping the boundaries of medicine. AI technology not only significantly improves the efficiency of medical services and the accuracy of diagnostic decisions, but also brings new opportunities for scenario-based teaching, practice-oriented development, and precision assessment in medical humanities education. At the same time, due to the inherent limitations of AI technology itself and the excessive reliance on it by doctors, problems such as the weakening of doctors′ comprehensive capabilities, the crisis of physician-patient trust, and the difficulty in defining medical responsibilities have arisen, putting medical humanities education face to face with new challenges. In recent years, The First Affiliated Hospital of Sun Yat-sen University has fully integrated AI technology with medical humanities education through various initiatives, including top-level design, the reconstruction of a full-cycle integrated humanities education curriculum system, the construction of an interdisciplinary and composite ″AI+″ faculty team, the establishment of a diversified evaluation system, and hospital-industry collaborative innovation. While maintaining the core position of medical humanities education, the hospital has fully leveraged AI technology for empowerment, aiming to cultivate composite medical professionals with both clinical competence and medical humanistic spirit, and provide references for the reform of medical humanities education under the Healthy China strategy.
    References | Related Articles | Metrics
    Exploring a medical humanities education pathway in medical museums from the perspective of situated learning
    Zhang Xiaolei, Bao Han
    2026, 46 (4):  247-251.  DOI: 10.3760/cma.j.cn115259-20250427-00475
    Abstract ( 3 )   PDF (856KB) ( 0 )  
    Objective To explores the role of medical museums in medical humanities education and clarifies their functional positioning, with the aim of developing a practice-oriented educational pathway. Methods Grounded in situated learning theory, a mixed-methods approach was employed. The museum complex of Shanghai Medical College, Fudan University served as the primary case, supplemented by a comparative analysis of publicly available information from medical museums worldwide. Subsequently, in June 2025, a questionnaire survey was administered to the museum′s docent team (n=55), and descriptive statistics were used to evaluate educational outcomes. Finally, an medical Humanities education pathway in medical museums was constructed by integrating theoretical analysis and empirical findings. Results As a situated learning environment, the medical museum effectively fosters the transformation of medical humanities education from knowledge acquisition to value internalization by constructing multi-dimensional situations, forming learning communities, and designing participatory activities. The survey showed that medical students deeply involved in museum activities showed the highest agreement on ″professional mission″ (4.67±0.47) and ″communication skills″ (4.67±0.47). Furthermore, a positive correlation was found between the depth of participation and educational outcomes. Conclusions The educational pathway developed in this study provides a theoretical framework and practical guidance for leveraging medical museums in medical humanities education, and can effectively enhance medical students′ humanistic literacy and professional identity.
    References | Related Articles | Metrics
    English and Bilingual Teaching
    Effect of online Clinical Professional English course in improving medical students′ clinical English proficiency
    Liang Zhiqiao, Huang Rui, Bao Jing, Chang Panpan, Chen Xiuyuan, Wang Qi, Diao Tongxiang, Zhang Fang, Zhang Yuanyuan
    2026, 46 (4):  252-258.  DOI: 10.3760/cma.j.cn115259-20250206-00117
    Abstract ( 3 )   PDF (897KB) ( 2 )  
    Objective This study aims to evaluate the effectiveness of the online Clinical Professional English course in enhancing the clinical English proficiency of eight-year medical program students, and further explore the students′ satisfaction with the course. Methods In this study, a convenience sampling method was adopted, with 60 students from the 8-year medical program (admitted between 2016 and 2018) at Peking University People′s Hospital selected as the research subjects, including 18 students from the first year, 24 students from the second year, and 18 students from the third year. Teaching effectiveness was assessed through pre- and post-course tests and satisfaction surveys of the online Clinical Professional English course. The course incorporated diverse digital resources, virtual case studies, pre- and after-course test, focusing on enhancing medical students′ understanding of professional English terminology, fluency in expression, and cross-cultural communication skills. Results The average scores increased from 10.60±2.59 to 16.27±2.15 (t=-12.974, P<0.001), demonstrating significant improvement in students′ mastery of professional terminology, fluency, and confidence in cross-cultural communication. Subjective feedback revealed that 66.7% (40/60)of students reported significant progress in professional vocabulary, and 58.3% (35/60) self-assessed increased confidence in cross-cultural communication. However, some students highlighted a lack of interactivity in online teaching. Conclusions Online Clinical Professional English course instruction significantly improves students′ mastery of professional terminology, fluency in expression, and confidence in cross-cultural communication, but there are shortcomings in terms of interactivity. Based on this, it can be inferred that a blended teaching model (combining online and offline learning) can more comprehensively meet students′ practical application needs for clinical professional English.
    References | Related Articles | Metrics
    Teaching Methods
    Learning curve analysis of tracheal extubation skills in undergraduate medical interns under the ″Sandwich″ teaching method
    Liu Mingjuan, Dong yuanyuan, Cao Jin, Yu Danhong, Yang Rui, Liu Ke, Ni Huadong
    2026, 46 (4):  259-263.  DOI: 10.3760/cma.j.cn115259-20240625-00652
    Abstract ( 4 )   PDF (868KB) ( 2 )  
    Objective To explore the minimum number of practice attempts required for undergraduate medical interns to master endotracheal extubation skills under the ″sandwich″ teaching method. Methods Fifteen fifth-year undergraduate clinical students from the Class of 2019 at Jiaxing University medical college, who completed a rotation in the Department of Anesthesiology at the Affiliated Hospital of Jiaxing University between June and September 2023, were enrolled in the study. The ″Sandwich″ teaching method was employed for training in tracheal extubation procedures. All bedside extubation practices were independently supervised and assessed by the same two senior attending physician. Each intern independently completed 30 cases of tracheal extubation (a total of 450 cases), The quality score of the operation was recorded using a scale validated by the Delphi method, and a learning curve was constructed using the Cumulative Sum (CUSUM) method. They fitted quadratic and cubic polynomial functions and compared the goodness of fit (R2). The derivative analysis method was used to analyze the slope of the fitted curve and determine the skill turning point. Results The cubic polynomial function has the best fitting effect on the learning curve (R2=0.914). Using polynomial fitting derivative analysis method, the 18th operation is the inflection point of the learning curve slope, indicating a significant slowdown in skill improvement speed and entering a stable plateau period. Conclusions The ″sandwich″ teaching method can serve as an effective approach for training interns in endotracheal extubation skills. Learning curve analysis indicated that the minimum number of practice attempts required to achieve skill mastery was 18, providing evidence-based support for setting teaching objectives and optimizing resource allocation.
    References | Related Articles | Metrics
    Application of in situ simulation in comprehensive first aid skills training for emergency standardized training residents
    Guo Yawei, Zhu Dandan, Han Yingna, Liu Zhi, Wang Changyuan
    2026, 46 (4):  264-268.  DOI: 10.3760/cma.j.cn115259-20250412-00412
    Abstract ( 5 )   PDF (867KB) ( 4 )  
    Objective To study application of in situ simulation(ISS) in comprehensive first aid skills training for emergency standardized training residents. Methods This study adopted an experimental control method. A total of 67 residency trainees who rotated through the Emergency Department of Xuanwu Hospital, Capital Medical University, from September 2023 to September 2024 were enrolled. They were randomly divided into an experimental group (n=34) and a control group (n=33). The experimental group received training through ISS using a SimMan 3G manikin, whereas the control group was trained in the medical simulation center using the same manikin but through off-site simulation. After training, assessments were conducted using high-fidelity physiologically driven manikins to compare the rescue success time and rescue success rate between the two groups. Non-technical skills of trainees in both groups were assessed, and scores in four areas, including task management, were compared. Questionnaires were distributed to compare self-evaluations of learning interest and competency. Data were analyzed using independent samples t-tests, χ2 tests, etc. Results The rescue success time of the experimental group was shorter than that of the control group [(7.92±0.86) minutes vs. (8.43±0.78) minutes], and the rescue success rate was higher than that of the control group [(91.18%) vs. (72.73%)]. The non-technical skill scores for task management and situational awareness in the experimental group were higher than those in the control group [(14.54±3.65) vs. (12.36±3.54) and (9.86±1.63) vs. (8.83±1.74), respectively]. Questionnaire results showed that self-evaluation scores for learning interest and competency were higher in the experimental group than in the control group [(4.44±0.66) vs. (4.00±0.79) and (4.32±0.68) vs. (3.85±0.79), respectively]. All the above differences were statistically significant (all P<0.05). Conclusions The application of ISS in comprehensive emergency skill training for residency trainees helps improve rescue success rates, enhances non-technical skills such as task management and situational awareness, and promotes the development of their competency.
    References | Related Articles | Metrics
    Educational Technologies
    Application of dynamic cases based on large language models in pediatric medical education
    Wei Xiaotong, Wen Deliang
    2026, 46 (4):  269-274.  DOI: 10.3760/cma.j.cn115259-20250410-00404
    Abstract ( 6 )   PDF (847KB) ( 5 )  
    This study aims to break through the limitations of traditional static case design by constructing a ″dynamic prompt model″ based on large language models and exploring its application and effectiveness in pediatric medical education. The model encompasses disease course progression information (prodromal, acute, and recovery phases), a complete clinical decision-making chain (history taking, physical examination, auxiliary investigations, diagnosis, and treatment), and progressively advanced cognitive objectives (remembering, understanding, applying, analyzing, and evaluating). The disease list includes two categories: common pediatric diseases and rare diseases. The study selected 45 fourth-year pediatric medical students from China Medical University as research participants to conduct a 4-week learning program. The results indicate that after learning with dynamic cases, the virtual case scores significantly increased from (75.31±15.21) to (82.22±11.43). Critical thinking ability improved from (102.67±10.93) to (110.13±12.61), with system analysis skills rising from (42.85±3.91) to (45.13±4.61) and knowledge exploration willingness increasing from (31.55±3.74) to (36.32±5.11), all P<0.05. The realism scores of different cases were relatively high (all>4), and the difficulty ratings were reasonably distributed. In the experience evaluation, the ″enhancing learning″ dimension scored the highest (4.02±0.47). Therefore, the ″dynamic prompt model″ effectively improved medical students′ clinical thinking and critical thinking abilities, providing an efficient and innovative tool for pediatric medical education.
    References | Related Articles | Metrics
    Application and evaluation of teaching effects of knowledge graph in promoting deep learning in epidemiology
    Luo Yingyi, Li Jia, Sun Wenwen, Sheng Yueying, Weng Huachun, Chen Yanfeng
    2026, 46 (4):  275-279.  DOI: 10.3760/cma.j.cn115259-20250328-00340
    Abstract ( 2 )   PDF (857KB) ( 1 )  
    Objective To investigate the application and teaching effects of Knowledge Graph (KG) in promoting deep learning in epidemiology course, and to provide references for mathematical intelligence curriculum reform. Methods From September 2024 to December 2024, undergraduate students majoring in health inspection and quarantine enrolled in 2022 were selected as the research participants. An online KG teaching module was used to assist in the teaching process of Epidemiology course. Information including grade point averages, learning attitudes, learning habits, as well as the learning effects of KG on their deep learning were collected through online questionnaires. Multiple linear regression analysis was used to study the influencing factors of students′ deep learning effects and the process and final assessment scores. Results The KG learning effect score of was 90.0 (12.5), and the self-assessment score of deep learning effect was 36.0 (7.2). The process assessment score was 88.0 (10.5), and the final assessment score was 71.5 (32.5). Univariate analysis found that students with characteristics such as long daily review time after class, high grade point average in the previous semester, strong exploration ability, and strong ability to apply knowledge to practice had better learning effects with the knowledge graph (all P<0.05). Multiple linear regression analysis indicated that students′ knowledge graph learning effect was a positive factor for their self-assessment of deep learning (P=0.010). The knowledge graph learning effect score was a positive influencing factor for process assessment scores and final assessment scores (P<0.001; P=0.042). Strong exploration ability of students promoted the improvement of process assessment scores and final assessment scores (P=0.021; P=0.032). Conclusions Integrating KG into the teaching of Epidemiology courses could guide the students to cultivate personalized learning characteristics, including deep learning abilities and exploration abilities, and enhance the deep learning effects and teaching quality in mathematical intelligence curriculum reforms.
    References | Related Articles | Metrics
    Clinical Teaching
    Impact of community medicine internship on medical students′ cognition of primary healthcare, learning interest and willingness to work in primary healthcare
    Yu Hongyan, Ye Yan, Zhang Li, Zhao Xueyan,Hu Chunyan, Chen Shaohua
    2026, 46 (4):  280-283.  DOI: 10.3760/cma.j.cn115259-20250328-00343
    Abstract ( 4 )   PDF (816KB) ( 1 )  
    Objective To explore the impact of community medicine internship on medical students′ cognition of and interest in primary healthcare, and their willingness to work in primary healthcare. Methods A survey was conducted using a self-designed questionnaire. From November 2023 to June 2024, 142 medical students from Zhejiang University who participated in the community medicine internship enrolled in the study, and surveys were conducted before and after the internship respectively. Data were analyzed using the Mann-Whitney U test, χ2 test, and Spearman rank correlation analysis. Results The total score of medical students′ cognition of primary healthcare increased from 22.0 (20.5, 24.0) before the internship to 28.0 (27.0, 28.5) after the internship, with a statistically significant difference (P<0.05). The proportion of medical students interested in primary healthcare increased from 52.7% (49/93) before the internship to 67.4% (95/141) after the internship, with a statistically significant difference (P<0.05). The proportion of medical students willing to work in community health centers increased from 39.8% (37/93) before the internship to 48.2% (68/141) after the internship, and the proportion willing to engage in general practice increased from 40.9% (38/93) to 50.4% (71/141) after the internship; these differences were not statistically significant (all P>0.05). The Spearman correlation coefficient between medical students′ primary healthcare cognition and interest was 0.289 before the internship and increased to 0.636 after the internship (all P<0.05). Conclusions Community medicine internship improved medical students′ cognition of and interest in primary health care, and strengthened the positive correlation between the two factors, but it did not have a significant impact on their willingness to work in primary healthcare.
    References | Related Articles | Metrics
    Evaluation of the ″establishing health records for relatives and friends″ practical teaching module in the teaching of introduction to general practice
    Wang Huguo, Guo Xue, Wang Rui, Wu Yun
    2026, 46 (4):  284-287.  DOI: 10.3760/cma.j.cn115259-20250527-00592
    Abstract ( 2 )   PDF (847KB) ( 0 )  
    This study aims to explore the application value of the ″Establishing Health Records for Relatives and Friends″ practical teaching module based on WONCA job competencies in the Introduction to General Practice course, and to evaluate its effect on enhancing the six core competencies of medical students in general practice. The practical teaching module was designed around five tasks across four stages: ″establishment, assessment, intervention, and follow-up″, and was implemented after the course. Seventy undergraduate clinical medicine students were recruited to participate in the practice, of whom 56 completed all tasks and were included in the analysis. The improvement in students′ competencies was evaluated through objective scoring and structured questionnaires. The results showed that in the objective scoring, the distribution of student scores was as follows: 90-100 points, 10.7% (6/56); 80-89 points, 64.3% (36/56); 70-79 points, 25.0% (14/56). According to the questionnaire survey results, the proportion of students reporting ″greatly improved″ or ″significantly improved″ in 7 evaluation items, including ″clinical thinking and comprehensive analytical skills″ and ″promoting family health″, exceeded 55%. The ″Establishing Health Records for Relatives and Friends″ practical teaching module helps strengthen medical students′ understanding of the concept of ″family-centered health care″, enhances their core competencies in health record management, risk assessment, communication, and humanistic care, and demonstrates the advantages of integrating theory and practice in teaching.
    References | Related Articles | Metrics
    Graduate Education
    Pathways and reflections on education through health-related public welfare practice for medical postgraduates
    Hu Mingyi, Zhou Zheng, Feng Hong
    2026, 46 (4):  288-292.  DOI: 10.3760/cma.j.cn115259-20250710-00767
    Abstract ( 2 )   PDF (829KB) ( 0 )  
    In the context of the Healthy China Initiative, health-related public welfare practice has become an important approach to cultivating medical postgraduates in the new era. To address current problems in such practice, including insufficient value identification, inadequate management mechanisms, and suboptimal activity quality, Xiangya School of Basic Medicine Sciences at Central South University has explored the construction of a practical education system from four aspects: focusing on the fundamental task of fostering virtue through education, expanding public welfare practice platforms, innovating the content and forms of practices, and optimizing management and incentive mechanisms. A questionnaire survey of 328 postgraduate students who participated in health-related public welfare practices showed that 97.3% (319/328) believed that these activities had deepened their understanding of public health needs, and 76.8% (252/328) reported that their communication skills had been improved. Furthermore, the School has produced a National ″100 Exemplary Graduate Student Party Branches″ in Chinese Universities and a group of outstanding student representatives serving in border and grassroots areas. It was awarded the National Outstanding Team in the ″Three Going-Down″ Social Practice, and its life education science popularization brand has served over 40 000 members of the public. Practice has shown that this pathway effectively promotes the improvement of postgraduate students′ professional competence and sense of social responsibility, providing valuable insights and references for the high-quality development of medical postgraduate education in the new era.
    References | Related Articles | Metrics
    A study on the pathways for selecting thesis topics for master′s students in clinical medicine under dual clinical-basic mentor guidance
    Liu Gongwen, Hu Yutong, Li Zhaohui, Xu Ying, Qian Zhiyuan, Xu Youjia
    2026, 46 (4):  293-298.  DOI: 10.3760/cma.j.cn115259-20241117-01186
    Abstract ( 4 )   PDF (848KB) ( 0 )  
    To facilitate the selection of thesis topics for master′s students in clinical medicine and to ensure that their research remains aligned with the most recent advancements in disease studies, while also balancing fundamental research with clinical relevance, the Second Affiliated Hospital of Soochow University has established a dual-mentorship system. This system involves collaboration between clinical and basic science faculty to provide guidance in selecting appropriate thesis topics. To establish collaborative partnerships between clinical and basic science mentors aligned with the clinical mentor′s research direction, small-scale experiments and preliminary foundational validation studies were conducted. Based on the experimental results, thesis topics were selected for 14 master′s students in clinical medicine during their topic selection phase. All students graduated on schedule, with their thesis topics integrating both basic and clinical research while emphasizing clinical application and translational potential. Among the students, 12 of them reported that this topic selection approach helped them master the methods for developing research topics. Three theses were recognized as outstanding at the university level, one received a provincial-level outstanding thesis award, and three were granted second-class provincial graduate innovation awards. The dual-supervision model involving both clinical and basic science mentors facilitates a smooth thesis topic selection process and helps ensure the quality of master′s education.
    References | Related Articles | Metrics
    Standardized Residency Training
    Exploration and practice of ″team-based″ medical aid to Xinjiang in promoting quality consistency of cross-regional standardized residency training
    Wang Guojie, Buerjier Abuduwayiti, Zhong Shuping, Zhu Ye, Wu Qing, He Tao, Wang Xuemei, Xiao Fei, Lin Tianxin
    2026, 46 (4):  299-303.  DOI: 10.3760/cma.j.cn115259-20250621-00696
    Abstract ( 3 )   PDF (863KB) ( 0 )  
    The purpose of Standardized Residency Training (SRT) is to cultivate competent, fit-for-practice physicians. However, the quality of training varies across regions, particularly between the east and the west. Relying on the ″team-based″ medical aid program in Xinjiang, the affiliated hospitals of Sun Yat-sen University explored ways to ensure quality consistency across regional SRT. Previous research employed a combination of retrospective analysis and interviews. Data on the SRT at the Kashi Prefecture First People′s Hospital (Kashi First Hospital) from 2017 to 2024 were collected, and interviews were conducted with 20 individuals, including training administrators, faculty members at Kashi First Hospital, and experts from Sun Yat-sen University. The study focused on the issues in cross-regional collaboration for SRT, such as institutional alignment and implementation challenges. Based on the above data, this paper examines the reform measures implemented by experts to address the main problems in the SRT at Kashi First Hospital and the outcomes achieved. By August 2024, Sun Yat-sen University had dispatched 138 medical experts to support Kashi First Hospital. Through measures such as transferring SRT management frameworks and systems, enhancing faculty development and incentives, optimizing training content, and implementing information management, the hospital was approved as the National Medical Licensing Examination Practical Skill Test center. The number of specialized training bases increased to twenty, and the number of faculty with provincial-level or higher qualifications grew to 401. The pass rates for the National Medical Licensing Examination Practical Skill Test and the SRT final examination rose from 69.0% and 69.8% in 2019 to 85.0% and 84.5% in 2024, respectively. By leveraging the ″team-based″ medical aid program in Xinjiang, the affiliated hospitals of Sun Yat-sen University have improved the consistency of SRT quality across regions, providing valuable experience for training medical talent in western China.
    References | Related Articles | Metrics
    Continuing Medical Education
    A survey and analysis of the current competency status of community-based public health physicians in Shanghai
    Zhang Ziyi, Guo Yuxin, Shi Shimiao, Shi Lili, Cai Yuyang
    2026, 46 (4):  304-309.  DOI: 10.3760/cma.j.cn115259-20250515-00548
    Abstract ( 6 )   PDF (861KB) ( 1 )  
    Objective To investigate the current status of competencies among community-based public health physicians in Shanghai and to inform targeted competency improvement strategies. Methods A structured questionnaire was designed based on the McClelland competency dictionary and informed by literature review, behavioral event interviews, and expert consultation. In February 2023, a questionnaire survey was conducted among all 1 664 community-based public health physicians from 247 community health service centers in Shanghai. Statistical analyses were performed on the collected data. The Mann-Whitney U test and the Kruskal-Wallis test were used for comparison between groups. Results The overall competency score of community-based public health physicians was 153.00 (out of a maximum of 210). The scores of each competency dimension, ranked from highest to lowest, were as follows: professional ethics 4.00 (3.43, 4.29), general competencies 3.75 (3.13, 4.00), professional skills 3.67 (3.00, 4.00), research and development ability 3.50 (3.00, 4.00), and professional knowledge 3.25 (3.00, 4.00). The competency scores of community-based public health physicians with junior, intermediate, and senior professional titles were 151.00 (130.75, 168.00), 155.00 (137.00, 168.00), and 167.50 (149.00, 173.00), respectively. The competency scores of community-based public health physicians with 0-3, 4-6, and more than 6 professional training sessions were 149.00 (128.00, 168.00), 151.00 (133.00, 167.00), and 157.00 (136.00, 168.00), respectively. Statistically significant differences in competency scores were observed both among community-based public health physicians with different professional titles and training frequencies (all P < 0.05). Conclusions The current competency level of community-based public health physicians in Shanghai was found to be generally acceptable. However, professional knowledge and research and development ability still require further improvement.It is recommended that relevant institutions provide targeted training for community-based public health physicians based on differences in professional titles and training frequencies to further enhance their overall competency level.
    References | Related Articles | Metrics
    Medical Education Assessment
    Construction of general practice military doctors′ competency evaluation indicator system
    Dong Zhao, Shi Mingyue, Shi Kang, Dong Xiaojian, Liu Weiming, Yang Jiaxiao, Ge Wei
    2026, 46 (4):  310-314.  DOI: 10.3760/cma.j.cn115259-20250421-00446
    Abstract ( 8 )   PDF (863KB) ( 1 )  
    Objective To construct general practice military doctors′ competency evaluation indicator system, clarify the training objectives for grass-roots general practice military doctors under the new situation and facilitate accurate evaluation of their competencies. Methods The evaluation index framework was initially established by literature review, expert consultation and group discussion. The Delphi method was used to determine the index system by two rounds of correspondence to 18 experts. Then combined with Analytic Hierarchy Process to form the judgment matrix, calculate the weight values of the importance of indicators at all levels, and form the evaluation indicators system. Results The recovery rates of questionnaires in both rounds of consultation were 100.0%. The expert authority coefficient was 0.875, and the coordination coefficient of expert opinions in the second round reached 0.400 (P<0.05), leading to the establishment of an evaluation indicator system of general practice military doctors′ competency. This system comprises 6 first-level indicators and 27 second-level indicators. The weight ranking of first-level indicators is as follows: occupational qualities (0.363), knowledge ability (0.260), rescue ability (0.175), military and political literacy (0.076), communication expansion (0.076), and lifelong development (0.050). Conclusions The evaluation indicator system determined by the Delphi method and Analytic Hierarchy Process is scientifically reliable. It clarifies the training objectives of the ″six core competencies″ of general practice military doctors. It is of great significance for establishing and improving the training, education, and evaluation system of general medical doctors guided by job competencies in the military, ensuring the sustainable and high-quality growth of grass-roots military doctors, and promoting the high-quality development of grass-roots health care in the military.
    References | Related Articles | Metrics
    Evaluating the capability of large language models in answering questions on radiologic contrast agents
    Ma Xiaowen, Zuo Xu, Gong Jing, Gu Yajia
    2026, 46 (4):  315-320.  DOI: 10.3760/cma.j.cn115259-20250729-00851
    Abstract ( 7 )   PDF (887KB) ( 5 )  
    Objective To evaluate the capability of large language models (LLMs) in answering questions related to radiologic contrast agents. Methods In March 2025, this study employed a multifactorial repeated-measures design to develop a test question bank and evaluation system for radiologic contrast agents. DeepSeek-R1, DeepSeek-V3, GPT-4, Phi-4, and Llama-3.3 were selected to answer the test items. Performance was compared before and after integration with a contrast agent knowledge base using repeated-measures analysis of variance (ANOVA). Results Before accessing the contrast agent knowledge base, the scores of the five models were as follows: DeepSeek-R1 (78.94±3.96), DeepSeek-V3 (76.11±3.31), GPT-4 (75.92±2.02), Phi-4 (55.78±2.18), and Llama-3.3 (66.58±4.04), with statistical differences among models (P<0.05). After integration, the scores were as follows: DeepSeek-R1 (75.89±2.65), DeepSeek-V3 (79.64±1.97), GPT-4 (77.97±2.19), Phi-4 (73.78±3.49), and Llama-3.3 (80.22±2.71), again with statistical differences (P<0.05). In multiple-choice questions, DeepSeek-R1 achieved a perfect score without the knowledge base, while Llama-3.3 attained a perfect score after integration. For subjective questions, DeepSeek-V3 and GPT-4 scored above 36 without the knowledge base, whereas DeepSeek-R1, DeepSeek-V3, and Llama-3.3 exceeded 36 after integration. Conclusions The five LLMs demonstrated the ability to answer basic questions on radiologic contrast agents, and the contrast agent knowledge base had a notable impact on their performance. DeepSeek-R1 performed best without the knowledge base, while Llama-3.3 showed the greatest improvement after integration.
    References | Related Articles | Metrics