Chinese Journal of Medical Education ›› 2024, Vol. 44 ›› Issue (8): 561-569.DOI: 10.3760/cma.j.cn115259-20240524-00520

    Next Articles

Exploratory practice of generative large language models in the construction of medical item banks

Jiang Zhehan1, Feng Shicong2, Wang Weimin1   

  1. 1Institute of Medical Education, Peking University, Beijing 100191, China;
    2Master Degree Candidate, Medical Education Major, Enrolled in 2023, Graduate School of Education, Peking University, Beijing 100871, China
  • Received:2024-05-24 Online:2024-08-01 Published:2024-07-31
  • Contact: Wang Weimin, Email: wwm@bjmu.edu.cn
  • Supported by:
    Project of the Health Human Resources Development Center, National Health Commission, P.R.China(202110-335);National Natural Science Foundation Youth Project(72104006);Key Reform Project of the National Medical Examinations Centre, National Health Commission, P.R.China during the 14th Five-Year Plan Period(2022-21)

Abstract: Item development in healthcare profession education is time-consuming and heavily reliant on content experts. While large language models (LLMs) introduce a new approach to reduce the burdens, the output quality is largely contingent upon the prompt. This article aims to guide educators in effectively leveraging LLMs for item development, enhancing the quality through prompt engineering. Using ″postoperative bile leakage″ as an example, the paper demonstrates the effectiveness of various prompt engineering strategies, including Zero-shot, Few-shot, Chain of Thought (CoT), CoT with Self-Consistency (CoT-SC), and Tree of Thoughts (ToT). It is found that while Zero-shot and Few-shot methods are straightforward, they have certain limitations in terms of item diversity and depth. Conversely, prompt strategies incorporating ″Thought″ elements can navigate the LLMs through stages of drafting, refining, comparing, and finalizing, thereby elevating question quality. Although refining prompts indeed leads to notable improvements in question formulation efficacy, there remains substantial room for exploring and optimizing prompt formulations and strategies to further augment the quality of generated questions. The pursuit of advancing prompt engineering techniques holds the promise of significantly elevating the standards of question bank development within medical education.

Key words: Artificial intelligence, Generative large language models, Prompt engineering, Medical test questions, Item bank construction, Assessment development

CLC Number: