Fall 2025 theses and dissertations (non-restricted) will be available in ERA on November 17, 2025.

Item Difficulty and Response Time Prediction with Large Language Models: An Empirical Analysis of USMLE Items

dc.contributor.authorBulut, O.
dc.contributor.authorGorgun, G.
dc.contributor.authorTan, B.
dc.date.accessioned2025-05-01T20:51:22Z
dc.date.available2025-05-01T20:51:22Z
dc.date.issued2024-06-20
dc.descriptionThis paper summarizes our methodology and results for the BEA 2024 Shared Task. This competition focused on predicting item difficulty and response time for retired multiple-choice items from the United States Medical Licensing Examination® (USMLE®). We extracted linguistic features from the item stem and response options using multiple methods, including the BiomedBERT model, FastText embeddings, and Coh-Metrix. The extracted features were combined with additional features available in item metadata (e.g., item type) to predict item difficulty and average response time. The results showed that the BiomedBERT model was the most effective in predicting item difficulty, while the fine-tuned model based on FastText word embeddings was the best model for predicting response time.
dc.identifier.doihttps://doi.org/10.7939/r3-0xjn-2446
dc.language.isoen
dc.relationhttps://aclanthology.org/2024.bea-1.44/
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.subjectNLP
dc.subjectLLM
dc.subjecteducation
dc.subjectUSMLE
dc.subjectitem difficulty
dc.subjectresponse time
dc.titleItem Difficulty and Response Time Prediction with Large Language Models: An Empirical Analysis of USMLE Items
dc.typehttp://purl.org/coar/resource_type/R60J-J5BD
ual.jupiterAccesshttp://terms.library.ualberta.ca/public

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2024.bea-1.44.pdf
Size:
128.36 KB
Format:
Adobe Portable Document Format