
The efficacy of ChatGPT as a student resource for the diagnosis and treatment of neck pain
Disabil Rehabil. 2026 Jun 5:1-11. doi: 10.1080/09638288.2026.2681561. Online ahead of print.
ABSTRACT
INTRODUCTION: The use of Chat Generative Pretrained Transformer (ChatGPT) is becoming a highly used source of clinical information. No study has evaluated ChatGPT's effectiveness in providing educational material to students about neck pain.
METHODS: This cross-sectional study included 21 queries evaluating the use of ChatGPT v3.5 and v4.0 as educational tools for neck pain diagnosis and treatment. Misinformation was quantified using a Likert scale. Flesch-Kincaid grade level scores and word counts assessed readability. The DISCERN instrument evaluated quality, and the Patient Education Materials Assessment Tool (PEMAT) quantified understandability and actionability.
RESULTS: No misinformation was present. Both chatbots produced responses around a 12th-grade reading level. ChatGPT v4.0 (M = 318.8 ± 53.3) had more words per response than ChatGPT v3.5 (M = 229.3 ± 44.6), p < 0.0001. ChatGPT v3.5 had greater information quality than ChatGPT v4.0 for intervention-related queries (p < 0.0001). Actionability scores were far lower than understandability scores; however, intervention queries had greater actionability scores than nonintervention queries (22.5% versus 10.5%, p < 0.0001). The chatbots produced moderate-quality responses.
CONCLUSION: The reading level and understandability likely make these chatbots more learner-friendly. The chatbots are likely suitable for generating basic facts rather than providing direct advice on neck pain.
PMID:42246561 | DOI:10.1080/09638288.2026.2681561
