Suitability of the Oxford Shoulder Score for measuring pain in clinical trials evaluating interventions for people with shoulder disorders according to the OMERACT filter 2.2

Published on June 4, 2026

Semin Arthritis Rheum. 2026 May 9;79:152998. doi: 10.1016/j.semarthrit.2026.152998. Online ahead of print.

ABSTRACT

BACKGROUND: Pain is a mandatory domain in Outcome Measures in Rheumatology (OMERACT) core outcome sets for shoulder disorder trials. We evaluated the Oxford Shoulder Score (OSS), a composite measure comprising four pain and eight function items but no separate subscales for these domains, for measuring pain using the OMERACT Filter 2.2.

METHODS: Following the OMERACT Handbook, we assessed domain match and feasibility, then systematically reviewed OSS measurement properties in shoulder disorders (rotator cuff disease, adhesive capsulitis, instability, osteoarthritis, dislocation, humeral head fractures and unspecified pain). MEDLINE, EMBASE and CINAHL were searched to June 2023. Reviewers independently screened, appraised methodological quality and extracted data. Measurement properties were synthesised and rated (green, amber, red or white). Results were summarised in a Summary of Measurement Properties (SOMP) table and discussed at the OMERACT 2025 workshop.

RESULTS: The OSS was rated amber for feasibility and domain match, reflecting concerns about its multidimensional nature while acknowledging interrelatedness of pain and function. Twenty-three studies were included in the systematic review: eleven examined construct validity, three test-retest reliability, ten responsiveness, five clinical trial discrimination and five thresholds of meaning. Thirty of 34 components were judged suitable to proceed (eight green; 22 amber). All studies assessed the OSS total score (pain and function); one assessed a 4-item pain subscale. For the total OSS score, construct validity and responsiveness were green and reliability, trial discrimination and thresholds of meaning were amber. Two studies reported no floor/ceiling effects, one was equivocal. At OMERACT 2025, 95% of respondents agreed multidimensional instruments should not be used to measure single domains without validated subscales.

CONCLUSION: The OSS total score shows adequate construct validity and responsiveness, but its combined assessment of pain and function makes it unsuitable for measuring pain alone.

PMID:42235149 | DOI:10.1016/j.semarthrit.2026.152998