- Research article
- Open Access
- Open Peer Review
Psychometric properties of a new treatment expectation scale in rheumatoid arthritis: an application of item response theory
BMC Musculoskeletal Disordersvolume 16, Article number: 239 (2015)
Patient-generated health outcome measures are important in the assessment of long-term treatment goals for Rheumatoid Arthritis (RA), but few psychometrically sound measures are available. The MAPLe-RA (Measuring Actual Patient-Led expectations in RA) is a new questionnaire and its psychometric properties are not investigated. This study aims to examine these properties for each of the items using Item Response Theory (IRT) .
Participants were included if they completed the scale (MAPLe-RA). A one parameter (Rasch) model and a two parameter logistic (2PL) model were applied to these data using M-plus software.
One hundred thirty-eight patients with RA were included in the analysis. MAPLe-RA scale comprised of 21 items, the mean score was 71 (20.28) ranging from 0 to 105. Most items operated in the high expectations part of the items characteristics curves (ICC). Item discrimination varied widely, items with the highest discrimination capacity from the three domains were: pain (physical domain); control of my RA (self-management) and maintaining social role (psycho-social domain); feeling better overall and involvement in treatment decision making (impact of new treatment domain).
RA patients’ expectations of treatment are higher in the physical and psycho-social domains and less so in the impact of new treatment domain.
Patient- generated health outcome measures play an important role in the assessment of long-term treatment goals, for people experiencing Rheumatoid arthritis (RA) and are therefore well positioned to be utilised in assessing new treatments. In RA, there is no gold standard measure for the assessment of patient’s expectations at time of diagnosis or before commencing treatment. To fill this gap we developed a new patient-generated expectancy measure, called Measuring Actual Patient-Led expectations in RA (MAPLe-RA) scale . Item response theory (IRT) is an approach that emphasizes the influence of the individual’s qualities as well as the items qualities, in a test, or in a questionnaire. The method was originated in education where individual qualities may reflect abilities, was then extended to other applications, with well-known examples in medicine and psychology [2–4]. In this study the underlying construct is patients’ expectations, and the method was used to understand the psychometric properties of the individual items [5, 6]. Although IRT method has been applied in several long-term conditions to assess the properties of outcome measures and questionnaires , it is rarely used in patient-generated measures in RA [8, 9]. For the development of new scales, traditionally, a factor extraction method based on Eigenvalues is used [10, 11] to explore the number of domains, and the strength of association of items within domains. IRT in addition provides important details, on psychometric properties of each item, including, the difficulty and the discrimination of these properties. MAPLe-RA is a new questionnaire and its psychometric properties have not yet been investigated. This study aims to examine these properties for each of the items using IRT.
The development stages of MAPLe-RA were published elsewhere . In brief, stage one of the study: three repeated focus groups and two expert panels with RA patients were conducted by a patient researcher. Stage two: a feasibility study of the draft scale with 22 consecutive outpatient attendees over 1 week was conducted and stage three was the psychometric analysis, and that the results are presented here. MAPLe-RA scale, comprised of 21 items, and the response options were given in a Likert scale from 5 to 0. High scores refer to better treatment expectations. The scale is intended to measure expectations of treatment in three domains: physical, psychosocial and impact of treatment. Table 1 shows how the MAPLe-RA questionnaire is scored. MAPLe-RA was approved by the National Research Ethics Committee London-Central (REC reference number 10/HO718/82). All participants provided informed consent.
The demographic information of participants are described in Table 2, using proportions, means and standard deviations (SD) as appropriate. The item responses for all the 21 items were skewed towards ‘better’ and ‘much better’. For the factor analysis we used the original scales of the items. However, for the purpose of the IRT model, we dichotomised the items by collapsing the (i) ‘better’ and ‘much better’ responses together and coded as 1; and (ii) ‘worst’, ‘much worst’, ‘same’ and ‘non-applicable’ responses coded as 0. We fitted a one parameter (Rasch) model which, assumes that the items are equally discriminating but with varying difficulty, and a two parameter logistic (2PL) model that assumes the items have a varying ability to discriminate among patients with different levels of the underlying construct [13, 14]. Uni-dimensionality was assumed as a priori and was further assessed using maximum likelihood method as well as principal-component factor methods. We present the results obtained from the 2PL model, as these provide more desired information, including items’ difficulty and discrimination [15, 16]. The model fits the data well as assessed by the Akaike information criterion (AIC) and Bayesian information criterion (BIC). We employed the Item-characteristic Curve (ICC) to evaluate the profile of each item within the scale and to assess the relationship between the predicted patients’ response to an individual item and the underlying construct (expectations). For all the analyses we used M-Plus statistical software.
The two-parameter logistic model suits binary responses and may be described as:
Where x j is the observed response to item j, α j is the slope parameter, β j is the difficulty (location) of item j, and θ is the underlying construct being measured (expectations).
A total of 160 outpatient attendees were invited to take part in phase 3 of the study, 138 (86 %) consented and completed the MAPLe-RA questionnaire. The mean age was 54 (SD = 14.30) years; 101 (74 %) were women, 73 (53 %) were of white ethnicity and 48 (38 %) reported being registered disabled (Table 2).
MAPLe-RA scale properties
In stage one and two of the scale development, patients identified 21 dimensions of new treatment expectations, grouped into (i) physical (ii) psycho-social and (iii) expectations relating to impact of treatment. This resulted in a draft questionnaire assessed in the feasibility study and subsequent stage three analysis.
The overall mean score of MAPLe-RA 21 items, was 71 (SD: 20.28; range 0 to 105). The means for the 3 domains, separately were: physical (7 items), mean 24.40 (SD: 7.21), psycho-social (6 items), mean 17.51 (SD: 8.17) and impact of new treatment domain (8 items) mean 30.76 (SD: 7.02).
Exploratory factor analysis identified that all items had strong positive associations with the first factor, weak associations in most items with the second factor, and negative associations with items 13–21 (Additional file 1: Table S1). The Eigen value for the first factor was 6.25, proportion of variance explained was 77 %, supporting the uni-dimensionality of the items.
Most items had high rates of “yes” or positive responses, in IRT context, these have low difficulty parameters, most patients would pick, and these seem to describe the majority of patients’ expectations. Item discrimination on the other hand reflects the strength of the association of an item with the underlying construct, items with high discrimination are better at differentiating respondents at the location point; small changes in the underlying construct (expectation) leads to large changes in the probability of endorsing the item (response = yes), and vice versa for items with low discrimination. These responses varied widely, and the most powerful two items from the three domains were: “swelling of the joints” and “pain” in the physical domains, discriminations: 1.38 (0.58) and 1.74 (0.82).
In the psycho-social domains, “to maintain social role” and “emotional wellbeing” were the two items with highest discrimination: 1.87 (0.63) and 1.77 (0.51), respectively. The two items with the highest discrimination in impact of new treatment domain were “feeling better overall” and “involvement in treatment decision making”, 1.47 (0.38) and 1.41 (0.52) respectively, and other items had lower discrimination. Details for all items are presented in (Table 3). Within the physical domains, two items had a difficulty that was not different from zero, namely “visible signs of RA” and “joint damage”, and the two items also had very low discrimination. The model fit was high in both the BIC (2807.48) and AIC (2743.08), which indicates a good fit of the two-parameter logistic model.
Figure 1 shows the Item Characteristic Curves (ICC) that represents the respondents’ expectations (underlying construct) in relation to the probability of endorsing an item and is presented graphically for two items with the highest discrimination from each domain.
MAPLe-RA is a new questionnaire that was not yet validated and its items were not examined. In this study, we used factor analysis to assess the uni-dimentionality and IRT to describe the properties of the items. The study has shown that RA patients have high expectations from their treatment. These are particularly high in the physical and psycho-social domains rather than in the impact of new treatment domain. For the latter, most items were unlikely to be endorsed by patients with less than average expectations in this RA study cohort.
Strength and limitations of the study
All domains have shown items with strong association with the underlying construct (expectation), two to three items from each domain, may be considered as good candidates that differentiate between patients’ responses. Several items however, appeared to be redundant (e.g. visible signs of RA; not to have to change medication), as they did not show strong association with the underlying construct.
The IRT method is superior to the traditionally factor extraction methods based on Eigenvalues [10, 11]. It is also a suitable way to employ when an instrument includes response categories that have several levels. In this study, the method determines whether the categories perform as they were envisioned and/or whether to collapse the responses into fewer categories . The advantage of using IRT is that of an underlying construct, that gives items different weights, depending on the response pattern and the frequency of response to each item, and values instead of sum scores [12, 18]. This technique has been successfully applied in the development or the evaluation of new measures in patient-reported outcomes [19, 20]. To our knowledge the IRT method has not been applied in many Rheumatology related scale studies .
While the results of the new instrument appeared to have a very good reliability, it is important to interpret the findings with caution. This analysis was an exploratory phase of the scale development stages. The sample size was under powered for IRT 2PL model and the population studied was homogeneous from one RA clinic only. Although, there were some redundant items, we chose to keep these in the analysis to avoid making inappropriate decisions and or conclusions at this early stage.
Other studies have found similar results in that new measure of patient’ expectations in general need validation in larger multi-centre studies . We acknowledge that further analysis is necessary, thus MAPLe-RA is currently included in a national longitudinal observational study of patients with early Rheumatoid Arthritis with a diversity of socio-demographic characteristics and a long term follow up of 18 months, to be completed in 2015. This large multi-centre study will allow us to conduct a confirmatory analysis of the new measure as well as to assess if patients expectations change over time.
This study extends the evidence on the value of IRT models in the assessment of health outcomes and patient-generated measures. The result highlights that RA patients’ treatment expectations are higher in the physical and psycho-social domains and less so in the impact of new treatment domain. RA patients expect high degree of involvement in their care from health care providers, and that they rate highly, controlling their pain and emotional well-being.
Hofmann D, Ibrahim F, Rose D, Scott DL, Cope A, Wykes T, et al. Expectations of new treatment in rheumatoid arthritis: developing a patient-generated questionnaire. Health Expect 2013;doi: 10.1111/hex.12073.
Tay L, Diener E, Drasgow F, Vermunt JK. Multilevel Mixed-measurement IRT analysis: an explication and application to self-reported emotions across the world. Organ Res Methods. 2011;14(1):177–207.
Orlando M, Sherbourne CD, Thissen D. Summed-score linking using item response theory: application to depression measurement. Psychol Assess. 2000;12(3):354–9.
Hays RD, Liu H, Spritzer K, Cella D. Item response theory analyses of physical functioning items in the medical outcomes study. Med Care. 2007;45(5 Suppl 1):S32–8.
Wolfe F. Which HAQ is best? A comparison of the HAQ, MHAQ and RA-HAQ, a difficult 8 item HAQ (DHAQ), and a rescored 20 item HAQ (HAQ20): analyses in 2,491 rheumatoid arthritis patients following leflunomide initiation. J Rheumatol. 2001;28(5):982–9.
Wolfe F. Pain extent and diagnosis: development and validation of the regional pain scale in 12,799 patients with rheumatic disease. J Rheumatol. 2003;30(2):369–78.
Hays RD, Morales LS, Reise SP. Item response theory and health outcomes measurement in the 21st century. Med Care. 2000;38(9 Suppl):Ii28–42.
Tugwell P, Boers M, Brooks P, Simon L, Strand V, Idzerda L. OMERACT: an international initiative to improve outcome measurement in rheumatology. Trials. 2007;8:38.
van Hartingsveld F, Ostelo RW, Cuijpers P, de Vos R, Riphagen II, de Vet HC. Treatment-related and patient-related expectations of patients with musculoskeletal disorders: a systematic review of published measurement tools. Clin J Pain. 2010;26(6):470–88.
Cosco TD, Doyle F, Ward M, McGee H. Latent structure of the hospital anxiety and depression scale: a 10-year systematic review. J Psychosom Res. 2012;72(3):180–4.
Cosco TD, Doyle F, Watson R, Ward M, McGee H. Mokken scaling analysis of the hospital anxiety and depression scale in individuals with cardiovascular disease. Gen Hosp Psychiatry. 2012;34(2):167–72.
Siemons L, ten Klooster PM, Taal E, Kuper IH, van Riel PL, van de Laar MA, et al. Validating the 28-tender joint count using item response theory. J Rheumatol. 2011;38(12):2557–64.
Tennant A, McKenna SP, Hagell P. Application of rasch analysis in the development and application of quality of life instruments. Value Health. 2004;7 Suppl 1:S22–6.
Bjorner JB, Rose M, Gandek B, Stone AA, Junghaenel DU, Ware Jr JE. Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity. J Clin Epidemiol. 2014;67(1):108–13.
Scheerens JGCAWTS. Educational evaluation, assessment, and monitoring: a systemic approach. Lisse [Netherlands]; Exton, PA: Swets & Zeitlinger; 2003.
Edelen MO, Reeve BB. Applying item response theory (IRT) modeling to questionnaire development, evaluation, and refinement. Qual Life Res. 2007;16 Suppl 1:5–18.
Tennant A, Conaghan PG. The Rasch measurement model in rheumatology: what is it and why use it? When should it be applied, and what should one look for in a Rasch paper? Arthritis Rheum. 2007;57(8):1358–62.
McKenna SP, Doward LC, Whalley D, Tennant A, Emery P, Veale DJ. Development of the PsAQoL: a quality of life instrument specific to psoriatic arthritis. Ann Rheum Dis. 2004;63(2):162–9.
Ropes MW, Bennett GA, Cobb S, Jacox R, Jessar RA. Proposed diagnostic criteria for rheumatoid arthritis. Ann Rheum Dis. 1957;16(1):118–25.
Ropes MW, Bennett GA, Cobb S, Jacox R, Jessar RA. Proposed diagnostic criteria for rheumatoid arthritis; report of a study conducted by a committee of the American Rheumatism Association. J Chronic Dis. 1957;5(6):630–5.
Siemons L, Ten Klooster PM, Taal E, Glas CA, Van de Laar MA. Modern psychometrics applied in rheumatology--a systematic review. BMC Musculoskelet Disord. 2012;13:216.
Bowling A, Rowe G, Lambert N, Waddington M, Mahtani KR, Kenten C, et al. The measurement of patients’ expectations for health care: a review and psychometric testing of a measure of patients’ expectations. Health Technol Assess (Winchester, England). 2012;16:30. i-xii, 1–509.
We are grateful to all the patients for their participation, commitment and enthusiasm in developing this questionnaire; Carol Simpson and Patricia Rusling for their contribution as patient experts in the project. All clinical staff at King’s College Hospital NHS Foundation Trust; Joanna Dobson, Rosaria Salerno for their support and advice and the Project Management team for their support and guidance. This research is supported by the National Institute for Health Research (NIHR) Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust and King’s College London. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.
The funding for the one year project came from a Strategic Award from the National Institute for Health Research joined comprehensive Biomedical Research Centre at Guy’s and St. Thomas Hospital NHS Foundation Trust and the specific Biomedical Research Centre for Mental Health at the Institute of Psychiatry, Psychology and Neuroscience, King’s College London.
We are grateful for supported from the National Institute for Health Research (NIHR) Programme Grants For Applied Research (http://www.ccf.nihr.ac.uk/PGfAR/Pages/Home.aspx) on “Treatment Intensities and Targets In Rheumatoid Arthritis Therapy: Integrating Patients’ And Clinicians’ Views – The TITRATE Programme (RP-PG-0610-10066)”.
The authors declare that they have no competing interests.
FI: had main responsibility for conducting the data analysis and wrote the first draft of the paper and overseeing the submission process. SA: contributed to the data analysis and editing of the paper. HL: proposed the idea of the study and contributed to the editing of the paper. DLS, AC, TW, DR & DH: contributed editing the paper. All authors have seen and approved the final version of the paper before submission.
Factor loading for MAPLe-RA scale. (DOCX 17 kb)