Limited predictive value of illness perceptions for short-term poor recovery in musculoskeletal pain. A multi-center longitudinal study

Background Musculoskeletal pain (MSP) is recognized worldwide as a major cause of increased years lived with disability. In addition to known generic prognostic factors, illness perceptions (IPs) may have predictive value for poor recovery in MSP. We were interested in the added predictive value of baseline IPs, over and above the known generic prognostic factors, on clinical recovery from MSP. Also, it is hypothesized there may be overlap between IPs and domains covered by the Four-Dimensional Symptom Questionnaire (4DSQ), measuring distress, depression, anxiety and somatization. The aim of this study is twofold; 1) to assess the added predictive value of IPs for poor recovery and 2) to assess differences in predictive value for poor recovery between the Brief Illness Perception Questionnaire - Dutch Language Version (Brief IPQ-DLV) and the 4DSQ. Methods An eligible sample of 251 patients with musculoskeletal pain attending outpatient physical therapy were included in a multi-center longitudinal cohort study. Pain intensity, physical functioning and Global Perceived Effect were the primary outcomes. Hierarchical logistic regression models were used to assess the added value of baseline IPs for predicting poor recovery. To investigate the performance of the models, the levels of calibration (Hosmer-Lemeshov test) and discrimination (Area under the Curve (AUC)) were assessed. Results Baseline ‘Treatment Control’ added little predictive value for poor recovery in pain intensity [Odds Ratio (OR) 0.80 (Confidence Interval (CI) 0.66–0.97), increase in AUC 2%] and global perceived effect [OR 0.78 (CI 0.65–0.93), increase in AUC 3%]. Baseline ‘Timeline’ added little predictive value for poor recovery in physical functioning [OR 1.16 (CI 1.03–1.30), increase in AUC 2%]. There was a non-significant difference between AUCs in predictive value for poor recovery between the Brief IPQ-DLV and the 4DSQ. Conclusions Based on the findings of this explorative study, assessing baseline IPs, over and above the known generic prognostic factors, does not result in a substantial improvement in the prediction of poor recovery. Also, no recommendations can be given for preferring either the 4DSQ or the Brief IPQ-DLV to assess psychological factors.


Introduction
Musculoskeletal pain (MSP) is a major cause of increased years lived with disability [1]. There are several generic factors prognostic of poor recovery from MSP [2]: widespread pain (≥ 2 pain sites), high functional disability, somatization, and high pain intensity. Psychological factors such as distress, depressive mood and somatization have also been identified as risk factors for the transition from acute to chronic low back pain [3][4][5][6]. These domains have been identified, but no recommendation can be made as to the best instrument for identifying these factors. In The Netherlands, the Four-Dimensional Symptom Questionnaire (4DSQ) is commonly used to assess distress, depression, anxiety and somatization [7]. In addition, illness perceptions (IPs), as the core element of the Common-Sense Model of Selfregulation of Health and Illness (CSM), have been recognized as possible risk factors for poor recovery from MSP. The Brief Illness Perceptions Questionnaire (Brief IPQ) is frequently used to assess these IPs [8]. A recent systematic review showed limited to moderate evidence for the association of some IPs with pain intensity (PI) and physical functioning (PF) in MSP [9]. Pathways by which these associations can influence MSP are not known. IPs might act as moderators or mediators or affect MSP through fear avoidance or catastrophizing. Another important finding of the review was that longitudinal research is lacking. Therefore, it is desirable to explore the added predictive value of IPs, over and above the well-known generic factors for poor recovery from MSP, in the physiotherapy setting.
The CSM model provides a framework for identifying unhelpful cognitions and emotions people may have about their MSP condition [10]. It is based on a parallel processing model, describing individual representations (i.e. IPs) in response to health threats (i.e. MSP). There are 9 IP dimensions included in the CSM: Consequences, Timeline, Personal Control, Treatment Control, Identity, Concern, Coherence, Emotional Response, and Causal [11,12].
To investigate the added predictive value of IPs, we used the term 'predictor' defined as: "A patient characteristic that identifies subgroups of treated patients having different outcomes" [13]. In our study, IPs were seen as predictors, the treatment was usual care physiotherapy, and the disease was non-specific MSP.
Previous research has found that IPs are predictive for and associated with psychological factors, such as depression and anxiety, in patients with fibromyalgia [14], chronic back pain [15] systemic lupus erythematosus [16] and informal carers of patients with depression [17]. Therefore, overlap may exist between the domains included in the 4DSQ and in the Brief IPQ. Because of this potential overlap, we were interested in the correlation of these questionnaires. We were also interested in the difference between the added predictive values of the 4DSQ and the Dutch language version of the Brief IPQ (Brief IPQ-DLV) for poor recovery.
The following are our three research goals; First, to what extent do baseline illness perceptions in MSP patients have added predictive value for poor recovery in PI, PF and patient GPE after 3 months? Second, what is the correlation between the 4DSQ and the BIPQ-DLV? Third, what is the difference in added predictive value for poor recovery between the 4DSQ and the BIPQ-DLV?

Design and setting
Twenty-eight primary care physiotherapy centres participated in this five-month-long exploratory study, approved by the Medical Ethical Committee of the University of Applied Sciences Utrecht (HU) (Ref. no. 430012019). Physiotherapists at these centres collected the data as part of their HU Master of Physiotherapy study. All participating patients were treated according to the Good Clinical Practice guidelines [18].
A consecutive sample of patients attending outpatient physiotherapy was invited at first contact by participating physiotherapists to take part. As part of an assignment in their master's program, these physiotherapists included in the study 10-30 consecutive patients over a period of 2 months (after screening for in-and exclusion criteria: Table 1). After baseline (T0) assessment, a follow-up assessment after 3 months (T1) was performed, using a questionnaire assessing the dependent and independent variables (see Measurements).
Patients who met the inclusion criteria and gave written informed consent were recruited. We defined MSP as: Pain felt within the context of the musculoskeletal conditions listed in Table 1, according to the European Musculoskeletal Conditions Surveillance and Information Network.
All clinical procedures used in this study were carried out in accordance with relevant guidelines and Table 1 Inclusion criteria -Musculoskeletal pain Joint conditions (i.e. rheumatoid arthritis (RA), osteoarthritis (OA)), bone conditions (i.e. osteoporosis), spinal disorders (e.g. low back pain), regional and widespread pain disorders, musculoskeletal injuries, multisystem inflammatory diseases

Measurements
At baseline (T0), we collected data on demographic characteristics and the independent variables listed below:

Independent variables
Pain intensity (PI) Average pain in the last 24 h (11-point Numeric Rating Scale (NRS): 0 = no pain; 10 = worst pain imaginable) [19]. Physical functioning (PF) Difficulty in performing daily activities (11-point Patient-Specific Functional Scale (PSFS): 0 = no difficulty; 10 = unable to perform the activity). The PSFS is reportedly feasible and reliable [20,21]. Pain duration Patients rated how long their pain had existed prior to consultation: 1: pain < 7 weeks; 2: pain 7-13 weeks; 3: > 13 weeks. Number of pain sites Based on patients' reports, the number of different pain sites were categorized as: 1: 1-2 sites; 2: > 2 sites. Psychological measures The Four-Dimensional Symptom Questionnaire (4DSQ) was used to assess patients' level of risk (low, medium or high) for developing Distress (16 items), Somatization (16 items), Anxiety (12 items), and Depression (6 items). The 4DSQ is suitable for clinical applications. The items are answered on a 5point frequency scale. To calculate sum scores, responses are coded on a 3-point scale: "no" (0 points), "sometimes" (1 point), "regularly", "often", and "very often or constantly" (2 points). Then, sum scores are calculated for each dimension, and cut-off points applied to categorize each patient as at low, medium or high risk [7]. Illness perceptions The cross-cultural adapted and validated Brief Illness Perceptions Questionnaire-Dutch language Version (IPQ-DLV) was used [22,23]: this consists of nine questions of which eight were scored on an 11-point scale and cover the IP dimensions of Consequences, Timeline, Personal Control, Treatment Control, Identity, Concern, Coherence, and Emotional Response. The IP dimensions of control beliefs (Personal/Treatment) and Coherence were converted before statistical analyses as they are scored in reverse. Higher scores on Brief IPQ-DLV were theorized to have a greater chance on poor recovery.
The ninth IP question, the Causal dimension, has rank-ordered free-text responses and was not added as a predictor.

Dependent variables
For Global Perceived Effect (GPE), we used a 7-point scale ranging from 'completely recovered' to 'very much worsened'. The GPE is a reliable measurement [24] with a clinically meaningful improvement cut-off point at ≤2 on a 7-point scale [25].
We defined poor recovery in three different ways [26]; PI at follow-up; score of ≥3 on an 11-point NRS (0-10) PF at follow-up; score of ≥3 on an 11-point NRS (0-10) GPE; score of ≥3 on a 7-point ordinal scale Pain intensity and physical function were also assessed at T1 together with the Global Perceived Effect.

Statistics
In addition to age and gender, baseline scores were assessed for PI, PSFS, pain duration, number of pain sites, the 4DSQ, and the Brief IPQ-DLV, as percentages or means (standard deviation (SD)).
Hierarchical logistic regression models were constructed to examine the added predictive value of baseline 'poor recovery' (at 3 months). In the first block, age, gender and baseline scores for generic prognostic factors (psychological measures, PI, limitations in PF, number of pain sites and duration of pain) were entered as fixed (independent) variables. In the second block, baseline IPs with univariate significant ORs (p < 0.10) were added to the model. The final model was obtained by using the backward stepwise method. The goodness-offit of the model was described by the Nagelkerke R 2 and the Receiver Operating Characteristics (ROC) curve with Area Under the Curve (AUC). Goodness-of-fit of the AUC was judged thus: 0.90-1.0 Excellent; 0.80-0.89 Good; 0.70-0.79 Fair; 0.60-0.69 Poor; 0.50-0.59 Fail. For calibration, we checked the goodness-of-fit using the Hosmer & Lemeshow test (p < 0.05). The SPSS package 25™ was used to analyze the data.
For our research question 'Is there an association between the 4DSQ and the BIPQ-DLV?', we used the nonparametric Spearman's rank correlation coefficient. To interpret the strength of the correlation, we used the following classification; 0.00-0.10 negligible, 0.10-0.39 weak, 0.40-0.69 moderate, 0.70-0.89 strong and 0.90-1.00 very strong [27].
For our research question 'Is there a difference in added predictive value of poor recovery between the 4DSQ and the BIPQ-DLV?', two regression models were built to examine the predictive value of baseline 'poor recovery' (at 3 months). In our first model, we entered age, gender and the baseline scores for generic prognostic factors (PI, limitations in PF, number of pain sites and duration of pain) and added the baseline score of the 4DSQ. In our second model, we replaced the 4DSQ with the Brief-IPQ-DLV. To test the discrimination of the each model, a ROC-curve with Area Under the Curve (AUC) was applied. To compare the two AUCs, we used the empirical (non-parametric) method with NCSS 2020 software.

Results
A total of 251 (N max ) participants was included in this study (see Table 2). We found missing data to be Missing Completely at Random (Little's MCAR test p > 0.05). Numbers of missing items are reported in Table 3 in the 'n' column. A total of 237 participants was present at follow-up. The baseline characteristics of the fourteen participants lost to follow-up are described in Table 2 last column.
For distribution of the generic prognostic factors according with baseline IPs for good or poor recovery, see Table 4.
Univariate logistic regression of illness perceptions with poor clinical recovery Table 5 shows the results of the univariate logistic regression of baseline IPs with poor clinical recovery.
For the hierarchical model, the following IP dimensions were statistically significant and were therefore selected for entering in Block 2: for the clinical outcome PI, Timeline, Treatment Control, Identity, Concern, Coherence and Emotional Response; for PF, Consequences, Timeline, Identity, Concern and Emotional Response; for GPE, Consequences, Timeline, Treatment Control, Identity, Concern and Emotional Response.
Hierarchical logistic regression for baseline illness perceptions predicting poor recovery at 3 months Table 6 shows results of the hierarchical logistic regressions and the AUC. Entered as fixed variables in Block 1 for all regression models were age, gender and generic prognostic factors. In Block 2 of the model, we added all the univariate significantly associated IPs (see Table 5) with the backward stepwise method. We report only the final models.

Baseline IPs
After being added to Block 2, most IP dimensions did not increase predictive values for poor outcomes on PI, PF or GPE. Two IP dimensions did add predictive value: lower scores on Treatment Control for PI and GPE; and a higher score on Timeline for PF. The discrimination of each model after adding IPs increased slightly (the AUC increased by 2-3%). The goodness-of-fit was adequate (Hosmer & Lemeshow test (PI: p = 0.57; PSFS: p = 0.68; GPE: p = .08)).

Association of baseline scores in 4DSQ with the brief IPQ-DLV
The Spearman rank correlations showed small associations between the Brief IPQ-DLV and the 4DSQ. The IP dimensions 'Personal Control', 'Treatment Control' and 'Coherence' showed non-significant associations ( Table 7).
Difference in predictive value of poor recovery between the brief IPQ-DLV and the 4DSQ Table 8 presents the predictive value of poor recovery between the Brief IPQ-DLV and the 4DSQ.

Discussion
In addition to generic prognostic factors, two of the IP dimensions, Treatment Control and Timeline, give a small added predictive value for poor recovery from MSP in pain intensity, physical functioning and Global Perceived Effect. The Brief IPQ-DLV showed weak correlation with the 4DSQ for all IP dimensions. The highest correlations (0.32 to 0.40) were for the IP dimensions Consequences and Emotional Response. There were no significant differences in the added predictive values for poor recovery between the Brief IPQ-DLV and the 4DSQ.

Added predictive value of illness perceptions
Most IPs did not add predictive value for poor recovery. The amount of explained variance in Block 1 increased when adding Block 2 (Table 6) but the increase was small and most of the variance remained unexplained. This is also seen in the increase of the AUC from Step 1 to 2 by just 2-3%. Furthermore, from our data a higher score on Treatment Control (hypothesized as increasing the chance of poor recovery) showed the opposite. This is not in line with other research in patients attending a general physician, an inpatient rehabilitation program, or an acupuncturist for low back pain, where reporting higher scores for IPs was predictive of greater limitations in PF with low back pain [28][29][30][31]. We researched outpatients receiving usual physiotherapy care for a wide range of MSP, which makes comparison of results difficult. Looking at the difference between good and poor clinical recovery for Treatment Control scores (Table 4) we see very small differences. This means that, although Treatment Control contributes to added predictive value, the clinical importance is limited. In contrast with previous research, we adjusted our findings for known generic prognostic factors and psychological factors.
The IP Timeline (patients' beliefs about how long their condition will last) is an additional prognostic factor of poor recovery in PF (Table 6). This is in line with published research about recovery expectations, in which Timeline was found to be a factor in general expectations for individual recovery [32].
For interpretation of our findings on the additional predictive value of baseline IPs, the chosen generic prognostic factors must be taken into account. Using other prognostic factors may lead to different outcomes and interpretation of the predictive value of baseline IPs.

Association and difference in predictive value between 4DSQ and brief IPQ-DLV
The weak associations of the Brief IPQ-DLV with the 4DSQ indicate that they address different constructs. Additionally, both performed equally weakly as predictors for poor recovery in all three clinical outcomes. This indicates that the Brief IPQ-DLV (9-items) could not be replaced by the 4DSQ (50-items), and that neither makes a clinical contribution of added predictive value for poor recovery.

Limitations and strengths
First, despite the large number of participating primary care physiotherapy centers, selection bias may have occurred. Gender differences are reported for increased female risk of chronic pain and more severe pain [33]. This might be of influence on the outcome since 68.9% of our population was female. Additionally, we have no information about patients who were invited but did not participate. Further, we used the Brief IPQ-DLV and, although this is frequently used [8], it is debatable whether dimensions of beliefs about MSP can be measured with questionnaires alone [34]. Qualitative research might add extra in-depth information, but this was outside the scope of this study. Finally, the general prognostic factors were based on a systematic review among a range of musculoskeletal disorders [2]. Though this suited our  A strength of this study is that it is the first multicenter study done in primary care physiotherapy centers, with 28 primary care physiotherapy centers, geographically spread throughout the Netherlands. Hence, our findings are generalizable to patients in private practice in the Netherlands. Secondly, according to Hayden et al.'s criteria [35], our design is the first Phase 3 outcome prediction study focusing on the added predictive value of IPs. A systematic review of association and prognosis of IPs in MSP reported no other similar studies [9]. Thirdly, although there were missing data, the highest rate was 11%, making our dataset robust enough without the need for imputation. As this is the first paper to report on IPs and poor recovery in primary care physiotherapy, we built exploratory models based on univariate  Table 6 final hierarchical logistic regression models for predicting poor recovery at 3 months and added predictive probability value (AUC) IPs for poor outcome (N max = 251) p-values (Table 5). To overcome the issue of excluding possible relevant IPs we set the p-value threshold to 0.10.

Practical implications/future directions
Overall, the additional contribution of the two IP dimensions, Treatment Control and Timeline, to predictions of poor recovery after three months of usual physiotherapy care were small, the increase in the AUC being only 2-3%. Based on these results, assessing baseline IPs, over and above the known generic prognostic factors, does not result in a substantial improvement in the prediction of poor recovery. In addition, the baseline outcome score of the Brief IPQ-DLV does not indicate the use of the questionnaire as a baseline predictor of poor recovery. However, this does not rule out a value for IPs in MSP, as their possible role as mediators has yet to be researched. Other research designs, such as Single-Case Experimental Designs, have been shown to be of value when looking for relevant factors for recovery from low back pain [36,37].
In this study, treatment followed KGNF guidelines or, when not relevant, the physical therapist's usual practice. Therefore, specific interventions aimed at patients' beliefs cannot be assumed to have taken place. This could influence existing poor recovery outcomes of 39% for PI, 55% for PF and 30% for GPE. Tailoring interventions that match specific risk factors and patients' needs has recently brought forward as a preventative strategy for the transition of acute to chronic low back pain [38], so matching interventions with patients' high baseline IPs is conceivable. We recommend future research into the feasibility and effectiveness of an illness perception-based physiotherapy intervention for patients with disabling MSP.

Conclusion
Based on the findings of this explorative study, assessing baseline IPs, over and above the known generic prognostic factors, does not result in a substantial improvement in the prediction of poor recovery. Also, no recommendations can be given for preference between the 4DSQ and the Brief IPQ-DLV to assess psychological factors.
The role of IPs as possible mediators has still to be researched. We recommend future research with suitable designs that can look at changeability and possible effectiveness of high IPs on PI, PF and GPE in patients with musculoskeletal pain.