This is the first study to evaluate the Dutch version of the IPQ-B in ANLBP patients. The internal reliability, test-retest reliability, and concurrent validity indicate that the Dutch IPQ-B is of moderate psychometric quality. Even though the IPQ-B measures a psychological construct by means of a multidimensional scale with a few items, we found its internal consistency of 0.73 to be adequate . None of the items will affect the overall reliability if they were deleted. Kline, as well as other authors, agrees that an internal consistency of ≥ 0.60 indicates sufficient reliability for psychological constructs . According Terwee et al., an acceptable test-retest reliability must exceed an ICC value of > 0.70 . Although the lower limits of the 95% CI in the present study was less than 0.60, the ICC was an acceptable 0.72 (95% CI, 0.53 – 0.82). For concurrent validity, no gold standard is available for assessing patients’ perception of acute low back pain. Therefore, we used the criteria of Nunnally et al.  to determine that in our study the concurrent validity of the IPQ-B with the MCS of the SF-36 to be adequate.
In contrast to the ICC values, which demonstrated adequate test-retest reliability, the LOAs in the Bland Altman plot were large. The large LOAs might have been due to fewer low back complaints over time resulting from intervention-related changes in the patients’ perception of their back pain. Participants reported more positive perceptions on the IPQ-B retest than test (t – 3.5, P < 0.05). Therefore, it was preferable to shorten the test-retest interval time in the study. Most ANLBP decreases within the first weeks after onset, and as a result, negative perceptions concerning ANLBP also decrease . To mitigate this phenomenon as much as possible, we instructed the examiners in the primary care units that had contact with the patients to avoid giving patients any information about the course of ANLBP that could influence their perception of pain. As a consequence of the positive natural course of ANLBP recovery, patients’ perception might also have been influenced, especially during the acute stage.
However, the maximum score of the IPQ-B is 80. In the present study, the SDC was 42, which means that a change in IPQ-B score must exceed a value of 42 in order to reflect a true difference between test and retest scores; random error also explains the decrease of IPQ-B score. An SDC value of 42 also indicates that there is low agreement between the two scores, and thus moderate longitudinal responsiveness to real changed perception of complaints. We conclude, therefore, that the instrument is not suitable for detecting real individual changes.
For concurrent validity, Terwee et al. proposed a correlation value of ≥ 0.50 to be acceptable . In our study, the Pearson’s correlation coefficient was 0.51 and the ICC value for the IPQ-B and the mental health subscale of the SF-36 was 0.65 (95% CI, 0.46-0.80). However, since the items of the IPQ-B are derived from earlier versions of the IPQ, the content validity of the scale might have been influenced during this derivation process. The IPQ-B was developed by ‘forming one question that best summarized the items contained in each subscale of the IPQ-R’ . Indeed, more recent findings indicate that people do have difficulties understanding the items of the IPQ-B, with some even misinterpreting them . This could influence the content validity of the instrument, leading to the question: ‘Is the scale really measuring the same construct?’ . Nonetheless, in the present study, we did find the internal consistency of the scale to be adequate in ANLBP patients.
The results of our study are consistent with those reporting on the psychometric properties of the IPQ-B for several illnesses. However, our findings differ from those of Broadbent et al., who also used the mental health component of the SF-36 to determine concurrent validity in myocardial patients . They found negative associations for four items of the IPQ-B when compared to the mental health subscale. A possible explanation for this disparity is that psychological state has a greater impact on ANLBP patients than on patients with a specific medical condition such as myocardial infarction.
Small sample size was a major limitation of the study; results must be interpreted with caution. Another limitation was the relatively long test-retest period, so patients could have been influenced by the favourable natural course and a positive change in pain and activities might have occurred. These developments changed the perception of low back pain that might have negatively biased the test-retest reliability results. One problem inherent of this kind is to minimize treatment influence; hence, all data was collected just before the two interventions. However, an explanation of the changed IPQ-B score might be that internal and/or external influences between both administrations have affected patients’ perceptions of low back pain.
Main and George emphasized the importance of measuring a patient’s perception as part of a more psychologically informed physical therapy practice . The goal of doing so is to identify and alter the patient’s perception of musculoskeletal pain and response to pain in his or her daily coping behavior, as a patient’s cognition of his or her pain and disability might be essential for decreasing musculoskeletal disorders and for a more rapid recovery . Therefore, we emphasize the need for measuring patient pain perception for several musculoskeletal disorders. At the same time, we need to acknowledge the complexity of this multilevel representation and the problems patients might have interpreting the items of instruments measuring this psychological construct. In this regard, we support the use of the IPQ-B in primary care physical therapy management, as it is a useful instrument to assess patients’ initial perceptions of their disorder. Such assessments should address more negative perceptions of patients’ back pain, with the aim of decreasing the risk of more chronic low back pain problems.