Reliability and construct validity of the modified Finnish version of the 9-item Patient Health Questionnaire, and its’ content validity within biopsychosocial frame among female healthcare workers with sub-acute or recurrent LBP

Background: Healthcare workers have increased risk for chronic low back pain (LBP) leading to reduced workability. Depression, a highly prevalent, costly and disabling condition, is commonly seen in patients with sub-acute LBP. This study investigated the psychometric properties and content-validity of a modified 9-item Patient Health Questionnaire (PHQ-9-mFIN) in female healthcare workers with sub-acute LBP. Methods: Reliability (internal consistency, test-retest repeatability) was assessed with standard methods. Construct validity of the PHQ-9-mFIN was assessed as level of depression (PHQ-9-mFIN: 0-4 none, 5-9 mild, ≥10 at least moderate) against RAND-36 Health Survey, a valid measure of health-related quality of life (HRQoL). Content validity was determined as the strength of the association between the levels of PHQ-9-mFIN and the selected biopsychosocial factors. Results: The internal consistency of the PHQ-9-mFIN was high (Cronbach’s α=0.82) and the test-retest repeatability scores (n=65) fair: Pearson’s correlation 0.76, Kappa-value 0.42 for the diagnostic criterion (i.e. scores 0-9 vs. 10-27). Construct validity (Spearman correlation) against the Physical and Mental component items and their summary scales of the RAND 36 were much higher for the Mental (range -0.43 to -0.70 and -0.68) than for the Physical (range -0.06 to -0.41 and -0.24), respectively. There was a clear stepwise association (p<0.001) between the levels of depressive symptoms and General health (physical component, range 59.1 to 78.8). The associations with all items of the Mental components were strong and graded (p<0.001). All participants had low scores for Bodily pain regardless of the level of depressive symptoms. There was a strong association (p≤0.003) between levels of PHQ-9-mFIN and multisite pain, lumbar exertion and recovery after work

days, neuromuscular fitness in Modified push-ups, workability, and fear of pain related to work.
Conclusions: The PHQ-9-mFIN showed adequate reliability, and excellent construct and content validity among female healthcare workers with recurrent LBP and physically strenuous work.
Trial registration: NCT01465698 Background Low back pain (LBP) affects people of all ages and is now the leading cause of disability and contributor to a huge disease burden worldwide with the highest prevalence in working age groups [1][2][3]; there was an increase of 54% in disabilityadjusted life-years since 1990 to 2015 [2]. In nearly all people LBP is described as non-specific while it is not possible to identify a specific nociceptive cause [3]. In individual level musculoskeletal pain reduces health-related quality of life (HRQoL) of both physical and mental aspects [4]. Across all the European Union member states, including Finland, LBP and other musculoskeletal disorders are the leading causes of work disability, sickness absence from work and loss of productivity [5].
Depression, a highly prevalent, costly, and disabling condition [6], is commonly seen in patients with subacute LBP [7][8][9]. In 2018, LBP was the first and depressive disorders the third leading worldwide cause of years lived with disability [6].
Currently, it is not quite clear whether depression is a cause of LBP: Cross-sectional data among sub-acute patients with LBP indicates that men and women with LBP have significantly higher depression score compared to those with no pain [7]. The prospective findings on the course of acute and subacute LBP suggest that depression might have an adverse effect on the prognosis of LBP [8]. Individuals with depressive symptoms may have an increased risk of developing an episode of LBP in the future, the risk being higher in patients with more severe levels of depression [9].
Healthcare is one of the employment sectors having significantly higher rates of sickness absence with negative impact on employee health, healthcare delivery and patient health [10]. The annual prevalence of LBP among hospital nurses and nurses' aids in Europe is between 51% and 57%, and new high-risk groups include home and long-term care nurses and physiotherapists [11]. According to a Scottish health board database (including approximately 12 000 healthcare employees) over a 6-year period, musculoskeletal disorders (MSDs) accounted for 24% and mental health problems 20% of total number of working days lost [10]. Of all sickness absence events, LBP had the highest incidence, i.e. 34%. The highest burden of work loss due to both musculoskeletal and mental conditions was observed among nurses and midwives [10]. In Finland, MSDs account for a third of the overall costs of sickness absence and a fifth of all disability pensions [12].
The Patient Health Questionnaire-9 (PHQ-9) is the most commonly used screening tool for depression in primary care: It is brief, self-administered, easy to score and well validated for detecting and monitoring changes in depression [13,14]. There is a Finnish translation of the original PHQ-9 questionnaire [15]. However, we have produced a modified Finnish version in terms of shorter verbal design and replaced the questions 6, 8 and 9 to be more applicable in interventions among apparently healthy working populations as well as in large scale population studies on physical activity, fitness and health.
The purpose of this study was to investigate the reliability (internal consistency, test-retest repeatability) and construct validity of the modified Finnish version of PHQ-9 (PHQ-9-mFIN), as well as its' content validity within the biopsychosocial frame [16,17] among female healthcare workers with recurrent non-specific LBP and physically strenuous work [18][19][20][21].

Data collection, study design and sample
This study contains cross-sectional baseline data of the NURSE-RCT (NCT01465698) [18][19][20][21] and data from a small test-retest repeatability study (n = 64) among volunteer participants of the NURSE-RCT. The test-retest data on selected questionnaire items, including PHQ-9-mFIN, was collected in the sub-studies 2 and 3 of the NURCE-RCT in fall 2014 as part of the participants 24-(sub-study 2) and 12months (sub-study 3) follow-up measurements at the UKK Institute, respectively (see Figure 1 and Table 1 of the study protocol) [18]. The participants first filled in the standard NURSE-study questionnaire [18] at home (1 st measurement) a week before coming to the afore described follow-up measurements conducted at the UKK Institute; the Repeatability questionnaire (2 nd measurement) was filled in during the follow-up measurement session at the institute. All participants provided their written consent to a research secretary at the beginning of the baseline measurements. The study protocol of NURSE-RCT is available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5117067/pdf/bmjsem-2015-000098.pdf [18]. The Ethics Committee of Pirkanmaa Hospital District (ETL code R08157) has approved the study protocol.

Assessment methods
The nine questions of the PHQ-9-mFIN and the original PHQ-9 [13] are provided in Table 1. The time frame of the PHQ-9-mFIN is "during the last week", which is different from the two weeks used in the original version [13]. The wording of the scoring (0-3) of both versions is provided at the bottom of the Table 1.

Reliability
The internal consistency of the items of the PHQ-9-mFIN was assessed by the Cronbach's α coefficient. The 1-week test-retest repeatability of the total score (0-27) of PHQ-9-mFIN was assessed by Spearman correlation; Kappa was used for categorial analysis based on the diagnostic criterion (0-9 as No, ≥10 as Yes); we also present a figure (Fig.1) of the test-retest scores in the form of a scatter plot.
The construct validity of PHQ-9-mFIN was assessed against the RAND 36-Item Health Survey (RAND-36), a validated Finnish questionnaire [22] assessing healthrelated quality of life (HRQoL). First, we studied the correlations of the total score of PHQ-9-mFIN against the four Physical and Mental components (0-100) of RAND-36 and their corresponding summary scores (0-100), which are presented in Table 2.
Second, we studied the associations of the eight components and the two summary scores of the RAND-36 [22] with the level of depressive symptoms according to the PHQ-9-mFIN using the original ratings [13]: scores 0-4 as None, 5-9 as Mild and ≥10 as at least Moderate. Descriptive data are presented as percentages, and mean values with standard deviation or 95% confidence intervals (CI).

Content validity
The content validity of the PHQ-9-mFIN was assessed within the biopsychosocial model (i.e. Pain, Functioning, Participation, Individual) which provides a useful framework for understanding factors that may contribute to chronicity in LBP and are important targets for interventions among patients with subacute or recurrent LBP [16,17]. Standard methods were used to assess the background variables and selected biopsychosocial factors of the NURSE-RCT at baseline: intensity of LBP in the Visual Analog Scale (VAS) [23], number of musculoskeletal pain sites [24], lumbar exertion after workdays [25] and recovery after work [26]. Additionally, we measured the trunk and upper-body muscular fitness using the Modified push-ups test [27,28]. Work ability was assessed with work ability score [29] and work-stress as effort-reward imbalance [30]. Fear avoidance beliefs' questionnaire [31] was used to measure pain related fear towards work and physical activity.
We present the descriptive data of the study population by the level of depressive symptoms, assessed by the PHQ-9-mFIN and using the original categories [13] as described afore. The aim was to gain knowledge on the factors that are related to increasing levels of depressive symptoms among female healthcare workers with recurrent LBP and physically strenuous work [18][19][20][21]. All statistical analyses were conducted by KT using SPSS statistics software, version 25 (IBM, Chicago, IL)."

Results
The mean age of the women was 46 years, mean time in their current job 11 years and 70% had shift work [19]. Majority of the participants were nurses (45%) or nurses' aid (41%). Of the women 28% were current smokers; 59% had body mass index (BMI) of 25 or more, indicating overweight, of whom 18% were obese (i.e. BMI ≥30) [32].
Majority (65%) of the women reported a pain duration [24] of less than 3 months in the back, 40% had clinically meaningful intensity of LBP (i.e. ≥40 mm in the VAS) [23] and 12% experienced daily pain [24]. Almost a third (31%) of the participants reported musculoskeletal pain in multiple sites of at least moderate intensity (≥4 in numeric rating scale 0-10) [24] at three or more body sites. The majority (78%) of the female healthcare workers reported no days of sickness absence due to LBP during the preceding 6 months [21].
Descriptive results of the PHQ-9-mFIN Of the nine questions of the PHQ-9-mFIN (see Table 1) "Feeling yourself lonely" (question 8) had the highest proportion of scores of 2 and 3 indicating higher level of depressive symptoms (20.7% and 11.9%, respectively), followed by "Feeling bored (question 7; 22.6% and 6.0%) and "Lack of enthusiasm for doing anything" (question 1; 18.3% and 6.0%). The highest proportion of zero scores (no depression) was detected for questions "Have trouble getting to sleep or staying a sleep" (question 3; 64.2%) and "Feeling hopeless about the future" (question 9; 63.8%).
The mean value of the PHQ-9-mFIN in the present study population was 7.4 (range from 0 to 27).

Reliability and construct validity
The internal consistency of the PHQ-9-mFIN, assessed by Cronbach's α, was 0.82.
The Pearson's test-retest repeatability correlation (n = 65) over the 1-week testretest interval was 0.76 and Kappa-value for the diagnostic criterion (No: 0-9; Yes: ≥10) was 0.42. The scatter plot (Fig.1) indicates that the repeatability is lowest between the scores from 3 to 7 and highest from 9 on up to the highest possible  Table 2.
Of the Physical components (see Table 3 Table 4 by the level of depressive symptoms measured with the PHQ-9-mFIN. The proportion of female healthcare workers with at least moderate symptoms (score ≥10) [13] was 28% (n = 61) as was the percentage of those with no depression (scores 0-4; n = 61).
The mean intensity of LBP during the past 4 weeks in VAS was in a clinically meaningful level of 40mm [23] among those with moderate depressive symptoms and the lowest (i.e. 30mm) among those with no symptoms (p = 0.039). There were stepwise associations (p≤0.003) between the level of depressive symptoms and number of musculoskeletal pain sites [24], lumbar exertion after workdays [25], recovery after work days during the past 4 weeks [26], neuromuscular fitness in Modified push-ups test [27,28], Work Ability Score [29] and fear of pain [31] related to work, but not that related to physical activity. The effort-reward imbalance (0. [2][3][4][5], an indicator of work stress [30], slightly increased with the level of depression (p = 0.014).

Discussion
The nine item Patient Health Questionnaire is a screening tool used world-wide for major depressive disorder in different healthcare settings with acceptable diagnostic properties at cut-off score 10 or above [33,34]. The score 10 was recently shown to maximize combined sensitivity and specificity overall and for subgroups [34]. The validity of both the PHQ-9 [35] and the Mental Component Summary score of the Short Form-36 Health Survey [36] to screen major depression has been established in patients with chronic LBP.
The present study investigated the psychometric properties of a modified Finnish version of the PHQ-9 among female healthcare workers with sub-acute or recurrent LBP. To our knowledge, there are no previous validation studies of the PHQ-9 among this target group. The use of the RAND 36-Item Health Survey provides the benefits as a general functional health status measure and a criterion measure to study the construct validity of the PHQ-9-mFIN [36]. The assessment on the relationships of a variety of biopsychosocial factors with the level of depressive symptoms, measured with the PHQ-9-mFIN, provides knowledge on possible risk factors for long-term LBP among those with or without depressive symptoms.
The correlation coefficient of 0.76 indicates acceptable repeatability for the 1-week test-retest interval. Three former studies with 2-week test-retest period, reported similar (0.76) [38,40] and higher (0.86) [39] correlations. Our analysis on testretest repeatability by Kappa agreement of 0.42 for the commonly used diagnostic cut-point of 10 (i.e. scores 0─9 vs ≥10) also indicated acceptable repeatability. The scatter plot in Figure 1 further showed that the repeatability is higher when the depressive symptom score is at least 9 (i.e. close to the moderate level of ≥10) or when the score is very low from 0 to 3 indicating no depressive symptoms.
The original PHQ-9 assesses symptoms during the past two weeks. We chose to use the 1-week time-frame while it is the time duration during which the participants wore accelerometers for "objective" assessment of physical activity [41] and sedentary behaviour [42]. "Subjective" questionnaire data on physical activity and/or exercise is usually also collected for 1-week period. While physical activity and exercise are recommended treatments for moderate depression [43] as well as for recurrent LBP [44], we chose to collect the data of both for one week.

Content validity within the biopsychosocial model
The main interest for the assessment of content validity of the PHQ-9-mFIN was to find possible biopsychosocial risk factors for adverse future events among the female healthcare workers engaged in strenuous physical work and experiencing recurrent LBP with or without depressive symptoms. Depressive symptoms are expected to have an adverse effect on the prognosis among patients with recurrent LBP [8].
In our former cross-sectional study among the present study population work related Fear Avoidance Beliefs (p<0.001), lumbar exertion (p = 0.003), depressive symptoms (p = 0.01) and recovery after work (p = 0.03) best explained work ability [20]. Multi-site musculoskeletal pain has also been associated with poor physical work ability among healthcare workers, the magnitude of association being likely to increase by a higher number of pain sites [46]. In Finland, co-occurrence of musculoskeletal pain and depressive symptoms is strongly related to poor self-rated physical work ability [47].
A clear dose-response relationship has been reported between increasing levels of depressive symptoms and risk of long-term sickness absence (LTSA) [48].
Furthermore, the adverse effect of non-clinical depressive symptoms manifested at relatively low scores [48]. In Finland, musculoskeletal pain, but not depression was associated with thoughts of early retirement [47]. Among Danish healthcare workers depressive symptoms and number of musculoskeletal pain locations were associated with increased risk of LTSA for individuals who did not have comorbid symptoms [49].

Conclusion
The modified Finnish version of the PHQ-9 is shorter in overall verbal design and has replaced the psychologically the most devastating statements of questions 6, 8 and 9 with more positive ones to be applicable in interventions among apparently healthy worker populations or in large scale population studies. The PHQ-9-mFIN showed adequate reliability and excellent construct and content validity among the study group of female healthcare workers with recurrent LBP and physically strenuous work for the lower back.