Intra-session absolute and relative reliability of pressure pain thresholds in the low back region of vine-workers: effect of the number of trials

Background Pressure pain thresholds (PPT) are commonly used to quantify mechanical pain sensitivity of deep structures. Excellent PPT reliability has been previously reported among the low back of healthy subjects. However, there is a lack of studies assessing PPT over the low back of workers exposed to biomechanical risk factors of low back pain. Thus, the purpose of this study was threefold: (1) to evaluate the intra-session absolute and relative reliability as well as minimal detectable change (MDC) values of PPT within 14 locations covering the low back region of vine-workers and (2) to determine the number of trial required to ensure reliable PPT assessments and (3) to assess the effect of modifier factors such as gender, age, body mass index (BMI) and pain intensity on PPT reliability. Methods Twenty-nine vine-workers voluntarily participated in this study. Twenty-two reported low intensity of low-back pain while seven were pain-free. PPTs were assessed among 14 anatomical locations in the lower back region. Three trials were performed on each location with an interval time of at least one minute. Reliability was assessed computing intraclass correlation coefficients (ICC), standard error of measurement (SEM) for all possible combinations between trials. Bland-Altman plots were also generated to assess potential bias in the dataset. Finally, a repeated measure analysis of variance (RM-ANOVA) with the number of trials used as within subject factor was performed on (1) PPT, (2) ICC and (3) SEM values. Results ICC ranged from 0.86 to 0.99 for all anatomical locations and for all possible combinations between trials. SEM for comparison between trial 1–2, 2–3, 1–3 and, 1-2-3 ranged from respectively, 36.7–77.5, 27.8–77.7, 50–95.2 and, 39.3–80.8 kPa. ICC and SEM remained similar to the ones obtained for the entire population when taking modifier factors in consideration. The visual analysis of Bland-Altman plots suggested small measurement errors for all anatomical locations and for all possible combinations between trials. Conclusions The assessment of PPTs of the lower back among vine-workers was found to have excellent relative and absolute reliability. Moreover, reliable measurements can be equally achieved when using the mean of three PPT measurement or with the first one.


Background
Work-related Musculoskeletal Disorders (WMSDs) are considered in numerous countries as a public health problem [1,2]. WMSDs are often accompanied by pain located in the low back region [3,4] and associated with muscle hyperalgesia [5]. Seventy percent of the population will experience low back pain at least once in its lifetime [6][7][8]. In France, the prevalence of low back pain is particularly high especially in viticulture partly explained by the relatively high exposure to biomechanical risk factors, i.e. awkward postures, repetitiveness [9,10].
In many studies dealing with WMSDs, visual analogue scale (VAS) and numeric rating scale (NRS) are commonly used to measure pain intensity in the low back region [11,12]. Even if these self-reported methods of pain intensity are considered valid, reliable and responsive to change in the intensity of pain [13], they are also largely influenced by psychosocial aspects related to the environment and the beliefs concerning the expected duration of pain [14]. However, pain sensitivity is not uniformly distributed in a body region or along a muscle [15][16][17][18]. Using pain diagram to depict painful areas does not in general offer the possibility to visualize the spatial distribution of pain sensitivity [19].
Assessing pressure pain threshold (PPT) is a way of quantifying sensitivity of deep structures to mechanical pain [20,21]. PPT has a good to excellent relative reliability in many anatomical locations such as neck [22][23][24], knee [25], temporalis and masseter muscles [26,27] and the low back region [28][29][30][31]. However, for this latter anatomical location, only few studies assessed PPT's absolute reliability [31][32][33][34][35]. While topographical pain sensitivity mapping technique has been developed in the low back [36][37][38], the number of assessed points in studies dealing with PPT's reliability in the low back is limited to two locations (2 cm laterally from L3 or L4 spinal processes). Moreover, most of these studies assessed PPT on young healthy subjects [31] and little is known about PPT's reliability among workers exposed to biomechanical risk factors of WMSDs in the low back region. Consequently, it is essential to have reliable tools assessing mechanical sensitivity to pain in order to e.g. monitor the effectiveness of an intervention among workers with occupations potentially leading to WMSDs like vine-workers. This need is further substantiated by the difference found in PPT when comparing workers to young asymptomatic individuals underlining mechanical hyperalgesia [5,39,40].
The purpose of this study was threefold: (1) to evaluate the intra-session absolute and relative reliability as well as minimal detectable change (MDC) values of PPT within 14 locations covering the low back region of vine-workers and (2) to determine the number of trial required to ensure reliable PPT assessments and (3) to assess the effect of modifier factors such as gender, age, body mass index (BMI) and pain intensity on PPT reliability.

Participants
Twenty-nine adult vine-workers (16 men and 13 women) volunteered to participate in this study. Nineteen vineworkers out of 29 reported low-back pain at least 3 consecutive days in the last 12 months and seven were pain free at the time of measurements. The workers were recruited from two vineyards in the Bordeaux (France) wine district. The vine-workers' characteristics (anthropometrics, low-back pain duration and intensity) are presented in Table 1. The inclusion criteria were: age between 25 and 60 years, full-time employed as vine-worker, no history of spine or pelvis fracture, no tumor or spinal surgery and no pregnancy.

Experimental protocol
The PPT measurements were performed during working hours in one session lasting approx. 30 min using a hand-held electronic algometer (Somedic Algometer type 2, Sollentuna, Sweden) with a 1 cm 2 wide rubber tip. PPTs were assessed over 14 anatomical locations in the low back region with 7 locations on each side of the lumbar spinal processes L1-L5. Each location was measured trice (Trial 1, Trial 2 and Trial 3) by a single rater with at least 1 min between two consecutive trials on the same location to avoid temporal sensitization [41]. The algometer was calibrated prior to data collection. Pressure was applied at a constant slope of 30 kPa/s with the tip of the algometer perpendicular to the skin. The worker was lying comfortably in a prone position on a table and was asked to press a button that locks the algometer display when the feeling of pressure changed to pain. Then, the examiner noted the pressure indicated on the algometer display corresponding to the PPT. Prior to recordings, the worker was familiarized with PPT assessment by measuring PPT on the tibialis anterior muscle considered as a reference point [42,43].

Procedure to mark the 14 anatomical locations
Eight paper grids were designed for PPT measurements in the low back region. The design was based on the studies by Binderup and colleagues [36,37] where the distance between two adjacent locations is calculated from the distance (d1) between L1 and L5. Then, we calculated the quarter of this distance (d2). A first column of 5 points was placed bilaterally at the distance (d2) from a fictive line joining L1 to L5. Then, a second column of 2 points was set bilaterally at 2 times the distance (d2) of L2 and L3 (Fig. 1). Binderup and colleagues [36] have shown that the distance L1-L5 is on average 14.3 ± 2.8 cm for adult men and 12.5 ± 0.9 cm for adult women. Based on these distances, we have developed eight PPT grids with a L1-L5 distance ranging from 11 to 14.5 cm (using step of 0.5 cm between two consecutive grids) [31]. The rater palpated the lumber spinal process L1 and L5 and placed a mark on the skin with a pencil on those two locations. Then, the distance L1-L5 was measured to select the corresponding PPT grid. Once selected, the rater aligned the grid with the L1 and L5 marks on the skin and started the assessments. The use of the designed grid results in a gain of time of approx. 20 min without alterations of the gain in spatial information.

Data analysis
PPT measurements, intraclass correlation coefficients (ICC) and standard error of measurement (SEM) values were normally distributed (Shapiro-Wilk normality test). The magnitude of the systematic difference in PPT between trials was estimated using 95 % confidence interval (CI) of the mean difference (Mean Diff ) between trials 1-2, trials 1-3, trials 2-3 and calculated with the formula: where, t n-1 corresponds to the value of t distribution with n-1°of freedom where n corresponds to the number of participants. A repeated measure analysis of variance (RM-ANOVA) with the number of trials used as within subject factor was performed on (1) PPT, (2) ICC and (3) SEM values. In case of significant effect of the number of trials, a Tukey post-hoc for pair-wise comparisons test was used to compare differences between trials. The relative and absolute reliability across the trials 1-2-3 were computed using ICC, SEM and minimal detectable change (MDC). The relative reliability was evaluated by calculating a 2-way fixed ICC 2,1 (for absolute agreement). ICC values were interpreted using the categories proposed previously in which an ICC between 0.00 and 0.20 is considered poor, 0.21-0.40 is fair, 0.41-0.60 is moderate, 0.61-0.80 is substantial, and 0.81-1.00 is almost perfect [44]. SEM is an absolute measure of the variability of the errors of measurement and allows making statement about the precision of test scores of individual examinees [45]. SEM has the same unit of measurement (kPa). According to Harvill [35], SEM was generated with the following formula: where SD is the standard deviation of the scores from all workers and ICC the relative reliability. MDC gives the minimum value for which a difference can be considered as "real". MDC was calculated with the formula: Furthermore, Bland and Altman plots of the differences between trials against their mean and limits of agreements (LOA) were used to assess the magnitude of Fig. 1 Schematic representation of the low back pressure pain threshold recording grid of the left (blank square) and right (black squares) erector spinae muscles. d1 represents the distance between the first (L1) and the fifth (L5) lumbar vertebrae. d2 equals one fourth of d1 SDDiff n s disagreement between trials. A difference between trials outside the LOA can be considered as a real change [46]. Additional analyses using gender and a median split for age, BMI and pain intensity were conducted for the PPTs from the overall low-back (mean PPTs of the 14 anatomical locations) to address the effects of modifier factors. A student t-test was then used to compare groups. All data analyses were performed with R 3.0.1 software. Results are presented as mean (SD) or (95 % confidence interval), unless otherwise indicated. p < 0.05 was considered significant.

Intra-session relative and absolute reliability of PPT in the low back
The ICCs of the 14 anatomical locations and of the left, right and overall low-back (P left , P right , P all ) were almost perfect regardless of the conducted comparison (trials 1-2-3). Likewise the absolute reliability,.i.e. SEM did not change significantly ( Table 2). The ICC, SEM and MDC following a median split were similar to the ones obtained for the entire population ( Table 3).
The visual analysis of the Bland-Altman plots ( Fig. 2) suggested small measurement error whatever the comparison considered between trials. These plots also showed that zero was included in the 95 % confidence interval and that no apparent systematic bias was present in the data.

Number of trials to ensure reliable measurements
The mean PPT values at each PPT location were not significantly different between trials regardless of the three conducted comparisons (trial 1 vs. trial 2, trial 1 vs. trial 3, trial 2 vs. trial 3), the p-values were ranged from 0.7457 to 1.000 (Tables 4, 5 and 6). Concerning the left, right and overall low-back (P left , P right , P all ), p-values ranged from 0.8884 to 0.9994 (Tables 4 and 5). Lower PPT values were found for women compared with men for trial 1 and workers reporting pain intensity above 2.5 compared with workers reporting pain intensity below 2.5 for trial 2, trial 3 and the mean of the three trials ( Table 5).
The comparison of means of ICC regardless of the conducted comparisons between trials showed statistical differences for Trials 1,3 vs. Trials 1,2 (p = 0.0171) and for Trials 2,3 vs. Trials 1,3 (p = 0.0009). The same analysis for SEM values further showed a statistical difference for the comparison Trials 2,3 vs. Trials 1,3 (p = 0.0122).

Discussion
The purposes of this study were (1) to evaluate the intra-session absolute and relative reliability as well as MDC values of PPT assessments in the low back region (2) to determine the number of trial recordings required to ensure reliable PPT measurements among vine-workers and (3) to assess the effect of modifier factors such as gender, age, BMI and pain intensity on PPT reliability. This study particularly targeted a population at high risk of developing WMSDs i.e., vine-workers and used pressure pain sensitivity maps of the lumbar region.
Approximately 66 % of the workers reported episode of back pain lasting for more than 3 days within the last year. The low back pain intensity within the last 7 days was low. These self-reported values confirmed that low back pain is an issue among vine-workers. Compared to other workers, the PPT values of vine workers were close to those observed among cleaners and elderly administrative or nursing workers [37,47].
The statistical power is defined as the probability of rejecting the null hypothesis, i.e., the probability of finding an absence of significant effect whereas one actually exists [48]. In a reference study, Cohen [49] has reported that significant differences will have a power greater than 80 %. In our study population was small but sufficient to obtain substantial relative reliability values. An a posteriori calculation showed that the power achieved by all significant results was above 97 %. Still, significant differences might have been undetected due to the relative small population size (lack of adequate power for detecting a true difference of a meaningful magnitude). The relative reliability was assessed over 14 lumbar locations and estimated for 3 distinct low back regions ((1) the left low back, (2) the right low back and (3) the overall low back). The first important result of this study using an electronic pressure algometer was that ICC ranged from 0.90 to 0.99 and were almost perfect regardless of the conducted comparisons (trial 1 vs. trial 2, trial 1 vs. trial 3, trial 2 vs. trial 3). This finding is in accordance with the existing literature when PPTs were assessed in other anatomical locations. For instance, indeed, Nussbaum and Downes [50] assessed PPT relative reliability in the biceps brachii muscle by means of a Fischer algometer using 3 consecutive trials over 3 consecutive days. Nussbaum and Downes [50] have also reported excellent reliability regardless of the comparison considered (trial 1 vs. trial 2, trial 2 vs. trial 3 and trials 1 vs. trial 2 vs. trial 3). In 2011, Walton and colleagues [35] have tested intra-rater reliability with 3 consecutive PPT measurements on 2 anatomical locations (trapezius and tibialis anterior) with an electronic pressure algometer and reported excellent ICC values (0.96 and 0.97). However, the authors have only compared the mean between the second and the third PPT assessment. A comparison between two consecutive assessments is common in studies dealing with PPT but questions the rationale behind the fact that three PPT values are often recorded [33,37,51]. Table 2 Intraclass correlation coefficients (ICC), standard error of measurement (SEM) and minimum detectable change (MDC) for pressure pain thresholds assessed over 14 locations (P1 to P14) over the low back region, for left and right side (P left and P right ) as well as overall low back (P all ) between the mean of the first and second trials (T1-T2), the first and the third trials (T1-T3), the second and the third trials (T2-T3) and the means of the three trials (T1-T2-T3)  Table 3 Intraclass correlation coefficients (ICC), standard error of measurement (SEM) and minimum detectable change (MDC) for pressure pain thresholds assessed over the overall low back P all (representing the average of the 14 PPT assessments) between the mean of the first and second trials (T1-T2), the first and the third trials (T1-T3), the second and the third trials (T2-T3) and the means of the three trials (T1-T2-T3) using gender and a median split for age, BMI and, pain intensity Farasyn and Meeusen [30] have calculated ICC of three consecutive PPT measurements on erector spinae muscles with a Fischer algometer on healthy volunteers and have also compared these PPT measurements by series of two and have reported excellent relative reliability. In contrast to our findings, the PPT of the initial trial has been reported to be significantly higher than the second and the third trial [30]. A similar finding has also been obtained by Lacourt and colleagues [52] that have suggested that the first trial measured using an electronic algometer should be considered as a practice trial and excluded from the analysis. Conversely, Chesterton and colleagues [32] and Nussbaum and Downes [50] using respectively an electronic and a Fisher algometer have reported that the highest ICC values are obtained when the mean score of three trials is used. Our results are different from the above mentioned works in two aspects: (1) they showed that the mean of the three trials generated ICC and SEM values identical to those generated by 2 consecutive measurements (Trials 1-2 or Trials 1-3 or Trials 2-3); and (2) they showed that the first measurement did not generate higher PPT values than the second or third assessment. In other words, the first measurement does not necessarily need to be considered as a practice one and can thus be taken into account for analysis when assessing PPT values over the low-back region of vine-workers. However, this result is in accordance with a recent study assessing PPT over the low back of healthy individuals suggesting not to discard the first trial to report higher PPT reliability [31]. Furthermore, this finding highlights the importance of the experimental procedure, namely, the information given to the workers and the relevance of the familiarization PPT assessments on e.g. a remote body part like the tibialis anterior muscle in the present study.
Although numerous studies have assessed the relative reliability of PPT, only few studies have investigated actually reported the absolute reliability making difficult comparison among the existing literature. The reported SEMs were similar to what has been recently reported by Madeleine and colleagues [33] after a test-retest on the erector spinae muscle of young football players using an electronic algometer (i.e. 60.4 kPa). In other anatomical locations like the upper trapezius and tibialis anterior, Walton and colleagues [35] have obtained SEM ranged from 18.2 to 73.8 kPa while Chesterton and colleagues [32] have found SEM of approx. 60 kPa for the first dorsal interosseous muscle (both using an electronic algometer). We found no statistical differences between Fig. 2 Bland and Altman analyses plotted for worker's pressure pain threshold of the overall low back when the mean of the first and second trials a, the first and third trials b and the second and third trials c are considered SEM regardless of the conducted comparisons between trials except for the comparison Trials 2-3 vs. Trials 1-3 which suggests that the first assessment did not generate an error superior to other two.
With respect to MDC, we notice that the reported values were regularly above 150 kPa which is still in accordance with existing literature. Fischer [53], Chesterton and colleagues [32] and Madeleine and colleagues [33] have also reported MDC values above 100 kPa. This implies a small sensitivity to change and that a change in PPT measurement can be masked by the measurement error regardless of the absolute changes in PPT due to an ergonomics intervention [33].

Limits and perspectives
This study presents several limitations. Although many studies have demonstrated that inter-rater [33,49] and test-retest reliability are excellent on pain free subjects, our knowledge on population of vine-workers suffering from musculoskeletal pain is still limited. Individual factors like gender, age, BMI and the intensity of pain affect PPT [36,[54][55][56]. The studied population was Table 4 Mean (standard deviation: SD) pressure pain thresholds (kPa) assessed over 14 locations (P1 to P14) covering the low back region, for left and right side (P left and P right ) as well as overall low back (P all ) and level of significance (p-values) among trial. See "Procedure to mark the 14 anatomical locations" for explanation concerning the locations of PPT assessments Trial 1  Trial 2  Trial 3  T1-T2-T3  T1−T2  T1−T3  T2−  composed of men and women, different age, BMI and pain intensity. Consequently, we conducted new analyses using a median split [54]. In line with literature, we found lower PPT values for women and workers reporting pain above 2.5 on a VAS [36,56]. However, the computed ICC, SEM and MDC remained similar to values found for the entire population. Future studies assessing PPTs values between pain-free workers and workers suffering from WMSDs adjusted for individual factors are thus warranted. Although no link between PPTs and risk of future low back pain and no PPT differences among workers with or without recurrent low back pain have been reported [47,57], a larger sample size could help to establish a set of normative PPTs' values for vineworkers [20]. Finally, since PPT is increasingly used to assess the effect of physical training on pain sensitivity [58] or to monitor the effectiveness of an intervention [33,59,60], PPT could be used as a pain biomarker among vine-workers to assess over time the effects of ergonomic interventions or physical training programs specifically designed to prevent WMSDs.

Conclusions
The present study showed PPTs assessed over the low back region of vine-workers have excellent relative and absolute reliability. Reliable PPT assessments can be equally achieved when using the mean of three PPT measurement or with the first measurement. The relative and absolute reliability remained similar to the ones obtained for the entire population when taking gender, age, BMI and pain intensity in consideration but PPT were lower for women and in presence of pain. The findings suggest that the assessment of PPT over the low back region of vine-workers can be used to measure the effects of interventions.

Funding
The presented work is part of the joint PhD thesis of Romain Balaguier at Univ. Grenoble Alpes (France) and Aalborg University (Denmark), who was supported by a grant of the French Ministry of Higher Education and Research.

Availability of data and materials
The datasets generated during and/or analysed during the current study are not publicly available. Regarding data availability, ethical restrictions prevent deposition in a public repository, but data are available from the corresponding author on reasonable request.
Authors' contributions RB, PM and NV conceived the research idea. RB, PM and NV discussed and wrote the initial protocol as well as the design of the study. RB was responsible for Table 6 Mean difference between trials (95 % confidence interval (CI)) in pressure pain threshold (PPT, kPa) for the 14 locations (P1 to P14) covering the low back region, for left and right side (P left and P right ) as well as overall low back (P all ). See "Procedure to mark the 14 anatomical locations" for explanation concerning the locations of PPT assessments