Study selection: inclusion and exclusion criteria
Participants (with and without LBP) were recruited by poster and word-of-mouth advertising from private physiotherapy clinics and a university. People with LBP (LBP group) were included if they had back +/− leg pain for > 12 weeks and a pain score of > 2 on a 0 to 10 Numerical Rating Scale (average of worst, current, usual pain intensity) [28]. Exclusion criteria were any of the following: (i) previous lumbar surgery, (ii) any invasive spinal procedures for LBP, including therapeutic injections, within the last 12 months, (iii) pregnancy (iv) neoplasm, infection, fracture, inflammatory disease, neurological disease or any metabolic disorder that had the potential to affect the lumbo-pelvic region, (v) implanted electrical medical device, (vi) any medical abnormalities or conditions (e.g., knee or hip conditions) that in the opinion of the clinician would substantively interfere with an ability to participate in the study, (vii) a known allergic skin reaction to adhesive tapes or plasters, or (viii) BMI > 30 (where it becomes difficult to palpate bony landmarks). Participants recruited into the sample without back pain (NoLBP group) were excluded if they had (i) back pain at the time of testing, (ii) an episode of back pain that had necessitated attending a medical practitioner or allied health professional in the last 12 months, (iii) time off work due to back pain in the last 12 months or, (iv) any back pain during or between testing procedures. All potential participants were screened for suitability by a trained administrator, by direct contact and follow-up phone call if clarification was required, and then invited to participate. Ethics approval was obtained from Monash University (approval numberCF12/1995-20 12001090). All participants gave written informed consent.
Measurement protocol
Figure 1 presents the test procedures. Each participant was tested on two separate days. On the first test day, they were tested twice (Test 1 and Test 2) by two different raters (Raters A and B). On the second test day, they were assessed once (Test 3) by Rater A. On each test occasion, participants were assessed while they performed five repetitions of each movement. Data were collected at two geographic locations by physiotherapists with a minimum of 2 years’ clinical experience.
To standardise the testing procedures, 3 h of practice for standardised palpation of bony landmarks, sensor placement and measurement procedures preceded the initial data collection. Standardised instructions were used by both raters with pre-determined verbal cues for each movement test. Rater order (i.e., who administered Tests 1 or 2) was randomised pragmatically by rater availability. Participants were tested in the same room for all tests, and where possible, were tested at a similar time of day.
All kinematic data were automatically captured by the ViMove system independently of actions by the rater.
Equipment
The ViMove system (DorsaVi, Australia) is an inertial measurement system comprised of two wireless movement sensors containing a triaxial accelerometer, a triaxial gyroscope and a magnetometer, two wireless surface electromyography (EMG) sensors (these EMG data were not reported in this paper), and a small wireless recording device that can be easily carried (e.g., in a pocket). The manufacturer reports average differences of < 1° for single plane, through-range movements when comparing matched measurements from the ViMove and a Fastrak opto-electronic device [29]. The ViMove movement sensors collect data at approximately 20 Hz.
Test procedures
Participants were partially undressed to expose the body from T12 to the posterior superior iliac spines (PSIS) (see Fig. 2). Shoes were removed. The upper border of each PSIS was palpated and marked by Rater 1. To standardise sensor placement, the distance from the PSIS marker to the floor was recorded using a rigid vertical ruler and right-angled square. These measurements were used to replicate PSIS markings in subsequent testing [30]. A plastic template (part of the ViMove system) for standardising relative sensor placement was then aligned to the marking on the PSIS and used to guide sensor attachments. Movement sensors were attached to the skin over the T12 and S2 spinous processes using disposable adhesive pads. Movements were then demonstrated by the rater, after which participants were instructed to move through a standardised sequence of movements (summarised in Additional file 1).
During these movements, data on lumbo-pelvic angles and ROM were recorded automatically by the device. The only role of the rater was to request the required movement in the required sequence and initiate the data collection process. On completion of a test, sensors and adhesive pads were removed and the skin was wiped clean. Participants rested for 5 min then the entire procedure was immediately repeated by a second rater. Each rater was blind to data collected by the other rater with the exception of the measurement of the vertical distance of the PSIS from the floor. Participants then returned 7–14 days later for a repeat assessment (Test 3) by Rater A. For participants with LBP, pain was recorded using three Numerical Rating Scales (worst pain =10, no pain =0), and the average of current, usual and worst pain over the previous 2 weeks was used [31]). Activity limitation was assessed using the Roland Morris Disability Questionnaire [32]. Pain and activity limitation were recorded on both assessment occasions.
Sample size
No existing data were available to inform sample size estimates. A sample of 60 adults aged 18–60 years (n = 30 with LBP, n = 30 without LBP) were recruited. This sample size would allow detection of a correlation of 0.44 or more between repeated measures in each group of 30, with an alpha of 0.05 and power of 0.8 [33]. Arbitrarily, we assumed this was an adequate threshold, as movement consistency that resulted in lower retest correlations would provide adequate evidence that the individual variations in movement patterns would be so large that patterns of movements would be too variable to be clinically interpretable. In addition, a sample size of 30 is recommended where researchers are studying differences between two sets of scores, as difference scores for samples of 30 or more are likely to assume a normal distribution and thereby provide more adequate data for parametric tests.
Data analysis
Data on body position were sampled and recorded at approximately 20Hz for each of the five repetitions of flexion, extension and left and right lateral flexion movements. Averaged lumbar lordosis angle was recorded in standing over a 5-s period.
Peak angles were calculated for trunk and pelvic sensors to indicate maximum angular displacement at T12 (trunk movement) and S2 (pelvic/hip movement). Lumbar movement (movement between T12 to S2) was calculated by subtracting pelvic movement (movement of the lower sensor at S2) from trunk movement (movement of the upper sensor at T12). In addition to static posture and ROM, data on ‘lumbar versus pelvic’ contribution to flexion, extension and lateral flexion were collected during each movement. This is shown graphically in Fig. 3. A summary measure of this pattern of lumbar versus pelvic contribution to trunk movement (lumbo-pelvic rhythm) was estimated by calculating the percentage contribution of lumbar ROM to peak trunk ROM for flexion, extension and lateral flexion.
Statistical analysis
Participant demographics (gender, BMI, pain and activity limitations) were summarised.
Comparing ROM for participants with and without LBP
Mean ROM scores for each of the repetitions (three tests each of five repetitions) for each movement, for LBP and NoLBP participants, were tested for differences between groups using a repeated measures ANOVA.
Consistency in repeated measurements
To examine the overall consistency in repeated movements, the standard deviation of all measurements of a movement for each participant was calculated. Differences in standard deviations between groups were tested using independent t-tests.
Within-test repeated movement consistency
Each of the three tests consisted of five repetitions for each movement. We considered that the best estimate of a person’s ROM would most likely be an average of repeated measurements. Before commencing analysis of the magnitude of error in movement estimates, the five repetitions for Test 1 were examined to determine whether any of the repetitions were systematically different from others. Systematic variation for specific repetitions was assessed using a paired t-test to compare the mean for the first repetition to the mean for each of the other repetitions; this was repeated for repetitions 2, 3, 4 and 5, for each movement, and for LBP and NoLBP participants separately. Based on this analysis, we made decisions regarding the repetitions that were suitable for inclusion in subsequent analyses.
Movement consistency between tests on the same day (inter-rater reliability)
The average of stable repetitions was used as best evidence of the typical movement for each participant. Consistency between repeated tests was estimated using the two-way, random effects, absolute agreement between two raters, Intraclass Correlation Coefficient (ICC 2, 2) statistic. The magnitude of differences between repeated tests was summarised using Bland-Altman plots with 95 % limits of agreement (LOA) and the minimal detectable change (MDC90) statistic. These were calculated using the standard deviation of the differences between repeated tests multiplied by 1.65 for the MDC90 and 1.96 for 95 % confidence levels (LOA). The MDC90 metric with its 90%CI balances statistical rigour with clinical utility in deciphering changes in measurements.
Movement consistency between tests on different days (7–14 days after the first test: intra-rater reliability)
Methods used to calculate the consistency of measurements taken on the same day were repeated for measurements taken on two test occasions 7–14 days apart. The conceptual framework and definitions of reliability used in this study were those published by the COSMIN group [34]. All analyses were performed using a statistical software package (STATA, version 12).