Passive range of motion in patients with adhesive shoulder capsulitis, an intertester reliability study over eight weeks

Background Measuring range of motion (ROM) in the shoulder joint is important for the diagnosis and monitoring of change over time. To what degree passive ROM can be trusted as a reliable outcome measure was examined as part of an on-going randomized controlled trial for patients with shoulder capsulitis. The aim of this study was to examine intertester reliability of passive ROM in the shoulder joint over a period of eight weeks in patients with adhesive capsulitis stage II. Methods Fifty patients with a clinical diagnosis of adhesive shoulder capsulitis were examined by two independent testers. A predefined protocol was used for measuring passive range of motion with an inclinometer, a plurimeter, in both affected and non-affected shoulders three times; at the start of the study and after 4 and 8 weeks. Results Very good to excellent intertester agreements were found for most parameters for the affected arm at all three test points. The intraclass correlation coefficient (ICC 2.1) values ranged from 0.76 to 0.98, i.e. from very reliable to excellent. The measurement error was in general small for the affected arm (5°–7°). ICCs were slightly lower for the non-affected arm at 8 weeks, but with acceptable measurement errors. Conclusions Intertester reliability between two testers was very good at three visits over a time period of eight weeks using a plurimeter to measure passive range of motion in patients with adhesive shoulder capsulitis. This method can reliably determine passive range of motion in this patient population and be a reliable outcome measure.


Background
Range of motion (ROM) in the shoulder joint is among the commonly used clinical criteria for diagnostic purposes and to monitor effectiveness of given treatment [1]. ROM is often used as an outcome measure in studies observing effect of intervention in stiff joints and in shoulder pain [2][3][4]. Therefore it is important that the movement measured is reproducible without much variation, independent of instrument being used.
Shoulder capsulitis is a painful condition affecting between 2 -5% of the adult population [5][6][7]. There is a global reduction of active and passive movement, generally in a capsular pattern characterized by most reduction of external rotation, less of abduction and least of internal rotation. Reliable measurement of ROM is therefore essential for the correct diagnosis of adhesive capsulitis of the shoulder, as this is mainly a clinical diagnosis. Measurement variations in patients with shoulder capsulitis are bound to occur due to pain, fear of pain, stiffness, fatigue and measurement error at any one given time point [1]. To our knowledge no intertester reliability study has been conducted in this patient group. Reliability of measurements is therefore essential on both the affected and the non-affected side for diagnostic purposes and over time to monitor progression. Earlier intertester reliability studies have usually measured only affected [8][9][10] or only nonaffected shoulders [11,12] and with participant numbers below 35, and ROM has only been measured at one visit or with an insignificant time difference between measurements [8,[10][11][12][13][14][15]. Most former studies have only reported correlation coefficients and have not reported standard error of measurement (s w ) [8,14,16,17]. Although as many as 50 patients have recently been recommended to be included in reliability studies (p. 126 in de Vet et al., [18]), only a few studies have examined this many participants [8,17,19,20]. See Table 1 for an overview of former studies.
To what degree ROM can be trusted as a reliable outcome measure was examined as part of an on-going randomized controlled trial for patients with adhesive capsulitis of the shoulder. The aims of the study were: To determine the reliability of shoulder passive ROM (PROM) bilaterally between two testers in a large number of participants with adhesive capsulitis using a validated measuring instrument, the plurimeter [16,25], over an eight week period. To determine if the measurement error remained the same during the eight week period, measured three times four weeks apart.
The intertester reliability was evaluated for PROM in abduction, external rotation, internal rotation and "hand behind back" in patients with shoulder capsulitis.

Methods
PROM in the shoulder joint is defined as to the extent an investigator can move the arm until pain or stiffness limits the movement. We measured PROM on both the affected side and the non-affected side. There are no set rules regarding measurement intervals or the number of times measurement should take place. To avoid too much pain provocation we decided to measure each movement only once for each tester. Therefore a total of eight measurements for PROM were carried out for each of the two testers on each patient. No standardized time interval was set between tester 1 and 2, and usually only a few minutes elapsed between the two measurement sessions.

Participants
Patients potentially eligible for inclusion in the randomized controlled trial for treatment of shoulder capsulitis were referred to a primary care clinic by physicians and physiotherapists in the period 2010-2012. The study was approved by the Regional Ethical Committee REK NORD, reference 148/2008, in compliance with the Helsinki Declaration. Informed consent was obtained from all participants, and from the tester appearing in the photographs.
The measurement took place on the second consultation and is hereafter referred to as visit 1 in the study. The PROM testing took place on visit 1, four weeks later (visit 2) and then at eight weeks (visit 3).
To be included in the study patients had to be above 18 years of age, should be able to understand and speak Norwegian, and there should be no contraindications for use of corticosteroids. Participants should have reduced range of motion in a capsular pattern with a reduction of more than 30% of two out of three shoulder movements and none of the three movements (Abduction = ABD, External rotation = ER and Internal rotation = IR) should be normal. All patients were having both pain and stiffness, and can be referred to as being in stage 1 and/or stage 2 [25,26]. Patients with diabetes, asthma, pregnant women and breast feeding mothers were excluded from the study.
The first 50 participants recruited in the main study were included in the intertester reliability study, comprising 22 men and 28 women, age ranging from 38 years to 75 years (mean age 52 years; SD 9.3). Along with other demographic data, information regarding the affected shoulder, such as the side affected, how long the condition had lasted and details of any previous treatment, was collected before taking the PROM measurements. Mean pain intensity measured with Numerical Pain rating Scale (NPRS) was 6.8 (SD 1.7) and Shoulder Pain and Disability Index (SPADI) was 63.0 (SD 19.3). Four of the included participants had bilateral capsulitis. In these patients we chose the more affected side as the "affected" and the other side as the "non-affected" side.

Testers
Both testers were experienced general practitioners and had experience with measurements of shoulder movements with goniometer from a former pilot study. They had also trained with plurimeter on each other and on patients with shoulder capsulitis before the start of the study. Tester 2 (SS), who is also a physiotherapist, had experience in measurements of ROM. The testers planned beforehand how the measurements were to be carried out and standardised the procedures. The two testers performed ROM-testing in the same order; tester 2 always tested first and tester 1 last. The two testers kept their measurements records confidential and inaccessible to each other throughout the duration of the study until all the data was collected.

Measurements
The Plurimeter-V gravity inclinometer (Plurimeter-V inclinometer; Dr. Rippstein, Zurich, Switzerland) was used in this study to measure PROM for abduction, external rotation and internal rotation. A plurimeter is a handheld instrument for measuring relative angles between surfaces. In the gravity referenced inclinometer the starting position for the measurement is fixed, which is either 0°or 180°. The reliability of this instrument for measuring shoulder and scapular passive range of motion (PROM) has been examined in previous studies [16,27]. The exact technique of measurement was standardised considering the position of the patient, position of the arm in relation to the body, and position of the plurimeter in relation to the arm (Figure 1 a-d). The starting position of the plurimeter was from 0 for every measurement with the instrument on the arm prior to the start of the shoulder movement. This minimizes placement error. To determine the end point of range, the arm was passively moved up to the tolerance level of pain or  when it was not possible to move it further due to stiffness.
The following individual passive movements were measured:

In standing
Passive gleno-humeral abduction (ABD) The patient was in standing position and the tester stood partly behind and partly to the side of the patient to be measured. The scapula was stabilized by the tester holding the inferior angle of the scapula between thumb and index finger of one hand and holding the patients arm just proximal to the patient's elbow while at the same time holding the plurimeter between 2nd and 3rd finger on the dorsal aspect of the upper arm. Care was taken to hold the plurimeter base in a straight line on the upper arm. The arm was then passively abducted. The end point was reached either when the pain was reported as unbearable by the patient or the scapula began to rotate and the examiner could not hold the scapula in place. The reading on the plurimeter was then registered.
Hand behind back (HBB) HBB was measured in centimetres. The patient was in standing position and the distance was measured in centimetres (cm) by placing the patient's hand behind the back as far as it could reach within pain limits with the ventral side of palm facing outwards. The end point was considered to be the highest landmark reached with the upper end of the radius proximal to the wrist. We chose the distal end of radius as the highest landmark to avoid measurement errors involving movement of wrist and thumb. The starting point (0 point) was taken from the posterior inferior iliac spine (PIIS). If the hand did not reach PIIS, the distance to PIIS was denoted in minus centimeters (−cm). In case the hand did not reach medially enough, parallel lines were drawn from the 0 point and the distance between them was measured horizontally. Though a complex movement, this is a pragmatic way of measuring internal rotation of the shoulder joint and is commonly used in clinical situations [16]. Studies have however demonstrated that HBB does not measure the exact range of internal rotation [28,29].

In supine lying
Passive external rotation (ER) in 45°of abduction The patient was lying supine with about 45°of glenohumeral abduction and the elbow was kept at 90°of flexion and the forearm was kept in mid position. The plurimeter was placed between the shaft of radius and ulna in a straight line. The arm was rotated in external rotation. If the arm did not reach 0°, the ROM was noted in minus degrees.

Passive internal rotation (IR) in 45°of abduction
The position of the patient was the same as for measuring ER. The arm was then rotated in internal rotation. The reading on plurimeter was registered at the end point of movement i.e. when the pain was unbearable or the arm could not be moved further.

Statistics
Descriptive statistics with mean measurement including standard deviation (SD) for each movement is presented. Reliability refers to relative agreement as well as absolute measurement error. For calculation of reliability the Intraclass correlation coefficients (ICC model 2.1) was used, as this accounts for both systematic and random error. ICC is a reliability parameter and ranges between 0 and 1, and there are no fixed standards regarding what can be considered as acceptable. According to Eliasziw and co-workers [30], ICC values from 0.60 to 0.79 indicate moderate reliability, values ≥0.80 to 0.90 as very reliable, and >0.90 as excellent. For absolute agreement, which is the actual difference in measurements (i.e. absolute measurement error in degrees and centimeters), the size of measurement error was calculated. Bland and Altman [31] have suggested estimating within-subject standard deviation (s w ), i.e. the common SD of repeated measurements, derived from one-way analysis of variance. Statistical analysis was carried out using the IBM SPSS Statistics version 19, software program.

Results
Mean raw data for measurement of abduction, external and internal rotation and hand behind back are listed for both the affected and the non-affected arm in Tables 2, 3, 4 and 5. Relative agreement is reported with intraclass correlation coefficient, ICC (2.1), and absolute measurement error is reported with within-subject standard deviation, s w, between the two testers. ROM = Range of motion.

Abduction (ABD)
Very good to excellent reliability calculated with ICC 2.1 was found during all three visits, except for the 3rd visit for the normal side ( Table 2). The measurement error (s w) for the affected arm ranged from 6.7°at first visit, 5.9°at the 2nd and 7.2°at the last visit, whereas s w for the non-affected arm was between 1.4 and 2.1°.

External rotation (ER)
Very good to excellent reliability was found at all three visits, except for a moderate reliability shown for the non-affected arm on the third visit ( Table 3).
The measurement error (s w ) for the affected arm ranged between 4.5°, 5.8°and 6.2°. The s w was almost the same for the healthy arm: 6.5°on the first visit and 5.7°on the last.

Internal rotation (IR)
Good reliability calculated with ICCs was found on both the normal and affected side for all three visits, except for moderate reliability (ICC 0.63) shown in the last visit for the normal arm. Measurement error for IR ranged from 5.6°to 7.0°on the affected side and from to 5.1°to 6.2 on the healthy side (Table 4).

Hand behind back (HBB)
Excellent reliability was found on both the normal and the affected side when measuring HBB. The measurement error ranged from 1.6 cm to 1.9 cm on the affected side, and from 1.1 cm to 2.1 cm on the normal side (Table 5).
Graphic scatter plots showed that there were a few outliers (see Figure 2a and b) where the two testers had measured a difference in range of 15°-20°in a couple of patients.

Discussion
This large cohort study demonstrated very good to excellent intertester reliability when examining PROM in patients with shoulder adhesive capsulitis stage II. The results in our study are comparable or better than other reliability studies measuring shoulder ROM in normal individuals or in other shoulder populations (Table 1) [8,13,16,17,20,22,24]. To our knowledge this is the first reliability study that has measured passive ROM in patients with adhesive shoulder capsulitis using a plurimeter, whereas most former studies have used a variety of measuring instruments and techniques. The intertester reliability remained excellent at all three visits for examination of the affected side. The unaffected arm had stable measurements over time, while the affected arm changed over time, possibly due to treatment and/ or general improvement. The measurement errors were found to vary between~5°-7°on all three visits when examining the affected shoulder for passive abduction, and bilaterally when examining external and internal rotation. The measurement error was relatively small when examining abduction (ABD) (~1.5°-2°) in the normal arm and for measuring hand behind back (HBB) in both arms (~1 cm −2 cm) on all three visits. Some of the good results found in our study may be attributed to training of the testers who practiced the procedures on each other and on patients before the start of the study. Better results have been observed due to increased practice earlier [16].
The ICCs values for the affected side were very reliable on all three visits (ICCs ≥ 0.83 -0.96). The measurements on the non-affected side had slightly lower ICC values than the affected side, but only at the third visit. De Winter et al. [17] had an ICC of 0.28 on the nonaffected side and 0.83 on the affected side for ABD, and 0.56 for the non-affected side and 0.90 for the affected side for ER in patients with painful shoulder. Possibly a combination of low spread in scores and low variance has resulted in a low ICC, albeit with a low measurement error as demonstrated in Figure 2 b.
The absolute measurement error found in our study is generally better than the few studies where ROM in the shoulder has been measured. However, no values  have formerly been reported on patients with capsulitis. Hayes et al. [10] found standard error of measurement (i.e. s w ) to range from 14°-25°for flexion, ABD and IR. Kolber et al. [11] reported small s w ranging from approximately 2°-4°in ABD, ER and IR among 30 normal participants. In the study by de Winter et al. [17], 155 participants with shoulder pain were examined, and s w ranged from 14°-20°for ABD and ER. Muir et al. [1] studied a mixed participant group of 17, and the s w ranged from 6°-9°in flexion, ABD, ER and IR in supine lying. The measurement error indicates that some variation must be expected when using ROM as an outcome measure. In our study the affected arm had about 1/3 to ½ of the ROM as compared to the non-affected arm at the first visit. Smallest detectable change SDD (√2 x1.96 = 2.77 s w ) is often used to indicate statistical significant change [32]. An s w of 5°-7°for ER, IR and ABD and an s w of~2 cm for HBB for the affected side would indicate that statistical change larger than the measurement error in the effect study would have to be 14°-19°, and~5.5 cm. In our study, the SDD values for the affected arm were close to statistical significant change above measurement error from the first to the third visit. Range of motion is an important and reliable outcome measure, and a change of ≥15°is necessary to represent a clinically significant change in patients with adhesive capsulitis. Patients with shoulder capsulitis in stage II generally have a large movement reduction and a change of >15°has a positive impact on functionality in activities of daily living. Clinically important change should be defined within a context, and may sometimes be smaller than the SDD [33].
A sample size of 50 participants and measurement of ROM on both the affected and non-affected side constitutes a large sample size for examination of reliability [18]. Inclusion of 50 patients was based on the recommendations made in "Measurement in medicine" [32]. However, since two testers tested both sides three times, a lower sample size would have been sufficient.
Our sample is representative concerning gender (56% female) and age (mean 52 years) for patients with shoulder adhesive capsulitis in stage II [15]. At inclusion, participants in our study were patients with moderate to severe capsulitis. The numerical pain rating scale (NPRS) ranged from 5 to 9, which characterizes moderate to severe pain and may pose problems in measuring ROM. However, the very good to excellent reliability proves otherwise, i.e. measurements were still reliable in patients with moderate to severe painful stiff shoulders corresponding to stage II. The pre-treatment value for pain and function indicated moderate to severe problems (SPADI values varied from 42 to 98, on average 63). The recruited patients had restricted shoulder movement with more than 30% reduction in two of three PROM values and none of the three movements were normal. We chose to only examine passive ABD, ER, IR and HBB as these are the standard movements for diagnosis of shoulder capsulitis and may also be used over time to monitor progression [34,35]. Since pain and stiffness pose particular problems while measuring PROM, for example in finding out the exact end point of movement, measurement of AROM could have been a good supplement. Studies have shown that AROM is more reliable than PROM, probably because the extra pressure from the examiner while measuring PROM may affect the ROM [36,37].
The strength of this study lies in its good power, representativeness of the condition studied and good to excellent results, as well as being the first study that measures intertester reliability in patients with shoulder adhesive capsulitis with plurimeter. Among limitations it may be mentioned that non-randomization of testers may have induced systematic measurement error, as tester 2 may have provoked pain and thus affected the PROM for tester 1. The testers had two criteria, pain and stiffness, for judging the end of movement and this may also have constituted some source of measurement variation, although small. Despite the non-randomised test-procedure our results are very good. Although tester 2, who always tested before tester 1, had a tendency to measure a larger range for external and internal rotation, and mostly for the non-affected arm, findings in our study show an overall very good to excellent reliability for measuring PROM in patients with this condition. This is an important finding because measuring PROM is the diagnostic test for adhesive shoulder capsulitis. Little difference in intertester reliability occurred for the duration of the study (eight weeks). Although an intra-tester reliability study with short time intervals was not performed, our results indicate that we can trust the measurements from one tester at different visits also in an effect study.

Conclusion
Intertester reliability between two testers was very good at three visits over a time period of eight weeks using a plurimeter to measure passive range of motion in patients with adhesive shoulder capsulitis. This method can reliably determine passive range of motion in this patient population and be a reliable outcome measure.