Bone shortening of clavicular fractures: comparison of measurement methods

Background The indication for operative treatment of clavicular fractures with bone shortening over 2 cm is much debated. Correct measurement of clavicular length is essential, and reliable measures of clavicular length are therefore highly requested by clinical decision-makers. The aim of this study was to investigate if three commonly scientifically used measurement methods were interchangeable to each other. Methods A retrospective study using radiographs collected as part of a previous study on clavicular fractures. Two independent raters measured clavicle shortening on 60 patients using conventional radiographs on two separate sessions. The two measurement methods described by Hill et al. and Silva et al. were used on unilateral pictures. Side difference measurements according to Lazarides et al. were made on panoramic radiographs. The measurements were analyzed using intraclass correlation, Weir’s protocol for Standard error of measurement (SEM) and minimal detectable change (MDC), and Bland-Altman plots. Results None of the methods were directly interchangeable. The side difference method by Lazarides et al. was the most reliable of the three methods, but had a high proportion of post-fracture bone lengthening that indicated methodological problems. The Hill et al. and Silva et al. methods had high minimal detectable change, making their use unreliable. Conclusion As all three measurement methods had either reliability or methodological issues, we found it likely that differences in measurement methods have caused the differences in clavicular length observed in scientific studies.


Background
Clavicular fractures are common and represent approximately 2.5-7% [1] of all fractures. Depending on the site and severity of the fracture, they are predominately treated non-operatively with good results. Various relative operative indications exist for mid-clavicular fractures, one such indication being post-fracture bone shortening above 20 mm0, [2][3][4][5][6][7][8][9] as conservative treatment has been linked to adverse outcome in terms of decreased strength and overhead motion of the arm. This is much debated, however, as a handful of studies have not been able to confirm these adverse results [10][11][12][13].
Essential for the correct treatment and classification of shortening is accurate measurement. However, there is no standardized method of measuring shortening, and different methods seem to have been used equivalently [3,7]. The most commonly used measurement methods can be divided into two concepts: fragment overlap [3,14] and side difference [7]. These two measurement approaches appear to be very different as they build on different concepts. The fragment overlap methods build on the principle of drawing a perpendicular line between the lateral and medial fragment. Shortening is then defined as distance from this line to the tip of the upper fragment. The side difference method uses the uninjured clavicle as reference. Shortening is then the difference in length from the injured clavicle length.
It is therefore very likely that measurement method used could influence the conclusion on shortening, and in the end be the cause for the debate about clavicular bone shortening. A previous study comparing different methods to measure shortening of healed clavicular fractures have shown that the estimated shortening varied significantly according to the method used [15]. Whether the estimated shortening on initial radiographs of acute displaced clavicular fractures is influenced by the measurement method is unknown. To investigate if the choice of measurement method could explain the different conclusions of peer-reviewed studies, we designed a validation study.
The aim of the current study was to compare three methods for measuring acute post-fracture mid-clavicular bone shortening with the objectives of describing measurement results by each method, estimating the inter-and intra-observer reliability and the inter-method agreement. The three methods were Silva et al. [14] and Hill et al. [3] (both based on the principles of fragment overlap) and Lazarides et al. [7](based on the principles of side difference).

Ethical considerations
The study was a retrospective comparative study using radiographs that were collected as part of a not yet published study on clavicular fractures [ClinicalTrials.gov Identifier: NCT01483482]. The original study had been approved by the National Danish Data Registry (reference number: 2011-41-6031). Approval by the local ethical committee of the capital region was unnecessary. ation of the measurement difference between methods and n is the sample size. We wanted to estimate the limits of agreement within a margin of +/− 2.5 mm. We set the s as the normal anatomical standard deviation for clavicle length of approximately 10 mm [16] and found the number needed was 48 patients.

Method and power calculation
Of the 105 radiographs from the original study [ClinicalTrials.gov Identifier: NCT01483482], 25 were excluded due to non-accessible x-rays in the database and a further 20 were excluded because of incomplete panorama radiographs. The final study thus included 60 radiographs (60 patients) with acutely displaced clavicular fractures.
Two raters measured the radiographs in five separate sessions at least two weeks apart. The raters were experienced junior consultants in orthopedics and trauma medicine who viewed the original articles for method instructions and agreed on how each measure was attained. During the sessions, the original studies were consulted for guidance if the raters were in doubt about the methodology, and the raters were blinded to each other's results. The single anterior posterior radiographic view was used for the methods described by Silva et al. and Hill et al., while the panorama view including both clavicles was used for the method described by Lazarides et al. (Fig. 1). Length was measured using the available software (Carestream health inc. Verona street 150, Rochester NY 14608) which was noted in centimeters and converted to millimeters by the authors. For each radiograph, date and time were noted to ensure that the same radiograph was measured consecutively.

Analysis
Statistical analysis was performed with use of STATA software (version 13.1; STATACORP, College Station, Texas). Simple descriptive statistics were used. Measurement distributions were assessed after dividing into three groups: lengthening (over 0 mm), neutral (between 0 mm and −19 mm) or clinical significant shortening (over 20 mm).
For comparison of reliability, we used the protocol described by Weir [17]. Inter-rater comparison used the Fig. 1 The three methods for measuring post-fracture clavicular length compared in this study. Silva [14]: a line is drawn through the middle of each fragment. From each middle line, a perpendicular line between each fragment is drawn. Bone shortening is defined as the distance between the perpendicular lines on single anteriorposterior view. Hill [3]: a line is drawn from the bottom fragment perpendicular to the top fragment. Bone shortening is defined from the line to the tip of the top fragment on single anterior-posterior view. Lazardis [7]: the length of each clavicle is measured. Bone shortening is defined as uninjured clavicle length minus injured clavicle length on a panorama view second measurements made by each rater. The bone shortening intra-class correlation (ICC) was calculated for all three methods. A one-way random level for confidence was used for intra-observer reliability (one rater) while a two-way random level of confidence was used for inter-observer reliability (two raters). The obtained ICC values were used to calculate SEM, Standard error of measurement [SEM = SD × √ (1-ICC)] describing the given error for each measurement method. Afterwards MDC, minimal detectable change, was calculated with the use of SEM in the formula (MDC = 1.96 × √2 × SEM) to estimate the smallest given change each method would be able to detect.
Agreement between methods was visualized using Bland-Altman plots [18] estimating the convergent validity and limits of agreement comparing all three methods as reference. Results The final study group had a median age of 36.5 years (min. 18, max. 62) with 51 men and 9 women. Mean total length of the pooled all measurements clavicle for the injured unilateral clavicle was 160.8 mm (SD 14.9). For the pooled all measurements panorama radiographs, uninjured clavicle length was 168.2 mm (SD 12.8 mm) and injured 160.4 mm (SD 14.4 mm).
The plot of the results (Fig. 2) from all 240 measurements (n = 60) from each method showed visually that the side difference method by Lazarides et al. was very different from the two fragment overlap methods, which were more similar to each other. The method by Lazarides et al. also found more patients with lengthening and fewer with shortening over 20 mm. Histograms (Fig. 3) showed that all three methods had a normal distribution pattern for the 240 measurements (n = 60). When the measurements were divided into lengthening (over 0 mm), neutral (between 0 mm and −19 mm) or shortening (over 20 mm), the Silva Fig. 4).

Discussion
In this study comparing three previously described methods for defining bone shortening in acutely displaced mid-clavicular fractures, we found the side difference Our results indicate that the measurement method chosen is critical to the measurement of post-fracture clavicular bone shortening, and studies involving clavicular bone shortening should be read with this in mind. Until now, different measurement methods have been described as equivalent, but our results show there is a clear distinction. The simple descriptive statistics, graphs and analyses showed that the side difference method described by Lazarides et al. gave results that were very different from the fragment overlap methods described by Silva et al. and Hill et al. This is also to be expected from the difference in measurement concepts.
A similar pattern was seen when analyzing the reliability of the methods. The fragment overlap methods by Hill et al. and Silva et al. were comparable regarding the standard error of measurement (SEM) and minimal detectable change (MDC), but these methods had very wide MDC. This suggests that their clinical use is unreliable for measurements purposes, as the minimal clinically important length is set to 20 mmand it would not help to change  ICC: intra-class correlation. Mean (mm): mean bone shortening in millimeters. SD crude: standard deviation of bone shortening in millimeters. SEM (mm): standard error of measurement in millimeters. MDC(mm): minimal detectable change in millimeters. For inter-rater analysis, the second measurement made by each rater was compared to a more arbitrary limit e.g. 25 mm. In comparison, the Lazarides et al. side difference method showed a much better SEM and MDC, which should make it more clinically relevant. However, in this study we found methodological issues with the side difference method as it showed three times as many measurements with post-fracture lengthening of the bone than the fragment overlap methods. Lengthening after a fracture should theoretically be unlikely, as muscle pull would always tend to shorten the bone. These findings are probably a consequence of previously stated methodological issues with the side difference method, as it relies on the concept of bilateral length symmetry within individuals. A previous study have shown that this is not always the case, and clavicles follow a randomly distributed left and right length difference ranging between 0 and 15 mm [19]. Consequently, the bone lengthening observed in this study could be attributed to an underlying methodological error. Ultimately, even the use of the side difference method by Lazarides et al. is therefore problematic. The limitations of this study are that measurements were done consecutively and only two raters were used. Any mistake or misunderstanding of the measurement methods could possibly be aggravated over a larger series of measurements. We tried to avoid this by regularly consulting the original description by Hill et al. or Silva et al. if in doubt. Our intra-class correlation was in fact slightly higher than results reported by Silva et al. [14] indicating that we were able to minimize this bias. A final limitation was that we had to exclude 45 of 105 available participants due to lack of images or errors on the radiographs. In this particular study, this was not of great importance as it was the rater agreement that was of interest and we managed to include 60 patients, which was well above our power calculation. We would not expect to see any significant change in the intra-class correlation if a higher number of participants were to be included. The strengths of this study are that is the first of its kind to compare intra-class correlation and to quantify SEM and MDC within and between clavicular bone shortening measurement methods.

Conclusion
Our findings show that it is very likely that differences in measurement methods have caused the variation in results from studies on post-fracture clavicular bone shortening. Whether bone shortening results in adverse outcome is still subject to debate, but if used in a clinical setting it is important to have a reliable and accurate estimate. Based on our study, the side difference method by Lazarides et al. is the most accurate and reliable. However, as it relies on bilateral symmetry and identified a large proportion of patients as having bone lengthening rather than shortening, its use seems problematic. The two fragment overlap methods (described by Hill et al. and Silva et al.) appeared unreliable, and their use cannot be recommended. In conclusion, our findings raise a new question as to which method should be used, when taking both scientific and clinical grounds into consideration.