- Research article
- Open Access
- Open Peer Review
Intraobserver and interobserver reliability of measures of cervical sagittal rotation
BMC Musculoskeletal Disordersvolume 15, Article number: 332 (2014)
Diagnosis and treatment decisions of cervical instability are made, in part, based on the clinician’s assessment of sagittal rotation on flexion and extension radiographs. The objective of this study is to evaluate the intraobserver and interobserver reliability of three measurement techniques in assessing cervical sagittal rotation.
Fifty lateral radiographs of patients with single-level cervical degenerative disc were selected and measured on two separate occasions by three spine surgeons using three different measurement techniques. Cervical sagittal rotation was measured using three different techniques.
Intraclass correlation coefficients were most consistent for Method 2 (ICC 0.93-0.96) followed by Method 1 (ICC 0.88-0.91) and Method 3 (ICC 0.81-0.87). Intraobserver agreement (% of repeated measures within 0.5° of the original measurement) ranged between 76% and 96% for all techniques, with Method 2 showing the best agreement (92%-96%). Paired comparisons between observers varied considerably with interobserver reliability correlation coefficients ranging from 0.54 to 0.89. Method 2 showed the highest interobserver reliability coefficient (0.82, range 0.73-0.88). Method 2 was also more reliable for the classification of “instability”. Intraobserver percent agreements ranged from 94 to 98% for Method 2 versus 84% to 90% for Method 1 and 78% to 86% for Method 3, while interobserver percent agreements ranged from 90% to 98% for Method 2 versus 86% to 94% for Method 1 and 74% to 84% for Method 3.
Method 2 (measuring the angle from the inferior endplate of the vertebra above the degenerative disc and the inferior endplate of the vertebra below the degenerative disc) showed the best intraobserver and interobserver reliability overall in assessing cervical sagittal rotation.
The accuracy of a method can be defined as how close a measured value is to a true value. A reliable measurement should be both accurate and precise, with precision defined by agreement between different observers and agreement for an observer who repeats the measurement several times.
Clinical instability of the cervical spine should be diagnosed accurately for clinical decision making. Flexion-extension X-rays are commonly used clinically to assess stability of the cervical spine for several medical conditions, such as trauma, post-trauma, and degeneration etc. [1–10]. Diagnosis and treatment decisions are made, in part, based on the clinician’s assessment of these X-rays. Sagittal translation (>3.5 mm), or segmental angulation (>11°) is typically used to infer instability , and radiographic measurements often play a pivot role in orthopaedic decision making. The steps used for the analysis of sagittal translation are well described . Contrary to that, there are several techniques for the assessment of cervical sagittal rotation, and this can even be deemed to be a completely unreliable tool.
The intraobserver and interobserver variability of methods evaluating cervical sagittal rotation has not been studied. A reliability analysis is an essential step in the development of any classification system or treatment algorithm . This assessment gives critical information that often leads to the modification of a proposed classification system or treatment algorithm and thus its improvement. Digital measurement has been internally precise compared with manual measurement. In this study, three spine surgeons applied three different digital measurement techniques to 50 cases with single-level degenerative disc disease to determine intraobserver and interobserver variability of cervical sagittal rotation.
Lateral plain radiographs of 50 cases in flexion and extension were retrieved from the institutional digital imaging system and stored on compact discs (CDs) in high quality digital tagged image file format (TIFF) for mobility measurement in cervical degenerative segment. Inclusion criterion for subjects was single-level cervical degenerative disc disease confirmed by MRI. This study was approved by the Human Research Committee of the university, and all subsequent research adhered to the 'Guidelines for Human Research' of the university. Written informed consent was obtained from the patient for the publication of this report and any accompanying images.
Three spine surgeons who each had more than 10 years experience in spine surgery performed the measurements on computers using necessary software. Before performing the experimental measurements, each spine surgeon was trained on the use of the software and the measurement technique and demonstrated the ability to independently perform the measurements on one pair of flexion-extension radiographs. Each spine surgeon was assigned a set of radiographs and allowed to complete the measurements at his own pace over the course of 4 weeks.
Each pair of digital radiographs was opened using the software. For each image, three spine surgeons measured the angle on the flexion and extension films according to three methods: Method 1; Method 2; and Method 3. Cervical sagittal rotation was defined as the change in the angulation from extension to flexion. After three weeks, three spine surgeons rated the same set of images again. The image and the order were blinded and randomized on the two occasions. Cevical sagittal rotation (>20°) was considered unstable . Figure 1 illustrates the lines and angles constructed by the computer in Method 1, Method 2 and Method 3 [15–17].
Statistical analysis was performed using SPSS 15.0 software (SPSS Inc, Chicago, IL). Three analyses were conducted in assessing the reliability of this radiographic parameter of sagittal rotation in cervical spine. Intra-class correlation coefficient (ICC) was calculated for both inter-rater and intra-rater reliability . The intraobserver reliability assessed the reproducibility of each observer for each measurement technique. In this study, each observer measured the same radiograph twice for each technique. The interobserver reliabilities were obtained to assess the overall agreement among the three observers for all methods and for each method as well. For analyzing the interobserver reliability, the first measurement of each observer was entered in the ANOVA. The Pearson correlation was evaluated between the average angles estimated with different measurement techniques by all three observers in two sessions. A P value of less than 0.05 was considered significant. All reliability estimates were presented with a 95% confidence interval (CI).
Fifty lateral radiographs of cervical spine in flexion and extension were measured by three independent observers on two separate occasions using three different measurement techniques. MRI revealed that disc degeneration occurred at C3/4 in 11 patients, 19 cases at C4/5, 8 cases at C5/6, and 12 cases at C6/7.
The mean sagittal rotation was 9.6° (SD, 1.6°) with Method 1, 9.8° (SD, 1.5°) with Method 2, and 10.3° (SD, 1.7°) with Method 3.
Cervical sagittal rotation measurement reliability
Reproducibility for each observer was quite high when comparing each of the three techniques (Table 1). The intraclass coefficient varied from 0.87 to 0.95 for Observer 1, 0.81 to 0.93 for Observer 2, and 0.83 to 0.96 for Observer 3. The intraclass coefficients were most consistent for Method 2 (ICC 0.93-0.96), measuring the angle from the inferior endplate of the vertebra above the degenerative disc to that below the degenerative disc. This was followed by Method 1 (ICC 0.88-0.91), measuring the angle from the inferior endplate of the vertebra above the degenerative disc to the superior endplate of the vertebra below. Method 3 (measuring the angle from the posterior edge of vertebra above the degenerative disc to that below the degenerative disc) produced the lowest intraclass coefficients of the three methods.
Intraobserver agreement (percent of repeated measures within 0.5 degree of the original measurement) ranged from 76%-96% for each technique for all three observers. The confidence interval was set at 95% (Table 2). Once again, the most consistent results overall were obtained with Method 2 (94%, 92%, and 96%). Method 3 showed the least agreement.
Using intraclass correlation coefficients for each measurement technique, paired comparisons between observers varied considerably (Table 3). Method 2 had the best interobserver reliability (ICC 0.82, CI: 0.74-0.89) and was the only method acceptable by statistical standards (0.80) as all other techniques fell well below this standard. Method 1 and Method 3 were consistently poor.
There was statistically significant correlation between Method 1 and Method 2 (r = 0.982, P < 0.05), Method 1 and Method 3 (r = 0.953, P < 0.05), and Method 2 and Method 3 (r = 0.945, P < 0.05).
Reliability of instability classification
Intraobserver reliability for the classification of instability was substantially better for Method 2 compared with the other two measurement techniques. The percentage agreement between the two ratings of instability was 98%, 94%, and 96% for Method 2, 88%, 90%, and 84% for Method 1, 86%, 82%, and 78% for Method 3 (Table 4).
Method 2 also demonstrated substantially higher interobserver reliability for the classification of instability. Interobserver percent agreements ranged from 90% to 98% for Method 2 versus 86% to 94% for Method 1 and 74% to 84% for Method 3. The overall percentage agreement for the three sets of rater pairs 96% for Method 2, 90% for Method 1, and 82% for Method 3 (Table 5).
It was demonstrated that digital measurement was precise and Method 2 is the most reliable and least variable measurement technique. The intraobserver and interobserver reliability were markedly higher for Method 2 than the other two measurements. As expected, intraobserver reliability tended to be higher than interobserver reliability.
One of the more popular measurement techniques is the Method 1. This method measures from the inferior endplate of the vertebra above the degenerative disc to the superior endplate of the vertebra below. Our study found that this method is variable. This appears to be secondary to including a smaller area over which to measure, which maximizes differences between measurements. Method 2 appeared to have the best interobserver reliability. Maybe it is easy to establish the inferior endplate of the vertebra below the degenerative disc. Taylor et al.  reported that interobserver agreement (kappa = 0.17) was poor with methods routinely used in clinical practice, and computer-assisted analysis improved agreement (kappa = 0.77). To date, there is no universal agreement on how to measure cervical segmental angulation. A reliable, reproducible measurement technique is imperative to provide meaningful interstudy evaluation and comparison. This study indicates that measuring from the inferior endplate of vertebra above the degenerative disc to the inferior endplate of vertebra the below the degenerative disc is most consistent in terms of intraobserver and interobserver reliability. Recognizing the inherent limitations to any radiographic measurement, cervical stability may reliably be evaluated.
Many authors have suggested the need for instrumented fusion if instability is present, indicating that clinical decisions could be influenced by measurements of sagittal rotation. As such, we compared intra- and interobserver agreement on the classification of sagittal instability, using the criteria of 10° of rotation. Method 2 agreed with their own ratings of instability 90% to 98% of the time compared with 76% to 80% agreement for Method 3. This suggests that on two separate occasions a surgeon could arrive at different treatment decisions based on the same flexion-extension radiographs up to 22% of the time because of the imprecision of Method 2. Use of Method 2 would likely reduce this rate of disagreement to less than 10%. The pattern of interobserver agreement on instability was similar, with the percent agreement ranging from 90% to 98% for Mehod 2, compared with 70% to 80% for Method 3. This indicates that two different surgeons evaluating the same radiographs could arrive at different treatment decisions up to 30% of the time using Method 3 compared with 10% of the time using Method 2.
A reliable, reproducible methodology for evaluating stability in cervical segment is important, because this determines the modality of management. If instability is found, surgical management will be preferred. Otherwise, conservative treatment should be taken into consideration. Certain radiographic measurements, such as sagittal translation and segmental angulation in flexion and extension, are used to evaluate the stability in cervical spine. Measurement parameters should be critically examined for both validity and reliability before they can be embraced into clinical practice. To ensure applicability for all practitioners caring for patients with cervical instability, one of the first key exercises is to demonstrate its reliability. This can be a complex exercise, often appearing cumbersome and lengthy. Yet statistical analysis of this type is integral to the adoption of any treatment algorithm. This research study was such a statistical exercise in determining reliability of these parameters in the assessment of sagittal rotation in cervical spine.
There is one limitation in this study. Although C5/6 is the commonest level involved in cervical degeneration, there are more cases with C4/5 degeneration and less with C5/6 degeneration in our study.
It was demonstrated that the intraobserver reliability was more consistent than interobserver reliability, regardless of the method used. Method 2 had better overall, intraobserver and interobserver reliability in assessing cervical sagittal rotation.
Kristjansson E, Leivseth G, Brinckmann P, Frobin W: Increased sagittal plane segmental motion in the lower cervical spine in women with chronic whiplash-associated disorders, grades I-II: a case–control study using a new measurement protocol. Spine (Phila Pa 1976). 2003, 28: 2215-2221. 10.1097/01.BRS.0000089525.59684.49.
Dvorak J, Antinnes JA, Panjabi M, Loustalot D, Bonomo M: Age and gender related normal motion of the cervical spine. Spine. 1992, 17: S393-S398. 10.1097/00007632-199210001-00009.
Hino H, Abumi K, Kanayama M, Kaneda K: Dynamic motion analysis of normal and unstable cervical spines using cineradiography. An in vivo study. Spine (Phila Pa 1976). 1999, 24: 163-168. 10.1097/00007632-199901150-00018.
Buonocore E, Hartman JT, Nelson CL: Cineradiograms of cervical spine in diagnosis of soft-tissue injuries. JAMA. 1966, 198: 143-147.
Dvorak J, Panjabi MM, Grob D, Novotny JE, Antinnes JA: Clinical validation of functional flexion/extension radiographs of the cervical spine. Spine. 1993, 18: 120-127. 10.1097/00007632-199301000-00018.
Dimnet J, Pasquet A, Krag MH, Panjabi MM: Cervical spine motion in the sagittal plane: kinematic and geometric parameters. J Biomech. 1982, 15: 959-969. 10.1016/0021-9290(82)90014-8.
Knopp R, Parker J, Tashjian J, Ganz W: Defining radiographic criteria for flexion-extension studies of the cervical spine. Ann Emerg Med. 2001, 38 (1): 31-35. 10.1067/mem.2001.114319.
Dai L: Disc degeneration and cervical instability. Correlation of magnetic resonance imaging with radiography. Spine. 1998, 23: 1734-1738. 10.1097/00007632-199808150-00005.
Brown T, Reitman CA, Nguyen L, Hipp JA: Intervertebral motion after incremental damage to the posterior structures of the cervical spine. Spine. 2005, 30: E503-E508. 10.1097/01.brs.0000176245.46965.e8.
Subramanian N, Reitman CA, Nguyen L, Hipp JA: Radiographic assessment and quantitative motion analysis of the cervical spine after serial sectioning of the anterior ligamentous structures. Spine. 2007, 32: 518-526. 10.1097/01.brs.0000256449.95667.13.
Harris MB, Kronlage SC, Carboni PA, Robert KQ, Menmuir B, Ricciardi JE, Chutkan NB: Evaluation of the cervical spine in the polytrauma patient. Spine. 2000, 25: 2884-2891. 10.1097/00007632-200011150-00008.
Wu SK, Jou JY, Lee HM, Chen HY, Su FC, Kuo LC: The reproducibility comparison of two intervertebral translation measurements in cervical flexion-extension. Spine J. in press
Kuklo TR, Potter BK, Schroeder TM, O'Brien MF: Comparison of manual and digital measurements in adolescent idiopathic scoliosis. Spine. 2006, 31: 1240-1246. 10.1097/01.brs.0000217774.13433.a7.
White AA, Panjabi MM: The problem of clinical instability in the human spine: a systematic approach. Clinical Biomechanics of the Spine. Edited by: White AAIII, Panjabi MM. 1990, Philadelphia: J.B. Lippincott, 277-378. 2
Penning L: Normal movements of the cervical spine. AJR Am J Roentgenol. 1978, 130: 317-326. 10.2214/ajr.130.2.317.
Harrison DE, Harrison DD, Cailliet R, Troyanovich SJ, Janik TJ, Holland B: Cobb method or Harrison posterior tangent method: which to choose for lateral cervical radiographic analysis. Spine. 2000, 25: 2072-2078. 10.1097/00007632-200008150-00011.
Mofidi A, Tansey C, Mahapatra SR, Mirza HA, Eisenstein SM: Cervical spondylolysis, radiologic pointers of stability and acute traumatic as opposed to chronic spondylolysis. J Spinal Disord Tech. 2007, 20: 473-479. 10.1097/BSD.0b013e31803bbb43.
Shrout PE, Fleiss JL: Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979, 86: 420-428.
Taylor M, Hipp JA, Gertzbein SD, Gopinath S, Reitman CA: Observer agreement in assessing flexion-extension X-rays of the cervical spine, with and without the use of quantitative measurements of intervertebral motion. Spine J. 2007, 7: 654-658. 10.1016/j.spinee.2006.10.017.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2474/15/332/prepub
This research was supported by a grant XYQ2011026 from Shanghai Municipal Commission of Health and Family Planning. The authors are indebted to all patients who participated in the study.
The authors declare that they have no competing interests.
SDJ was responsible for data collection, data analysis and drafting of the manuscript. JWC contributed to the design, critical revision and drafting of the manuscript. YHY helped with the design and the statistics. XDC contributed to the study design and critical revision. LSJ helped with data collection, data analysis and critical revisions and drafting of the manuscript. All of the authors have read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.