- Research article
- Open Access
- Open Peer Review
Visual detection of cortical breaks in hand joints: reliability and validity of high-resolution peripheral quantitative CT compared to microCT
BMC Musculoskeletal Disorders volume 17, Article number: 271 (2016)
To study the reliability and validity of high-resolution peripheral quantitative CT (HR-pQCT) with microCT (μCT) as gold standard in the visual detection of cortical breaks in metacarpophalangeal (MCP) and proximal interphalangeal (PIP) joints.
Ten cadaveric fingers (10 MCP and 9 PIP joints) were imaged by HR-pQCT and μCT and visually analyzed by two independent readers. Intra- and interreader reliability were evaluated for the presence (yes/no, kappa statistics) and the total number (intraclass correlation coefficient, ICC) of cortical breaks. Sensitivity, specificity, positive and negative predictive value (PPV respectively NPV) of HR-pQCT in detecting cortical breaks were calculated.
With HR-pQCT, mean 149 cortical breaks were identified and with μCT mean 129 (p < 0.05). Intrareader reliability for the presence of a cortical break per quadrant was 0.52 (95 % CI 0.48–0.56) and 0.71 (95 % CI 0.67–0.75) for HR-pQCT and μCT, respectively, and for the total number of cortical breaks 0.61 (95 % CI 0.49–0.70) and 0.75 (95 % CI 0.68–0.82). Interreader reliability for the presence of a cortical break per quadrant was 0.37 (95 % CI 0.33–0.41) and 0.45 (95 % CI 0.41–0.49) for HR-pQCT and μCT, respectively, and for the number of cortical breaks 0.55 (95 % CI 0.43–0.65) and 0.54 (95 % CI 0.35–0.67). Sensitivity, specificity, PPV and NPV of HR-pQCT were 81.6, 64.0, 81.6, and 64 % respectively.
Cortical breaks were commonly visualized in MCP and PIP joints with HR-pQCT and μCT. Reliability of both HR-pQCT and μCT was fair to moderate. HR-pQCT was highly sensitive to detect cortical breaks with μCT as gold standard.
Peri-articular cortical breaks are one of the characteristic features of bone involvement in rheumatoid arthritis (RA) and predictors of further radiographic progression [1, 2]. Early detection of cortical breaks is an important indicator for intensifying treatment in order to modify the disease course . In daily clinic, conventional radiographs (CR) are considered the gold standard for detection of cortical breaks in the hand joints in rheumatic diseases. CR is widely available, fast to perform, relatively cheap, and extensively validated, however its sensitivity to detect structural bone changes is low compared to computed tomography (CT), MRI and ultrasound [4–7]. A novel, sensitive imaging technique is High-Resolution peripheral Quantitative Computed Tomography (HR-pQCT) . HR-pQCT allows analysis of the cortical and trabecular microarchitecture of peripheral bones with an isotropic resolution of 82 micrometer (μm). This technique is now also applied for 3D assessment of the bone microarchitecture in the hand joints [8–10]. A study by Stach et al. has demonstrated that HR-pQCT is more sensitive than CR in detecting cortical breaks in the hand joints in RA and also in healthy controls . However, the resolution of the HR-pQCT images can be of the same order as the thickness of the cortical bone in finger joints. Due to partial volume effects, it is possible that thin cortices are falsely identified as breaks. Therefore, in particular with thin cortices, the reliability, sensitivity and specificity of the measurements might be impaired and depend on the reader’s perception.
The aims of this study were 1). to investigate the intra- and interreader reliability, and 2). to determine the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of HR-pQCT in detecting cortical breaks in hand joints, using μCT images with a much higher resolution (18 μm) as gold standard. We were particularly interested in the methodology of identifying cortical breaks by HR-pQCT, not to study the clinical value of these cortical breaks. We hypothesized that HR-pQCT is a reliable and sensitive imaging method for identifying cortical breaks in hand joints compared to μCT.
For this study, we used cadaveric specimen, because μCT imaging can only be executed in-vitro. MCP and PIP joints of ten female right hand human cadaveric index fingers were imaged by both HR-pQCT and μCT. The donors had dedicated their body by testament signed during life to the Department of Anatomy and Embryology of the University of Amsterdam, the Netherlands. The fingers were fixated in formalin.
HR-pQCT and μCT image acquisition
HR-pQCT (XtremeCT1, Scanco Medical AG, Switzerland) scans were performed at clinical in vivo settings, ie at 60 kVp tube voltage, 900 μA tube current, 100 ms integration time and 82 μm voxel size. μCT (μCT 80, Scanco Medical AG, Switzerland) scans were performed at 70 kVp tube voltage, 114 μA tube current, 300 ms integration time and 18 μm voxel size. On HR-pQCT, the region of interest of the MCP joint covered an area of 18.04 mm, 220 slices and for the PIP joint 9.02 mm, 110 slices.
On μCT, the region of interest covered an area of 15.26 mm; 848 slices and for the PIP joint 9.45 mm, 525 slices (Additional file 1: Figure S1) (see supplementary data available at BMC Musculoskeletal Disorders online).
HR-pQCT and μCT image analysis
Scans of HR-pQCT and μCT were exported in Digital Imaging and Communications in Medicine (DICOM) format and analyzed using Osirix (v.5.8.5 64-bit) multiplanar DICOM viewer. Differences in the extent of the scanned areas as well as in joint angles were noticed because the fingers were scanned horizontally on HR-pQCT and vertically on μCT. Corresponding first and last slices of the overlapping region were visually determined to ensure that the same region of interest was used in the detection of cortical breaks on both imaging modalities.
Two trained readers (AS and MP) independently scored the HR-pQCT and μCT images visually for the presence of cortical breaks. The readers received extensive training from Study grouP for xtrEme Computed Tomography in Rheumatoid Arthritis (SPECTRA) and have additional reading experience, before the current dataset was read. Readers were aware of the hypothesis of this study. The images were not anonymized. HR-pQCT images were first and independently scored from μCT images, with at least one day in between, and a two week interval for the rescoring by Reader 1.
A cortical break was defined as a clear disruption of the cortex, seen on two consecutive slices on two orthogonal planes (on transverse and on sagittal or coronal plane) on HR-pQCT, and similarly, but on nine consecutive slices on μCT to cover the same area as evaluated by HR-pQCT (Additional file 2: Figure S2).
To assess the location of the breaks in each joint, the transverse plane was divided into four quadrants: palmar, ulnar, dorsal and radial (Additional file 3: Figure S3). The phalangeal base and metacarpal head of the MCP joints, and proximal phalanx and distal phalanx of the PIP joints were separately assessed. In total eight quadrants per joint were analyzed, four in the proximal bone, and four in the distal bone of the same joint. Each joint was systematically analyzed per quadrant. Quadrants with large discrepancies between the readers (ie more than four breaks difference) were re-examined to identify reasons for discrepancy. Also, total volumetric bone mineral density (vBMD) of the specimens was calculated using HR-pQCT.
Descriptive analyses were done to calculate the total number of cortical breaks scored by the readers per quadrant for each imaging modality.
The difference in the total number of cortical breaks detected with HR-pQCT versus μCT was tested for statistical significance with Wilcoxon signed-rank test. Intra- and interreader reliability were calculated using Cohen’s Kappa (k) and intraclass correlations coefficient (ICC) with a two-way random model and absolute agreement. k value was calculated for the presence (yes/no) of a cortical break per quadrant and ICC values were calculated for the total number of cortical breaks per quadrant. k and ICC were calculated on the level of all available quadrants. Kappa values were also re-calculated, corrected for potential prevalence and bias within the kappa value (Prevalence-Adjusted Bias Adjusted Kappa, PABAK) [11, 12]. Reliability was rated according to Landis et al.: <0.00 poor, 0.00–0.20 slight, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, 0.81–1.00 almost perfect . Sensitivity, specificity, PPV, and NPV of HR-pQCT in the detection of cortical breaks were calculated with μCT as gold standard. The mean value of the two readings of reader 1 (AS) was used for this purpose. Statistical analyses were performed with SPSS Statistics for Windows version 23.0 (IBM Corp., Armonk, NY).
The mean age of the donors was 85.1 ± 9.6 years, the medical history was unknown. Average vBMD was 203 mgHA/cm3 for MCP joints, 293 mgHA/cm3 for PIP joints and 245 mgHA/cm3 for the total joints. The scans of ten MCP and nine PIP joints with in total 152 quadrants were available for analysis. One PIP joint could not be evaluated due to a missing μCT scan. Furthermore, Reader 1 considered the quality of the images of the metacarpal head in one MCP joint as too low on μCT due to a protocol error during scanning. Therefore four quadrants were excluded in the analyses of the μCT images.
Table 1 shows the total number of cortical breaks each reader found on HR-pQCT and μCT images. The differences in scores between the first and second reading of Reader 1 were not statistically significant on HR-pQCT (139 vs 118 breaks, p = .064) and μCT (142 vs 156 breaks, p = .163). However, the difference in the mean score between HR-pQCT versus μCT was statistically significant (respectively, 129 and 149 breaks, p < 0.05). The total number of cortical breaks on HR-pQCT scored by Reader 1 (first reading) versus Reader 2 was not statistically significant (respectively, 139 versus 151 breaks, p = .288). On μCT, Reader 2 found significantly more breaks than Reader 1 (first reading) (241 vs 142 breaks, p < .001). In total 4 quadrants with large discrepancies between the readers were re-examined. Several reasons for discrepancy were identified:
On μCT we defined a cortical break when present on nine consecutive slices. Sometimes Reader 1 observed the cortical break on eight consecutive slices, hence not considering it a break, whereas Reader 2 observed it on nine consecutive slices, thereby fulfilling the criteria for a break.
The smaller the break, the less agreement between the readers. An example of this discrepancy is shown in Additional file 4: Figure S4 panel A and B.
Due to the low bone mineral density and thin cortices, there was low contrast in some cases (example in Additional file 4: Figure S4 panel C).
Reader 1 considered a break as one large break, whereas Reader 2 counted several small cortical breaks inside the same large break (example in Additional file 4: Figure S4 panel D).
Table 2 shows the intra- and interreader reliability based on the 152 quadrants on HR-pQCT and 148 quadrants on μCT that were evaluated. Intrareader reliability was moderate to substantial for the presence of breaks (HR-pQCT: k = 0.52 and μCT: k = 0.71) and for the number of breaks (HR-pQCT: ICC = 0.61 and μCT: ICC = 0.75).
Interreader reliability was fair to moderate for the presence of breaks (HR-pQCT: k = 0.37 and μCT: k = 0.45,) and for the number of breaks (HR-pQCT: ICC = 0.55 and μCT: ICC = 0.54). The values of PABAK were comparable (Table 2).
Sensitivity, specificity, PPV and NPV of HR-pQCT in the detection of cortical breaks with μCT as gold standard were calculated with the mean scores of Reader 1 (reading 1 and 2) and the score of Reader 2. The sensitivity was 81.6 %, specificity 64 %, PPV 81.6 %, and NPV 64 % respectively for Reader 1, and sensitivity 68.9 %, specificity 69.4 %, PPV 82.6 % and NPV 51.5 % for Reader 2 (Table 3 and Additional file 5).
In Fig. 1, several examples of cortical breaks on corresponding HR-pQCT and μCT images are presented. Panel A and B show a cortical break on both HR-pQCT and μCT. In panel C, a discontinuity of the cortex is found on HR-pQCT. However, it did not meet the definition of a cortical break applied in this study, because it was visible on one slice only, leading to discrepancy with the results from μCT. In panel D, a cortical break was detected on HR-pQCT, but not on μCT, where a thin cortical lining was seen.
This study is the first that reports on aspects of reliability and validity of detecting cortical breaks in hand joints using HR-pQCT with μCT as gold standard. Cortical breaks were found in all joints with both imaging modalities. Intrareader reliability of HR-pQCT and μCT was moderate to substantial, while interreader reliability was fair to moderate. The sensitivity of HR-pQCT in detecting cortical breaks was high (81.6 %).
In our study, only cortical discontinuities meeting the definition of a break, ie clearly visible cortical interruptions on at least 2 HR-pQCT (or 9 μCT) consecutive slices in two planes, were scored as a cortical break. They may have different pathological or physiological backgrounds, such as erosions or vascular channels [8, 9]. A formal classification system for defining breaks visualized on HR-pQCT and μCT is lacking. Histological examination is needed to provide more insight in the nature of these cortical breaks and for developing definitions.
In a previous study by Stach et al. an almost perfect and substantial intra- and interreader reliability was reported using HR-pQCT for grading bone lesions and discrimination between healthy individuals and RA patients (k = 0.82 and k = 0.75 respectively), but reliability on the presence and number of cortical breaks was not reported . The precision in scoring abnormalities visualized with several imaging techniques varies widely, even by experienced readers, as has been demonstrated for example for scoring radiographs in RA (ICC ranged from 0.65 to 0.99) . In general, lower values for interreader reliability in comparison with intrareader reliability are reported , corresponding to our findings. In our study, the breaks were scored visually, which is reader dependent. An automatic scoring algorithm, with detection of pre-defined definitions of breaks and executed automatically by the computer, could potentially improve reliability by minimizing reader interventions.
We investigated the sensitivity of HR-pQCT in detecting cortical breaks with μCT as the gold standard and found a high sensitivity (81.6 %). Unfortunately, no comparative studies are available. In contrast, two studies used HR-pQCT as the reference method for investigating the sensitivity to detect cortical breaks of other imaging modalities [9, 16]. These studies reported a sensitivity of 85.7 % for MRI, 60.9 % for CR, and 83–100 % for ultrasound with HR-pQCT as the reference method [9, 16]. We found a lower specificity of HR-pQCT in detecting cortical breaks (64 %) in comparison to sensitivity. A possible explanation for this could be a phenomenon attributed to a partial volume effect leading to a reduced cortical signal on HR-pQCT, giving the impression that a cortical break is present, whereas on μCT the cortex is intact. An example of this is shown in Fig. 1, panel D.
There are several limitations of this study. First, we evaluated whether the total number of breaks counted per quadrant corresponded between the two imaging modalities, but did not consider correspondence in exactly the same location. This might have led to an overestimation of the reliability. Second, we used fingers from cadaver specimens with unknown medical history and a relatively high mean age (85.1 years). Due to the old age of the donors, and the preservation in formalin, the cortices might become less mineralized . The average vBMD of the specimens was 245 mgHA/cm3, which is some 20 % lower than the average in the normal population (>300 mgHA/cm3) [10, 18]. This may hamper the scoring of a cortical break on HR-pQCT. It is also possible that thin regions were falsely identified as a cortical break. However, the use of cadaveric specimens was essential as in vivo human subjects cannot be measured by μCT because of a long scanning time. Third, the cadaveric specimens had slightly different orientations in the HR-pQCT versus the μCT scanner. Despite the careful visual matching of the regions of interest on HR-pQCT and μCT, the angle at which the transversal images were viewed was slightly different in some joints and a cortical break might therefore be missed. Fourth, a discrepancy between the readers regarding the number of cortical breaks identified on μCT was noticed. μCT images provide much detail, and in particular very small cortical interruptions were not always picked up by Reader 1. This indicates that, when visually analyzing μCT images, more stringent definitions are necessary than when using HR-pQCT because of the higher resolution.
Cortical breaks were commonly visualized in hand joints with HR-pQCT and μCT. Reliability of both HR-pQCT and μCT was fair to moderate. HR-pQCT was sensitive to detect cortical breaks with μCT as gold standard. In spite of the limitations of our study, including the discrepancy of μCT results between the readers, we have shown that HR-pQCT is highly sensitive to detect cortical breaks with a fair to moderate reliability compared to μCT. Our findings need further evaluation, preferably with focus on histological analyses to clarify the nature of the breaks and to establish more reliable definitions and a classification system for analyzing cortical breaks on high-resolution CT images.
CR, conventional radiographs; CT, computed tomography; DICOM, digital imaging and communications in medicine; HR-pQCT, high-resolution peripheral quantitative computed tomography; ICC, intraclass correlation coefficient; MCP, metacarpophalangeal; PIP, proximal interphalangeal; PPV, positive predictive value; RA, rheumatoid arthritis; k, Cohen’s Kappa; μCT, microCT; μm, micrometer
Schett G, Gravallese E. Bone erosion in rheumatoid arthritis: mechanisms, diagnosis and treatment. Nat Rev Rheumatol. 2012;8(11):656–64. doi:10.1038/nrrheum.2012.153. published Online First: Epub Date.
Geusens P, van den Bergh J. Bone erosions in rheumatoid arthritis. Rheumatology (Oxford). 2014;53(1):4–5. doi:10.1093/rheumatology/ket358. published Online First: Epub Date.
Sommer OJ, Kladosek A, Weiler V, Czembirek H, Boeck M, Stiskal M. Rheumatoid arthritis: a practical guide to state-of-the-art imaging, image interpretation, and clinical implications. Radiographics. 2005;25(2):381–98. doi:10.1148/rg.252045111. published Online First: Epub Date.
Baillet A, Gaujoux-Viala C, Mouterde G, et al. Comparison of the efficacy of sonography, magnetic resonance imaging and conventional radiography for the detection of bone erosions in rheumatoid arthritis patients: a systematic review and meta-analysis. Rheumatology (Oxford). 2011;50(6):1137–47. doi:10.1093/rheumatology/keq437. published Online First: Epub Date.
Geusens P, Chapurlat R, Schett G, et al. High-resolution in vivo imaging of bone and joints: a window to microarchitecture. Nat Rev Rheumatol. 2014;10(5):304–13. doi:10.1038/nrrheum.2014.23. published Online First: Epub Date.
Saraux A, Berthelot JM, Chales G, et al. Ability of the American College of Rheumatology 1987 criteria to predict rheumatoid arthritis in patients with early arthritis and classification of these patients two years later. Arthritis Rheum. 2001;44(11):2485–91.
Dohn UM, Ejbjerg BJ, Hasselquist M, et al. Detection of bone erosions in rheumatoid arthritis wrist joints with magnetic resonance imaging, computed tomography and radiography. Arthritis Res Ther. 2008;10(1):R25. doi:10.1186/ar2378. published Online First: Epub Date.
Stach CM, Bauerle M, Englbrecht M, et al. Periarticular bone structure in rheumatoid arthritis patients and healthy individuals assessed by high-resolution computed tomography. Arthritis Rheum. 2010;62(2):330–9. doi:10.1002/art.27252. published Online First: Epub Date.
Finzel S, Ohrndorf S, Englbrecht M, et al. A detailed comparative study of high-resolution ultrasound and micro-computed tomography for detection of arthritic bone erosions. Arthritis Rheum. 2011;63(5):1231–6. doi:10.1002/art.30285. published Online First: Epub Date.
Fouque-Aubert A, Boutroy S, Marotte H, et al. Assessment of hand bone loss in rheumatoid arthritis by high-resolution peripheral quantitative CT. Ann Rheum Dis. 2010;69(9):1671–6. doi:10.1136/ard.2009.114512. published Online First: Epub Date.
Sim J, Wright CC. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys Ther. 2005;85(3):257–68.
Byrt T, Bishop J, Carlin JB. Bias, prevalence and kappa. J Clin Epidemiol. 1993;46(5):423–9.
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–74.
Sharp JT, Wolfe F, Lassere M, et al. Variability of precision in scoring radiographic abnormalities in rheumatoid arthritis by experienced readers. J Rheumatol. 2004;31(6):1062–72.
Salaffi F, Carotti M. Interobserver variation in quantitative analysis of hand radiographs in rheumatoid arthritis: comparison of 3 different reading procedures. J Rheumatol. 1997;24(10):2055–6.
Lee CH, Srikhum W, Burghardt AJ, et al. Correlation of structural abnormalities of the wrist and metacarpophalangeal joints evaluated by high-resolution peripheral quantitative computed tomography, 3 Tesla magnetic resonance imaging and conventional radiographs in rheumatoid arthritis. Int J Rheum Dis 2014 doi: 10.1111/1756-185X.12495 [published Online First: Epub Date]|.
Zebaze RM, Ghasem-Zadeh A, Bohte A, et al. Intracortical remodelling and porosity in the distal radius and post-mortem femurs of women: a cross-sectional study. Lancet. 2010;375(9727):1729–36. doi:10.1016/S0140-6736(10)60320-0. published Online First: Epub Date.
Feehan L, Buie H, Li L, McKay H. A customized protocol to assess bone quality in the metacarpal head, metacarpal shaft and distal radius: a high resolution peripheral quantitative computed tomography precision study. BMC Musculoskelet Disord. 2013;14:367. doi:10.1186/1471-2474-14-367. published Online First: Epub Date.
This study did not have any funding.
Availability of data and materials
Data are available upon request.
AvT, PG, JvdB, BvR designed the study. AS and MP were responsible for the scoring. AS, AvT and PG analyzed the data. All authors were involved in discussing and interpreting the results, commented on the draft version of the manuscript and approved the final version.
Bert van Rietbergen is a consultant for Scanco Medical AG. The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
For this study cadaveric specimens were used. A handwritten and signed codicil from each donor, posed when still alive and well, is kept at the Department of Anatomy and Embryology, University of Amsterdam, Amsterdam, The Netherlands. This is required by Dutch law for the use of cadavers for scientific research and education.
Method of selection of regions of interest. Method of selection of regions of interest in an MCP and PIP joint on HR-pQCT and μCT. Total scan area for an MCP joint on HR-pQCT was 18.04 mm and for a PIP joint 9.02 mm. Total scan area for an MCP joint on μCT was 15.26 mm and for a PIP joint 9.45 mm. Abbreviations: HR-pQCT; high-resolution peripheral quantitative computed tomography, μCT; micro computed tomography, MCP; metacarpophalangeal, PIP; proximal interphalangeal. (TIF 11020 kb)
Resolution of HR-pQCT and μCT imaging. Resolution of HR-pQCT imaging is 82 μm, while the resolution of μCT imaging is 18 μm. To match both resolutions, two consecutive slices on HR-pQCT correspond with 9 consecutive slices on μCT. Abbreviations: HR-pQCT; high-resolution peripheral quantitative computed tomography, μCT; microCT. (TIF 3952 kb)
Division of a phalangeal base of an MCP joint. Phalangeal base of an MCP joint divided into palmar, ulnar, dorsal and radial quadrants. Abbreviations: MCP; metacarpophalangeal (TIF 9718 kb)
Examples of μCT images. Examples of μCT images that could have attributed to the differences in scoring cortical breaks between Reader 1 and 2. Panel a. Large cortical breaks (arrow) show high agreement. Panel b. Small cortical breaks (arrow) show less agreement. Panel c. An extremely thin cortex (arrow). Panel d. One large break (in brackets) was counted by Reader 1, where Reader 2 considered this as several smaller cortical breaks (arrow). Abbreviations: μCT; microCT. (PNG 460 kb)
2 × 2 contingency tables. 2 × 2 contingency tables for Reader 1 (all joints and MCP and PIP separately) and Reader 2 (all joints and MCP and PIP joints separately). (DOCX 17 kb)
About this article
Cite this article
Scharmga, A., Peters, M., van Tubergen, A. et al. Visual detection of cortical breaks in hand joints: reliability and validity of high-resolution peripheral quantitative CT compared to microCT. BMC Musculoskelet Disord 17, 271 (2016) doi:10.1186/s12891-016-1148-y
- Computed tomography
- Rheumatoid arthritis