Visual detection of cortical breaks in hand joints: reliability and validity of high-resolution peripheral quantitative CT compared to microCT
© The Author(s). 2016
Received: 7 April 2016
Accepted: 29 June 2016
Published: 11 July 2016
To study the reliability and validity of high-resolution peripheral quantitative CT (HR-pQCT) with microCT (μCT) as gold standard in the visual detection of cortical breaks in metacarpophalangeal (MCP) and proximal interphalangeal (PIP) joints.
Ten cadaveric fingers (10 MCP and 9 PIP joints) were imaged by HR-pQCT and μCT and visually analyzed by two independent readers. Intra- and interreader reliability were evaluated for the presence (yes/no, kappa statistics) and the total number (intraclass correlation coefficient, ICC) of cortical breaks. Sensitivity, specificity, positive and negative predictive value (PPV respectively NPV) of HR-pQCT in detecting cortical breaks were calculated.
With HR-pQCT, mean 149 cortical breaks were identified and with μCT mean 129 (p < 0.05). Intrareader reliability for the presence of a cortical break per quadrant was 0.52 (95 % CI 0.48–0.56) and 0.71 (95 % CI 0.67–0.75) for HR-pQCT and μCT, respectively, and for the total number of cortical breaks 0.61 (95 % CI 0.49–0.70) and 0.75 (95 % CI 0.68–0.82). Interreader reliability for the presence of a cortical break per quadrant was 0.37 (95 % CI 0.33–0.41) and 0.45 (95 % CI 0.41–0.49) for HR-pQCT and μCT, respectively, and for the number of cortical breaks 0.55 (95 % CI 0.43–0.65) and 0.54 (95 % CI 0.35–0.67). Sensitivity, specificity, PPV and NPV of HR-pQCT were 81.6, 64.0, 81.6, and 64 % respectively.
Cortical breaks were commonly visualized in MCP and PIP joints with HR-pQCT and μCT. Reliability of both HR-pQCT and μCT was fair to moderate. HR-pQCT was highly sensitive to detect cortical breaks with μCT as gold standard.
KeywordsImaging Computed tomography Hand Bone Rheumatoid arthritis
Peri-articular cortical breaks are one of the characteristic features of bone involvement in rheumatoid arthritis (RA) and predictors of further radiographic progression [1, 2]. Early detection of cortical breaks is an important indicator for intensifying treatment in order to modify the disease course . In daily clinic, conventional radiographs (CR) are considered the gold standard for detection of cortical breaks in the hand joints in rheumatic diseases. CR is widely available, fast to perform, relatively cheap, and extensively validated, however its sensitivity to detect structural bone changes is low compared to computed tomography (CT), MRI and ultrasound [4–7]. A novel, sensitive imaging technique is High-Resolution peripheral Quantitative Computed Tomography (HR-pQCT) . HR-pQCT allows analysis of the cortical and trabecular microarchitecture of peripheral bones with an isotropic resolution of 82 micrometer (μm). This technique is now also applied for 3D assessment of the bone microarchitecture in the hand joints [8–10]. A study by Stach et al. has demonstrated that HR-pQCT is more sensitive than CR in detecting cortical breaks in the hand joints in RA and also in healthy controls . However, the resolution of the HR-pQCT images can be of the same order as the thickness of the cortical bone in finger joints. Due to partial volume effects, it is possible that thin cortices are falsely identified as breaks. Therefore, in particular with thin cortices, the reliability, sensitivity and specificity of the measurements might be impaired and depend on the reader’s perception.
The aims of this study were 1). to investigate the intra- and interreader reliability, and 2). to determine the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of HR-pQCT in detecting cortical breaks in hand joints, using μCT images with a much higher resolution (18 μm) as gold standard. We were particularly interested in the methodology of identifying cortical breaks by HR-pQCT, not to study the clinical value of these cortical breaks. We hypothesized that HR-pQCT is a reliable and sensitive imaging method for identifying cortical breaks in hand joints compared to μCT.
For this study, we used cadaveric specimen, because μCT imaging can only be executed in-vitro. MCP and PIP joints of ten female right hand human cadaveric index fingers were imaged by both HR-pQCT and μCT. The donors had dedicated their body by testament signed during life to the Department of Anatomy and Embryology of the University of Amsterdam, the Netherlands. The fingers were fixated in formalin.
HR-pQCT and μCT image acquisition
HR-pQCT (XtremeCT1, Scanco Medical AG, Switzerland) scans were performed at clinical in vivo settings, ie at 60 kVp tube voltage, 900 μA tube current, 100 ms integration time and 82 μm voxel size. μCT (μCT 80, Scanco Medical AG, Switzerland) scans were performed at 70 kVp tube voltage, 114 μA tube current, 300 ms integration time and 18 μm voxel size. On HR-pQCT, the region of interest of the MCP joint covered an area of 18.04 mm, 220 slices and for the PIP joint 9.02 mm, 110 slices.
On μCT, the region of interest covered an area of 15.26 mm; 848 slices and for the PIP joint 9.45 mm, 525 slices (Additional file 1: Figure S1) (see supplementary data available at BMC Musculoskeletal Disorders online).
HR-pQCT and μCT image analysis
Scans of HR-pQCT and μCT were exported in Digital Imaging and Communications in Medicine (DICOM) format and analyzed using Osirix (v.5.8.5 64-bit) multiplanar DICOM viewer. Differences in the extent of the scanned areas as well as in joint angles were noticed because the fingers were scanned horizontally on HR-pQCT and vertically on μCT. Corresponding first and last slices of the overlapping region were visually determined to ensure that the same region of interest was used in the detection of cortical breaks on both imaging modalities.
Two trained readers (AS and MP) independently scored the HR-pQCT and μCT images visually for the presence of cortical breaks. The readers received extensive training from Study grouP for xtrEme Computed Tomography in Rheumatoid Arthritis (SPECTRA) and have additional reading experience, before the current dataset was read. Readers were aware of the hypothesis of this study. The images were not anonymized. HR-pQCT images were first and independently scored from μCT images, with at least one day in between, and a two week interval for the rescoring by Reader 1.
A cortical break was defined as a clear disruption of the cortex, seen on two consecutive slices on two orthogonal planes (on transverse and on sagittal or coronal plane) on HR-pQCT, and similarly, but on nine consecutive slices on μCT to cover the same area as evaluated by HR-pQCT (Additional file 2: Figure S2).
To assess the location of the breaks in each joint, the transverse plane was divided into four quadrants: palmar, ulnar, dorsal and radial (Additional file 3: Figure S3). The phalangeal base and metacarpal head of the MCP joints, and proximal phalanx and distal phalanx of the PIP joints were separately assessed. In total eight quadrants per joint were analyzed, four in the proximal bone, and four in the distal bone of the same joint. Each joint was systematically analyzed per quadrant. Quadrants with large discrepancies between the readers (ie more than four breaks difference) were re-examined to identify reasons for discrepancy. Also, total volumetric bone mineral density (vBMD) of the specimens was calculated using HR-pQCT.
Descriptive analyses were done to calculate the total number of cortical breaks scored by the readers per quadrant for each imaging modality.
The difference in the total number of cortical breaks detected with HR-pQCT versus μCT was tested for statistical significance with Wilcoxon signed-rank test. Intra- and interreader reliability were calculated using Cohen’s Kappa (k) and intraclass correlations coefficient (ICC) with a two-way random model and absolute agreement. k value was calculated for the presence (yes/no) of a cortical break per quadrant and ICC values were calculated for the total number of cortical breaks per quadrant. k and ICC were calculated on the level of all available quadrants. Kappa values were also re-calculated, corrected for potential prevalence and bias within the kappa value (Prevalence-Adjusted Bias Adjusted Kappa, PABAK) [11, 12]. Reliability was rated according to Landis et al.: <0.00 poor, 0.00–0.20 slight, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, 0.81–1.00 almost perfect . Sensitivity, specificity, PPV, and NPV of HR-pQCT in the detection of cortical breaks were calculated with μCT as gold standard. The mean value of the two readings of reader 1 (AS) was used for this purpose. Statistical analyses were performed with SPSS Statistics for Windows version 23.0 (IBM Corp., Armonk, NY).
The mean age of the donors was 85.1 ± 9.6 years, the medical history was unknown. Average vBMD was 203 mgHA/cm3 for MCP joints, 293 mgHA/cm3 for PIP joints and 245 mgHA/cm3 for the total joints. The scans of ten MCP and nine PIP joints with in total 152 quadrants were available for analysis. One PIP joint could not be evaluated due to a missing μCT scan. Furthermore, Reader 1 considered the quality of the images of the metacarpal head in one MCP joint as too low on μCT due to a protocol error during scanning. Therefore four quadrants were excluded in the analyses of the μCT images.
Number of cortical breaks scored per imaging modality
Reader 1 (first reading)
139 (7.3 ± 4.1)
142 (7.4 ± 4.0)
p = .562
Reader 1 (second reading)
118 (6.2 ± 2.6)
156 (8.2 ± 3.6)
p < .000
Mean score Reader 1
129 (6.7 ± 3.0)
149 (7.8 ± 3.6)
p = .018
151 (7.9 ± 3.3)
241 (12.6 ± 6.3)
p < .000
On μCT we defined a cortical break when present on nine consecutive slices. Sometimes Reader 1 observed the cortical break on eight consecutive slices, hence not considering it a break, whereas Reader 2 observed it on nine consecutive slices, thereby fulfilling the criteria for a break.
The smaller the break, the less agreement between the readers. An example of this discrepancy is shown in Additional file 4: Figure S4 panel A and B.
Due to the low bone mineral density and thin cortices, there was low contrast in some cases (example in Additional file 4: Figure S4 panel C).
Reader 1 considered a break as one large break, whereas Reader 2 counted several small cortical breaks inside the same large break (example in Additional file 4: Figure S4 panel D).
Intra- and interreader reliability per imaging modality
Intrareader (reader 1)
Interreader (reader 1 first reading versus reader 2)
k (95 % CI)
PABAK (95 % CI)
ICC (95 % CI)
k (95 % CI)
PABAK (95 % CI)
ICC (95 % CI)
.52 (0.48 to 0.56)
.53 (0.39 to 0.66)
0.61 (0.49 to 0.70)
.37 (0.33 to 0.41)
.38 (0.23 to 0.53)
0.55 (0.43 to 0.65)
.71 (0.67 to 0.75)
.72 (0.60 to 0.83)
0.75 (0.68 to 0.82)
.45 (0.41 to 0.49)
.47 (0.33 to 0.61)
0.54 (0.35 to 0.67)
Interreader reliability was fair to moderate for the presence of breaks (HR-pQCT: k = 0.37 and μCT: k = 0.45,) and for the number of breaks (HR-pQCT: ICC = 0.55 and μCT: ICC = 0.54). The values of PABAK were comparable (Table 2).
Sensitivity, specificity, positive predictive value and negative predictive value of HR-pQCT
Total number of breaks on HR-pQCT
Total number of breaks on μCT
Positive predictive value
Negative predictive value
MCP reader 1
MCP reader 2
PIP reader 1
PIP reader 2
All reader 1
All reader 2
This study is the first that reports on aspects of reliability and validity of detecting cortical breaks in hand joints using HR-pQCT with μCT as gold standard. Cortical breaks were found in all joints with both imaging modalities. Intrareader reliability of HR-pQCT and μCT was moderate to substantial, while interreader reliability was fair to moderate. The sensitivity of HR-pQCT in detecting cortical breaks was high (81.6 %).
In our study, only cortical discontinuities meeting the definition of a break, ie clearly visible cortical interruptions on at least 2 HR-pQCT (or 9 μCT) consecutive slices in two planes, were scored as a cortical break. They may have different pathological or physiological backgrounds, such as erosions or vascular channels [8, 9]. A formal classification system for defining breaks visualized on HR-pQCT and μCT is lacking. Histological examination is needed to provide more insight in the nature of these cortical breaks and for developing definitions.
In a previous study by Stach et al. an almost perfect and substantial intra- and interreader reliability was reported using HR-pQCT for grading bone lesions and discrimination between healthy individuals and RA patients (k = 0.82 and k = 0.75 respectively), but reliability on the presence and number of cortical breaks was not reported . The precision in scoring abnormalities visualized with several imaging techniques varies widely, even by experienced readers, as has been demonstrated for example for scoring radiographs in RA (ICC ranged from 0.65 to 0.99) . In general, lower values for interreader reliability in comparison with intrareader reliability are reported , corresponding to our findings. In our study, the breaks were scored visually, which is reader dependent. An automatic scoring algorithm, with detection of pre-defined definitions of breaks and executed automatically by the computer, could potentially improve reliability by minimizing reader interventions.
We investigated the sensitivity of HR-pQCT in detecting cortical breaks with μCT as the gold standard and found a high sensitivity (81.6 %). Unfortunately, no comparative studies are available. In contrast, two studies used HR-pQCT as the reference method for investigating the sensitivity to detect cortical breaks of other imaging modalities [9, 16]. These studies reported a sensitivity of 85.7 % for MRI, 60.9 % for CR, and 83–100 % for ultrasound with HR-pQCT as the reference method [9, 16]. We found a lower specificity of HR-pQCT in detecting cortical breaks (64 %) in comparison to sensitivity. A possible explanation for this could be a phenomenon attributed to a partial volume effect leading to a reduced cortical signal on HR-pQCT, giving the impression that a cortical break is present, whereas on μCT the cortex is intact. An example of this is shown in Fig. 1, panel D.
There are several limitations of this study. First, we evaluated whether the total number of breaks counted per quadrant corresponded between the two imaging modalities, but did not consider correspondence in exactly the same location. This might have led to an overestimation of the reliability. Second, we used fingers from cadaver specimens with unknown medical history and a relatively high mean age (85.1 years). Due to the old age of the donors, and the preservation in formalin, the cortices might become less mineralized . The average vBMD of the specimens was 245 mgHA/cm3, which is some 20 % lower than the average in the normal population (>300 mgHA/cm3) [10, 18]. This may hamper the scoring of a cortical break on HR-pQCT. It is also possible that thin regions were falsely identified as a cortical break. However, the use of cadaveric specimens was essential as in vivo human subjects cannot be measured by μCT because of a long scanning time. Third, the cadaveric specimens had slightly different orientations in the HR-pQCT versus the μCT scanner. Despite the careful visual matching of the regions of interest on HR-pQCT and μCT, the angle at which the transversal images were viewed was slightly different in some joints and a cortical break might therefore be missed. Fourth, a discrepancy between the readers regarding the number of cortical breaks identified on μCT was noticed. μCT images provide much detail, and in particular very small cortical interruptions were not always picked up by Reader 1. This indicates that, when visually analyzing μCT images, more stringent definitions are necessary than when using HR-pQCT because of the higher resolution.
Cortical breaks were commonly visualized in hand joints with HR-pQCT and μCT. Reliability of both HR-pQCT and μCT was fair to moderate. HR-pQCT was sensitive to detect cortical breaks with μCT as gold standard. In spite of the limitations of our study, including the discrepancy of μCT results between the readers, we have shown that HR-pQCT is highly sensitive to detect cortical breaks with a fair to moderate reliability compared to μCT. Our findings need further evaluation, preferably with focus on histological analyses to clarify the nature of the breaks and to establish more reliable definitions and a classification system for analyzing cortical breaks on high-resolution CT images.
CR, conventional radiographs; CT, computed tomography; DICOM, digital imaging and communications in medicine; HR-pQCT, high-resolution peripheral quantitative computed tomography; ICC, intraclass correlation coefficient; MCP, metacarpophalangeal; PIP, proximal interphalangeal; PPV, positive predictive value; RA, rheumatoid arthritis; k, Cohen’s Kappa; μCT, microCT; μm, micrometer
This study did not have any funding.
Availability of data and materials
Data are available upon request.
AvT, PG, JvdB, BvR designed the study. AS and MP were responsible for the scoring. AS, AvT and PG analyzed the data. All authors were involved in discussing and interpreting the results, commented on the draft version of the manuscript and approved the final version.
Bert van Rietbergen is a consultant for Scanco Medical AG. The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
For this study cadaveric specimens were used. A handwritten and signed codicil from each donor, posed when still alive and well, is kept at the Department of Anatomy and Embryology, University of Amsterdam, Amsterdam, The Netherlands. This is required by Dutch law for the use of cadavers for scientific research and education.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Schett G, Gravallese E. Bone erosion in rheumatoid arthritis: mechanisms, diagnosis and treatment. Nat Rev Rheumatol. 2012;8(11):656–64. doi:https://doi.org/10.1038/nrrheum.2012.153. published Online First: Epub Date.View ArticlePubMedPubMed CentralGoogle Scholar
- Geusens P, van den Bergh J. Bone erosions in rheumatoid arthritis. Rheumatology (Oxford). 2014;53(1):4–5. doi:https://doi.org/10.1093/rheumatology/ket358. published Online First: Epub Date.View ArticleGoogle Scholar
- Sommer OJ, Kladosek A, Weiler V, Czembirek H, Boeck M, Stiskal M. Rheumatoid arthritis: a practical guide to state-of-the-art imaging, image interpretation, and clinical implications. Radiographics. 2005;25(2):381–98. doi:https://doi.org/10.1148/rg.252045111. published Online First: Epub Date.View ArticlePubMedGoogle Scholar
- Baillet A, Gaujoux-Viala C, Mouterde G, et al. Comparison of the efficacy of sonography, magnetic resonance imaging and conventional radiography for the detection of bone erosions in rheumatoid arthritis patients: a systematic review and meta-analysis. Rheumatology (Oxford). 2011;50(6):1137–47. doi:https://doi.org/10.1093/rheumatology/keq437. published Online First: Epub Date.View ArticleGoogle Scholar
- Geusens P, Chapurlat R, Schett G, et al. High-resolution in vivo imaging of bone and joints: a window to microarchitecture. Nat Rev Rheumatol. 2014;10(5):304–13. doi:https://doi.org/10.1038/nrrheum.2014.23. published Online First: Epub Date.View ArticlePubMedGoogle Scholar
- Saraux A, Berthelot JM, Chales G, et al. Ability of the American College of Rheumatology 1987 criteria to predict rheumatoid arthritis in patients with early arthritis and classification of these patients two years later. Arthritis Rheum. 2001;44(11):2485–91.View ArticlePubMedGoogle Scholar
- Dohn UM, Ejbjerg BJ, Hasselquist M, et al. Detection of bone erosions in rheumatoid arthritis wrist joints with magnetic resonance imaging, computed tomography and radiography. Arthritis Res Ther. 2008;10(1):R25. doi:https://doi.org/10.1186/ar2378. published Online First: Epub Date.View ArticlePubMedPubMed CentralGoogle Scholar
- Stach CM, Bauerle M, Englbrecht M, et al. Periarticular bone structure in rheumatoid arthritis patients and healthy individuals assessed by high-resolution computed tomography. Arthritis Rheum. 2010;62(2):330–9. doi:https://doi.org/10.1002/art.27252. published Online First: Epub Date.PubMedGoogle Scholar
- Finzel S, Ohrndorf S, Englbrecht M, et al. A detailed comparative study of high-resolution ultrasound and micro-computed tomography for detection of arthritic bone erosions. Arthritis Rheum. 2011;63(5):1231–6. doi:https://doi.org/10.1002/art.30285. published Online First: Epub Date.View ArticlePubMedGoogle Scholar
- Fouque-Aubert A, Boutroy S, Marotte H, et al. Assessment of hand bone loss in rheumatoid arthritis by high-resolution peripheral quantitative CT. Ann Rheum Dis. 2010;69(9):1671–6. doi:https://doi.org/10.1136/ard.2009.114512. published Online First: Epub Date.View ArticlePubMedGoogle Scholar
- Sim J, Wright CC. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys Ther. 2005;85(3):257–68.PubMedGoogle Scholar
- Byrt T, Bishop J, Carlin JB. Bias, prevalence and kappa. J Clin Epidemiol. 1993;46(5):423–9.View ArticlePubMedGoogle Scholar
- Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–74.View ArticlePubMedGoogle Scholar
- Sharp JT, Wolfe F, Lassere M, et al. Variability of precision in scoring radiographic abnormalities in rheumatoid arthritis by experienced readers. J Rheumatol. 2004;31(6):1062–72.PubMedGoogle Scholar
- Salaffi F, Carotti M. Interobserver variation in quantitative analysis of hand radiographs in rheumatoid arthritis: comparison of 3 different reading procedures. J Rheumatol. 1997;24(10):2055–6.PubMedGoogle Scholar
- Lee CH, Srikhum W, Burghardt AJ, et al. Correlation of structural abnormalities of the wrist and metacarpophalangeal joints evaluated by high-resolution peripheral quantitative computed tomography, 3 Tesla magnetic resonance imaging and conventional radiographs in rheumatoid arthritis. Int J Rheum Dis 2014 doi: https://doi.org/10.1111/1756-185X.12495 [published Online First: Epub Date]|.
- Zebaze RM, Ghasem-Zadeh A, Bohte A, et al. Intracortical remodelling and porosity in the distal radius and post-mortem femurs of women: a cross-sectional study. Lancet. 2010;375(9727):1729–36. doi:https://doi.org/10.1016/S0140-6736(10)60320-0. published Online First: Epub Date.View ArticlePubMedGoogle Scholar
- Feehan L, Buie H, Li L, McKay H. A customized protocol to assess bone quality in the metacarpal head, metacarpal shaft and distal radius: a high resolution peripheral quantitative computed tomography precision study. BMC Musculoskelet Disord. 2013;14:367. doi:https://doi.org/10.1186/1471-2474-14-367. published Online First: Epub Date.View ArticlePubMedPubMed CentralGoogle Scholar