Skip to main content

Volume 22 Supplement 2

All about the hip

Femoral neck fracture: the reliability of radiologic classifications

Abstract

Background

Femoral neck fractures (FNF) are one of the most common injury in the elderly. A valid radiographic classification system is mandatory to perform the correct treatment and to allow surgeons to facilitate communication. This study aims to evaluate reliability of 2018 AO/OTA Classification, AO/OTA simplified and Garden classification.

Methods

Six Orthopaedic surgeons, divided in three groups based on trauma experience, evaluated 150 blinded antero-posterior and latero-lateral radiography of FNF using Garden classification, 2018 AO/OTA and simplified AO/OTA classification. One month later, the radiographs were renumbered and then each observer performed a second evaluation of the radiographs. The Kappa statistical analysis was used to determine the reliability of the classifications. Cohen’s Kappa was calculated to determine intra and inter observer reliability. Fleiss’ Kappa was used to determine multi-rater agreement.

Results

The k values of interobserver reliability for Garden classification was from 0,28 to 0,73 with an average of 0,49. AO classification showed reliability from 0,2 to 0,42, with average of 0,30. Simplified AO/OTA classification showed a reliability from 0,38 to 0,58 with an average of 0,48.

The values of intra observer reliability for Garden classification was from 0,48 to 0,79 with an average of 0,63. AO classification showed reliability from 0,2 to 0,64 with an average of 0,5. Simplified AO/OTA classification showed a reliability from 0,4 to 0,75 with an average of 0,61.

Conclusion

The revised 2018 AO/OTA classification simplified the previous classification of intracapsular fracture but remain unreliable with only fair interobserver reliability. The simplified AO/OTA classification show a reliability similar to Garden classification, with a moderate interobserver reliability. The experience of the surgeons seems not to improve reliability. No classification has been shown to be superior in terms of reliability.

Background

Proximal femur fracture is one of the most common type of fracture in the elderly. It occurs in 18% of women and in 6% of men worldwide [1]. It is caused by accidental falls in elderly patients, due to osteoporosis [2]. The incidence of proximal femur fracture has raised worldwide in the last two decades along with the increase in the average age of the population. In fact, the global number of hip fractures is expected to increase from 1.26 million in 1990 to 4.5 million by the year 2050 [1].

The incidence of femoral neck fractures (FNF) is approximately equal to the incidence of pertrochanteric fractures, in combination making up over 90% of all proximal femur fractures [3].

In Italy, hip fractures occurred in people over 65 years increased from 89,601 to 94,525 during the period from 2007 to 2014 [4]. This leads to an increasing number of hospital admission and hospitalization costs [5]. Furthermore hip fractures affect the quality of life of patients [6]. For this reason it is important to reach a fast and correct diagnosis and perform an adequate and prompt treatment to reduce post-operative complications [7] and mortality [8].

The treatment of choice, in almost all of the cases, is surgical. The choice of a specific treatment option is based on the stability and orientation of the fracture and patient factors such as age, function, and bone quality [9, 10]. For unstable FNF the treatment of choice is hip replacement (total hip arthroplasty or hemiarthroplasty) instead for stable FNF, the most used treatment is the internal fixation with cannulated screws or with other hip implants [11].

Radiographic FNF classification helps with clinical decision making, communication, and research on prognosis and treatment [12]. The most common classification used for intracapsular FNF are the Garden Classification and the AO/OTA classification. These classification systems are based on 2-dimensional X-ray images. Garden classified femoral neck fractures into four types based on displacement on the anteroposterior radiograph [1314]. A type I fracture is an incomplete or valgus-impacted fracture. A type II fracture is a complete fracture without displacement of the fracture fragments. A type III fracture is a complete fracture with partial displacement of fracture fragments. A type IV fracture is a complete fracture with total displacement of the fracture fragments, allowing the femoral head to rotate back to an anatomic position [9]. The AO/OTA classification system is organized into hierarchies of severity as the descriptions generally proceed from simple to multifragmentary fractures [15]. Fractures of the femoral head have been classified as subcapital with minimal or no displacement (Type B1), transcervical (Type B2), or displaced subcapital fractures (Type B3). Each of these types has a subclassification [16]. In clinical practice AO/OTA classification is usually simplified considering only the three categories (B1, B2, B3).

The aim of the study is to assess the reliability of these classifications by examining intra- and interobserver agreement of trauma surgeons and how the reliability depends on observers’ experience.

Methods

In this retrospective study were included patients admitted to a single institution from January 2017 to December 2019 for FNF.

The inclusion criteria was femoral neck fracture in a patient aged 18 years or more.

The exclusion criteria were: incomplete series of preoperative radiography (it was requested digital files of antero-posterior projection of the pelvis and hip in lateral projection), advanced hip osteoarthritis, previous contralateral side femoral neck fracture or contralateral prosthetic replacement, hip dysplasia, associated pelvic fractures. Pathologic fractures were excluded too.

The final sample size consisted of 150 patients, including 57 men and 93 women, with an average age of 75,6 years. The hip involved in 43% of cases was the right. Of this sample, 49 patients underwent CRIF or ORIF surgery, 101 patients underwent prosthesis surgery (in 4 cases, a computed tomography was used for the surgical choice).

All possible patient identification marks were obscured on the radiographs. The radiographs were subsequently numbered and were analyzed, by 6 observers: 2 experienced trauma surgeons (GC and SD), 2 junior trauma surgeons (GM and MSO) and 2 orthopaedic trauma residents (AS and MM). All observers were familiar with the classifications analyzed, and all of them were equipped with the classifications’ definitions and schemes. Surgeons with different experience were chosen to assess how much experience could affect reproducibility. Radiographs were classified according to 2018 edition AO/OTA classification, 2018 edition AO/OTA simplified (only B1, B2, B3) and Garden classification.

Each observer was required to make a first classification of the radiographs in AP and LL projection, noting the results in a specific grid. At the end of the observation, the grid was archived without sharing it with the other observers. One month later, the radiographs were renumbered and then each observer performed a second evaluation of the radiographs (Figs. 1 and 2).

Fig. 1
figure 1

Example of blinded radiographs included in the study. (1) Femural neck fracture with high intra and inter observer reliability. (2) Femural neck fracture with low intra and inter observer reliability

Fig. 2
figure 2

Example of blinded radiographs included in the study. (1) Femural neck fracture with high intra and inter observer reliability. (2) Femural neck fracture with low intra and inter observer reliability

The Kappa statistical analysis was used to determine the reliability of the classifications. Cohen’s Kappa was calculated to determine intra and inter observer reliability. Fleiss’ Kappa was used to calculate the multi rater reliability of more and less experienced trauma surgeons.

We used the interpretation of intra and interobserver variability using the Landis and Koch criteria: a k values of 0.00–0.20 considered slight agreement; 0.21–0.40, fair agreement; 0.41–0.60 moderate agreement; 0.61–0.80, substantial agreement; 0.81–1.00, almost perfect agreement [14].

Results

Interobserver reliability

Cohen kappa’s values of interobserver reliability of AO/OTA, AO/OTA simplified, and Garden classification based on X ray are noted in Table 123. The values of interobserver reliability for Garden classification was from 0,28 [0,17–0,39 CI] to 0,73 [0,65–0,82 CI] with an average of 0,49. AO classification showed reliability from 0,2 [0,11–0,29 CI] to 0,42 [0,33–0,52 CI], with average of 0,30. Simplified AO/OTA classification showed a reliability from 0,38 [0,26–0,50 CI] to 0,58 [0,47-0,69] with an average of 0,48.

Table 1 Inter-observer reliability of Garden Classification. K Value [95% CI] - %Agreement
Table 2 Inter-observer reliability of AO/OTA Classification. K Value [95% CI] - %Agreement
Table 3 Inter-observer reliability of Simplified AO/OTA Classification. K Value [95% CI] - %Agreement

We also analyzed the agreement between observers, dividing them in groups according to their trauma experience: trauma surgeon, young trauma surgeon and resident. There were no significant differences in agreement between observer groups (Table 4). The results shows a moderate agreement as regards both Garden classification and simplified AO/OTA classification; the mean K value was lower when considering AO/OTA classification; results demonstrated a fair agreement.

Table 4 Groups’ Kappa values for inter observer agreement of Classifications. K Value [95% CI] - %Agreement

Intra observer reliability

Cohen kappa’s values of intra observer reliability of AO/OTA, AO/OTA simplified, and Garden classification based on X ray are noted in Table 5. The values of intra observer reliability for Garden classification was from 0,48 [0,37–0,58 CI] to 0,79 [0,71–0,87 CI] with an average of 0,63. AO/OTA classification showed reliability from 0,2 [0,13–0,3 CI] to 0,64 [0,56–0,73 CI] with an average of 0,5. Simplified AO/OTA classification showed a reliability from 0,4 [0,28–0,52 CI] to 0,75 [0,66-0,84] with an average of 0,61.

Table 5 Intra-observer reliability of classifications. K Value [95% CI] - %Agreement

Discussion

Successful treatment starts by an adequate classification of pathology and an accurate evaluation of the clinical condition of the patient (age, comorbidities) that guides surgeons in choosing the correct management and communication.

Ideally, a classification system should be easily applicable, highly reliable, comprehensive, highly reproducible; in many cases it indicates outcomes. Regarding proximal femur fractures there is still no agreement on a universally accepted, reliable classification, and this can stimulate debate regarding the appropriate treatment options. Any classification system used should aim to possess a high degree of inter-observer and intra-observer reliability facilitating the communication of patient’s conditions providing a clear guidance for the treatment of patients [17].

A valid classification allows surgeons to determine the correct treatment and predict outcomes. In fact, femoral neck fractures were firstly classified by Waldenström in 1924 in “stable” and “unstable”. In literature, reliability of this classification was widely analyzed; datas show that it’s higher than in the others, because it considers only two level, instead of four and seven level respectively in Garden and AO classification, reducing possible bias [18] [19].

In the past the Pauwels classification has also been studied, resulting in a poorly reliability classification and therefore no longer used in daily clinical practice [20] [21].

In this paper we have studied the inter-observer and intra-observer agreement evaluation of three different classification systems. Six orthopedic trauma surgeons, with different years of experience (two young trauma surgeons, two residents, two trauma surgeons) graded 150 radiographs of proximal femur fractures using Garden classification and 2018 AO classification, complete and simplified. We decided to not use the Waldenström and Pauwels classifications because these are not used a day in the clinical practice.

The inter-observer reliability obtained in Garden classification was moderate, as regards simplified AO/OTA was moderate too (average k value was 0,49 and 0,48 respectively). Inter-observer reliability lessens to fair with an average k value of 0.30 when considering AO classification.

In literature the interobserver agreement of Garden classification varies from fair to moderate [21,22,23]. Our results demonstrated for the Garden classification an higher reliability compared to the previous study: Masionis et al., Gaspar et al. and Van Embden et all found a k value respectively of 0,33, 0,41 and 0,31 [18, 21, 22].

We found a substantial intra-observer reliability in Garden and simplified AO/OTA classification, (mean k value was 0,63 and 0,61 respectively). Intra-observer reliability lessens to moderate with a mean k value of 0,50 when considering AO/OTA classification.

Even for Garden’s intraobserver reliability, we found an higher k value compared with previous studies: Masionis et al. found an intraobserver reliability from 0,40 to 0,57 [18].

We observed, as well as all the studies in the literature, that inter and intra-observer reliability decrease if the classification is more complex, in fact kappa values strongly depends on the numbers of levels of classification investigated [18, 20, 24].

Our work, to our knowledge, is the first in literature considering the reliability of 2018 AO/OTA classification. A recent study analyzed this classification, but only for the extra capsular femur fractures (31A): simplified AO k value was 0,479, complete AO k value was 0,376 [24].

Masionis et all describe a k value for intraobserver reliability from 0,26 to 0,48 and a k value from 0,11 to 0,43 for interobserver reliability of the previous AO classification [18]; Blundell et al. found AO system had fair agreement [25]; Gaspar et al. calculated a k value of 0,17 for interobserver reliability [21].

Thus, it is important to notice that radiographic images were graded using the latest version of AO classification (2018); despite its complexity, it has a reliability higher than the previous version. Another strength of our study is that the reliability was analyzed considering the experience of observers. In literature, this particular analysis has been described only for Garden classification and for the previous version of AO classification [12, 1820].

Our results are similar to data founded in literature for the reliability when comparing more experienced to less experience surgeons [18, 20]. Data we founded favor opinion that experience does not improve the interobserver and intra observer reliability. This can be due to the learning curve of classifying fracture that is steeper in the first couple of years of practice and then decreases [12]; trauma residents making part of this study had already 3 years of experience in treatment this type of fractures.

Authors are aware of limitations of the present study. First, the low sample size of the evaluating surgeons. Then, it’s a retrospective study and patient outcomes were not evaluated. All the observers work in the same hospital and in the same university orthopedic department, it could have probably uniformed their classification. This is not a multicentric study, because patients were selected from a single department and consequently radiographic images were collected using the same protocol. Lastly, we considered only three classifications; we excluded other classifications, such as Pauwels and Waldenström i.e., because in the clinical practice these are the most common used.

Conclusion

The latest version of AO classification (2018), despite its complexity, has a reliability higher than the previous version. Furthermore our results are similar to data founded in literature for the reliability when comparing more experienced to less experience surgeons. Garden and AO/OTA simplified classification are more reliable than AO/OTA classification with subgroups, in fact also in previous literature, inter and intra-observer reliability decrease when the classification become more complex. It does not mean that these classifications can be considered successful because their inter observer reliabilities are not high enough and even trauma experience did not improve them.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

FNF:

Femoral neck fractures

AO:

Arbeitsgemeinschaft für Osteosynthesefragen

OTA:

Orthopaedics Trauma Association

CRIF:

closed reduction internal fixation

ORIF:

open reduction internal fixation

AO/OTA:

AO Foundation/Orthopaedic Trauma Association

References

  1. Veronese N, Maggi S. Epidemiology and social costs of hip fracture. Injury. 2018;49:1458–60. https://doi.org/10.1016/j.injury.2018.04.015 PMID: 29699731.

    Article  PubMed  Google Scholar 

  2. Compston JE, McClung MR, Leslie WD. Osteoporosis. Lancet. 2019;393:364–76. https://doi.org/10.1016/S0140-6736(18)32112-3 PMID: 30696576.

    CAS  Article  PubMed  Google Scholar 

  3. Bäcker HC, Wu CH, Maniglio M, Wittekindt S, Hardt S, Perka C. Epidemiology of proximal femoral fractures. J Clin Orthop Trauma. 2021;12:161–5. https://doi.org/10.1016/j.jcot.2020.07.001 PMID: 33716441.

    Article  PubMed  Google Scholar 

  4. Piscitelli P, Feola M, Rao C, Neglia C, Rizzo E, Vigilanza A, et al. Incidence and costs of hip fractures in elderly Italian population: first regional-based assessment. Arch Osteoporos. 2019;14:81. https://doi.org/10.1007/s11657-019-0619-9 PMID: 31342284.

    Article  PubMed  Google Scholar 

  5. Williamson S, Landeiro F, McConnell T, Fulford-Smith L, Javaid MK, Judge A, et al. Costs of fragility hip fractures globally: a systematic review and meta-regression analysis. Osteoporos Int. 2017;28:2791–800. https://doi.org/10.1007/s00198-017-4153-6 PMID: 28748387.

    CAS  Article  PubMed  Google Scholar 

  6. Dyer SM, Crotty M, Fairhall N, Magaziner J, Beaupre LA, Cameron ID, et al. A critical review of the long-term disability outcomes following hip fracture. BMC Geriatr. 2016;16:158 PMID: 27590604.

    Article  Google Scholar 

  7. Basilico M, Vitiello R, Oliva MS, Covino M, Greco T, Cianni L, et al. Predictable risk factors for infections in proximal femur fractures. J Biol Regul Homeost Agents. 2020;34(3 Suppl. 2):77–81 PMID: 32856444.

    CAS  PubMed  Google Scholar 

  8. Vicenti G, Bizzoca D, Pascarella R, Delprete F, Chiodini F, Daghino W, et al. Development of the Italian fractures registry (RIFra): a call for action to improve quality and safety. Injury. 2020;(10):052.

  9. Florschutz AV, Langford JR, Haidukewych GJ, Koval KJ. Femoral neck fractures: current management. J Orthop Trauma. 2015;29:121–9 PMID: 25635363.

    Article  Google Scholar 

  10. Vitiello R, Perisano C, Covino M, Perna A, Bianchi A, Oliva MS, et al. Euthyroid sick syndrome in hip fractures: valuation of vitamin D and parathyroid hormone axis. Injury. 2020;51(Suppl 3):S13–6 PMID: 31983423.

    Article  Google Scholar 

  11. Bigoni M, Turati M, Leone G, Caminita AD, D’Angelo F, Munegato D, et al. Internal fixation of intracapsular femoral neck fractures in elderly patients: mortality and reoperation rate. Aging Clin Exp Res. 2020;32:1173–8 PMID: 31175608.

    Article  Google Scholar 

  12. Crijns TJ, Janssen SJ, Davis JT, Ring D, Sanchez HB. Science of variation group. Reliability of the classification of proximal femur fractures: does clinical experience matter? Injury. 2018;49:819–23 PMID: 29549969.

    Article  Google Scholar 

  13. Zlowodzki M, Bhandari M, Keel M, Hanson BP, Schemitsch E. Perception of Garden’s classification for femoral neck fractures: an international survey of 298 orthopaedic trauma surgeons. Arch Orthop Trauma Surg. 2005;125:503–5 PMID: 16075274.

    Article  Google Scholar 

  14. Garden RS. Reduction and fixation of subcapital fractures of the femur. Orthop Clin North Am. 1974;5:683–712.

    CAS  Article  Google Scholar 

  15. Meinberg EG, Agel J, Roberts CS, Karam MD, Kellam JF. Fracture and dislocation classification Compendium-2018. J Orthop Trauma. 2018;32(Suppl 1):S1–170 PMID: 29256945.

    Article  Google Scholar 

  16. Caviglia HA, Osorio PQ, Comando D. Classification and diagnosis of intracapsular fractures of the proximal femur. Clin Orthop. 2002;399:17–27 PMID: 12011690.

    Article  Google Scholar 

  17. Fung W, Jonsson A, Buhren V, Bhandari M. Classifying intertrochanteric fractures of the proximal femur: does experience matter? Med Princ Pract Int J Kuwait Univ Health Sci Cent. 2007;16:198–202 PMID: 17409754.

    Google Scholar 

  18. Masionis P, Uvarovas V, Mazarevičius G, Popov K, Venckus Š, Baužys K, et al. The reliability of a Garden, AO and simple II stage classifications for intracapsular hip fractures. Orthop Traumatol Surg Res OTSR. 2019;105:29–33 PMID: 30639032.

    Article  Google Scholar 

  19. Waldenström J. Fractures récentes du col femoral: traitement operatoire ou orthopédique. J Chir. 1924;24:129.

    Google Scholar 

  20. Turgut A, Kumbaracı M, Kalenderer Ö, İlyas G, Bacaksız T, Karapınar L. Is surgeons’ experience important on intra- and inter-observer reliability of classifications used for adult femoral neck fracture? Acta Orthop Traumatol Turc. 2016;50:601–5 PMID: 27889406.

    Article  Google Scholar 

  21. Gašpar D, Crnković T, Durović D, Podsednik D, Slišurić F. AO group, AO subgroup, Garden and Pauwels classification systems of femoral neck fractures: are they reliable and reproducible? Med Glas Off Publ Med Assoc Zenica-Doboj Cant Bosnia Herzeg. 2012;9:243–7 PMID: 22926358.

    Google Scholar 

  22. Van Embden D, Rhemrev SJ, Genelin F, Meylaerts S, a. G, Roukema GR. The reliability of a simplified Garden classification for intracapsular hip fractures. Orthop Traumatol Surg Res OTSR. 2012;98:405–8 PMID: 22560590.

    Article  Google Scholar 

  23. Aggarwal A, Singh M, Aggarwal AN, Bhatt S. Assessment of interobserver variation in Garden classification and management of fresh intracapsular femoral neck fracture in adults. Chin J Traumatol Zhonghua Chuang Shang Za Zhi. 2014;17:99–102 PMID: 24698579.

    PubMed  Google Scholar 

  24. Chan G, Hughes K, Barakat A, Edres K, da Assuncao R, Page P, et al. Inter- and intra-observer reliability of the new AO/OTA classification of proximal femur fractures. Injury. 2020;10:067.

  25. Blundell CM, Parker MJ, Pryor GA, Hopkinson-Woolley J, Bhonsle SS. Assessment of the AO classification of intracapsular fractures of the proximal femur. J Bone Joint Surg Br. 1998;80:679–83 PMID: 9699837.

    CAS  Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

About this supplement

This article has been published as part of BMC Musculoskeletal Disorders Volume 22 Supplement 2 2021: All about the hip. The full contents of the supplement are available at https://bmcmusculoskeletdisord.biomedcentral.com/articles/supplements/volume-22-supplement-2.

Funding

Publication costs are funded by Orthopedic and Traumatology School of Università Cattolica del Sacro Cuore – Roma, The funders did not play any role in the design of the study, the collection, analysis, and interpretation of data, or in writing of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

GC created the study design, wrote part of the text, edited figures and tables and performed proof-reading. AS, GM and MM wrote part of the text, AS formatted text. MSO selected the radiographic images from database. RV contributed to study design and performs the statistical analysis. SD, GC, MSO, GM, AS and MM analyzed and classified the radiographic images selected. AZ and OP supervised the interpretation of data and revised the study. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Alessandro Smimmo.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no potential conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Cazzato, G., Oliva, M.S., Masci, G. et al. Femoral neck fracture: the reliability of radiologic classifications. BMC Musculoskelet Disord 22, 1063 (2021). https://doi.org/10.1186/s12891-022-05007-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12891-022-05007-3

Keywords

  • Hip fractures
  • Femoral neck fracture
  • Femoral fractures’ classification
  • Reliability