- Research article
- Open Access
- Open Peer Review
Classification of distal radius fractures in children: good inter- and intraobserver reliability, which improves with clinical experience
© Randsborg and Sivertsen; licensee BioMed Central Ltd. 2012
- Received: 21 November 2011
- Accepted: 23 January 2012
- Published: 23 January 2012
We wanted to test the reliability of a commonly used classification of distal radius fractures in children.
105 consecutive fractures of the distal radius in children were rated on two occasions three months apart by 3 groups of doctors; 4 junior registrars, 4 senior registrars and 4 orthopedic consultants. The fractures were classified as buckle, greenstick, complete or physeal. Kappa statistics were used to analyze inter- and intraobserver reliability.
The kappa value for interobserver agreement at the first reading was 0.59 for the junior registrars, 0.63 for the senior registrars and 0.66 for the consultants. The mean kappa value for intraobserver reliability was 0.79 for the senior registrars, 0.74 for the consultants and 0.66 for the junior registrars.
We conclude that the classification tested in this study is reliable and reproducible when applied by raters experienced in fracture management. The reliability varies according to the experience of the raters. Experienced raters can verify the classification, and avoid unnecessary follow-up appointments.
- Distal Radius
- Distal Radius Fracture
- Interobserver Reliability
- Intraobserver Reliability
- Kappa Score
Distal radius fractures is the most common fracture in childhood . Most of these fractures are treated conservatively in a plaster and complications are rare. Although these fractures generally are benign, they are monitored differently according to the stability of the fracture and whether the growth plate is injured or not. A clinically relevant classification must take these factors into account. Otherwise the classification will not be helpful in deciding the correct treatment, follow-up strategy and prognosis. Any fracture classification should also have a substantial degree of both inter- and intraobserver reliability. If not, treatment algorithms will be arbitrary, since the fractures are placed in different categories by different doctors. If an unstable fracture is classified in a benign category with little or no follow-up, it can lead to complications, i.e. malunion of the fracture. Placing stable fractures, such as buckle fractures, in categories for unstable fractures will cause more follow-up than necessary. This is costly both for patients and society. On the other hand, a fracture classification with high reliability will provide effective, predictable and safe treatment algorithms, and it will be possible to draw general conclusions from research based on that system.
Reported reliability of different fracture classifications for distal radius fractures in adults
Kreder et al 1996
Ploegmakers et al 2007
Jin et al 2007
Andersen et al 1996
Flinkkilä et al 1998
The aim of this study is to evaluate the interobserver and intraobserver reliability of this commonly used classification of distal radius fractures in children. Also, we wanted to investigate to what degree experience influence the reliability of the classification. To the best of our knowledge this has not previously been done, although this classification has been used in several publications.
We designed this study to comply with the guidelines for reliability studies of fracture classification systems outlined by Audigé, Bhandari and Kellam . We included the first 105 consecutive distal radius fractures in children below the age of 16 years treated at our institution in 2007. Where indicated, information on follow ups was retrieved from the electronic journal. The radiographs were identified via our computerized files and checked by the authors. Fractures not involving the metaphysis according to the AO Pediatric Comprehensive Classification of long bone fractures were considered to be diaphyseal fractures and were excluded from the study . No radiograph was excluded due to poor quality, to avoid selection bias. Standard anterior-posterior and lateral radiographs of the distal radius were reviewed independently by 12 different observers: four junior orthopedic residents with a mean experience of fracture management of 14 months (6-28), four senior orthopedic registrars with an average experience in fracture management of 41 months (30-49) and four experienced orthopedic surgeons. In our institution, the pediatric distal radius fractures are generally managed by the junior registrars, who are supervised by the senior registrars. The orthopedic consultants are only occasionally involved in the management of these fractures.
Each fracture was classified to one of four possible categories; buckle (or torus), greenstick, complete or physeal fracture. The physeal fractures were not subclassified further. Before rating the radiographs, the observers were given schematics of the different fracture types and a written introduction to the difference between the categories. Further instructions to enhance results were not given. The radiographs were reviewed by the observers at two occasions 3 months apart. The raters were blinded to clinical information regarding the patients. The observers were not given any feed-back between the observations, and the order of the fractures was randomly changed before the second rating.
Interpretation of kappa values according to different authors
Landis and Koch
0.95 - 1.00
0.90 - 0.95
0.85 - 0.90
0.75 - 0.80
0.70 - 0.75
0.65 - 0.70
0.60 - 0.65
0.55 - 0.60
0.50 - 0.55
0.45 - 0.50
0.40 - 0.45
0.35 - 0.40
0.30 - 0.35
0.25 - 0.30
0.20 - 0.25
0.15 - 0.20
0.10 - 0.15
0.05 - 0.10
0.00 - 0.05
Reliability of fracture classification of 105 consecutive pediatric distal radius fractures rated by 12 doctors with variable level of experience in fracture management.
agreement at first reading
Two categories at first reading
Three categories at first reading
Four categories at first reading
Inter-observer values at first reading
Inter-observer values at second reading
Intra-observer agreement (Mean kappa)
All 12 raters
8 senior raters
4 junior registrars
4 senior registrars
Category-specific kappa values at first reading
Intraobserver agreement for 12 raters with different experience in fracture management
Number of months in practice
Cohens kappa value
Percentage of agreement
1 Junior registrar
2 Junior registrar
3 Junior registrar
4 Junior registrar
5 Senior registrar
6 Senior registrar
7 Senior registrar
8 Senior registrar
Distribution of fractures
Follow up of 65 buckle fractures
Type of follow up
Number of patients
Number of clinical follow-ups
Number of radiological examinations
No follow up scheduled
After 1 week only
After 1 week and at plaster removal
At plaster removal only
Classification of 105 consecutive distal radius fractures by consensus among 12 raters, including age and gender distribution.
Number of patients
Number of boys
Mean age in years (range)
11.0 (1.6 - 15.8)
12.0 (7.3 - 15.4)
13.7 (12.6 - 14.8)
12.2 (6.0 - 15.8)
11.4 (1.6 - 15.8)
The overall interobserver reliability of this fracture classification is better than most other reported agreement for fracture classification systems in adults. According to Landis and Koch, a kappa value of 0.66 would be rated as substantial agreement . It is reasonable to believe that the reliability of the fracture classification will improve in the clinical setting, where information about the patient is available.
The number of categories will affect the reliability of any classification. This is obvious if we think of a classification with only one category. Adult fracture classifications have often many categories due to the various fracture patterns that can occur in brittle bone, such as intraarticular affection and comminution. For example, the AO classification of distal radius fractures in adults has 3 types, 9 groups and 27 subgroups, and the reliability has been reported to be less than satisfactory by several authors [6–11] (Table 1). However, intraarticular fractures and severe comminution are rare features of pediatric fractures. It is therefore possible to reduce the number of categories and increase the reliability of the classification without loss of prognostic value. For example, the Gartland classification of supracondylar humerus fractures in children has only three categories, and has one of the highest reported kappa values for interobserver reliability .
There are very few fracture classifications for pediatric fractures compared to the vast array of different classifications that exists for fractures in adults. The Arbeitsgemenischaft für Osteosynthesefragen (AO) has recently proposed a comprehensive fracture classification system for pediatric fractures . This fracture classification contains categories for fracture types that are unique for pediatric bone, such as bowing fractures and growth plate injuries. However, this classification does not make the distinction between the buckle (torus) and the greenstick fracture of the distal radius. It is generally agreed that these two common pediatric fracture types are different entities which behave differently and need different treatment and follow up [18, 24, 25, 36, 37]. In addition, the AO group has added ligamentous avulsion injuries of the wrist as a separate category. This is a very rare injury in children and was not identified when the AO classification system for pediatric fractures was validated .
The AO group reported a kappa value of 0.70 for metaphyseal fractures of the distal radius . However, in this study there were only two categories; complete fractures and buckle/greenstick categorized together. Epiphyseal fractures were analyzed separately. The authors defined the correct classification as that defined by most raters, and then excluded the epiphyseal fractures when analyzing the reliability for metaphyseal fractures. This raises a few concerns: A fracture classification should include all possible fracture categories for that bone (distal radius). When confronted with an injured wrist, the clinician does not know if the physis is involved before the radiological examination. There is often disagreement between raters whether the fracture involves the physis or not. This is certainly the case in our study, as is demonstrated in Figure 2. If we excluded all the growth plate injuries as defined by most raters, there would still be raters that would categorize some of the remaining fractures as physeal. Furthermore, buckle and greenstick fractures should be managed differently . By placing these fractures in the same category the classification will not offer helpful guidelines to the clinician. In addition, it is important that the sample is representative of the study population, since the kappa statistics will vary according to the prevalence of the categories under study . When the number of categories is reduced by excluding one type of fracture, this will change the prevalence of the different fracture types in the sample compared to that of the population at risk, and thus changing the kappa statistics. It is therefore essential that the included fractures come from an unfiltered consecutive series to make sure the sample is representative of the population. This is specifically important when examining the reliability of distal radius fractures in children, since the distribution of categories is highly uneven, with buckle fractures representing the majority of cases.
Our results demonstrate that the fracture classification is not only dependent on the number of categories and the prevalence of the categories in the study population, but also on the experience of the raters. Ideally, a classification system should be simple and independent of the experience of the rater. However, the effect of experience on reliability has previously been described for other classification systems [2, 6, 40]. The effect of the experience on this particular classification is noteworthy, since these fractures are considered benign and are generally treated by the youngest doctors. The best result at the first reading in our study was achieved by the orthopaedic consultants. It is worth noticing that two experienced consultants had lower intraobserver agreement than the senior registrars. The senior registrars have several years of experience in fracture management, and are involved in fracture classification on a daily basis. At our institution the consultants are in general not involved in fracture management of the distal radius in children, except occasionally while on-call. It seems that both daily fracture management and general experience in orthopedics enhances the reliability.
Stable distal radius fractures in children are extensively monitored with both clinical and radiological follow ups [18, 41]. In this series of 105 consecutive fractures, 65 fractures were by consensus defined as buckle fractures. These stable fractures were given a total of 72 clinical follow-up examinations and 34 further radiological examinations. These could have been avoided with more focus on the fracture classification and better supervision. The junior registrars had statistically significant lower kappa value for interobserver reliability than the more experienced raters. They placed fewer fractures in the buckle group, and rated more fractures as greenstick or physeal injuries. This generated more unnecessary follow-ups, but didn't risk any adverse outcome. We coclude that junior registrars overdiagnose, and safe-guard themselves by placing more fractures in categories that merit a follow-up. We encourage the junior registrars to ask for a second opinion. We can avoid an appointment in an overbooked fracture clinic, the child can stay in school and the parents don't have to take time off work to take the child to hospital.
Limitation of the study
All raters in this study were selected from one institution. This may limit the generalizability of the results to other institutions, thus reducing the external validity of the study. The type and amount of instruction for each rater prior to enrollment in the study is unknown. However, at our institution no systematic instruction for classification are given to doctors treating these fractures, and we have no reason to believe that this is different at other institutions. Only one of the consultants was trained in pediatric orthopedics, and thus the findings may not be relevant for institutions where specialists in pediatric orthopedics are involved in outpatient fracture treatment.
We conclude that the classification tested in this study is reliable and reproducible when applied by raters experienced in fracture management. More focus on the different fracture categories and better supervision of our younger colleges (who treat most of these fractures in the fracture clinic) will reduce the number of fractures that are considered in need for follow up. This is supported by the results in the study where the more experienced doctor tended to classify better, and where the youngest doctors improved throughout the study period. We recommend this simple four category classification for future research into the treatment and prognosis of distal radius fractures in children.
We thank statistician Jūratė Šaltytė Benth of HØKH Research Center, Akershus University Hospital, Norway, for help with the statistical analysis.
The study was supported by grants from Aase Bye and Trygve J. B. Hoffs Research Fund. The funding agency was not involved in the design or the conduct of the study; data analysis or interpretation; manuscript preparation or in the decision to submit the manuscript for publication.
- Brudvik C, Hove LM: Childhood fractures in Bergen, Norway: identifying high-risk groups and activities. J Pediatr Orthop. 2003, 23: 629-634. 10.1097/01241398-200309000-00010.View ArticlePubMedGoogle Scholar
- Sidor ML, Zuckerman JD, Lyon T, Koval K, Cuomo F, Schoenberg N: The Neer classification system for proximal humeral fractures. An assessment of interobserver reliability and intraobserver reproducibility. J Bone Joint Surg Am. 1993, 75: 1745-1750.PubMedGoogle Scholar
- Siebenrock KA, Gerber C: The reproducibility of classification of fractures of the proximal end of the humerus. J Bone Joint Surg Am. 1993, 75: 1751-1755.PubMedGoogle Scholar
- Thomsen NO, Jensen CM, Skovgaard N, Pedersen MS, Pallesen P, Soe-Nielsen NH: Observer variation in the radiographic classification of fractures of the neck of the femur using Garden's system. Int Orthop. 1996, 20: 326-329. 10.1007/s002640050087.View ArticlePubMedGoogle Scholar
- van ED, Rhemrev SJ, Meylaerts SA, Roukema GR: The comparison of two classifications for trochanteric femur fractures: the AO/ASIF classification and the Jensen classification. Injury. 2010, 41: 377-381. 10.1016/j.injury.2009.10.007.View ArticleGoogle Scholar
- Kreder HJ, Hanel DP, McKee M, Jupiter J, McGillivary G, Swiontkowski MF: Consistency of AO fracture classification for the distal radius. J Bone Joint Surg Br. 1996, 78: 726-731.PubMedGoogle Scholar
- Ploegmakers JJ, Mader K, Pennig D, Verheyen CC: Four distal radial fracture classification systems tested amongst a large panel of Dutch trauma surgeons. Injury. 2007, 38: 1268-1272. 10.1016/j.injury.2007.03.032.View ArticlePubMedGoogle Scholar
- Belloti JC, Tamaoki MJ, Franciozi CE, Santos JB, Balbachevsky D, Chap CE: Are distal radius fracture classifications reproducible? Intra and interobserver agreement. Sao Paulo Med J. 2008, 126: 180-185.View ArticlePubMedGoogle Scholar
- Andersen DJ, Blair WF, Steyers CM, Adams BD, el-Khouri GY, Brandser EA: Classification of distal radius fractures: an analysis of interobserver reliability and intraobserver reproducibility. J Hand Surg Am. 1996, 21: 574-582. 10.1016/S0363-5023(96)80006-2.View ArticlePubMedGoogle Scholar
- Flikkila T, Nikkola-Sihto A, Kaarela O, Paakko E, Raatikainen T: Poor interobserver reliability of AO classification of fractures of the distal radius. Additional computed tomography is of minor value. J Bone Joint Surg Br. 1998, 80: 670-672. 10.1302/0301-620X.80B4.8511.View ArticlePubMedGoogle Scholar
- Jin WJ, Jiang LS, Shen L, Lu H, Cui YM, Zhou Q: The interobserver and intraobserver reliability of the cooney classification of distal radius fractures between experienced orthopaedic surgeons. J Hand Surg Eur Vol. 2007, 32: 509-511. 10.1016/j.jhse.2007.03.002.View ArticlePubMedGoogle Scholar
- Wilkins KE: Principles of fracture remodeling in children. Injury. 2005, 36 (Suppl 1): A3-11.View ArticlePubMedGoogle Scholar
- Cannata G, De MF, Mancini F, Ippolito E: Physeal fractures of the distal radius and ulna: long-term prognosis. J Orthop Trauma. 2003, 17: 172-179. 10.1097/00005131-200303000-00002.View ArticlePubMedGoogle Scholar
- Light TR, Ogden DA, Ogden JA: The anatomy of metaphyseal torus fractures. Clin Orthop Relat Res. 1984, 103-111.Google Scholar
- McLauchlan GJ, Cowan B, Annan IH, Robb JE: Management of completely displaced metaphyseal fractures of the distal radius in children. A prospective, randomised controlled trial. J Bone Joint Surg Br. 2002, 84: 413-417. 10.1302/0301-620X.84B3.11432.View ArticlePubMedGoogle Scholar
- Miller BS, Taylor B, Widmann RF, Bae DS, Snyder BD, Waters PM: Cast immobilization versus percutaneous pin fixation of displaced distal radius fractures in children: a prospective, randomized study. J Pediatr Orthop. 2005, 25: 490-494. 10.1097/01.bpo.0000158780.52849.39.View ArticlePubMedGoogle Scholar
- Noonan KJ, Price CT: Forearm and distal radius fractures in children. J Am Acad Orthop Surg. 1998, 6: 146-156.View ArticlePubMedGoogle Scholar
- Randsborg PH, Sivertsen EA: Distal radius fractures in children: substantial difference in stability between buckle and greenstick fractures. Acta Orthop. 2009, 80: 585-589. 10.3109/17453670903316850.View ArticlePubMedPubMed CentralGoogle Scholar
- Davidson JS, Brown DJ, Barnes SN, Bruce CE: Simple treatment for torus fractures of the distal radius. J Bone Joint Surg Br. 2001, 83: 1173-1175. 10.1302/0301-620X.83B8.11451.View ArticlePubMedGoogle Scholar
- Plint AC, Perry JJ, Tsang JL: Pediatric wrist buckle fractures. CJEM. 2004, 6: 397-401.PubMedGoogle Scholar
- Solan MC, Rees R, Daly K: Current management of torus fractures of the distal radius. Injury. 2002, 33: 503-505. 10.1016/S0020-1383(01)00198-X.View ArticlePubMedGoogle Scholar
- Parfitt AM: The two faces of growth: benefits and risks to bone integrity. Osteoporos Int. 1994, 4: 382-398. 10.1007/BF01622201.View ArticlePubMedGoogle Scholar
- Salter RB: Injuries of the epiphyseal plate. Instr Course Lect. 1992, 41: 351-359.PubMedGoogle Scholar
- Plint AC, Perry JJ, Correll R, Gaboury I, Lawton L: A randomized, controlled trial of removable splinting versus casting for wrist buckle fractures in children. Pediatrics. 2006, 117: 691-697. 10.1542/peds.2005-0801.View ArticlePubMedGoogle Scholar
- West S, Andrews J, Bebbington A, Ennis O, Alderman P: Buckle fractures of the distal radius are safely treated in a soft bandage: a randomized prospective trial of bandage versus plaster cast. J Pediatr Orthop. 2005, 25: 322-325. 10.1097/01.bpo.0000152909.16045.38.View ArticlePubMedGoogle Scholar
- Audige L, Bhandari M, Kellam J: How reliable are reliability studies of fracture classifications? A systematic review of their methodologies. Acta Orthop Scand. 2004, 75: 184-194. 10.1080/00016470412331294445.View ArticlePubMedGoogle Scholar
- Slongo T, Audige L, Schlickewei W, Clavert JM, Hunter J: Development and validation of the AO pediatric comprehensive classification of long bone fractures by the Pediatric Expert Group of the AO Foundation in collaboration with AO Clinical Investigation and Documentation and the International Association for Pediatric Traumatology. J Pediatr Orthop. 2006, 26: 43-49. 10.1097/01.bpo.0000187989.64021.ml.View ArticlePubMedGoogle Scholar
- R Development Core Team: R: A Language and Environment for Statistical Computing. http://www.r-project.org/. 2009. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org. Ref Type: Online Source
- Gamer M, Lemon J, Fellows I: irr: Various Coefficients of Interrater Reliability and Agreement. R package version 0.82 http://cran.r-project.org/package=irr 2009. Ref Type: Online Source
- Fleiss JL: Measuring nominal scale agreement among many raters. Psychol Bull. 1971, 86: 378-382.View ArticleGoogle Scholar
- Cohen J: A coefficient of agreement for nominal scales. Educ Psychol meas. 1960, 20: 37-46. 10.1177/001316446002000104.View ArticleGoogle Scholar
- Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics. 1977, 33: 159-174. 10.2307/2529310.View ArticlePubMedGoogle Scholar
- Svanholm H, Starklint H, Gundersen HJ, Fabricius J, Barlebo H, Olsen S: Reproducibility of histomorphologic diagnoses with special reference to the kappa statistic. APMIS. 1989, 97: 689-698. 10.1111/j.1699-0463.1989.tb00464.x.View ArticlePubMedGoogle Scholar
- Barton KL, Kaminsky CK, Green DW, Shean CJ, Kautz SM, Skaggs DL: Reliability of a modified Gartland classification of supracondylar humerus fractures. J Pediatr Orthop. 2001, 21: 27-30. 10.1097/01241398-200101000-00007.View ArticlePubMedGoogle Scholar
- Slongo TF, Audige L: Fracture and dislocation classification compendium for children: the AO pediatric comprehensive classification of long bone fractures (PCCF). J Orthop Trauma. 2007, 21: S135-S160. 10.1097/00005131-200711101-00020.View ArticlePubMedGoogle Scholar
- Symons S, Rowsell M, Bhowal B, Dias JJ: Hospital versus home management of children with buckle fractures of the distal radius. A prospective, randomised trial. J Bone Joint Surg Br. 2001, 83: 556-560. 10.1302/0301-620X.83B4.11211.View ArticlePubMedGoogle Scholar
- Gupta RP, Danielsson LG: Dorsally angulated solitary metaphyseal greenstick fractures in the distal radius: results after immobilization in pronated, neutral, and supinated position. J Pediatr Orthop. 1990, 10: 90-92.View ArticlePubMedGoogle Scholar
- Slongo T, Audige L, Clavert JM, Lutz N, Frick S, Hunter J: The AO comprehensive classification of pediatric long-bone fractures: a web-based multicenter agreement study. J Pediatr Orthop. 2007, 27: 171-180. 10.1097/01.bpb.0000248569.43251.f9.View ArticlePubMedGoogle Scholar
- Viera AJ, Garrett JM: Understanding interobserver agreement: the kappa statistic. Fam Med. 2005, 37: 360-363.PubMedGoogle Scholar
- Wiig O, Terjesen T, Svenningsen S: Inter-observer reliability of radiographic classifications and measurements in the assessment of Perthes' disease. Acta Orthop Scand. 2002, 73: 523-530. 10.1080/000164702321022794.View ArticlePubMedGoogle Scholar
- Green JS, Williams SC, Finlay D, Harper WM: Distal forearm fractures in children:the role of radiographs during follow up. Injury. 1998, 29: 309-312.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2474/13/6/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.