This study found that five scoring systems that are used to report outcomes of clubfoot treatment provided a wide spectrum of success (from 56 to 89% of cases) in a cohort with 3.5–5 years of follow up. When compared with the standard of clinical assessment, missed referrals ranged from 7.4% (the Bangla and ACT scores) to 22.7% (the Roye score). The measurements assess different aspects of clubfoot correction, from parent reported outcome measures (the Roye score) to scores that include physical assessment (the Bangla and ACT score) and single measurements (plantigrade foot and evidence of recurrence). Success improves in all measures with the completion of casting and at least two years of bracing.
Comparison to previous studies
There are limited studies that compare measurement tools in the same patient against which to compare our findings. However, success of treatment in this cohort is similar to other studies in sub-Saharan Africa (between 63 and 98% of cases) [9]. Non-adherence and surgical intervention, often defined as failure, are reported to vary from 7 to 61% and 3–39.4% [15] respectively. Ponseti and Laaveg [16] describe a scoring system that rates functional results as satisfactory in 88.5% of feet. Further studies describe success using the Ponseti and Laaveg system as 89.3% [17]. The criteria includes the need for a goniometer and the tool was therefore not included in evaluation of this cohort.
Use of outcome measures
The ease of use and rate of incorrect classification in the tools used to measure success need to be considered when selecting an outcome measure. Single item scales for assessment of individual children require no further calculation and may be easier to use in clinics (such as plantigrade foot or evidence of relapse), however their simplicity may not allow a full assessment of success. Multi-scale items prove difficult to transform into useful statistics without technology and are unlikely to be routinely used in clinics. This study found no clear agreement between the different outcome measurements in use.
All of the assessments used in this study have limitations. The Roye score has been validated in high income settings and parents in our study reported difficulty in answering the question of “How often does your child have problems finding shoes that he or she likes?” as it was understood to be related to the availability of a variety of shoes. The Bangla score took the longest time to transform with statistical analysis. Acceptability and feasibility of the ACT score is needed to be studied in future research. The ACT score is likely easy to teach, however this is unknown as the examiners were physiotherapists; the time taken for other cadres of health workers to use the ACT tool is also unknown. With regard to the relapse score, Bhaskar et al. (2013) considered ankle dorsiflexion < 15 degrees with knee in extension as grade IA relapse. This may be a reason for the restriction in defining good outcome as an evaluation of 85 normal feet in children found that the mean ankle dorsiflexion was 12.8 degrees with knees in extension [18]. Greater than 15 degrees may therefore be difficult to achieve.
Relationship between the outcome measures and clinical assessment
The Bangla and ACT tool were most helpful in predicting the need for referral for further intervention (specialist opinion or for further manipulation and casting). The five referrals that were missed with the ACT score were children who required review of a mobile curvature of the lateral border of the foot or supination in swing phase, neither of which are assessed with the score. Despite this, the ACT tool demonstrates the best diagnostic accuracy for the need for referral for further intervention.
Strengths and limitations of study
This study reports on five measurements of success in a cohort at 3.5–5 years from initial treatment. Repeat phone calls facilitated assessments when caregivers were initially unavailable. Two independent raters reduced the likelihood of reporting bias and all outcome measures were verified by the reference standard. The threshold for diagnostic accuracy was based on previous studies and was defined prior to the study. There were also study limitations. No distinction between a clubfoot that may not have been fully corrected and a relapsed clubfoot was made, and all cases with elements of the deformity were classified with the relapse score, which may be a source of potential bias that underestimates the accuracy of the relapse score. The tools were chosen based on ease of use in low resource high volume clinics and were not all initially developed to identify need for referral for further intervention.
Implications for practice
Task shifting and task sharing between orthopaedic and non-specialised health workers in some clinics means that outcome measures are even more important as teams expand. As older children are being treated with the principles of the Ponseti method [19], expert guidance on assessment and measurement in these cases is needed. The Roye score is overly optimistic of good outcomes, the Bangla score is restrictive in identifying good outcome, and the ACT score most closely aligns to clinical examination. However, the Bangla, relapse and ACT scores closely agree on false negatives and have the least chance of missing recurrence; the Bangla score and the relapse score over-estimate referral needs compared to the ACT score.