DDH (Developmental Dysplasia of the Hip) is a common pediatric orthopedic condition . It represents a broad spectrum of conditions, ranging from congenital dislocation of the hips to occult acetabular dysplasia . Left untreated, DDH can lead to long-term morbidities, including chronic pain, gait abnormalities, and degenerative arthritis. In patients with early diagnosis, within 3 to 6 months of life, the treatment which is essentially conservative and involves the use of dynamic harness, was reported with good clinical and ultrasonographic outcomes [3, 4]. As early and accurate diagnosis of DDH is believed to be the most important factor for satisfactory treatment, hip US (ultrasonography) has become the most commonly used diagnostic tool for DDH during early infancy and has been so for many years [5, 6]. The most widely used DDH screening method was developed by Reinhard Graf in the early 1980s [7,8,9]. The Graf classifications were based on several thresholds of the angles of α and β, as summarized in Supplemental Table S1 (see Additional file 1).
However, there is still controversy concerning the methodology used in infant hip screening programs such as the optimal screening time and the accuracy of the Graf classifications [10,11,12]. The results of an ultrasonographic study revealed that, among the Graf type IIa or worse hips that were identified during the first 3 days of life, only 9% would remain abnormal and require treatment during the follow-up period . In the selective sonographic assessment of ‘at risk hips’ at 6 weeks, there was still a significant risk of overdiagnosis and over-treatment with a positive predictive value of 20.5% . The high false-positive rate is also the major concern about universal DDH screening in many countries when considering costs and efficiency. There are several reasons for the high false-positive rate of the Graf method in early DDH screening. First, the thresholds of angles of α and β ignore the significant differences of race, age, sex, and sides of the hips. In addition to the rapid development of the hip during the first 3 months of infancy, there are notable differences between boys and girls and left and right hips. Furthermore, the measure differences of the angles of α and β among intraobserver and interobserver are of concern [14,15,16,17,18]. Thus the static thresholds for Graf typing, making the reported Graf types range from moderate to substantial and from fair to substantial, respectively [14,15,16, 19, 20]. In the following period, the Graf method together with the technology of the US machines improved dramatically. Other types of IIa Hips (IIa+,IIa-) have been introduced distinguishing immature hip and suspect pathologies in the first 3 months of life reducing the number of overtreatment. Some checklists were introduced to improve the reproducibility among intraobserver and interobserver [21, 22]. However, the literature surrounding the question about selective vs. universal US screening is still very varied around effectiveness, cost, and the possibility of overdiagnosis and overtreatment [23, 24]. A Cochrane review in 2013 concluded that neither US strategy had been demonstrated to improve clinical outcomes, including late diagnosed DDH and surgery . But there also an international interdisciplinary consensus was published in 2019 that strong agreement in favor of universal US screening . It seems different countries have different views on this issue and the debate has not ceased by far.
Based on many studies, there are different hip characteristics among races , between boys and girls , and the left side of hips are more commonly affected . We also know the hip changes rapidly in the first 3 months after birth. But currently, the Graf method which is based on several static thresholds for all infants does not fully consider the difference of race, gender, age, and side of the hip.
Z-scores express how many SD (Standard Deviation) above (positive values) or below (negative values) a given measurement lies with respect to the mean of the specific population. A dynamic reference range based on the Z-score has been widely used in many clinical measurements, especially for fetuses and infants . In pediatric practice, there is the added dimension of somatic growth: a single reference range cannot be applied across children of different races, sizes, sex, and age. For these reasons, we wanted to test whether an age, sex, and side specific Z-score enhanced Graf method could control the high false-positive rate in DDH screening.