The objective was to update the literature on the relative efficacy of different osteoporosis medications to prevent four types of osteoporosis-related fractures. Based on the combination of effect size and probability of being most efficacious, teriparatide zoledronic acid and denosumab are consistently ranked highest for reducing non-vertebral and vertebral fractures, the two most common types of fractures
Etidronate is also ranked high on probability of being most efficacious but there are reservations with this result. First, etidronate does not have a statistically significant odds ratio versus placebo for non-vertebral fracture, but was ranked highest for being efficacious. The higher ranking may be due to a wide confidence interval that covers a lower region of odds ratio creating a favourable relative result over that region of low odds ratio. This suggests a limitation with this analysis where a requirement may be that the odds ratio for different drugs should have similar widths. A second caution with the results for etidronate is that the trials were small resulting in small effect sizes and the trials were conducted prior to the year 2000. This suggests that there is a lack of current strong evidence for the efficacy of etidronate versus placebo. As a result of these two limitations, this analysis suggests that etidronate should not be considered among the most efficacious drugs based on current evidence.
In addition, the number needed to treat analysis that treating as few as 10 patients with teriparatide, zoledronic acid or denosumab will produce 1 less fracture than if the patients were on other drugs.
This work updates the most recent study for ITC analysis in osteoporosis medications which looked at vertebral, hip and nonvertebral nonhip fractures  for five drugs, zoledronic acid, alendronate, ibandronate, risedronate and etidronate. Based on that analysis zoledronic acid had a 0.79 probability of being the most efficacious for vertebral fractures. In our analysis, teriparatide (0.40) and etidronate (0.40) had the highest probability of being the most efficacious. In our analysis, we included more studies for etidronate, alendronate, and risedronate in addition to adding denosumab, raloxifene, strontium and teriparatide. Similarly, the earlier work reported that zoledronic acid had the highest probability of preventing hip fractures, while our analysis indicates the most efficacious drugs are teriparatide (0.44), and that zoledronic acid (0.11), etidronate (0.19), denosumab (0.12) and alendronate (0.10) could be the most efficacious treatment. One key difference between inclusions of different studies was that we analyzed wrist fractures specifically while the earlier work reported on nonvertebral nonhip fractures . We report that risedronate does have a high probability of being most efficacious similar to earlier work but we estimated that teriparatide has the highest probability of preventing wrist fractures (0.44).
The other objective of this analysis was to compare the results across two statistical methods. The first method was based on Bayesian ITC analysis in WinBUGS, and the second method was the results from classical Bucher analysis with ITC specific software. The estimates differed only by the second decimal place when the results were statically significant. However, there are key differences in the interpretation of the results. Based on the classical analysis we generated confidence intervals around the odds ratio and provided a test of association. In the Bayesian analysis, we generated a posterior distribution of the credible intervals for the true values of the odds ratio. In this analysis these values are similar, indicating that the priors used in the analysis were uninformative.
The analysis is limited in that the results are based on ITC comparisons. However, a recent review of the results of DTC and ITC analysis, described that out of 44 meta-analysis that were available with studies for meta-analysis by ITC and studies for meta-analysis by DTC, the DTC was similar in all but 3 cases to the ITC estimates for the same drugs and outcomes . Of the 3 cases where the results were statistically different, 2 cases had the relative clinical benefit in the same direction while the third had differences in dosage regime in the studies. This result was also reported by Bucher in 1997  where the ITC results were similar in direction as the DTC estimates. In addition, Bucher and Song both reported that the magnitude of the ITC results was larger between comparators than DTC comparisons, and the level of significance between comparators was less in ITC than DTC. In our ITC analysis, non-significant differences were estimated between drugs but the true effect between drugs may be even smaller.
The other assessment of strength of evidence in the indirect comparisons beyond looking at different classical versus Bayesian analysis was to look at heterogeneity within drugs and across drugs. The heterogeneity between comparators and heterogeneity within one comparator was small, with the exception of alendronate for wrist fractures. This heterogeneity was explained by two studies [28, 29] for wrist fractures. These studies did not contribute to heterogeneity in the meta-analysis of vertebral fractures and non-vertebral fractures. However, these two studies included the one study  that was the longest study with duration of 4 years with a low risk patients and the largest study for alendronate, while the other study  was small single centre study with duration of 4 years with low risk patients and the largest study.
The interpretation of the heterogeneity, although not a major feature in this analysis, is an important factor for ITC analysis. Increased heterogeneity can be caused by differences in inclusion criteria or study design such as length of follow-up. These are also important factors for consideration for analysis of DTC studies . Three studies assessed the effect of patient characteristics to explain the level of heterogeneity in ITC analysis. In 2 studies [56, 57] no baseline variables were significant while in the other study  the year of the study and baseline risk affected heterogeneity. Both of these factors may have also affected heterogeneity if the studies were randomized with an active comparator. In our analysis, we may not have enough power to detect the impact of baseline characteristics because of a low number of studies for each drug . In addition, because of the high heterogeneity in the estimates of odds ratios for wrist fractures, the evidence for wrist fractures should be considered weak.
ITC is becoming a useful tool in the absence of DTC comparisons and increasing transparency of ITC analysis builds confidence for the evidence. In a review of 88 ITC analyses, many of the studies could have increased the believability of their results  but the missed elements would also concern DTC analysis. These include incomplete searches or not assessing heterogeneity within a comparator. In 40/88 analysis there was no specific searches for active comparison studies to allow the comparison to the ITC evidence. For osteoporosis, this search was conducted and we found no published meta-analysis of DTC evidence. In the future stronger evidence may come from head-to-head studies but this is unlikely, because based on this analysis differences between comparators are not significant and studies would require very large sample sizes. Alternatively, the treatment analysis could come for pooling patient level data to compare the effects directly but this is unlikely due to propriety, and this analysis would diminish the benefits of randomization.