Effects of ground and joint reaction force exercise on lumbar spine and femoral neck bone mineral density in postmenopausal women: a meta-analysis of randomized controlled trials

Background Low bone mineral density (BMD) and subsequent fractures are a major public health problem in postmenopausal women. The purpose of this study was to use the aggregate data meta-analytic approach to examine the effects of ground (for example, walking) and/or joint reaction (for example, strength training) exercise on femoral neck (FN) and lumbar spine (LS) BMD in postmenopausal women. Methods The a priori inclusion criteria were: (1) randomized controlled trials, (2) exercise intervention ≥ 24 weeks, (3) comparative control group, (4) postmenopausal women, (5) participants not regularly active, i.e., less than 150 minutes of moderate intensity (3.0 to 5.9 metabolic equivalents) weight bearing endurance activity per week, less than 75 minutes of vigorous intensity (> 6.0 metabolic equivalents) weight bearing endurance activity per week, resistance training < 2 times per week, (6) published and unpublished studies in any language since January 1, 1989, (7) BMD data available at the FN and/or LS. Studies were located by searching six electronic databases, cross-referencing, hand searching and expert review. Dual selection of studies and data abstraction were performed. Hedge’s standardized effect size (g) was calculated for each FN and LS BMD result and pooled using random-effects models. Z-score alpha values, 95%confidence intervals (CI) and number-needed-to-treat (NNT) were calculated for pooled results. Heterogeneity was examined using Q and I2. Mixed-effects ANOVA and simple meta-regression were used to examine changes in FN and LS BMD according to selected categorical and continuous variables. Statistical significance was set at an alpha value ≤0.05 and a trend at >0.05 to ≤ 0.10. Results Small, statistically significant exercise minus control group improvements were found for both FN (28 g’s, 1632 participants, g = 0.288, 95% CI = 0.102, 0.474, p = 0.002, Q = 90.5, p < 0.0001, I2 = 70.1%, NNT = 6) and LS (28 g’s, 1504 participants, g = 0.179, 95% CI = −0.003, 0.361, p = 0.05, Q = 77.7, p < 0.0001, I2 = 65.3%, NNT = 6) BMD. Clinically, it was estimated that the overall changes in FN and LS would reduce the 20-year relative risk of osteoporotic fracture at any site by approximately 11% and 10%, respectively. None of the mixed-effects ANOVA analyses were statistically significant. Statistically significant, or a trend for statistically significant, associations were observed for changes in FN and LS BMD and 20 different predictors. Conclusions The overall findings suggest that exercise may result in clinically relevant benefits to FN and LS BMD in postmenopausal women. Several of the observed associations appear worthy of further investigation in well-designed randomized controlled trials.


Background
Osteoporosis is a major public health problem affecting an estimated 200 million women worldwide [1]. Congruent with osteoporosis is an increased risk for osteoporosisrelated fractures, especially in women during the postmenopausal years, generally considered to begin around 50 years of age [2]. Comparatively, the lifetime risk of an osteoporosis-related fracture in women is equivalent to the risk of developing cardiovascular disease [3]. The two most common sites for osteoporosis-related fractures are the hip and the spine, with an estimated worldwide prevalence of 1.1 million and 862,000, respectively, in women 50 years of age and older in the year 2000 [2]. In the United States, the total annual costs associated with osteoporosis-related fractures were more than $19 billion in 2005 with a predicted increase to $25.3 billion in 2025 [4]. The majority of the costs in 2005 were attributed to fractures of the hip (72%) followed by the spine (6%) [4].
Prevention of osteoporosis has focused on maximizing bone mineral density (BMD) during childhood and adolescence and maintaining BMD during adulthood [5,6]. Preventive measures include adequate calcium and vitamin D intake as well as avoiding cigarette smoking and excessive alcohol intake [5,6]. In addition, ground reaction (for example, jogging) and joint reaction (for example, strength training) force exercise has been recommended across the lifespan [5][6][7][8]. However, the results of previous randomized controlled exercise intervention trials have reached conflicting and underwhelming conclusions regarding the effects of ground reaction and/or joint reaction force exercise on BMD at the femoral neck (FN) and lumbar spine (LS) in postmenopausal women . For example, using the vote-counting approach, only 29% of the exercise versus control group differences in FN BMD have been reported as statistically significant and in the direction of benefit while even fewer (11%) have been reported at the LS . Based on these findings, one might reach the general conclusion that ground and joint reaction force exercise have little or no effect on FN and LS BMD. However, reliance on a vote-counting approach based on statistical significance can be extremely misleading since the absence of a statistically significant effect does not mean that an effect is absent [34]. In contrast, meta-analysis allows one to go beyond statistical significance and focus on the magnitude of effect. It is a quantitative approach for combining the results of studies. The strengths of meta-analysis include: (1) increased power, (2) improved estimates of effect size (ES), and (3) the potential to resolve disagreements between studies [35].
A modality-specific, joint reaction force meta-analysis that included studies published up to December 2004 found a statistically significant benefit of 0.006 g/cm 2 in LS BMD and a non-significant benefit of 0.010 g/cm 2 in FN BMD as a result of high-intensity resistance exercise in postmenopausal women [49]. Another modalityspecific meta-analysis by the same research group which included studies published through December 2006 reported a non-statistically significant benefit in FN and LS BMD in postmenopausal women as a result of walking [50]. These findings suggest that walking, a lower impact, ground reaction force exercise, may have little benefit on FN and LS BMD in postmenopausal women. The same research group also published another metaanalysis that included studies published to 2008 [51]. When limited to randomized controlled trials and a random-effects model, a statistically significant benefit of 0.004 g/cm 2 was found for FN BMD with no statistically significant benefit observed at the LS as a result of exercise in postmenopausal women [51]. More recently, a Cochrane systematic review by Howe et al. reported a statistically significant exercise minus control group benefit of 0.85% in LS BMD but no significant change in FN BMD (−0.08%) as a result of joint and/or ground reaction force exercise in postmenopausal women [37]. However, this systematic review did not appear to be limited to studies in which participants had been previously participating in exercise levels below that currently recommended for bone health [8]. Consequently, the benefits of exercise could have been underestimated. Another meta-analysis reported a statistically significant benefit of 0.014 g/cm 2 and 0.012 g/cm 2 , respectively, for both FN and LS BMD in females 60 years of age and older [48]. However, similar to the work of Howe et al. [37], participants did not appear to be limited to those who were participating in exercise levels below that currently recommended for bone health [8]. In addition, all studies were coded by one person, thereby increasing the risk for coding errors [47]. A potential reason for the discrepancy in findings for FN BMD between the Howe et al. [37] and Marques et al. [48] reviews may be accounted for by the fact that the latter meta-analysis limited studies to those in adults 60 years of age and older. This raises the possibility that older postmenopausal women may have more to gain from a regular exercise program. Finally, because the number of analyses aimed at trying to establish the association between selected covariates and changes in FN and LS BMD was limited for all of the previously described meta-analyses, potentially important covariates could have been missed.
A need exists for an updated and thorough metaanalysis on the effects of different ground and joint reaction force exercises, either alone or in combination, on FN and LS BMD in postmenopausal women not participating in exercise levels currently recommended for bone health [8]. Therefore, the purpose of this study was to use the aggregate data meta-analytic approach to determine the effects of ground and/or joint reaction force exercise on BMD at the FN and LS in postmenopausal women not participating in exercise levels currently recommended for bone health [8].

Study eligibility criteria
The a priori inclusion criteria for this meta-analysis were as follows: (1) randomized controlled trials, (2) exercise intervention ≥ 24 weeks, (3) comparative control group (attention control, non-intervention, etc.), (4) postmenopausal women, as defined by the authors, (5) participants not currently participating in any type of regular joint and/or ground reaction force exercise, as defined by the authors, (6) published and unpublished (master's theses and dissertations) studies in any language since January 1, 1989 and (7) BMD (relative value of bone mineral per measured bone area or volume) assessed at the FN and/ or LS using dual-energy x-ray absorptiometry (DEXA) or dual-photon absorptiometry (DPA). Given the heterogeneity of reporting by the authors with respect to previous exercise in participants, we revised our inclusion criteria post hoc so that only participants who performed less than 150 minutes of moderate intensity (3.0 to 5.9 metabolic equivalents) weight bearing endurance activity per week, less than 75 minutes of vigorous intensity (> 6.0 metabolic equivalents) weight bearing endurance activity per week, resistance training <2 times per week, were included [7]. Studies were limited to those in which exercise was performed for at least 6 months since it has been suggested that one can generally expect exerciseinduced changes in BMD to occur after approximately this length of time [55]. Resistance training studies were included only if lower body exercises were part of the exercise program. The year 1989 was chosen as the starting point for the inclusion of studies because it appeared to be the first year in which a randomized controlled intervention trial on exercise and BMD in postmenopausal women was conducted [56]. Studies were limited to those in which BMD at the FN and LS were assessed using either DPA or DEXA since they are/have been the most common instruments for assessing BMD in the clinical setting. Only those groups that met the inclusion criteria were included from each study. Any studies not meeting all of the above criteria were excluded from the meta-analysis.

Data sources
Studies were retrieved using the following six electronic databases: (1) Medline (within EBSCO host), (2) Embase, (3) Cochrane Central Register of Controlled Trials (CEN-TRAL), (4) Dissertation Abstracts Online (DAO), (5) CINAHL (within EBSCOhost), and (6) SPORTDiscus (within EBSCOhost). The last search was conducted in August, 2011. All electronic searches were conducted by the second author with assistance from a Health Sciences librarian at West Virginia University. While the search strategies used varied according to the different databases searched, three key words, or forms of keywords, germane to all searches were 'exercise' , 'bone' and 'randomized'. An example of the search strategy used for one of the electronic database searches (SPORTDiscus) is shown in Additional file 1. In addition to electronic searches, cross-referencing from retrieved studies and previous review articles, both systematic and narrative, was performed. Furthermore, hand searches of selected journals were conducted.

Study selection
All studies were selected by the first two authors, independent of each other. Disagreements regarding the final list of studies to include were resolved by consensus. If consensus could not be reached, the third author acted as an arbitrator. After an initial list of included studies was developed, the third author reviewed the list for completeness. All included studies as well as a list of excluded studies, including reasons for exclusion, were stored in Reference Manager (version 12.0.1) [57].

Data abstraction
Prior to data abstraction, a detailed codebook that could hold more than 245 items per study was developed by all three members of the research team in Microsoft Excel 2007 [58]. The major categories of variables that were coded included: (1) study characteristics, (2) subject characteristics, (3) exercise program characteristics, (4) primary outcomes and (5) secondary outcomes. The primary outcomes for this study were BMD at the FN and LS. Secondary outcomes included other measures of BMD (Ward's triangle, total hip, trochanteric, intertrochanteric, whole body, radius) as well as number of fractures, aerobic fitness, dynamic and static balance, body weight, body mass index (BMI), lean body mass (LBM), fat mass, percent body fat, upper and lower body muscular strength, and calcium and vitamin D intake. Missing primary outcome data were requested from the author(s). Multiple publication bias was avoided by only including data from the most recently published study.
As part of the coding process, the effective load rating for the exercise intervention from each study was calculated using a recently developed, age-adjusted formula [59]. This included the frequency of exercise per week along with the effective load rating, calculated as the product of peak vertical ground reaction force and the rate of force application [59]. Given the multiple types of exercises used in many of the studies, it was not possible to calculate effective load ratings specific to each activity within each study. Therefore, the broad categories recommended by previous work were used [59]. These included numerical effective load ratings equivalent to low (walking, etc.), moderate (tennis, etc.) and high (jumping, etc.) forces [59]. Effective load ratings were also provided for strength training [59]. All studies were coded by the first two authors, independent of each other. They then met and reviewed every entry for accuracy and consistency. Discrepancies were resolved by consensus. If consensus could not be reached, the third author served as an arbitrator.

Risk of bias
The Cochrane Collaboration risk of bias instrument was used to assess bias across five categories: (1) sequence generation, (2) allocation concealment, (3) blinding to group assignment, (4) incomplete outcome data and (5) incomplete outcome reporting [60]. Each item was classified as having either a high, low, or unclear risk of bias [60]. Assessment for risk of bias was limited to the primary outcomes of interest, i.e. FN and LS BMD. Given the objective nature of BMD assessment, all studies were considered to be at a low risk of bias with respect to blinding unless the study reported some reason for such. For incomplete outcome reporting, studies were considered to be at an unclear risk for bias if studies did not report a study protocol identification number to confirm assessed outcomes. No study was excluded based on the results of the risk of bias assessment [61]. All assessments were performed by the first two authors, independent of each other. Both authors then met and reviewed every item for agreement. Disagreements were resolved by consensus.

Statistical analysis
Calculation of effect sizes for primary and secondary outcomes from each study Given the different methods of reporting results for primary outcomes, i.e., FN and LS BMD, the standardized mean difference effect size (g), adjusted for small sample bias, was calculated from each study in order to create a common metric for the pooling of findings [62]. Since all studies were parallel, randomized controlled trials , the g for each outcome from each study was calculated as the difference in change scores between the exercise and control groups divided by the pooled SD of the change scores [62]. For studies in which change outcome SDs for the exercise and control groups were not reported, these were estimated for the exercise and control groups using pre-and post-intervention means and SDs according to the approach of Follmann et al. [63]. For studies that did not allow for such calculations using the aforementioned methods, g was calculated using the reported 95% confidence intervals (95% CIs). After calculating g from each study, its variance was estimated using previously developed procedures [62]. The beneficial effects of exercise on FN and LS BMD were denoted by a positive g.
Secondary outcomes from each study were calculated using either g (Ward's triangle, total hip, trochanteric, whole body, radius, calcaneus, aerobic fitness, dynamic and static balance, upper and lower body muscular strength) or the original metric (body weight in kilograms, BMI in kilogram per meters-squared, LBM in kilograms, fat mass in kilograms and percent of body weight, calcium intake in milligrams, vitamin D intake in micrograms).

Pooled estimates for FN and LS BMD
Random-effects, method-of-moments models that incorporate heterogeneity into the overall estimate were used to pool results for FN and LS BMD as well as secondary outcomes from each study [64]. Multiple groups from the same study were analyzed independently as well as collapsing multiple groups so that only one ES represented each outcome from each study. For the one study that included both per-protocol and intention-totreat analyses, the more conservative intention-to-treat results were used [10]. While the same study assessed LS BMD at both the L1-L4 and L2-L4 sites [10], data are reported using the L1-L4 sites based on the International Society for Clinical Densitometry 2007 Position Stand recommending that L1-L4 be used for LS BMD measurement [65]. A z-score two-tailed alpha value of ≤0.05 was considered to be statistically significant. Alpha values >0.05 but ≤ 0.10 were considered as a trend. To determine the precision of these estimates, two-tailed 95% confidence intervals (CIs) were also calculated. Analysis of secondary outcomes was considered exploratory because they were not part of the inclusion criteria, and thus, may represent a biased sample.
In terms of magnitude, values for those outcomes in which g was used may be classified as either trivial (<0.20), small (≥0.20 to <0.50), medium (≥0.50 to <0.80), or large (≥0.80) [66]. A g of 0.20, for example, means that exercise would result in a 0.20 SD benefit over those who did not exercise. Given that the interpretation of g can be difficult with respect to clinical and practical relevance [67], the number needed to treat (NNT) was estimated for FN and LS BMD from pooled g's using procedures described by Kraemer and Kupfer [68]. For continuous data, the event is the increase in BMD of magnitude g. In addition, the NNT was used to provide a gross estimate of the number of US women 50 years of age and older who could achieve benefit in FN and LS BMD by initiating and maintaining a regular exercise program. This estimate was based on US Census Data for the number of women 50 years of age and older in the US (53,410,602) [69] and Healthy People 2020 Objective PA-2.4 for increasing by 10% the number of adults who meet current physical activity guidelines for aerobic and muscle-strengthening activity [70]. Based on the most recently available physical activity estimates for US adult females, this means an increase in physical activity from 14.9% to approximately 16.4%, a 1.49% increase [71].

Stability and validity of changes in g for FN and LS BMD
Heterogeneity of results between studies was examined using Q as well as an extension of the Q statistic, I 2 [72]. Statistical significance for Q was set at an alpha value of ≤0.10. For I 2 , values of 25% to <50%, 50% to <75%, and ≥75% may be considered to represent small, medium, and large amounts of inconsistency, respectively [72]. To determine treatment effects in a new trial, 95% prediction intervals were also calculated [73,74].
Publication bias was examined using the trim and fill approach of Duval and Tweedie [75]. Potential publication bias was considered noteworthy if a statistically significant finding was no longer significant after imputing potentially missing studies.
In order to examine the effects of each g from each study on the overall findings, results were analyzed with each study deleted from the model once. In addition, standardized residuals ≥ 3.0 were considered as outliers but not arbitrarily deleted from the model. Cumulative metaanalysis, ranked by year, was used to examine the accumulation of evidence over time on FN and LS BMD [76].

Moderator analysis for FN and LS BMD
Between-group differences (Q b ) in FN and LS BMD for categorical variables were examined using mixed effects ANOVA-like models for meta-analysis [77]. This consisted of a random effects model for combining studies within each subgroup and a fixed effect-model across subgroups [77]. Study-to-study variance (tau-squared) was considered not equal for all subgroups. This value was computed within subgroups but not pooled across subgroups. Planned categorical variables to examine a priori and in which each category had at least 3 g's included: country in which the study was conducted (USA, other), type of control group (non-intervention, other), matching procedures (yes, no), risk of bias assessment (sequence generation, allocation concealment, blinding, incomplete outcome data, outcome reporting bias according to low, high or unclear risk), type of analysis (per-protocol, intention-to-treat), provision of sample size estimates (yes, no), external funding for the study (yes, no), adverse events (yes, no), whether participants were allowed or required to have osteoporosis, whether they were allowed to be current cigarette smokers and/or consume alcohol (yes, no), changes in exercise habits beyond the exercise intervention (increase, decrease, no change), no prior exercise allowed versus some prior exercise but less than that recommended by the American College of Sports Medicine (yes, no) [8], whether calcium and/or vitamin D supplements were given during the study (yes, no), type of exercise (aerobic, strength, both), exercise delivery (supervised, unsupervised, both), type of reaction forces (ground, joint, both) and instrumentation (Hologic, Lunar). The twotailed alpha value for a statistically significant difference between groups (Q b ) was set at p ≤ 0.05 with values >0.05 but ≤0.10 considered as a trend. All moderator analyses were considered exploratory [78].

Meta-regression for FN and LS BMD
Simple mixed-effects, method of moments metaregression was used to examine the potential association between changes in FN and LS BMD and continuous variables with at least 3 g's [77]. Because of expected missing data for different variables from different studies, only simple meta-regression was planned and performed. Potential predictor variables, established a priori, included year of publication, percentage of dropouts, age in years and years postmenopausal. For exercise training, variables for aerobic-only groups included length (weeks), frequency (days per week), intensity, expressed as a percentage of maximum oxygen consumption (%VO 2max ), percentage of maximal heart rate (MHR) or heart rate reserve (HRR), duration (minutes per session), minutes of training per week and compliance, defined as the percentage of exercise sessions attended. For strength training only groups, variables included: length (weeks), frequency (days per week), intensity, expressed as a percentage of one-repetition maximum (% 1RM), number of sets, repetitions and exercises, rest between sets (seconds) and compliance (%). For those groups that performed both aerobic and strength training concurrently, variables included: length in weeks, frequency (days per week) and percent compliance. Other potential predictors included: load ratings and baseline BMD as well as changes in aerobic fitness, dynamic and static balance, calcium and vitamin D intake, lower and upper body strength, BMI in kg/m 2 , body weight, LBM, percent body fat and fat mass. The alpha value for a statistically significant association was set at ≤0.05. Alpha values >0.05 but ≤0.10 were considered as a trend for an association. All meta-regression analyses were considered exploratory [78].
The dropout rate ranged from 0% to 43% for the 30 exercise groups for which data were available ( x ± SD = 17 ± 12%, Mdn = 12%) and 0% to 27% for the 24 control groups in which data were available for ( x ± SD = 13 ± 7%, Mdn = 15%). Twelve studies (52%) provided one or more of the following reasons for participants dropping out or for the investigative team to drop participants from the study: (1) personal health problems apparently unrelated to the intervention [13,16,17,26,27,29,30,33], (2) time [14,25,30], (3) lack of compliance to the exercise intervention [10,11], (4) personal issues not related to one's health [11,13,26,27,33], (5) lack of interest [26] and (6) moved [30]. Five studies (20%) reported that one or more participants experienced musculoskeletal pain and/or minor musculoskeletal injuries as a result of the exercise intervention [9,18,24,29,30]. For the other studies, a lack of complete data were available regarding any possible pain and/or injuries as a result of the interventions. No serious adverse events were reported.
Characteristics of the exercise programs from each group and each study are described in Additional file 2. As can be seen, the exercise interventions varied widely. Fourteen groups (40%) participated in exercise interventions that focused on joint reaction forces (for example, strength training) while 12 (34%) focused on ground reaction forces (for example, aerobic exercises such as walking and jumping). Another nine groups (26%) included exercises that provided both joint and ground reaction forces. With the exception of four groups (11%) that performed either jumping or agility training, the remaining 31 (89%) focused on aerobic and/or strength training exercises. The load rating for the 28 groups in which data were available for calculation ranged from 9.4 to 340.5 ( x ± SD = 57.3 ± 117.7, Mdn = 10). The length of training across all groups ranged from 24 to 104 weeks ( x ± SD = 50.7 ± 23.3, Mdn = 52). A group summary of the characteristics for those studies that included aerobic and/or strength training is shown in Table 2.

Risk of bias assessment
Risk of bias results are shown in Figure 2. As can be seen, the majority of studies were considered to be at low risk with respect to sequence generation and blinding and unclear risk for allocation concealment and incomplete outcome reporting. Approximately half of the studies were considered to be at either low or unclear risk for incomplete outcome data.

Primary outcomes FN BMD
Overall, there was a statistically significant benefit of ground and/or joint reaction force exercise on FN BMD (Table 3, Figure 3). In addition, non-overlapping CIs were observed. The NNT was 6 with an estimated 127,968 postmenopausal US women experiencing    Figure 5). Moderator analysis for changes in FN BMD is shown in Additional file 3. As can be seen, no statistically significant between-group differences (Q b ) were found for those a priori comparisons for which sufficient data were available.
Meta-regression analyses for changes in FN BMD are shown in Additional file 4. As can be seen, there was a statistically significant association between increases in FN BMD and decreased compliance (combined aerobic and strength training groups only), decreases in BMI, decreases in body weight and decreases in percent body fat. A trend for a statistically significant association was observed for increases in FN BMD and increases in   intensity (strength only), increased compliance (strength training group only) and increases in static balance.

LS BMD
Overall, there was a statistically significant benefit in LS BMD but slightly overlapping 95% CIs (Table 3, Figure 6). The NNT was 6 with an estimated 80,219 postmenopausal US women maintaining and/or increasing their LS BMD if they began and maintained a regular exercise program. A moderate and statistically significant amount of heterogeneity was observed as well as overlapping prediction intervals. No adjustment for publication bias was needed. With the exception of one study [11], an outlier, results remained statistically significant or there was a trend for statistical significance when each study was deleted from the model once ( Figure 7). The difference in g between the largest and smallest values was 0.084 (41%) when each study was deleted. With the one outlier deleted from the model, the alpha value for g increased to 0.12 and heterogeneity, while still statistically significant (Q = 42.2, p = 0.02), was reduced to 48.5%. The benefits in LS BMD remained statistically significant when data were collapsed so that only one g represented each study (g = 0.231, 95% CI = 0.026, 0.435, p = 0.03, Q = 71.1, p <0.0001, I 2 = 71.9%). Cumulative meta-analysis, ranked by year, demonstrated that results have been statistically significant, or there has been a trend for statistical significance, since 2009 ( Figure 8).
Moderator analysis for changes in LS BMD is shown in Additional file 3. As can be seen, no statistically significant between-group differences (Q b ) were found for those a priori comparisons in which sufficient data were available.
Meta-regression analyses for changes in LS BMD are shown in Additional file 4. As shown, there was a statistically significant association between increases in LS BMD and older age, greater number of years postmenopausal, fewer minutes of training per session (aerobic groups only), fewer minutes of training per week, greater  intensity of training (strength only), increased compliance (strength only), decreased compliance (combined aerobic and strength training only), increases in static balance, decreases in BMI, body weight and percent body fat. A trend for a statistically significant association was found between increases in LS BMD and smaller increases in aerobic fitness as well as increases in lean body mass.

Secondary outcomes
Changes in secondary outcomes are shown in Table 3. As can be seen there was a statistically significant benefit in BMD at the total hip, trochanteric and intertrochanteric regions. A non-significant and small to nil amount of heterogeneity was observed for all three outcomes. In addition, non-overlapping prediction intervals were observed for the trochanteric region. Furthermore, large, statistically significant improvements as well as statistically significant and large amounts of heterogeneity were found for aerobic fitness, dynamic and static balance. For body composition, a trend for statistically significant increases in LBM along with a statistically significant and moderate amount of heterogeneity was observed. A statistically significant decrease as well as a statistically significant and moderate amount of heterogeneity was also observed for percent body fat. For both upper and lower body strength, large, statistically significant increases were observed as well as large and statistically significant amounts of heterogeneity. Insufficient data were available to examine differences in fractures between the exercise and control groups.

Discussion
The purpose of this study was to use the aggregate data meta-analytic approach to determine the effects of ground and/or joint reaction force exercise on BMD at the FN and LS in postmenopausal women participating in exercise levels below that currently recommended for bone health [8]. The overall results suggest that ground and joint reaction force exercise may result in clinically important benefits in FN and LS BMD, with results more convincing for FN BMD. These findings are  similar to those from three [48,51,53] of four [37,48,51,53] previous meta-analyses for FN BMD and four [37,39,48,53] of five [37,39,48,51,53] previous metaanalyses for LS BMD, all of which included both ground and joint reaction force exercises from randomized controlled trials in postmenopausal women. Further support for the overall findings of the current meta-analysis were strengthened by the robustness of results when data were collapsed so that only one g represented each study as well as when examined for publication bias. When each study was deleted from the model once, results remained statistically significant for FN BMD across all deletions but were no longer statistically significant for LS BMD (p = 0.12) when one study was deleted from the model [11]. From a stability perspective, the statistical significance of findings has been consistent over a longer period of time for BMD at the FN (2000) versus LS (2009). Thus, the changes in BMD appear to be more convincing for FN versus LS BMD. This may have to do with the possibility that the exercise protocols employed were more specific to the FN versus LS.
While random-effects models that incorporate heterogeneity into the analysis were used, it is still important to point out that heterogeneity was observed for both FN and LS BMD. The existence of heterogeneity in metaanalysis is not only common [79], but also important, as there is no need to combine studies exactly alike since their findings, within statistical error, would be the same [80]. In addition, prediction intervals for estimating the expected results of a new trial included zero for both FN and LS BMD. However, these values should not be confused with confidence intervals since prediction intervals are based on a random mean effect while confidence intervals are not [73]. Nevertheless, these prediction intervals may be beneficial for future researchers interested in conducting randomized controlled intervention trials addressing the effects of ground and/or joint reaction force exercise on FN and LS BMD in postmenopausal women.
While the magnitude of change in FN and LS BMD might be considered small at the FN and trivial at the LS, they appear to be clinically important. For example, based on previous prediction models [81], the exercise-

Study name
Subgroup within study Cumulative statistics Cumulative point estimate (95% CI) induced changes in BMD observed at the FN and LS in the current meta-analysis would reduce the 20-year relative risk of osteoporotic fracture at any site by approximately 11% and 10%, respectively. However, the observed benefits of exercise on FN (g = 0.29) and LS (g = 0.18) BMD in the current meta-analysis were smaller than those previously reported for pharmacologic interventions (alendronate, calcitonin, etidronate, hormone therapy, raloxifine, risedronate) at both the hip (range of g = 0.64 to 5.74) and LS (range of g = 0.90 to 8.90) [82]. The exercise-induced benefits on FN and LS BMD also appear to be similar to or smaller than those observed for calcium and vitamin D supplementation (g for calcium = 0.45 at the hip and 1.57 at the LS; g for vitamin D = 0.47 at the hip and 0.20 at the LS) [82]. However, the use of pharmacological and nutritional interventions should be considered with respect to several factors. These include: (1) the potential adverse effects of pharmacologic agents [83], (2) that participants included in previous pharmacological and nutritional intervention studies had generally lower initial levels of BMD than participants included in the current exercise meta-analysis [83], and (3) that exercise results in numerous other benefits not realized with pharmacologic and nutritional interventions [84], for example, increases in balance and a subsequent reduction in falls [85]. Given the former, the current recommendations of lifestyle changes such as exercise and adequate calcium and vitamin D intake prior to pharmacological intervention appear to be appropriate [6]. The focus of the present meta-analysis has been on the use of the traditional alpha value for statistical significance Favors Control Favors Exercise Figure 6 Forest plot for changes in LS BMD. Forest plot for point estimate standardized effect size changes (g) in LS BMD. The black squares represent the standardized mean difference (g) while the left and right extremes of the squares represent the corresponding 95% confidence intervals. The middle of the black diamond represents the overall standardized mean difference (g) while the left and right extremes of the diamond represent the corresponding 95% confidence intervals. For subgroup, HRT means hormone replacement therapy.

Lower
(p ≤ 0.05) and 95% CI. However, it has been suggested that rather than focus on the term statistically significant and alpha value cutpoints, one should report the exact alpha value and use 90% CI to determine clinical relevance within the range of the 90% interval [86]. Using the 90% CI approach, the interval no longer included zero (0) for changes in LS BMD (0.026 to 0.332) and ranged from 0.132 to 0.444 for changes in FN BMD.
No statistically significant between-group differences were found when mixed-effects ANOVA was conducted for changes in FN and LS BMD partitioned by a large number of categorical variables. However, while no statistically significant between-group differences were noted, changes in FN BMD were smaller for ground (g = 0.088) versus joint (g = 0.420) and combined joint and ground reaction force exercise (g = 0.398).
Several interesting associations were found when simple meta-regression was performed for changes in FN and LS BMD. For ease of reading, statistically significant findings (p < 0.05) as well as trends for statistical significance (>0.05 but ≤ 0.10) are discussed collectively. For both FN and LS BMD, greater increases were associated with both greater intensity and compliance in the strength training (joint-reaction force) groups. These findings suggest that greater loads per repetition as well as greater adherence may provide greater benefit to FN and LS BMD. Greater improvements in both FN and LS BMD were also associated with increases in static balance. These associations may be especially important for reducing the risk of falling as well as subsequent fracture risk. Greater increases in both FN and LS BMD were also associated with decreases in BMI, body weight and percent body fat. In addition, increases in LS BMD were associated with increases in LBM. All of these associations may be reflective of greater exercise effort. The inverse association between increases in both FN and LS BMD with poorer compliance to  Figure 7 Influence analysis for changes in LS BMD. Influence analysis for point estimate standardized effect size changes (g) in LS BMD with each corresponding study deleted from the model once. The black squares represent the standardized mean difference (g) while the left and right extremes of the squares represent the corresponding 95% confidence intervals. The middle of the black diamond represents the overall standardized mean difference (g) while the left and right extremes of the diamond represent the corresponding 95% confidence intervals. Results are ordered from smallest to largest values of g. For subgroup, HRT means hormone replacement therapy.

Favors Control Favors Exercise
aerobic and strength training protocols may be nothing more than the play of chance. Alternatively, studies with poorer compliance may have yielded greater benefits in FN and LS BMD because of the greater overall volume of training prescribed. For LS BMD, the positive association between increases in LS BMD and older age as well as a greater number of years postmenopausal may be the result of lower initial levels of BMD. However, we found no association between baseline LS BMD and changes in LS BMD. The negative associations between increases in LS BMD with shorter duration and total minutes of training per week for aerobic exercise studies may help to reinforce the belief that shorter duration activities such as jumping may be more beneficial to LS BMD than activities such as walking [7]. One potential reason for this negative association may be the result of calcium loss from excessive sweating in longer duration and/or higher intensity activities [87,88]. This causes a decrease in serum calcium followed by an increase in serum parathyroid hormone, which then stimulates bone resorption [87,88]. While these findings are interesting, further research is needed before any firm conclusions can be drawn. In addition to changes in FN and LS BMD, statistically significant improvements were found for several secondary outcomes. These included increases in BMD (total hip, trochanteric, intertrochanteric), aerobic fitness, dynamic and static balance, lean body mass and both upper and lower body strength. Statistically significant decreases in percent body fat were also found. These findings reinforce the many benefits that can be derived from exercise programs [84]. The former notwithstanding, the results for secondary outcomes should be interpreted with caution since they were only included if FN and/or LS BMD data were reported. Consequently, secondary outcomes in meta-analysis may not comprise a representative sample.  Figure 8 Cumulative meta-analysis for changes in LS BMD. Cumulative meta-analysis, ordered by year, for point estimate standardized effect size changes (g) in LS BMD. The black squares represent the standardized mean difference (g) while the left and right extremes of the squares represent the corresponding 95% confidence intervals. The results of each corresponding study are pooled with all studies preceding it. The middle of the black diamond represents the overall standardized mean difference (g) while the left and right extremes of the diamond represent the corresponding 95% confidence intervals. For subgroup, HRT means hormone replacement therapy.

Favors Control Favors Exercise
A major interest of the investigative team was to examine the dose-response relationship between changes in FN and LS BMD and exercise load ratings in postmenopausal women. While we found no significant association between changes in FN and LS BMD and load ratings, these associations were based on general categorical estimates versus estimates specific to each activity [59]. The decision to use categorical estimates was based on the inability to accurately calculate load ratings for those studies that involved multiple types of activities. In addition, the algorithm used requires further testing, improvement and validation [59]. Future research should also focus on developing formulas for accurately calculating load ratings from data typically provided in randomized controlled intervention trials. Ideally, individual studies should collect and report force data in all exercise interventions. However, the accurate measurement of such may be challenging for some activities [7]. Until additional dose-response research is conducted, it would appear plausible to suggest that postmenopausal women adhere to the exercise guidelines from the American College of Sports Medicine [8]. These include weightbearing endurance activities 3 to 5 times per week as well as resistance exercise 2 to 3 times per week [8]. However, it will be particularly important for future dose-response studies to determine whether increased duration of aerobic exercise diminishes the potential skeletal benefits, as suggested by the current regression analyses.
The results of this meta-analysis should be viewed with respect to several potential limitations. First, because studies are not randomly assigned to covariates, they are considered to be observational in nature. Therefore, the results of moderator and regression analyses conducted in this or any other meta-analysis do not support causal inferences [78]. Second, because a large number of statistical tests were conducted, some statistically significant results could have been nothing more than the play of chance. However, as suggested by Rothman [89], no adjustment was made for multiple tests because of the concern about missing possibly important findings. Third, because of a lack of data, a common occurrence in meta-analysis, the research team was unable to examine several variables, thereby compromising the thoroughness of the study. With the former in mind, it is suggested that future randomized controlled trials addressing the effects of ground and/or joint reaction force exercise on FN and LS BMD in postmenopausal women include information regarding study design (allocation concealment, incomplete outcome data, verification that all outcomes planned to be assessed are reported), participant characteristics (adverse events, whether the participants had osteoporosis, cigarette smoking, alcohol consumption, change in exercise habits outside the intervention) and exercise intervention characteristics (intensity, how exercise was delivered). Fourth, future studies should provide more specific information regarding their exercise cutpoints for enrolling participants in their studies. The heterogeneity of reporting found in the current meta-analysis is not surprising. In a systematic review of the different definitions of sedentary for screening participants for entrance into physical activity intervention trials, Bennett et al. [90], found that the definition of sedentary ranged from less than 20 to less than 150 minutes per week minutes of physical activity and that few studies reported the type and intensity of physical activity used to screen participants. While such varied definitions may make it difficult to generalize findings, the current meta-analysis, to the best of the authors' knowledge, is the first one on exercise and BMD in women to limit the inclusion of studies to those in which participants were not currently meeting exercise recommendations for bone health [8]. Fifth, given the potential advantage of high resolution peripheral quantitative computed tomography (HR-pQCT) for detecting microarchitectural changes in bone [91], it would appear plausible to suggest that future exercise intervention studies should use this technology so as to better understand the exercise-induced changes that may occur in bone. Finally, consistent with recommendations from the 2008 Physical Activity Guidelines Report, there continues to be a need for large randomized controlled trials to determine whether fracture incidence is decreased as a result of ground and/or joint reaction force exercise [7].