Quality of life and functionality after total hip arthroplasty: a long-term follow-up study

Background There is a lack of data on the long-term outcome of total hip arthroplasty procedures, as assessed by validated tools. Methods We conducted a follow-up study to evaluate the quality of life and functionality of 250 patients an average of 16 years (range: 11-23 years) after total hip arthroplasty using a validated assessment set including the SF-36 questionnaire, Harris Hip Score, WOMAC score, Functional Comorbidity Index, and a study specific questionnaire. Models of multiple stepwise linear and logistic regression analysis were constructed to evaluate the relationships between several explanatory variables and these functional outcomes. Results The SF-36 physical indexes of these patients compared negatively with the normative values but positively with the results obtained in untreated subjects with severe hip osteoarthritis. Similar results were detected for the Harris Hip Score and WOMAC score. There was a 96% rate of post-surgical satisfaction. Hip functionality and comorbidities were the most important determinants of physical measures on the SF-36. Conclusions Patients who had undergone total hip arthroplasty have impaired long-term self-reported physical quality of life and hip functionality but they still perform physically better than untreated patients with advanced hip osteoarthritis. However, the level of post-surgical satisfaction is high.


Background
Hip osteoarthritis (OA) is a cause of severe pain and disability [1] but can be successfully treated with total hip arthroplasty (THA). Short-and medium-term THA studies report substantial improvements in the generic health-related quality of life (HRQoL) [2][3][4][5][6] and hip functionality [4,7] in subjects with OA. Currently about 20% of THA are performed in people younger than 60 years with variable diagnoses [8]; the general increase in life expectancy is expected to further increase the need for this procedure [9]. These data suggest that greater attention should be paid to the long-term follow-up results of hip replacement surgery. A comprehensive approach requires the combined use of generic and disease-specific patient-oriented validated measures [5], but there is a lack of data on the long-term outcome of THA procedures, as assessed by these validated tools. Even less is known about possible predictors of long term outcomes of these procedures. The goals of the present study were: 1) to evaluate by validated instruments whether subjects who had undergone THA more than 11 years earlier had severe functional impairment and/or disability, and 2) to identify possible outcome predictors of long term HRQoL and hip functionality after THA.

Methods
After approval by the local ethics committee, we enrolled patients who had undergone THA at our institution from 1985 to 1996 who fulfilled the following inclusion criteria: (1) age less than 70 years at operation (2) total hip arthroplasty, and (3) primary surgery. On the basis of these criteria we selected 412 subjects. Onehundred sixteen (28%) of them had died before our study commenced. Thus, 296 were available for followup examination. We were able to collect data on 250 patients (162 females and 88 males) with 330 THA (80 bilateral procedures), who represented 84% of the surviving patients. Forty-six subjects refused to participate in the study because of severe comorbidities or lack of interest. The selection of patients is shown in Figure 1.
No significant differences were found between the participants and those subjects lost to follow-up with respect to gender (p = 0.83), preoperative diagnosis (p = 0.37), operating surgeon (p = 0.34), or use of cemented/ cementless implant (p = 0.55). The only parameter that differentiated between the two groups was the mean age, which was significantly older in the subjects lost to follow-up (76.8 vs 70.8 years; p = 0.004). The patients' data are shown in Table 1. A direct transgluteal lateral approach was used in all cases. Out of 330 implants, 118 (36%) were cemented and 212 (64%) were cementless THA. Preoperative diagnoses were primary osteoarthritis in 252 hips (76%), osteonecrosis in 32 (10%), posttraumatic arthritis in 24 (7%), developmental dysplasia of the hip in 14 (4%), rheumatoid arthritis in 6 (2%), and residual arthritis from slipped capital femoral epiphysis in two hips (1%). The mean age at follow-up of eighty patients who received bilateral THA was older compared to subjects with unilateral THA (73 vs 69.7 years; p = 0.039), but no sex differences were found between these two groups. Out of 250 participants, 189 (76%) agreed to return for a follow-up visit, and 61 (24%) answered our questionnaires through a telephone interview. The mean ± standard deviation (SD) length of follow-up for the participants was 16.1 ± 3.6 years (range [11][12][13][14][15][16][17][18][19][20][21][22][23]. During the follow-up visits, the patients gave their informed consent and underwent a complete physical examination as well as weight and height measurement. The clinical investigation was carried out by one of the authors, who was not involved in the primary care. The following patient-oriented instruments were chosen to evaluate the patients: the Italian version of the Short Form-36 Health Survey (SF-36) Questionnaire [10], the Harris Hip Score (HHS) [11], the Italian version of the Western Ontario and Mac Master University (WOMAC) Questionnaire [12], the Functional Comorbidity Index (FCI) [13], and a study-specific questionnaire dealing with patients' daily life activities, medical history, intensity and frequency of hip pain, possible reoperations, degree of satisfaction with surgery, and willingness to undergo the same operation again. The SF-36 Questionnaire is a generic measure of health status which contains 36 questions measuring the physical, social, and mental components of subjects. It yields an eight-scale profile of scores (i.e. physical functioning = PF; role physical = RP; bodily pain = BP; general health = GH; vitality = VT; social functioning = SF; role emotional = RE; mental health = MH) as well as summary physical (PCS) and mental (MCS) measures. SF-36 results were compared to the published data [14]. The HHS is a widely used diseasespecific outcome measure for THA studies to assess pain and functional status. Sum scores are fitted in a 0-100 scale, with high values indicating less pain or better physical functioning. The WOMAC is a selfadministered disease-specific validated outcome measure that evaluates pain (5 items), stiffness (2 items), and physical function (17 items). A total WOMAC summary score is calculated for each individual, adjusted, and reported on a 0-to-100 scale. Lower scores are associated with less pain and stiffness and better function. The FCI is a validated 18-item list of diagnoses designed to assess the burden of comorbidities on physical function. Each item is given 1 point if present, and the final score is the sum of the items. Fifteen randomly selected study subjects completed the questionnaires twice (the second time after a 20-day interval) to assess test-retest reliability. Pearson's product-moment correlation coefficients for the results of the tests ranged from 0.71 to 0.90 for the SF-36 scale scores, and averaged 0.84 and 0.86 for the HHS and WOMAC, respectively, and 0.90 for the FCI. The same outcome set was used for the participants who were interviewed by telephone. In these patients, since range of motion and deformity cannot be assessed by telephone, a modified HHS with a correction factor was adopted [15]. Since our study was carried out on surgically treated patients only without any control group including untreated patients, we compared our results with those obtained by other authors in patients affected by advanced hip osteoarthritis.

Statistical analysis
A two-sample t test, ANOVA, and chi-square test were used to test the significance of the cross-sectional differences between groups. A Bonferroni test was used to test the differences between multiple groups. Pearson's correlation coefficient was used to assess the relationships among patient-oriented outcomes. Models of multiple stepwise linear and logistic regression analysis were constructed to evaluate the relationships between the explanatory variables and the outcomes with continuous and categorical distributions, respectively. Summary measures and single scale scores of the SF-36, as well as the WOMAC scores and the HHS, were treated as continuous outcome variables. Satisfaction with surgery, willingness to undergo the operation again, and occurrence of reoperation were categorical outcomes. Explanatory variables included in the analysis were: present age (continuous), gender (categorical), age at operation (continuous), bilaterality of the procedure (categorical), length of follow-up (continuous), BMI (continuous), educational level (discrete), FCI (discrete), cigarette smoking (categorical), sport practise (categorical), postoperative employment (i.e. keeping preoperative job/ workload -categorical), cemented THA (categorical), and possible reoperations (categorical). The patients' educational level was graded as follows: 1) illiterate, 2) primary school, 3) secondary school, 4) high school, and 5) graduation. Before constructing the models, ageadjusted univariate linear and logistic regression analyses were performed. Explanatory variables were included in our multiple regression models if a trend toward an association (i.e. p ≤ 0.10) with the outcome of interest was found in the univariate analysis. In the multiple linear regression analysis, total R 2 for the model and changes in R 2 for the independent contribution of single factors were calculated to assess the percent of total variance in the outcome accounted for by the whole model and by single explanatory variables, respectively. In multiple logistic regression, log-likelihood tests were obtained to evaluate the independent contribution of single explanatory variables in the fit of the model. A pvalue of less than 0.05 was considered significant. SPSS software program (SPSS, Inc., Chicago, IL, USA) was used for the database and statistics.

General health
The subjects' SF-36 scores, stratified into three age groups, are reported in Table 2 in comparison with the age-matched normative data [14]. The SF-36 physical indexes of patients compared negatively with the normative values, mainly in the two youngest age groups. Significantly lower values were observed in the older age groups compared with the youngest age group. Patients with unilateral THA scored better than patients with bilateral THA on the RP (p < 0.001), GH (p < 0.05), SF (p < 0.05), and PCS (p < 0.05) SF-36 scales. The only difference between patients with different preoperative diagnoses were that better results were obtained by subjects with osteonecrosis compared to those with OA on the PCS (p = 0.022) scale. No significant differences were found between patients who received cementless or cemented THA, or who had undergone revision procedures or not during the follow-up period.

Disease-specific quality of life
Mean HHS and WOMAC questionnaire results are reported in Table 3, stratified into three age groups in comparison with age-matched normative data when available [16]. Subjects from the study group obtained poorer scores in comparison with the values of healthy subjects, and the

Post-surgical satisfaction and revision rate
Of the 250 responders, 240 (96%) were satisfied with the outcome of their surgery and 242 subjects (96.8%) said that they would undergo the same procedure again. No difference in these outcomes was noted when patients operated on by different surgeons were compared. A preoperative diagnosis of hip dysplasia was associated with a lesser degree of postoperative satisfaction and willingness to undergo the surgery again. Indeed, the satisfaction rate was 97.5% and 66.6% in patients with such a diagnosis or not, respectively (p = 0.001). Onehundred eighty-two patients (72.8%) had experienced pain in their operated hip over time, which was referred to as mild and sporadic in 160 cases (87.9%) and moderate or continuous in 22 cases (12.1%). The intensity of pain on the 10-step visual analogic scale averaged 2.8 ± 2 (range 1-8). The pain appeared at exertion and/or in standing position in 174 (95.6%) and 78 (42.9%) patients, respectively. Seventy-two patients (28.8%) reported current consumption of pain alleviating medications. Fortyone THA in 36 patients were revised, leading to a reoperation rate of 12.4%. Five patients underwent a bilateral revision procedure. No difference in the reoperation rate was noted between the different operating surgeons or the preoperative diagnoses.

Correlation and regression analysis
There were significant correlations between HHS and PCS (c = 0.69; p < 0.001), PCS and the WOMAC score (c = -0.71; p < 0.001), and HHS and the WOMAC score (c = -0.88; p < 0.001). A weaker but still significant correlation was noted between these three physical indexes and the MCS. Major determinants (i.e. those explaining a variation in the variance of the outcome of the model ≥ 3%) of the SF-36 summary and scale scores are reported in Table 4. The hip functionality assessed by either the WOMAC score or HHS (only the best predictor is reported in the table) was closely related to both physical and mental HRQoL. In our models, the variation in these indexes of hip functionality explained about half of the percent variance of PCS and PF scale scores. Comorbidities as assessed by FCI were another significant but less important determinant. A higher level of education showed a trend toward a positive association (p = 0.07) with some physical indexes on the SF-36 questionnaire (PCS, PF, and BP). Hip functionality (WOMAC score and HHS) ( Table 5) was positively associated with the postoperative resumption of preoperative employment and negatively associated with age and with the number of comorbidities. The older the age at operation, the better the long-term WOMAC score, although this explanatory variable accounted for only a small amount of the variation in this disease-specific index. During our multivariate analysis, neither the bilaterality of the procedure, use of cemented or cementless implant, nor possible reoperations was found to be related to the WOMAC scores and HHS. The functionality of the operated hip (WOMAC score and HHS) was a major positive determinant of long-term satisfaction with the surgery and willingness to undergo the surgery again, whereas the number of comorbidities was negatively related to these outcomes. Results of the models including this outcome and its best functional predictor (the WOMAC score) are shown in Table 6. If the scores of the domains of the WOMAC scale were used separately as single determinants of postoperative satisfaction and willingness to undergo the surgery again, then the most relevant predictor was the pain scale score. Several factors were found to be associated with reoperation during the multivariate regression analysis (Table 6), but they had lesser importance as outcome predictors.

Discussion
The main result of the present study is that patients who had undergone THA a mean of 16 years earlier had poorer long-term HRQoL with respect to agematched healthy controls [14]. However, their scores on physical SF-36 scales were higher in comparison with those previously reported in subjects with advanced hip osteoarthritis (Table 7) [1,[17][18][19][20]. In the present study the older the age group, the lower the SF-36 scale scores and summary measures. To the best of our knowledge, no previous studies that used this validated instrument with a comparably long follow-up period have been published. Thus, making exact comparisons with our findings is impossible. Several prospective studies dealing with early results of THA have shown that patients may obtain normal age-and sex-adjusted SF-36 values 3-12 months after surgery [2,3,21], but twelve to 36 months after THA, SF-36 parameters start to decrease over time [20,22,23]. In the long term, Rat et al. [23] reported SF-36 scores similar to ours 10 years after  THA. These authors also found that the scores on both physical and mental scales of SF-36 were lower than those for a general population with comparable age. Another long-term study [24] used a different validated questionnaire (i.e. the Nottingham Health Profile) that measures patient evaluation of the functional, social, and emotional impact of chronic disease. This study showed impaired quality of life in patients who had undergone THA 15 years earlier. These patients fared worse than the control group in most areas of perceived health. Moreover, they considered daily function to be affected negatively by health problems as compared with the control subjects. In our patients, also the indexes of hip functionality (the WOMAC questionnaire and HHS) compared negatively with those of healthy controls [16] but positively with those of patients with hip osteoarthritis [1]. Moreover, these scores were equal to or better than the findings of other THA studies with earlier follow-up data [22,25,26]. In our multivariate analyses, the WOMAC score and HHS were essential determinants of SF-36' PCS and PF scale scores, showing that hip functionality is critical in determining the patient's general functioning. In these models, comorbidities were negatively correlated with SF-36, WOMAC, and HHS results. This result is in keeping with previous studies that used the SF-36 [1,6,23] and WOMAC and HHS [1,21] questionnaires to evaluate either operated or nonoperated subjects. The frequency of subjects who kept their preoperative employment after surgery was similar to other studies [27,28]. Resuming preoperative job or Age at operation -0.14 -0.26 --0.03 0.017 3 C = coefficient; CI = confidence interval; FCI = Functional Comorbidity Index * = total adjusted R2 accounted for by the whole model ** = only explanatory variables accounting for a R2 variation in the outcome ≥ 3 are reported workload was closely associated with better hip functionality, as assessed by the WOMAC and HHS questionnaires. This is in good agreement with the results of Bohm [28], who found better Oxford-12 hip scores among those returning to work after THA. As stated by this author, a better hip functionality is likely to positively impact the ability to return to work, although this relationship may not be causal (i.e. the ability to resume work by itself may positively influence the patient's self reported functionality). Despite the impairment in the HRQoL, the level of post-surgical satisfaction in our study group was high and the 96% rate of satisfied patients is equal to or superior to the percentages previously reported in studies with shorter follow-up intervals [22,25,29]. This discrepancy between the rate of satisfied patients and HRQoL is not surprising. Indeed, several different factors apart from hip functionality (i.e. patient expectation, pain relief, psychological benefit, and improvement in activities of daily life) can influence the level of post-surgical satisfaction [30].
We acknowledge some methodological weaknesses in the present study. Due to its observational and retrospective character, it lacks reliable baseline data and a control group. However, performing a prospective analysis with such a follow-up is very challenging. Thus, we could have been subject to variability in information gathering that might have existed at the time these patients were treated. Nevertheless, the information obtained from medical records and used in the present analysis mostly consisted of unambiguous personal, demographic, or occupational data. Moreover, the comprehensive assessment by validated patient-oriented tools warranted comparisons with age and sex-matched norms, thus mitigating the lack of a control population. At the time the patients in our study group underwent their surgery, many of the validated questionnaires used in the present study were not available. This lack in comparable baseline data prevented us from evaluating the postoperative changes in these patients' status. However, this study was only designed to evaluate the influence of a THA on the long-term HRQoL and hip functionality of unselected patients. As stated in large register-based studies [31], the effectiveness of a widely used routine surgical technique (such as THA) can be evaluated better in observational studies than in randomised ones, because patients enrolled in these latter  studies are frequently not representative of the entire cohort of subjects undergoing THA in the routine clinical practice, due to the stringent exclusion criteria. Lastly, although multiple attempts were made to trace all the patients for follow-up evaluation, this was impossible due to the long elapsed time and the death of many patients. Nevertheless, we obtained a satisfactory survey rate of more than 80% of the surviving patients, which is superior to other studies with shorter times until follow-up [6]. Comparison of the included patients with those lost to follow-up suggested that these participants were representative of the entire population. Main strengths of our study are the use of validated instruments and length of follow-up, since to the best of our knowledge, no previous studies that used these validated patient-oriented tools had similarly long intervals postsurgery.

Conclusions
This paper demonstrates that patients who had undergone THA a mean of 16 years earlier have impaired self-reported physical HRQoL and hip functionality, but they still perform physically better than untreated patients with hip osteoarthritis. The hip functionality is a major determinant of physical HRQoL, but other relevant factors, such as the number of comorbidities, can also influence the ability of subjects. Despite the impairment in the HRQoL, the level of post-surgical satisfaction was high in this study group.