Patient-Reported Outcome questionnaires for hip arthroscopy: a systematic review of the psychometric evidence
© Tijssen et al; licensee BioMed Central Ltd. 2011
Received: 12 November 2010
Accepted: 27 May 2011
Published: 27 May 2011
Hip arthroscopies are often used in the treatment of intra-articular hip injuries. Patient-reported outcomes (PRO) are an important parameter in evaluating treatment. It is unclear which PRO questionnaires are specifically available for hip arthroscopy patients. The aim of this systematic review was to investigate which PRO questionnaires are valid and reliable in the evaluation of patients undergoing hip arthroscopy.
A search was conducted in Pubmed, Medline, CINAHL, the Cochrane Library, Pedro, EMBASE and Web of Science from 1931 to October 2010. Studies assessing the quality of PRO questionnaires in the evaluation of patients undergoing hip arthroscopy were included. The quality of the questionnaires was evaluated by the psychometric properties of the outcome measures. The quality of the articles investigating the questionnaires was assessed by the COSMIN list.
Five articles identified three questionnaires; the Modified Harris Hip Score (MHHS), the Nonarthritic Hip Score (NAHS) and the Hip Outcome Score (HOS). The NAHS scored best on the content validity, whereas the HOS scored best on agreement, internal consistency, reliability and responsiveness. The quality of the articles describing the HOS scored highest. The NAHS is the best quality questionnaire. The articles describing the HOS are the best quality articles.
This systematic review shows that there is no conclusive evidence for the use of a single patient-reported outcome questionnaire in the evaluation of patients undergoing hip arthroscopy. Based on available psychometric evidence we recommend using a combination of the NAHS and the HOS for patients undergoing hip arthroscopy.
Hip arthroscopy is a relatively new procedure in the management of hip disorders [1, 2]. It has first been described by Burman  in 1931, but has not evolved into general use since approximately the last two decades . The indications for hip arthroscopy are numerous and include, symptomatic labral tears, femoroacetabular impingement (FAI), loose bodies, synovitis, chondral defects and degenerative conditions of the hip [4, 5]. This broad range of indications also implies a broad range of patients [6, 7]. Arthroscopies are performed on adolescents and professional athletes, but also on older populations (<55 years) [2, 7–9]. Exact numbers on incidence and prevalence of these surgical interventions are unknown.
The number of hip arthroscopies is rising because of improvements in surgical technique and a better understanding of the pathology associated with the hip joint . Therefore, the need for outcome related research increases . One important parameter in outcome-related research in all areas of medicine is the patient's perspective . As Patrick et al.  described patient-reported outcomes (PROs) should serve as a golden standard in the assessment of musculoskeletal conditions where the patients perspective and health-related quality of life are of main interest.
A number of PRO questionnaires have been developed for individuals with hip pathology, especially osteoarthritis [12–14]. The small amount of outcome related research available for hip arthroscopy uses many of these different questionnaires, but it is unclear if these are valid and reliable in the assessment of patients undergoing hip arthroscopy . In order to recommend or discard these PRO questionnaires analysis of their content and psychometric properties is necessary. Thus far, two systematic reviews in this area have been performed [13, 14]. Schenker et al.  concluded that the Hip Outcome Score (HOS) was the most reliable and valid measure of self-reported physical function for individuals undergoing hip arthroscopy. It is unclear which methods were used to achieve this conclusion and which questionnaires and psychometric evidence were compared. Furthermore, the review only provides evidence for the HOS in pre-operative use . The second study by Thorborg et al.  reviewed all questionnaires assessing hip and groin disability on validity, reliability and responsiveness and concluded that the HOS should be recommended for evaluating patients undergoing hip arthroscopy. This conclusion was based on the number of psychometric properties known for the particular questionnaires involved in the study . More psychometric properties meant a better quality questionnaire. However, the quality of studies investigating the psychometric evidence was not a subject of research, which could possibly lead to bias .
The aim of this systematic review was to investigate which PRO questionnaires are valid and reliable in the evaluation of patients undergoing hip arthroscopy.
A systematic review was performed 1) to identify all PRO questionnaires used in the evaluation of patients undergoing hip arthroscopy 2) to evaluate the quality of these questionnaires based on their psychometric evidence 3) to determine the methodological quality of the studies into the psychometric evidence of these questionnaires.
A health-related PRO questionnaire is a measurement of any aspect of a patient's health status that is directly assessed by the patient, thus without interpretation of the patient's responses by a physician or anyone else .
Psychometric properties are part of psychometrics, which is the discipline concerned with the construction and validation of measurement instruments, such as questionnaires and tests . The psychometric properties used in this study are defined by Terwee et al.  and consist of: content validity, internal consistency, criterion validity, construct validity, agreement, reliability, responsiveness, floor and ceiling effects and interpretability.
A computerized literature search was performed using Pubmed, Medline, CINAHL (via EBSCO), the Cochrane Library, Pedro, EMBASE (via OVID) and Web of Science to identify relevant articles published between January 1931 and 1 October 2010. The search was conducted by two reviewers (NM and MT). The following terms were used:
Hip AND arthroscopy
Hip AND arthroscopy AND questionnaires OR outcome assessment OR self assessment OR outcome
Hip AND rehabilitation OR treatment AND questionnaires OR outcome assessment OR self assessment OR outcome
Terms were searched as key words or 'free-text' terms in all databases except for Pubmed in which they were searched as MESH terms. The reference lists of the retrieved articles were searched for more relevant studies. The search was completed with a separate search for the identified questionnaires as well as for authors of these questionnaires.
1. Article was published in English, French, German or Dutch and available as full text article.
2. The study included a PRO questionnaire specifically used for the evaluation of patients following hip arthroscopy
3. The main goal of the study was to evaluate the quality of a PRO questionnaire used for the evaluation of patients undergoing hip arthroscopy
4. The study used new data instead of data extracted from other research (for example systematic reviews)
Two assessment procedures were used to assess the quality of the identified questionnaires and the methodological quality of the articles describing the questionnaires.
Terwee et al.  developed quality criteria for good psychometric properties in order to evaluate and compare the quality of PRO questionnaires. The list contains the following items: content validity, internal consistency, criterion validity, construct validity, reproducibility (agreement/reliability), responsiveness, floor and ceiling effects and interpretability . The items are rated as positive (+), intermediate (?), negative -, or no information available (). The exact definitions of the psychometric properties and scoring criteria can be found in Additional file 1.
No overall score is calculated, but a conclusion is drawn based on the information of the properties combined with the aim of the questionnaire . This criteria list was used in previous systematic reviews [14, 18]. The reviewers (MT and NM) rated the articles independently in order to avoid systematic errors.
Descriptive data of the 5 selected articles
Time of administration
Chirstensen et al. (2003) 
Hip pain >6 months, no abnormalities RX
N = 48/17
19♂, 29♀/6♂, 11♀
33y (range 16-45)/32y
Young patients with hip pain pre- and postoperative
Martin et al. (2006) 
N = 507 (263 operation)
38y (SD 13y, range 13-66)
Patients with labral tears (conservative + operative)
Martin et al. (2007) 
N = 107
42y (SD 14, median 44.2, range 14-79)
Post-operative follow-up 3.1y (SD 0.49, range 2-4.6)
Hip arthroscopy patients >2 years
Martin et al. (2008) 
N = 126
41y (SD 16, range 13-80)
Pre-operative. Post-operative 7 months
Hip arthroscopy patients
Potter et al. (2005) 
Hip arthroscopy labral tears
N = 33
34.6y (range 21-56y)
Post-operative mean follow-up 25.7 months (range 13-55 months)
Hip arthroscopy patients - labral tears
Descriptive data of questionnaires
Evaluative Measure pre/post-operative hip pain and function
Pain, function, functional activities
Hip arthroscopy patients
Evaluative Measure pre/post-operative hip pain and function
Functional activities, pain, symptoms, sports
20 - 40 year old patients with hip pain and without radiographic diagnosis
Evaluative Measure outcome treatment intervention
Functional activities, sports
Subjects with acetabular labral tears with function of wide range of ability
Any disagreement between the two reviewers (NM and MT) was resolved by consensus.
Quality of questionnaires and articles
Quality of the questionnaires based on psychometric properties
Floor and ceiling effects
The MHHS scored high on construct validity because it correlated well with the domains bodily pain and physical functions of the Short Form-36 (SF-36) . Some information on interpretability is known, however this information was not comprehensive and therefore this property scored an intermediate rating . The NAHS scored high on content validity, but intermediate on internal consistency, construct validity and reproducibility. The internal consistency was checked with a factor analysis but this was performed with too little subjects . A Pearson Correlation Coefficient was used to check for reliability instead of an ICC or Kappa . The correlation between the NAHS and the SF-12 on the physical and emotional domains was good, but not in compliance with the a priori formulated hypothesis and thus let to an intermediate rating for construct validity . The HOS scored good on internal consistency, construct validity, agreement, reliability and responsiveness. However, because no target population was used, the content validity was rated negative [22–24]. The construct validity was checked with a SF-36 and a rating scale for level of function and surgical outcome. Only the correlation with the SF-36 was used to establish construct validity, which was good [22, 23]. Remarkably, the construct validity of all questionnaires was checked with either a SF-36 or SF-12 [20–23]. Furthermore, for none of the questionnaires definite information was available for criterion validity, floor and ceiling effects and interpretability.
Scores of articles rated by COSMIN checklist
Measurement properties assessed
Generalisability per box
Chirstensen et al. (2003) 
Martin et al. (2006) 
Martin et al. (2007) 
Martin et al. (2008) 
Potter et al. (2005) 
This systematic review included five articles on hip arthroscopy using three different questionnaires (NAHS, HOS and MHHS). The MHHS is a modification of the Harris Hip Score which is an observer-administrated score . Potter et al.  used it as a self-administrated score, deleting the two observer-administrated items. Therefore the MHHS was included in this study. In previous studies more questionnaires were used but these were often developed for osteoarthritis [12, 13, 26, 27]. Furthermore, none of these studies explicitly investigated the quality of the questionnaires used for the evaluation of hip arthroscopy patients [12, 13, 26, 27]. The quality of the questionnaires was assessed by the criteria list of Terwee et al. . The methodological quality of the studies into the questionnaires was assessed by the COSMIN list . Based on the quality criteria proposed by Terwee et al.  none of the three identified questionnaires had a high quality. Not all measurement properties are equally important for the quality of a questionnaire . Terwee et al.  considered the content validity to be one of the most important measurement properties and stated that only if this is adequate, one should consider using a questionnaire. Based on this parameter the NAHS would be the best quality questionnaire. However, they also showed that the aim of the questionnaire demands different qualities of a questionnaire and thus measurement properties . As all three included questionnaires were evaluative a high level of agreement was important. In that perspective the HOS scored the best.
The overall quality of the articles investigating the measurement properties as rated by the COSMIN list was fair to good. Remarkably, in most cases the generalisability per box was better than the quality of the assessed properties per article. Only one article scored excellent on hypothesis testing of the correlation between the MHHS and SF-36 . Furthermore, two articles by Martin et al. [22, 23] examining the validity of the HOS had one or more scores that were rated good. When adding all scores the article by Martin et al.  had the highest quality.
The NAHS has been developed for a young population with orthopedic, non arthritic hip pain and not specifically for patients undergoing hip arthroscopy, like the HOS . Therefore, the NAHS may be a more generalisable questionnaire, but less specific for hip arthroscopy patients. Studies investigating the HOS excluded subjects that could not answer a certain amount of questions, which could lead to bias [22, 23]. Furthermore, the HOS has a sports subscale which may fit an athletic population but may not be appropriate for individuals with slight degenerative conditions undergoing hip arthroscopy . These two disadvantages may compromise the reliability and validity of the HOS. Evidence for the support of the NAHS as well as the HOS can be found in other systematic reviews [12–14]. Baldwin et al.  performed a review concerning the outcomes of hip arthroscopy for the treatment of FAI and concluded that the NAHS was the most suitable scale for evaluating FAI. However, the quality assessment in this article was performed based on the authors experience and preference. The HOS was found the best in the assessment performed by Schenker et al. and Thorborg et al. . Yet, Schenker et al.  did not define the search strategy nor the identified questionnaires and the methods on which they based their quality assessment. Thorborg et al.  used only the amount of measurement properties per questionnaire and not the quality of the articles investigating it. Further, they used the criteria stated by Terwee et al.  for the evaluation of measurement properties of PRO questionnaires in hip arthroscopy patients, but found different results due to interpretation differences. This was foreseen by Terwee et al.  who stated that at least the criteria list would separate poor from good quality questionnaires. Based on this separation our review stated the MHHS to be of moderate quality and the NAHS and HOS to be of better quality. Thorborg et al.  stated the HOS to be of good quality.
The COSMIN list we used in this review was recently developed. At present no other checklists for the assessment of articles on the methodological quality of questionnaires are available [15, 16, 19]. There is also no list that scores both the quality of the questionnaires and the quality of the studies investigating the questionnaires . Therefore, a combination of the list by Terwee et al.  and the COSMIN list has been recommended in assessing the quality of questionnaires . Using these two lists we concluded that the NAHS is the best quality questionnaire, but the quality of the articles describing the HOS is higher. The quality of a systematic review depends on the quality of the studies included. A limitation of this study is the small number of questionnaires as well as the small number of studies that could be included. More rigorous studies to determine which score is most valid and reliable are necessary to provide a conclusive recommendation.
This systematic review shows that there is no conclusive evidence for the use of a single patient-reported outcome questionnaire in the evaluation of patients undergoing hip arthroscopy. A limitation of this study is the small number of studies that could be included. Based on available psychometric evidence we recommend using a combination of the NAHS and the HOS for patients undergoing hip arthroscopy. In order to provide a conclusive recommendation more studies on the validity and reliability of these questionnaires are warranted.
Acknowledgements and Funding
No acknowledgements or funding.
- Byrd JW: The role of hip arthroscopy in the athletic hip. Clin Sports Med. 2006, 25 (2): 255-78. 10.1016/j.csm.2005.12.007. viiiView ArticlePubMedGoogle Scholar
- Philippon MJ, Stubbs AJ, Schenker ML, Maxwell RB, Ganz R, Leunig M: Arthroscopic management of femoroacetabular impingement: osteoplasty technique and literature review. Am J Sports Med. 2007, 35 (9): 1571-80. 10.1177/0363546507300258.View ArticlePubMedGoogle Scholar
- Burman M: Arthroscopy or the direct visualization of joints. J. Bone Joint Surg. 1931, 4: 669-695.Google Scholar
- Kelly BT, Williams RJ, Philippon MJ: Hip arthroscopy: current indications, treatment options, and management issues. Am J Sports Med. 2003, 31 (6): 1020-37.PubMedGoogle Scholar
- Byrd JW: Arthroscopy of the hip. Sports medicine and arthroscopy review. 2002, 10 (2): 151-162. 10.1097/00132585-200210020-00007.View ArticleGoogle Scholar
- Allen D, Beaulé PE, Ramadan O, Doucette S: Prevalence of associated deformities and hip pain in patients with cam-type femoroacetabular impingement. J Bone Joint Surg Br. 2009, 91 (5): 589-94. 10.1302/0301-620X.91B5.22028.View ArticlePubMedGoogle Scholar
- Narvani AA, Tsiridis E, Kendall S, Chaudhiri R, Thomas P: A preliminary report on prevalence of acetabular labrum tears in sports patients with groin pain. Knee Surg Sports Traumatol Arthrosc. 2003, 11 (6): 403-8. 10.1007/s00167-003-0390-7.View ArticlePubMedGoogle Scholar
- Haviv B, O'Donnell J: The incidence of total hip arthroplasty after hip arthroscopy in osteoarthritic patients. Sports Med Arthrosc Rehabil Ther Technol. 2010, 2: 18-10.1186/1758-2555-2-18.View ArticlePubMedPubMed CentralGoogle Scholar
- Philippon MJ, Weiss DR, Kuppersmith DA, Brigss KK, Hay CJ: Arthroscopic labral repair and treatment of femoroacetabular impingement in professional hockey players. Am J Sports Med. 2010, 38 (1): 99-104. 10.1177/0363546509346393.View ArticlePubMedGoogle Scholar
- McCarthy JC, Jarret BT, Ojeifo O, Lee JA, Bragdon CR: What Factors Influence Long-term Survivorship After Hip Arthroscopy?. Clin Orthop Relat Res. 2010Google Scholar
- Patrick DL, Burke LB, Powers JH, Scott JA, Rock EP, Dawisha S, O'neill R, Kennedy DL: Patient-reported outcomes to support medical product labeling claims: FDA perspective. Value Health. 2007, 10 (Suppl 2): S125-37.View ArticlePubMedGoogle Scholar
- Baldwin KD, Harrison RA, Namdari S, Nelson CL, Hosalkar HS: Outcomes of hip arthroscopy for treatment of femoroacetabular impingement: a systematic review. Currrent Orthopaedic Practice. 2009, 20 (6): 669-673. 10.1097/BCO.0b013e3181a9d771.View ArticleGoogle Scholar
- Schenker ML, Martin R, Weiland DE, Philippon MJ: Current trends in hip arthroscopy: a review of injury diagnosis, techniques and outcome scoring. Current opinion in orthopeadics. 2005, 16: 89-94.View ArticleGoogle Scholar
- Thorborg K, Roos EM, Bartels EM, Petersen J, Hölmich P: Validity, reliability and responsiveness of patient-reported outcome questionnaires when assessing hip and groin disability: a systematic review. Br J Sports Med. 2010Google Scholar
- Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC: The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study. Qual Life Res. 2010, 19 (4): 539-49. 10.1007/s11136-010-9606-8.View ArticlePubMedPubMed CentralGoogle Scholar
- Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC: The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010, 63 (7): 737-45. 10.1016/j.jclinepi.2010.02.006.View ArticlePubMedGoogle Scholar
- Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, Bouter LM, de Vet HC: Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007, 60 (1): 34-42. 10.1016/j.jclinepi.2006.03.012.View ArticlePubMedGoogle Scholar
- Eechaute C, Vaes P, Van Aerschot L, Duquet W: The clinimetric qualities of patient-assessed instruments for measuring chronic ankle instability: a systematic review. BMC Musculoskelet Disord. 2007, 8: 6-10.1186/1471-2474-8-6.View ArticlePubMedPubMed CentralGoogle Scholar
- Mokkink LB, Terwee CB, Gibbons E, Stratford PW, Alonso J, Patrick DL, Knol DL, Bouter LM, de Vet HC: Inter-rater agreement and reliability of the COSMIN (COnsensus-based Standards for the selection of health status Measurement Instruments) Checklist. BMC Med Res Methodol. 2010, 10: 82-10.1186/1471-2288-10-82.View ArticlePubMedPubMed CentralGoogle Scholar
- Potter BK, Freedman BA, Andersen RC, Bojescul JA, Kuklo TR, Murphy KP: Correlation of Short Form-36 and disability status with outcomes of arthroscopic acetabular labral debridement. Am J Sports Med. 2005, 33 (6): 864-70. 10.1177/0363546504270567.View ArticlePubMedGoogle Scholar
- Christensen CP, Althausen PL, Mittleman MA, Lee JA, McCarthy JC: The nonarthritic hip score: reliable and validated. Clin Orthop Relat Res. 2003, 75-83. 406Google Scholar
- Martin RL, Kelly BT, Philippon MJ: Evidence of validity for the hip outcome score. Arthroscopy. 2006, 22 (12): 1304-11. 10.1016/j.arthro.2006.07.027.View ArticlePubMedGoogle Scholar
- Martin RL, Philippon MJ: Evidence of validity for the hip outcome score in hip arthroscopy. Arthroscopy. 2007, 23 (8): 822-6. 10.1016/j.arthro.2007.02.004.View ArticlePubMedGoogle Scholar
- Martin RL, Philippon MJ: Evidence of reliability and responsiveness for the hip outcome score. Arthroscopy. 2008, 24 (6): 676-82. 10.1016/j.arthro.2007.12.011.View ArticlePubMedGoogle Scholar
- Byrd JW, Jones KS: Prospective analysis of hip arthroscopy with 2-year follow-up. Arthroscopy. 2000, 16 (6): 578-87. 10.1053/jars.2000.7683.View ArticlePubMedGoogle Scholar
- Ilizaliturri VM, Nossa-Barrerra JM, Acosta-Rodriguez E, Carnacho-Galindo J: Arthroscopic treatment of femoroacetabular impingement secondary to paediatric hip disorders. J Bone Joint Surg Br. 2007, 89 (8): 1025-30. 10.1302/0301-620X.89B8.19152.View ArticlePubMedGoogle Scholar
- Ilizaliturri VM, Nossa-Barrerra JM, Acosta-Rodriguez E, Carnacho-Galindo J: Arthroscopic treatment of cam-type femoroacetabular impingement: preliminary report at 2 years minimum follow-up. J Arthroplasty. 2008, 23 (2): 226-34. 10.1016/j.arth.2007.03.016.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2474/12/117/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.