Accuracy of magnetic resonance studies in the detection of chondral and labral lesions in femoroacetabular impingement: systematic review and meta-analysis

Background Several types of Magnetic resonance imaging (MRI) are commonly used in imaging of femoroacetabular impingement (FAI), however till now there are no clear protocols and recommendations for each type. The aim of this meta-analysis is to detect the accuracy of conventional magnetic resonance imaging (cMRI), direct magnetic resonance arthrography (dMRA) and indirect magnetic resonance arthrography (iMRA) in the diagnosis of chondral and labral lesions in femoroacetabular impingement (FAI). Methods A literature search was finalized on the 17th of May 2016 to collect all studies identifying the accuracy of cMRI, dMRA and iMRA in diagnosing chondral and labral lesions associated with FAI using surgical results (arthroscopic or open) as a reference test. Pooled sensitivity and specificity with 95% confidence intervals using a random-effects meta-analysis for MRI, dMRA and iMRA were calculated also area under receiver operating characteristic (ROC) curve (AUC) was retrieved whenever possible where AUC is equivocal to diagnostic accuracy. Results The search yielded 192 publications which were reviewed according inclusion and exclusion criteria then 21 studies fulfilled the eligibility criteria for the qualitative analysis with a total number of 828 cases, lastly 12 studies were included in the quantitative meta-analysis. Meta-analysis showed that as regard labral lesions the pooled sensitivity, specificity and AUC for cMRI were 0.864, 0.833 and 0.88 and for dMRA were 0.91, 0.58 and 0.92. While in chondral lesions the pooled sensitivity, specificity and AUC for cMRI were 0.76, 0.72 and 0.75 and for dMRA were 0.75, 0.79 and 0.83, while for iMRA were sensitivity of 0.722 and specificity of 0.917. Conclusions The present meta-analysis showed that the diagnostic test accuracy was superior for dMRA when compared with cMRI for detection of labral and chondral lesions. The diagnostic test accuracy was superior for labral lesions when compared with chondral lesions in both cMRI and dMRA. Promising results are obtained concerning iMRA but further studies still needed to fully assess its diagnostic accuracy. Electronic supplementary material The online version of this article (doi:10.1186/s12891-017-1443-2) contains supplementary material, which is available to authorized users.


Background
Femoroacetabular impingement (FAI) becomes a wellestablished syndrome with characteristic clinical and radiological findings [1]. The condition shows pathological repetitive impingement of the surrounding soft tissue structures mostly in the labrum and the adjacent cartilage leading to their damage and appearance of pain. It has been associated both with specific morphotypes as well as with extreme/repetitive motion (e.g. kickboxing and soccer) [2][3][4].
Two different types of FAI morphology have been described. Firstly, Cam type morphology which is characterized by a non-spherical portion of the femoral head (including the pistol-grip deformity, decreased head-neck offset, increased alpha angle, overgrowth of the femoral head epiphysis and subclinical slipped epiphysis). The second is pincer type morphology which is characterized by anterior over coverage of the acetabulum (including coxa profunda, acetabular retroversion, and lateral rim lesions). Most symptomatic hips, however, have been reported as mixed morphology and both femoral (cam) and acetabular (pincer) factors are present [1,2,5].
Both morphotypes are highly prevalent in asymptomatic populations reaching 30% in some studies. This indicates that the presence of this morphology is not always a pathological finding that needs interference [3,[6][7][8][9][10]. The precise diagnosis of FAI may therefore be difficult because both clinical examinations and plain radiographs have limited reliability in identification of labral and chondral damage [5].
Magnetic resonance imaging (MRI) in general has superior soft tissue contrast and reliability in assessing of acetabular labrum and articular cartilage of the hip. In this respect different scanning protocols have been developed for the evaluation of FAI, including conventional magnetic resonance imaging (cMRI), direct magnetic resonance arthrography (dMRA) and indirect magnetic resonance arthrography (iMRA).
To date, a gold standard has not been well established, as several studies comparing the accuracy of these different protocols obtained variable outcomes [11][12][13][14]. Moreover, there is a debate about whether introduction of contrast material increases the accuracy of cMRI or not. Introduction of contrast material may be done directly by intra-articular injection into the joint as in dMRA or indirectly by intravenous injection as in iMRA [15][16][17][18][19].
More recently, biochemical imaging analysis of chondral surface has shown good results for diagnosis of early abnormalities. In the delayed Gadolinium Enhanced Magnetic Resonance Imaging of Cartilage (dGEMRIC) technique which is a common protocol of iMRA, introduction of an intravenous dose of gadolinium is done, followed by a short period of exercise then subsequent imaging. Early images can be used to determine cartilage morphology and delayed images can be obtained to assess biochemical structure [20].
There are some potential advantages for iMRA over dMRA, it is simple and less invasive procedure than dMRA and may be more accepted by patients, also iMRA can be easily arranged and performed at any imaging facility [21].
In 2011 Smith et al. [19] did a meta-analysis about the accuracy of cMRI and dMRA in diagnosing acetabular labral tears, but they included all pathologies of labral tears with no specificity to FAI and they didn't include iMRA as a valuable method in diagnosing labral tears as in this review.
In 2011 Smith et al. [18] did another meta-analysis about the accuracy of cMRI, MRA and computer tomography in diagnosing chondral lesions of the hip, but they also included all pathologies of chondral lesions with no specificity to FAI and they didn't include iMRA as a valuable method in diagnosing chondral lesions as in this review. There was some heterogeneity in their review meta-analysis by pooling results from studies using different magnetic resonance (MR) strength fields, we also noticed that they included a study in their metaanalysis about knee not hip [22].
Till now there are no clear protocol and recommendations for MRI in diagnosing FAI and this more evident as in regard to the associated chondral lesions. About 6 years ago Smith et al. [18,19] did their search for review metaanalysis; we reviewed the current evidence about the accuracy of conventional MRI, dMRA and iMRA in the detection of chondral and labral lesions in FAI.

Methods
We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [23] statement.
A thorough search of the literature was conducted and was completed on the17th of May 2016.
The primary database used was the Medical Literature Analysis and Retrieval System Online (Medline) (via PubMed) and additional data base used was Web of Science from their inception to the search date. The search was supplemented by hand-searching of databases and references of relevant articles and reviews. The search strategy was a combination of Medical Subject Headings (MeSH) terms and free text words which is represented in Table 1.

Eligibility criteria
All studies reporting the diagnostic test accuracy (sensitivity/specificity) of cMRI, dMRA and iMRA for the assessment of chondral and labral lesions in FAI with surgical comparison (open or arthroscopic) as the reference test, were included.

Study identification
All search data were collected and initial screening of the abstracts was performed by one reviewer (Saied AM) based on inclusion and exclusion criteria. Full-text documents were obtained for all studies meeting the criteria above then further analysis was done by two reviewers (Saied AM., Audenaert EA), Audenaert EA checked the data collected by Saied AM then final agreement on the final data between the 2 investigators was obtained. The data extracted included Country of study, sample size, mean age, type of magnetic resonance procedure, type of lesion analyzed (Acetabular chondral delamination, Combined chondral lesions, Femoral head chondral lesions, Acetabular chondral lesions and labral lesions), sensitivity and specificity for each type of lesion.
Also the frequency of true-positives (TP), true-negatives (TN), false-positives (FP) and false-negatives (FN) for the MRI studies to the reference test were collected to perform statistical meta-analysis. If insufficient, attempts were made to estimate the values and if this was not possible the study was excluded from the meta-analysis.
Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria was used to assess all studies for their methodological quality [24]. QUADAS-2 tool shows improved criteria, it distinguishes between bias and applicability and identifies 4 key domains supported by signaling questions to aid assessing risk of bias and concerns about applicability as "high" and "low".
For risk of bias signaling questions are answered as "yes," "no," or "unclear" and are handled such that "yes" indicates low risk of bias. Risk of bias is determined as "low," "high," or "unclear". If the answers to all signaling questions for a domain are "yes," then risk of bias can be determined low. If any signaling question is answered "no," potential for bias occurs. The "unclear" category should be used only when insufficient data are reported to permit a judgment.
Applicability concerns shows if the study matches the review question or not, so systematic review question in terms of patients, index tests, and reference standard must be reported. Concerns about applicability are rated as "low," "high," or "unclear". The "unclear" result should be used only when insufficient data are reported [24].
Meta-analysis was done by assessing the pooled sensitivity and specificity with 95% confidence intervals using random effect. Only studies with similar types of MRI, strength of magnetic field machines and lesion types were included. Also area under receiver operating characteristic (ROC) curve (AUC) [25] was retrieved whenever possible where AUC represented the diagnostic accuracy. AUC values are graded as the following: All analysis was done on SPSS version 18.0 (SPSS Inc, Chicago, Illinois) and Meta-Disc (Unit of Clinical Biostatistics, Ramo´ny Cajal Hospital, Madrid, Spain) [26] (Additional files 1, 2, 3 and 4).

Results
The results of literature search strategy are illustrated in the PRISMA flowchart ( Fig. 1). A total number of 192 papers were collected from which 21 studies met the eligibility, inclusion and exclusion criteria and were included in the qualitative analysis using QUADAS-2 criteria.
Only 12 studies were included in the quantitative metaanalysis, five studies were excluded because there was no available data as regard TP, TN, FP, FN [21,[27][28][29][30] ( Table 3). To decrease heterogeneity of data, another 4 studies were excluded because it wasn't suitable to pool their results with another studies, one study didn't show the used MR field-strength [17], another study used MR field-strength scanner 1 T [15] and two studies used MR field-strength scanner 3 T (one used cMRI [14] and the other used dMRA [31]) ( Fig. 1).

Qualitative analysis
The results of the QUADAS-2 showed that all studies had low risk for applicability concerns. There was some variation in the results for the risk of bias specially for the description of time between MRI and surgery. Nine studies didn't describe the time between MRI and surgery while it was mentioned in 12 studies; in 5 studies the time interval exceeded 3 months in some cases while the other 7 studies the time interval was below 3 months. Most studies showed that the surgical procedures were done with the knowledge of the radiological findings and sometimes clinical data were available to radiologists when they reviewed the images.
Most of studies showed high risk of bias in identification of cohort recruitment, all studies showed that the patients received both the reference (surgery) and index tests (MRI) and that the surgery was independent of the MRI. All studies showed that MRI interpretation was done without the knowledge of the surgical findings ( Table 2).

Study demographics
A total number of 828 cases were determined. Mean age of the study cohorts was 34.4 years, mentioned in 19 studies. This ranged from a mean value of 19 to 43 years. The time from radiological assessment to surgical comparison was documented in 12 studies. This ranged from within 3 days to within 6 months [12, 14-16, 21, 27, 28, 32-34].
All sensitivities and specificities of cMRI, dMRA and iMRA were retrieved and organized in relation to study ID and type of lesions analyzed (Acetabular chondral delamination, combined chondral lesions, Femoral head chondral lesions, Acetabular chondral lesions and labral lesions) ( Table 3).
Quantitative meta-analysis Conventional magnetic resonance imaging

A) Labral lesions
Three studies were included. The sensitivity and specificity results for each study are illustrated in Fig. 2. The results showed some variation between studies and this was reflected in the summary ROC diagram (Fig. 3). The pooled analysis indicated Sensitivity of 0.864 (95% CI: 0.757 -0.936), specificity of 0.833 (95% CI: 0.359-0.996) and AUC of 0.88 (Table 4).

B) Chondral lesions
Eight studies were included and 12 data set were retrieved. The individual sensitivity and specificity results are presented in Fig. 8. The summary ROC diagram (Fig. 9) showed AUC of 0.83. The  Table 4).

Indirect magnetic resonance arthrography
Chondral lesions Two studies were included and 2 data sets were retrieved. The individual sensitivity and specificity results are presented in Fig. 10. The pooled analysis  (Table 4).

Discussion
The results of the QUADAS-2 assessment showed that about 45% of studies didn't mention the duration interval between MRI and surgery, about 23% of studies mentioned this duration with limit reaching 6 months. This could increase the possibility that the patient chondral or labral condition change between the index and reference tests. However, 35% of studies mentioned this duration with limit reaching 3 months which considered accepted duration ( Table 2). The results for patient selection showed low risk of applicability concerns in all studies, this could be explained by specifying FAI patients for the eligibility criteria in this review. However, it showed high risk of bias in most studies where only 2 studies [32,34] showed that a consecutive or random sample of patients were enrolled. However, all studies avoided case control design and avoided inappropriate exclusions ( Table 2).
The results of the QUADAS-2 assessment also showed that all studies showed low risk of applicability concerns as in regard to the reference standard and index test. All studies showed that the index test was conducted exactly as the review question and that the reference standard defined the target condition matching the review question (Table 2) [24].
All studies showed that the surgical procedures were done with the knowledge of the radiological findings; this explains the high risk of bias as in regard to the reference standard. This bias was inevitable because it is difficult to blind the surgeons about MRI findings ( Table 2).
There was some heterogeneity between studies, so we tried to decrease it by pooling results of studies with similar types of MRI, strength of magnetic field machines and lesion types (Table 4). There was some variability between studies in assessors, imaging planes, sequences, slice thicknesses and resolution. These factors were too complex to be analyzed as part of this meta-analysis and this should be considered when interpreting the pooled results.
We believe that MRI is highly depended on the radiologists who read its images, this factor could explain also the heterogeneity of results even between similar studies. The difference between the accuracy of radiologists still needing further studies and there was insufficient data to perform meta-analysis on this point.
Only one study used 1 T magnet [15], this low field strength may be not appropriate hip because it is large and deep joint. Three studies used 3 T magnet    Tian et al. [14] concluded that dMRA at 3.0 T was more accurate method for diagnosing acetabular labral tears, with a significant greater sensitivity and NPV compared with cMRI examination, but in their study only 30% of the study population did dMRA while 100% did MRI. This was the only study using 3.0 T magnet for dMRA and cMRI and that is why it was not possible to do a meta-analysis on the difference between MRI fields.   Gonzalez et al. [31] found 87% sensitivity and 77% specificity for the diagnosis of labral lesions using dMRA. For the chondral lesions they found lower values in both locations, acetabular and femoral.
These studies showed good results using high field strength magnet and this were explained by increasing the signal-to-noise ratio thus helping in detailed assessment of intraarticular structures such as labrum and cartilage [41,42]. More investigation is therefore required for the 3 T imaging to precisely detect its accuracy.
There are a number of disadvantages for dMRA because the injection of gadolinium directly into the joint is an invasive procedure and carries small risk of joint infection [43]. Also the use of contrast material increases both the cost and the time of dMRA examination over cMRI.

Labral lesions
The number of studies included in the meta-analysis for dMRA were 8 and for cMRI were 3 studies (Figs. 2, 3, 6  and 7). The results showed that the diagnostic test accuracy was superior for dMRA when compared with cMRI for detection of labral lesions (Table 4).
Similar results were achieved by a meta-analysis done by Smith et al. [19] which showed that the pooled sensitivity and specificity for cMRI for diagnosing acetabular labral tears were 66% and 79% and for dMRA were 87% and 64% however, the data in that meta-analysis lack analysis of iMRA studies and include all causes of labral pathologies with no specificity to FAI as in this review.
Sutter et al. [12] showed that dMRA arthrography showed an advantage over cMRI in the detection of labral tears for one reader, whereas both methods were equivalent for the other reader. These results shows that the MRI procedure is operator dependent and this was confirmed in another study, McGuire et al. [13] showed that musculoskeletal (MSK) specialists had more accuracy than general radiologists in detecting labral lesions and also showed a higher accuracy of dMRA in detecting labral lesions when analyzing both groups of radiologists in comparison with cMRI.  The previous 2 papers [12,13] compared the accuracy of dMRA with cMRI and they concluded that dMRA had higher accuracy than cMRI in detecting labral lesions which matched the results of this meta-analysis.
Keeney et al. [44] showed that a negative result of dMRA study does not exclude important intraarticular pathology that can be identified and managed. With respect to labral lesions they showed a sensitivity of 71%, specificity of 44%. They stated that the assessment of specificity of dMRA in the evaluation of labral lesions was limited because of the small number of patients without acetabular labral tears in their study.
Reurink et al. [34] showed that the overall sensitivity and specificity for detecting labral lesions were 86% and 75%. They concluded that dMRA has a poor Negative predictive value and cannot be used to rule out a labral tear when there is a high clinical suspicion of such a tear which matches the results of this meta-analysis.

Chondral lesions
The number of studies included in the meta-analysis for dMRA were 8 studies and for cMRI studies were 3 studies (figs. 4,5,8 and 9). The results showed that the diagnostic test accuracy was superior for dMRA when compared with cMRI for detection of chondral lesions (Table 4).
Smith et al. [18] achieved different results in their metaanalysis, they concluded that the accuracy for the diagnosis of hip joint chondral lesions is higher for cMRI compared to dMRA but the data in that meta-analysis lack analysis of iMRA studies and include all causes of chondral pathologies with no specificity to FAI. There was some heterogeneity by pooling results from studies using different MR strength fields. They found that the pooled sensitivity and specificity for cMRI for diagnosing chondral lesions were 59% and 94% and for dMRA were 62% and 86%.
Sutter et al. [12] showed that dMRA was superior to cMRI for detecting acetabular cartilage defects but for femoral cartilage lesions, both modalities yielded comparable results. They indicated that both dMRA and cMRI allow identification of the patients with extensive cartilage damage at the acetabular rim. For patients with non-extensive cartilage damage at the acetabular rim dMRA showed increased accuracy compared with cMRI.
McGuire et al. [13] showed that MSK radiologists performed better than community radiologists in terms of overall accuracy. Accuracy rates for MSK radiologists were 79 and 59 for acetabular chondral lesions and femoral chondral lesions, respectively, whereas accuracy rates for community radiologists were 28 and 52%. Accuracy was significantly increased for both groups of radiologists when dMRA were reviewed rather than cMRI and concluded that dMRA has been shown to be more sensitive and specific for diagnosing hip joint pathology.
The previous 2 studies compared the accuracy of dMRA with cMRI and they concluded that dMRA had higher accuracy than cMRI in detecting of chondral lesions and this matched the results of this meta-analysis.
Keeney et al. [44] showed that with respect to chondral lesions, dMRA had a sensitivity of 47%, specificity of 89%, and an accuracy of 67%. Also Aprato et al. [33] showed that the role of dMRA in evaluating chondral lesions is limited. The sensitivity for femoral chondral lesions was 46%, specificity was 81% and for acetabular cartilage injuries, the sensitivity was 69%, Specificity was 88%. These data show that negative dMRA study should not rule out the presence of chondral lesion if it was clinically suspected.
Four studies [12,13,33,38] assessed the accuracy of dMRA in detection of acetabular chondral lesions, the pooled sensitivity and specificity for these studies were 0.86 and 0.68 respectively, and AUC was 0.86. Four studies [12,13,33,38] assessed the accuracy of dMRA in detection of femoral head chondral lesions, the pooled sensitivity and specificity for these studies were 0.69 and 0.75 respectively, and AUC was 0.75. Two studies [36,37] assessed the accuracy of dMRA in detection of combined chondral lesions, the pooled sensitivity and specificity for these studies were 0.51 and 0.88 respectively.
Two studies [12,13] assessed the accuracy of cMRI in detection of acetabular chondral lesions, the pooled sensitivity and specificity for these studies were 0.84 and 0.88 respectively. Two studies [12,13] assessed the accuracy of cMRI in detection of femoral head chondral lesions, the pooled sensitivity and specificity for these studies were 0.73 and 0.85 respectively. This analysis of different chondral lesions showed close results except for Femoral head chondral lesions which was the lowest value and this could be explained by the tight congruence hip joint and difficult recognition of femoral head cartilage.

iMRA
We performed an analysis of iMRA but the studies were few in number, only 2 studies for chondral lesions provided suitable data for meta-analysis [21,35]. The pooled analysis indicated Sensitivity of 0.722 (95% CI: 0.465 -0.903), specificity of 0.917 (95% CI: 0.615 -0.998). As in regard to chondral lesions, iMRA showed acceptable results like dMRA but further studies are still needed concerning this technique to get more reliable data (Table 4).
It was not possible to analyze data concerning labral tears as it was incalculable. The two studies that present data about labral lesions using iMRA reported sensitivity and specificity of (0.89-0.99) in Petchprapa et al. [21] and (1.00-1.00) in Zlatkin et al. [35], these results are considered very high and indicate that further studies concerning iMRA are still needed.