- Open Access
Development of convolutional neural network model for diagnosing meniscus tear using magnetic resonance image
BMC Musculoskeletal Disorders volume 23, Article number: 510 (2022)
Deep learning (DL) is an advanced machine learning approach used in diverse areas, such as image analysis, bioinformatics, and natural language processing. A convolutional neural network (CNN) is a representative DL model that is advantageous for image recognition and classification. In this study, we aimed to develop a CNN to detect meniscal tears and classify tear types using coronal and sagittal magnetic resonance (MR) images of each patient.
We retrospectively collected 599 cases (medial meniscus tear = 384, lateral meniscus tear = 167, and medial and lateral meniscus tear = 48) of knee MR images from patients with meniscal tears and 449 cases of knee MR images from patients without meniscal tears. To develop the DL model for evaluating the presence of meniscal tears, all the collected knee MR images of 1048 cases were used. To develop the DL model for evaluating the type of meniscal tear, 538 cases with meniscal tears (horizontal tear = 268, complex tear = 147, radial tear = 48, and longitudinal tear = 75) and 449 cases without meniscal tears were used. Additionally, a CNN algorithm was used. To measure the model’s performance, 70% of the included data were randomly assigned to the training set, and the remaining 30% were assigned to the test set.
The area under the curves (AUCs) of our model were 0.889, 0.817, and 0.924 for medial meniscal tears, lateral meniscal tears, and medial and lateral meniscal tears, respectively. The AUCs of the horizontal, complex, radial, and longitudinal tears were 0.761, 0.850, 0.601, and 0.858, respectively.
Our study showed that the CNN model has the potential to be used in diagnosing the presence of meniscal tears and differentiating the types of meniscal tears.
A meniscus tear resulting from trauma or degeneration is a common cause of persistent knee pain . It also results in a reduction in function, a low quality of life, and early osteoarthritis . Accurate detection of meniscal tears is essential for adequate and effective treatment. In addition, based on the type of meniscal tear, the treatment options can range from conservative to surgical [3, 4]. Magnetic resonance imaging (MRI) is the most useful and accurate non-invasive diagnostic tool for the diagnosis of meniscal tears. It is typically used as the first method for evaluating suspected meniscal tears and can effectively present the location and type of meniscal tear . However, the diagnostic accuracy of MRI for evaluating the presence of meniscal tears and type of tear is different between clinicians specializing in knee disease and other clinicians. A system that aids in reading a knee MRI would be of great help for clinicians to manage patients suspected of having a meniscus tear.
Machine learning (ML) is a computer algorithm that automatically learns from data without requiring explicit programming . ML enables breakthroughs in several fields, such as big data analysis, image analysis, natural language processing, and bioinformatics [7,8,9,10,11,12]. In addition, the usefulness of ML in the diagnosis of various musculoskeletal disorders has been demonstrated [13,14,15]. The deep learning (DL) technique is an advanced ML approach. DL involves the construction of artificial neural networks using numerous hidden layers with structures and functions similar to those of the human brain . The DL technique can learn unstructured and perceptual data, such as images and languages, and overcome traditional ML techniques. A convolutional neural network (CNN) is a representative DL model that is advantageous, particularly in image recognition and classification . Previous studies have shown that a CNN can be useful for determining the presence of meniscal tears in knee MRI images [18,19,20,21]. A CNN model that can differentiate tear location in the anterior horn, body, and posterior horn was recently developed . We assumed that the CNN could be useful for classifying tear types (horizontal, complex, radial, and longitudinal tears) in addition to detecting meniscal tears.
In this study, we developed a CNN model to diagnose meniscal tears, classify the types of meniscal tears using knee magnetic resonance (tablMR) images of each patient, and evaluate its accuracy.
We retrospectively collected 599 knee MR images from patients with meniscal tears, and 449 knee MR images from patients without meniscal tears. All MR images were obtained from a single university hospital from January 2010 to December 2020 (mean age = 38.7 ± 16.5; M:F = 729:319). To develop the DL model for evaluating the presence of meniscal tears, all collected knee MR images of the 599 cases with meniscal tears (medial meniscus tear = 384, lateral meniscus tear = 167, medial and lateral meniscus tears = 48) and 449 cases without meniscal tears were used. Tear of the meniscus on MR images was independently assessed by two board-certified orthopedic knee specialists and repeated 2 weeks later. If there was a disagreement between the two experts, a third orthopedic knee specialist made the final decision on the grade. Reliabilities for all radiographic parameters were analyzed using intra-class correlation coefficients and were classified as little (correlation coefficient, ≤ 0.25), low (0.26–0.49), moderate (0.50–0.69), high (0.70–0.89), or very high (≥ 0.90) . To develop a DL model for evaluating the type of meniscal tear, 538 cases with meniscal tears (horizontal tear = 268, complex tear = 147, radial tear = 48, longitudinal tear, 75) (Fig. 1) and 449 cases without meniscal tears were used. The study protocol was approved by the institutional research board of the university hospital. The Institutional Review Board waived the requirement for written informed consent because this study was performed retrospectively using anonymous data. The Helsinki Declaration was adhered to in this study.
Images used for deep learning (input variables)
All MRI examinations were performed using a 1.5 T MR scanner (Philips Medical Systems, Eindhoven, Netherlands). We used fat-suppressed T2-weighted coronal and sagittal images containing the meniscus (repetition time, 2480–5000 ms; echo time, 19–25 ms; section thickness, 4 mm; NEX, 3.0; 192 × 2; matrix, 192 × 256).
Deep learning model
This study consisted of two main components: 1) determining meniscal tears and 2) classifying tear type. In this study, we trained the model for tear detection and tear type independently.
CNN model for meniscus tear
Coronal and sagittal MR images were used as inputs to determine the presence of meniscal tears, and the features of coronal and sagittal MRI images were extracted using two CNN models. The CNN model used AlexNet as the backbone, and the input size of each CNN model was s × 224 × 224 × 3 . Here, s indicates the number of 2D images included in the MRI and 3 indicates the number of RGB color channels. Each CNN model consisted of five convolutional layers and a global average pooling layer. The feature maps generated in each model are concatenated and delivered to the fully connected layer. The fully connected layer of the model consists of two layers. These two layers contained a dropout layer and used a sigmoid function to classify meniscal tears. Figure 2 illustrates the CNN model used to identify the meniscal tears. The detailed architecture of the CNN model is shown in Table 1.
CNN model for the type of meniscus tear
Coronal MR images were used as inputs to classify the type of meniscal tear. Our CNN model extracted image features for the meniscus type using AlexNet as the backbone. The input size of this CNN model was s × 224 × 224 × 3, and the features of the meniscus image were extracted through each of the five convolutional layers. The extracted feature maps were averaged using image slices, and then transferred to a fully connected layer. The fully connected layer comprised of three layers, and the sigmoid function was used as the last activation function. Figure 3 illustrates the CNN model used to determine the type of meniscal tears. The detailed architecture of the CNN model is shown in Table 2.
All of our models were implemented in PyTorch version 1.7.0 and were tested on an NVIDIA GeForce RTX 2080TI. All MR images were normalized between 0 and 1 (pixel value/255). We retrained the model using the weight of the pretrained AlexNet model as the initial weight. The batch size and epoch of each model were set to 1 and 100, respectively, and the training model was optimized using the Adam optimizer method.
The MRI data of meniscal tears were categorized as follows: 1) To develop a model to determine the presence of meniscal tears: normal, medial meniscus, lateral meniscus, and medial and lateral meniscal tears. 2) To develop a model to differentiate between the types of meniscal tears: normal, horizontal, complex, radial, and longitudinal.
The details of the dataset configurations are presented in Tables 3 and 4. For each case, 70% of the dataset was randomly selected as the training set, whereas the remaining 30% was assigned to the test set to evaluate the model performance.
The performance of the model was evaluated in terms of accuracy, precision, recall, sensitivity, specificity, and area under the curve (AUC). The 95% confidence interval for the AUC was calculated using the method described by DeLong et al. .
We evaluated our model performance and compared it with MobileNet . We used the same hyper-parameters for MobileNet and our model. In addition, the fully connected layer of MobileNet was modified, as in our model.
Table 5 shows the performance of the models that were employed to identify the presence of meniscal tears. The AUCs of our model were 0.889, 0.817, and 0.924 for medial meniscal, lateral meniscal, and medial and lateral meniscal tears, respectively, with an accuracy of 85.08, 80.54, and 91.95%, respectively. Furthermore, the precisions of the medial meniscal, lateral meniscal, and medial and lateral meniscal tears were 83.93, 62.96, and 55%, respectively. The sensitivity/specificity of the medial meniscal, lateral meniscal, and medial and lateral meniscal tears were 83.19%/86.67, 68%/85.19, and 78.57%/93.33%, respectively. As compared with MobileNet, the proposed model showed improvements in the accuracy, precision, recall, sensitivity, specificity, and AUC by 20.97, 21.93, 28.32, 28.32, 14.82%, and 0.214, respectively, in identifying medial meniscus tears. Further, for lateral meniscus tears, the metrics improved by 16.22, 22.96, 4, 4, 20.75%, and 0.143, respectively, for the proposed model. The metrics associated with medial and lateral meniscus tears improved by 16.78, 34.49, 21.43, 21.43, 16.29%, and 0.273, respectively.
Table 6 presents the performance results for the different types of meniscal tears. The AUCs of our model were 0.761, 0.85, 0.601, and 0.858 for the horizontal, complex, radial, and longitudinal tears, respectively, with an accuracy of 72.23, 91.02, 72.48, and 81.53%, respectively. Additionally, the precision of the horizontal, complex, radial, and longitudinal tears were 59.3, 81.48, 15.38, and 40.54%, respectively. The sensitivity/specificity of the horizontal, complex, radial, and longitudinal tears were 63.75%/74.07, 68.75%/96.3, 42.86%/75.56, and 68.18%/83.7%, respectively. We observed that, as compared with MobileNet, the accuracy, precision, specificity, and the AUC improved by 20.14, 17.82, 32.59%, and 0.219, respectively, for the proposed model in the case of horizontal tears. These metrics for complex tears improved by 26.95, 49.43, 35.56%, and 0.091, respectively, for the proposed model. For radial tears, the proposed model performed better than MobileNet with improvements of 27.93, 66.23, 4.46, 4.46, and 33.34% in terms of accuracy, precision, recall, sensitivity, and specificity, respectively. For longitudinal tears, the proposed model showed improvements of 15.29, 18.72, 13.63, 13.63, 15.55%, and 0.178 in terms of accuracy, precision, recall, sensitivity, specificity, and AUC, respectively. Figure 4 shows the receiver operating characteristic curve results for test dataset. The meniscal tears assessed by two orthopedic surgeons (GBK and OS) showed very high intra- and inter-observer reliabilities (Table 7).
In this study, we developed a CNN model for detecting the presence and type of meniscal tears using MR images as input data.
The AUCs for detecting the presence of tears in the medial meniscal, lateral meniscal, and both medial and lateral meniscal were 0.889, 0.817, and 0.924, respectively (Fig. 4a). Considering that an AUC ≥ 0.9, 0.9 > AUC ≥ 0.8, and 0.8 > AUC ≥ 0.7 are generally outstanding, excellent, and acceptable , respectively, our model trained using knee MRI as input data can be potentially applied for diagnosing meniscal tears in clinical practice. Regarding the capacity to differentiate the type of meniscal tear, the AUCs were 0.761, 0.850, 0.601, and 0.858 for horizontal, complex, radial, and longitudinal tears, respectively (Fig. 4b). In addition to radial tears, determination of the other three types of meniscal tears was acceptable.
A DL model consists of a multilayer perceptron with multiple hidden layers, or a feedforward neural network. It has a greater ability to learn the characteristics of input data in detail than traditional shallow neural networks . A CNN is a representative deep learning (DL) model. It receives multiple channels of two-dimensional data as input and transforms them repeatedly using convolution and pooling operations . These processes allow the extraction of valuable features from the input data. Therefore, CNNs have been used to recognize image patterns and process image data . Our developed model recognized the valuable characteristics of knee MR images, identified meniscal tears, and classified the images based on the type of meniscal tear. However, our model has a low capacity for detecting and diagnosing radial meniscal tears. This could be because a small number of cases of radial tears were used to develop the DL model compared to other types of meniscal tears. In addition, the relatively small size of the lesion observed on MRI in radial tears could be attributed to the low AUC result.
To the best of our knowledge, four previous studies have evaluated the diagnostic efficacy of the DL model for detecting meniscal tears on knee MRI [18,19,20,21]. In 2018, Bien et al. developed a CNN model using 1370 cases of knee MRI (coronal, sagittal, and axial MR images; meniscus tear, 397) . The AUC value for determining the presence of meniscal tears was 0.847. In 2020, Fritz et al. used a training set of 18,520 MR images, 1000 MR images for the validation set, and 1000 MR images for testing data. They developed a DCNN consisting of two 3D convolutional blocks (coronal and sagittal) to determine the presence of meniscal tears . The AUC value for diagnosing medial meniscal tears was 0.882, that for lateral meniscal tears was 0.781, and that for overall meniscal tears was 0.961. Moreover, Rizk et al. used coronal and sagittal knee MR images from 11,353 examinations . The AUC value for diagnosing medial meniscal tears was 0.93 and that for lateral meniscal tears was 0.84. Most recently, in 2021, Tack et al. used 2399 sagittal 3-dimensional MRI scans from the publicly available database of the Osteoarthritis Initiative . The AUC values for medial meniscal tears in the anterior horn, body, and posterior horn were 0.94, 0.93, and 0.93, respectively, whereas those for lateral meniscal tears were 0.96, 0.94, and 0.91, respectively. Recent studies have reported an enhancement in the accuracy of DL models for diagnosing meniscal tears [20, 21]. This can be attributed to the large number of MRI scans required. However, previous studies did not diagnose the type of meniscal tear. Therefore, our study is the first to develop a DL model to classify meniscal tears based on knee MRI. Table 8 summarizes related work on meniscal tears.
In conclusion, using coronal and sagittal knee MR images, we developed a CNN model to diagnose the presence of meniscal tears and differentiated types of meniscal tears. The diagnostic accuracy is generally acceptable. Although our CNN model is limited in its low accuracy for diagnosing radial tears, we believe that our study is meaningful because it is the first to distinguish the types of meniscal tears and show the possibility that the CNN model can differentiate types of meniscal tears and detect the presence of meniscal tears. In the future, diagnostic accuracy should be increased by using a larger amount of knee MRI data.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.
Magnetic resonance imaging
Convolutional neural network
Area under the curve
Hong SY, Han W, Jang J, et al. Prognostic factors of mid- to long-term clinical outcomes after arthroscopic partial meniscectomy for medial meniscal tears. Clin Orthop Surg. 2021;13:e82.
Navarro RA, Adams AL, Lin CC, et al. Does knee arthroscopy for treatment of meniscal damage with osteoarthritis delay knee replacement compared to physical therapy alone? Clin Orthop Surg. 2020;12:304–11.
Dawson LJ, Howe TE, Syme G, Chimimba LA, Roche JJW. Surgical versus conservative interventions for treating meniscal tears of the knee in adults. Cochrane Database Syst Rev. 2017;2017:CD011411.
Mordecai SC, Al-Hadithy N, Ware HE, Gupte CM. Treatment of meniscal tears: an evidence based approach. World J Orthop. 2014;5:233–41.
Lefevre N, Naouri JF, Herman S, Gerometta A, Klouche S, Bohu Y. A current review of the Meniscus imaging: proposition of a useful tool for its radiologic analysis. Radiol Res Pract. 2016;2016:8329296.
Deo RC. Machine learning in medicine. Circulation. 2015;132:1920–30.
Agn M, Munck Af Rosenschöld P, Puonti O, et al. A modality-adaptive method for segmenting brain tumors and organs-at-risk in radiation therapy planning. Med Image Anal. 2019;54:220–37.
Aubert B, Vazquez C, Cresson T, Parent S, de Guise JA. Toward automated 3D spine reconstruction from Biplanar radiographs using CNN for statistical spine model fitting. IEEE Trans Med Imaging. 2019;38:2796–806.
Chen Y, Li D, Zhang X, Jin J, Shen Y. Computer aided diagnosis of thyroid nodules based on the devised small-datasets multi-view ensemble learning. Med Image Anal. 2021;67:101819.
Ge R, Yang G, Chen Y, et al. K-net: integrate left ventricle segmentation and direct quantification of paired Echo sequence. IEEE Trans Med Imaging. 2020;39:1690–702.
Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2:230–43.
Luo L, Yu L, Chen H, et al. Deep mining external imperfect data for chest X-ray disease screening. IEEE Trans Med Imaging. 2020;39:3583–94.
Bowes MA, Kacena K, Alabas OA, et al. Machine-learning, MRI bone shape and important clinical outcomes in osteoarthritis: data from the osteoarthritis initiative. Ann Rheum Dis. 2020;80:502–8.
Kwon SB, Ku Y, Han HU, Lee MC, Kim HC, Ro DH. A machine learning-based diagnostic model associated with knee osteoarthritis severity. Sci Rep. 2020;10:15743.
Shim JG, Kim DW, Ryu KH, et al. Application of machine learning approaches for osteoporosis risk prediction in postmenopausal women. Arch Osteoporos. 2020;15:169.
Abiodun OI, Jantan A, Omolara AE, Dada KV, Mohamed NA, Arshad H. State-of-the-art in artificial neural network applications: a survey. Heliyon. 2018;4:e00938.
Yamashita R, Nishio M, Do RKG, Togashi K. Convolutional neural networks: an overview and application in radiology. Insights Imaging. 2018;9:611–29.
Bien N, Rajpurkar P, Ball RL, et al. Deep-learning-assisted diagnosis for knee magnetic resonance imaging: development and retrospective validation of MRNet. PLoS Med. 2018;15:e1002699.
Fritz B, Marbach G, Civardi F, et al. Deep convolutional neural network-based detection of meniscus tears: comparison with radiologists and surgery as standard of reference. Skelet Radiol. 2020;49:1207–17.
Rizk B, Brat H, Zille P, et al. Meniscal lesion detection and characterization in adult knee MRI: a deep learning model approach with external validation. Phys Med. 2021;83:64–71.
Tack A, Shestakov A, Lüdke D, Zachow S. A multi-task deep learning method for detection of meniscal tears in MRI data from the osteoarthritis initiative database. Front Bioeng Biotechnol. 2021;9:747217.
Munro BH. Statistical methods for health care research. Philadelphia: Lippincott Williams and Wilkins; 2015.
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst. 2012;25:1097–105.
DeLong ER, DeLonog DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–45.
Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. p. 4510–20.
Mandrekar JN. Receiver operating characteristic curve in diagnostic test assessment. J Thorac Oncol. 2010;5:1315–6.
This research was supported by the National Research Foundation of Korea Grant funded by the Korean government, No. NRF2021R1A2C1013073. This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2021R1A6A1A03040177).
Ethics approval and consent to participate
Our study protocol was approved by the institutional review board of Yeungnam university hospital.
Consent for publication
The requirement for written informed consent to publish this report was waived owing to the retrospective nature of this study.
The authors reported no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Shin, H., Choi, G.S., Shon, OJ. et al. Development of convolutional neural network model for diagnosing meniscus tear using magnetic resonance image. BMC Musculoskelet Disord 23, 510 (2022). https://doi.org/10.1186/s12891-022-05468-6
- Deep learning
- Convolutional neural network
- Magnetic resonance imaging
- Meniscus tear