This article has Open Peer Review reports available.
Measuring the morphological characteristics of thoracolumbar fascia in ultrasound images: an inter-rater reliability study
© The Author(s). 2018
Received: 11 September 2017
Accepted: 15 May 2018
Published: 1 June 2018
Chronic lower back pain is still regarded as a poorly understood multifactorial condition. Recently, the thoracolumbar fascia complex has been found to be a contributing factor. Ultrasound imaging has shown that people with chronic lower back pain demonstrate both a significant decrease in shear strain, and a 25% increase in thickness of the thoracolumbar fascia. There is sparse data on whether medical practitioners agree on the level of disorganisation in ultrasound images of thoracolumbar fascia. The purpose of this study was to establish inter-rater reliability of the ranking of architectural disorganisation of thoracolumbar fascia on a scale from ‘very disorganised’ to ‘very organised’.
An exploratory analysis was performed using a fully crossed design of inter-rater reliability. Thirty observers were recruited, consisting of 21 medical doctors, 7 physiotherapists and 2 radiologists, with an average of 13.03 ± 9.6 years of clinical experience. All 30 observers independently rated the architectural disorganisation of the thoracolumbar fascia in 30 ultrasound scans, on a Likert-type scale with rankings from 1 = very disorganised to 10 = very organised. Internal consistency was assessed using Cronbach’s alpha. Krippendorff’s alpha was used to calculate the overall inter-rater reliability.
The Krippendorf’s alpha was .61, indicating a modest degree of agreement between observers on the different morphologies of thoracolumbar fascia.The Cronbach’s alpha (0.98), indicated that there was a high degree of consistency between observers. Experience in ultrasound image analysis did not affect constancy between observers (Cronbach’s range between experienced and inexperienced raters: 0.95 and 0.96 respectively).
Medical practitioners agree on morphological features such as levels of organisation and disorganisation in ultrasound images of thoracolumbar fascia, regardless of experience. Further analysis by an expert panel is required to develop specific classification criteria for thoracolumbar fascia.
A growing body of evidence supports the notion that the thoracolumbar fascia, an anatomical structure consisting of layers of dense connective tissue in the lumbar area of the trunk, is clinically important in people with chronic lower back pain [1–8]. The thoracolumbar fascia has been shown to play an important role in force transmission between lower limbs and trunk in both ex-vivo cadaver studies [9, 10] and in-vivo research during walking [11, 12]. Subcutaneous fascial bands have been found to mechanically link the skin, subcutaneous layers and deeper muscles. The differences in morphological characteristics of subcutaneous fascial planes may reflect how mechanical forces are distributed across various tissues . However, what is not clear, is whether medical practitioners are able to agree on these different morphological features in ultrasound images of thoracolumbar fascia.
The architecture of the thoracolumbar fascia is complex, it consists of layers of dense collagenous connective tissue, interspersed with loose connective tissue which allows the dense layers to slide and hence play a role in trunk mobility. The thoracolumbar fascia is continuous with the aponeuroses of major trunk muscles which are instrumental in movement and vertebral control [8, 9]. It has been hypothesised that fibrosis, densification and thickening in the thoracolumbar fascia may be the result of an inflammatory response or soft tissue injury [1, 14–17]. For instance, a recent animal study demonstrated that an induced soft tissue injury in the lumbar region, when combined with movement restriction, lead to fibrosis, and significant thickening of thoracolumbar fascia . An earlier pioneering ultrasound based human study concluded that the thoracolumbar fascia in people with chronic lower back pain demonstrated 25% greater thickness compared to a matched control group . A follow-up investigation found that thoracolumbar fascia shear strain during passive trunk flexion, was reduced in people with chronic lower back pain by 56% . In both aforementioned studies, Langevin’s research team found significant differences not only in fascial thickness and echogenicity, but also in disorganisation of the architecture of the connective tissues of people with chronic lower back pain. Even though the clinical relevance of fascial tissues has been established , to date no classification of thoracolumbar fascia has been developed. In order to develop a classification system, a level of inter-observer reliability of the different types of architecture of thoracolumbar fascia needs to be established.
The aim of this study was to determine the inter-rater reliability for the rating of morphological characteristics of thoracolumbar fascia in ultrasound images, on Likert-type scale, by a range of clinicians.
The study was approved by the University of Kent’s Ethics Committee and conducted in compliance with the Helsinki Declaration. Informed consent was obtained from all participants.
Characteristics of raters
N = 30
Years of clinical experience
13.03 (± SD 9.6)
USI training & experience
N = 30
Trained & experienced
Untrained & unexperienced
Frequency of USI usage
n = 12 (40%)
Ultrasound image data acquisition
Each ultrasound image was obtained using B-Mode imaging, with a MyLabGold25 semi-portable ultrasound scanner (Easote, Rimini, Italy). A 4 cm, 18 MHz linear array transducer (Easote LA435) was used for all images.
Selection of ultrasound images for reliability study
Inter-observer reliability rating protocol
In inter-observer reliability studies, it is vital that raters apply coding to data they understand . For this reason, a 20 mins presentation about the thoracolumbar fascia was delivered, this facilitated anatomical orientation and exposed the participants to a representative range of ultrasound images prior to rating. Participants were not given examples of actual ratings, only of the range of images they would be rating, to avoid bias. (See Fig. 1 for anatomical orientation and region of interest). Scans were projected on a standard sized screen (133 × 100 cm).
Table 1 shows that 57% had no training or experience in ultrasound imaging, 40% had experience ranging from monthly to daily evaluations of ultrasound imaging, 1 participant did not respond to this question, no observers had experience in evaluating ultrasound images of thoracolumbar fascia.
Participants were instructed to rank the region of interest (ROI in Fig. 1) which included the thoracolumbar fascia (* thoracolumbar fascia in Fig. 1) and the subcutaneous zone (*SZ in Fig. 1) on a Likert-type scale. A Likert scale with rating points from 1 to 10 was used, point 1 was labelled as ‘very disorganised’ and point 10 as ‘very organised’, the intermediate points were numbered but remained unlabelled. Participants were familiarised to the definition of thoracolumbar fascia organisation and disorganisation. For instance, ‘very organised’ was defined as ‘to be able to draw a rectangular shaped box around the hyperechoic area of thoracolumbar fascia’ (see Fig. 1).
Participants viewed scans sequentially in a time frame of 30 s to 1 min. They were able to modify responses, request to re-assess a scan, and make written comments about their decisions. Participants could not discuss ratings with each other, in order to avoid bias. All responses were anonymised prior to analysis.
Inter-rater reliability was assessed from the total raw scores of all 899 decisions, and the raw scores divided into 4 sub-groups using Cronbach’s alpha, to assess internal consistency among observers [24, 25]. The Cronbach’s alpha coefficient was calculated using SPSS (version 21) statistical software. Standard error of measurement (SEM) was calculated as the square root of error variance in accordance with de Vet’s guidelines . The Krippendorff’s alpha for ordinal measures was used to assess inter-observer agreement [23, 27] and was calculated using a custom-designed online calculator . As Likert scales are an ordinal measurement, the median and interquartile range for the total of scans was calculated, as well as for each scan individually [29, 30].
Participant ratings of scans were categorised into four groups [30–32]. Group 1 (very disorganised) consisted of all scans with a median rating of 1 to 3. Group 2 (somewhat disorganised) consisted of all median ratings from 4 to 5. Group 3 (somewhat organised) consisted of all median ratings from 6 to 7. Group 4 (very organised) consisted of all median ratings from 8 to 10 (Fig. 2). The Cronbach’s alpha and Krippendorf’s alpha were calculated using the original raw scores from individual raters for each scan.
Results of descriptive analysis
Inter-rater reliability scores for all data and sub-groups
Landis and Koch criteria 
Results of inter-rater reliability analysis
All participants assessed all scans, except one participant who did not complete one rating. The Cronbach’s alpha was 0.98, which is considered excellent according to the Landis and Koch criteria . Observers without ultrasound imaging experience scored a Cronbach’s alpha = 0.96, observers with ultrasound imaging experience scored a Cronbach’s alpha = 0.95, both in the excellent range. Scores between 4 sub-groups are reported in Table 2. The Krippendorff’s alpha for ordinal measures was .61, with an error variance of 0.63, indicating a modest degree of agreement.
In this study we found that medical practitioners agree on different morphological features in ultrasound images of thoracolumbar fascia such as levels of organisation and disorganisation. This agreement is independent of experience in ultrasound image rating. We found that the knowledge gap between musculoskeletal (MSK)-trained radiologists, MSK-trained medical doctors and physiotherapists on the one hand, and clinicians untrained and inexperienced in MSK ultrasound, did not affect the inter-observer agreement.
It is important to establish internal consistency before images can be used for research or clinical evaluation to ensure validity . The measurement error was smaller in both groups of disorganised scans, and higher in the more organised groups. This could be an indication that it may be easier to interpret disorganisation or irregular shapes rather than organisation or regular shapes. The modest Krippendorf’s alpha for the ratings suggests that a minimal amount of measurement error was introduced by the independent observers, and therefore statistical power for subsequent analyses is not substantially reduced.
In this cohort, the differences in ultrasound experience do not appear to impact on consistency. We did not observe any raters who systematically under- or over-rated the images. Novice raters have demonstrated good to excellent reliability in measuring abdominal and lumbar muscle thickness obtained by ultrasound scans [34, 35]. However, a straightforward comparison between quantitative measures of lumbar and abdominal muscle tissue, commonly found in the literature on rehabilitation of lower back pain, and this study’s qualitative ratings of subcutaneous connective tissue requires caution. Substantial observer variability can occur, even at the expert level of image interpretation . Interestingly, in this study, experienced radiologists agreed with the interpretation of clinicians relatively inexperienced in the reading of ultrasound images. The American College of Radiology Imaging Network (ACRIN) has highlighted that in order to improve the research in interpretation of medical images, observers in reliability studies should ideally reflect a broad range of experience to provide a sufficient level of generalisability .
In multi-reader medical image interpretation, the phenomenon of ‘groupthink’, has been identified, where the opinion of novice raters might be influenced by senior or experienced raters . In order to avoid a situation of potential pseudo-consensus, all raters viewed the scans independently without discussing decisions with each other.
This study has a number of limitations. First, it involved a small cohort size of both observers and scans. The results are encouraging and should be validated in a larger cohort . Secondly, the study relied on static ultrasound images. Future studies may consider functional and dynamic measurements. Finally, we did not determine the frequency in which raters interpret the same image differently. This needs to be taken into account for future studies.
Medical practitioners agree on morphological features such as levels of organisation and disorganisation in ultrasound images of thoracolumbar fascia, regardless of experience. These findings will be useful for the establishment of a clinical diagnostic scale and the further development of using ultrasound as a decision-making tool for researchers and clinicians.
The authors thank Karthik Muthumayandi for assistance with development of electronic data collection, and Dr. Samantha L Winter for helpful discussions during manuscript preparation.
Availability of data and materials
The datasets analysed during the current study are available in the KAR repository, http://kar.kent.ac.uk/66942/
KDC conceived the study, participated in study design, collected the data, analysed the data and drafted the manuscript. KH participated in study concept and design, reviewed the manuscript. JWD participated in study design, analysis, interpretation and manuscript preparation. LP participated in study design and manuscript preparation. All authors read and approved the final manuscript.
Ethics approval and consent to participate
This study was approved by the University of Kent’s Research and Ethics Committee (Prop. 163–2013). Informed consent was obtained from all participants.
Consent for publication
Consent for publication was sought from all participants whose images are contained in this manuscript.
The authors declare they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Langevin HM, Sherman KJ. Pathophysiological model for chronic low back pain integrating connective tissue and nervous system mechanisms. Med Hypotheses. 2007;68:74–80.View ArticlePubMedGoogle Scholar
- Yahia LH, Rhalmi S, Newman N, Isle M. Sensory innervation of human thoracolumbar fascia. Acta Orthop Scand. 1992;63:195–7.View ArticlePubMedGoogle Scholar
- Taguchi T, Tesarz J, Mense S. The Thoracolumbar Fascia as a Source of Low Back Pain. In: Huijing PA, Hollander P, Findley TW, Shleip R, editors. Fascia Research II - Basic Science and Implications for Conventional and Complementary Healthcare. Munich: Elsevier GmbH; 2009. p. 251.Google Scholar
- Langevin HM, Stevens-Tuttle D, Fox JR, Badger GJ, Bouffard NA, Krag MH, Wu J, Henry SM. Ultrasound evidence of altered lumbar connective tissue structure in human subjects with chronic low back pain. BMC Musculoskelet Disord. 2009;10:151.View ArticlePubMedPubMed CentralGoogle Scholar
- Tesarz J, Hoheisel U, Wiedenhöfer B, Mense S. Sensory innervation of the thoracolumbar fascia in rats and humans. Neuroscience. 2011;194:302–8.View ArticlePubMedGoogle Scholar
- Hoheisel U, Taguchi T, Treede RD, Mense S. Nociceptive input from the rat thoracolumbar fascia to lumbar dorsal horn neurones. Eur J Pain. 2011;15:810–5.View ArticlePubMedGoogle Scholar
- Wilke J, Schleip R, Klingler W, Stecco C. The lumbodorsal fascia as a potential source of low back pain: a narrative review. Biomed Res Int. 2017;2017. https://doi.org/10.1155/2017/5349620.
- Willard FH, Vleeming A, Schuenke MD, Danneels L, Schleip R. The thoracolumbar fascia: anatomy, function and clinical considerations. J Anat. 2012;221:507–36.View ArticlePubMedPubMed CentralGoogle Scholar
- Barker PJ, Hapuarachchi KS, Ross JA, Sambaiew E, Ranger TA, Briggs CA. Anatomy and biomechanics of gluteus maximus and the thoracolumbar fascia at the sacroiliac joint. Clin Anat. 2014;27:234–40.View ArticlePubMedGoogle Scholar
- Macintosh JE, Bogduk N, Gracovetsky S. The biomechanics of the thoracolumbar fascia. Clin Biomech. 1987;2:78–83.View ArticleGoogle Scholar
- Vleeming A, Pool-Goudzwaard AL, Stoeckart R, van Wingerden JP, Snijders CJ. The posterior layer of the thoracolumbar fascia. Its function in load transfer from spine to legs. Spine (Phila Pa 1976). 1995;20:753–8.View ArticleGoogle Scholar
- Carvalhais VO, Do C, Ocarino J de M, Araujo VL, Souza TR, PLP S, Fonseca ST. Myofascial force transmission between the latissimus dorsi and gluteus maximus muscles: an in vivo experiment. J Biomech. 2013;46:1003–7.View ArticlePubMedGoogle Scholar
- Li W, Ahn AC, Weitz D, Mahadevan L, Barnett R, Zhang M. Subcutaneous fascial bands—a qualitative and morphometric analysis. PLoS One. 2011;6:e23987.View ArticlePubMedPubMed CentralGoogle Scholar
- Pavan PG, Stecco A, Stern R, Stecco C. Painful connections: densification versus fibrosis of fascia. Curr Pain Headache Rep. 2014;18:441.View ArticlePubMedGoogle Scholar
- Diviti S, Gupta N, Hooda K, Sharma K, Lo L. Morel-lavallee lesions-review of pathophysiology, clinical findings, imaging findings and management. J Clin Diagnostic Res. 2017;11:TE01–4.Google Scholar
- Corey SM, Vizzard MA, Bouffard NA, Badger GJ, Langevin HM. Stretching of the back improves gait, mechanical sensitivity and connective tissue inflammation in a rodent model. PLoS One. 2012;7:e29831.View ArticlePubMedPubMed CentralGoogle Scholar
- Schilder A, Hoheisel U, Magerl W, Benrath J, Klein T, Treede R-D. Sensory findings after stimulation of the thoracolumbar fascia with hypertonic saline suggest its contribution to low back pain. Pain. 2014;155:222–31.View ArticlePubMedGoogle Scholar
- Bishop JH, Fox JR, Maple R, Loretan C, Badger GJ, Henry SM, Vizzard MA, Langevin HM. Ultrasound evaluation of the combined effects of thoracolumbar fascia injury and movement restriction in a porcine model. PLoS One. 2016;11:e0147393.View ArticlePubMedPubMed CentralGoogle Scholar
- Langevin HM, Fox JR, Koptiuch C, Badger GJ, Greenan-Naumann AC, Bouffard NA, Konofagou EE, Lee W-N, Triano JJ, Henry SM. Reduced thoracolumbar fascia shear strain in human chronic low back pain. BMC Musculoskelet Disord. 2011;12:203.View ArticlePubMedPubMed CentralGoogle Scholar
- Klingler W, Velders M, Hoppe K, SR PM. Clinical Relevance of Fascial Tissue and dysfunctions. Curr Pain Headache Rep. 2014;18:439.View ArticlePubMedGoogle Scholar
- Stokes M, Hides MJ, Elliott J, Kiesel MSK, Hodges CP, Hons B. Rehabilitative ultrasound imaging of the posterior Paraspinal muscles. J Orthop Sport Phys Ther. 2007;37:581–95.View ArticleGoogle Scholar
- Kremkau F. Diagnostic ultrasound: principles and instruments. 7th ed. St. Louis, Mo: Saunders; 2006.Google Scholar
- Krippendorff K. Reliability in content analysis: some common misconceptions and recommendations. Hum Commun Res. 2004;30:411–33.Google Scholar
- Tavakol Mohsen DR. Making sense of Cronbach’s alpha. Int J Med Educ. 2011;2:53–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Cronbach LJ, Shavelson RJ. My current thoughts on coefficient alpha and successor procedures. Educ Psychol Meas. 2004;64:391–418.View ArticleGoogle Scholar
- de Vet HCW, Terwee CB, Knol DL, Bouter LM. When to use agreement versus reliability measures. J Clin Epidemiol. 2006;59:1033–9.View ArticlePubMedGoogle Scholar
- Hayes AFKK. Answering the call for a standard reliability measure for coding data. Commun Methods Meas. 2007;1:77–89.View ArticleGoogle Scholar
- Freelon D. ReCal OIR: ordinal, interval, and ratio intercoder reliability as a web service. Int J Internet Sci. 2013;8:10–6.Google Scholar
- Jamieson S. Likert scales: how to (ab)use them. Med Educ. 2004;38:1217–8.View ArticlePubMedGoogle Scholar
- Norman G. Likert scales, levels of measurement and the “laws” of statistics. Adv Heal Sci Educ. 2010;15:625–32.View ArticleGoogle Scholar
- LaValley MP, Felson DT. Statistical presentation and analysis of ordered categorical outcome data in rheumatology journals. Arthritis Rheum. 2002;47:255–9.View ArticlePubMedGoogle Scholar
- Hallgren KA. Computing inter-rater reliability for observational data: an overview and tutorial. Tutor Quant Methods Psychol. 2012;8:23–34.View ArticlePubMedPubMed CentralGoogle Scholar
- Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74.View ArticlePubMedGoogle Scholar
- Teyhen DS, George SZ, Dugan JL, Williamson J, Neilson BD, Childs JD. Inter-rater reliability of ultrasound imaging of the trunk musculature among novice raters. J Ultrasound Med. 2011;30:347–56.View ArticlePubMedGoogle Scholar
- Wilson A, Hides JA, Blizzard L, Callisaya M, Cooper A, Srikanth VK. Measuring ultrasound images of abdominal and lumbar multifidus muscles in older adults: a reliability study. Man Ther. 2016;23:114–9.View ArticlePubMedGoogle Scholar
- Bankier AA, Levine D, Halpern EF, Kressel HY. Consensus interpretation in imaging research: is there a better way? Radiology. 2010;257:14–7.View ArticlePubMedGoogle Scholar
- Obuchowski NA. How many observers are needed in clinical studies Peter Hogg. Am J Roentgenol. 2004;182(April):867–9.View ArticleGoogle Scholar