Reliability of a human pose tracking algorithm for measuring upper limb joints: comparison with photography-based goniometry
BMC Musculoskeletal Disorders volume 23, Article number: 877 (2022)
Range of motion (ROM) measurements are essential for diagnosing and evaluating upper extremity conditions. Clinical goniometry is the most commonly used methods but it is time-consuming and skill-demanding. Recent advances in human tracking algorithm suggest potential for automatic angle measuring from RGB images. It provides an attractive alternative for at-distance measuring. However, the reliability of this method has not been fully established. The purpose of this study is to evaluate if the results of algorithm are as reliable as human raters in upper limb movements.
Thirty healthy young adults (20 males, 10 females) participated in this study. Participants were asked to performed a 6-motion task including movement of shoulder, elbow and wrist. Images of movements were captured by commercial digital cameras. Each movement was measured by a pose tracking algorithm (OpenPose) and compared with the surgeon-measurement results. The mean differences between the two measurements were compared. Pearson correlation coefficients were used to determine the relationship. Reliability was investigated by the intra-class correlation coefficients.
Comparing this algorithm-based method with manual measurement, the mean differences were less than 3 degrees in 5 motions (shoulder abduction: 0.51; shoulder elevation: 2.87; elbow flexion:0.38; elbow extension:0.65; wrist extension: 0.78) except wrist flexion. All the intra-class correlation coefficients were larger than 0.60. The Pearson coefficients also showed high correlations between the two measurements (p < 0.001).
Our results indicated that pose estimation is a reliable method to measure the shoulder and elbow angles, supporting RGB images for measuring joint ROM. Our results presented the possibility that patients can assess their ROM by photos taken by a digital camera.
This study was registered in the Clinical Trials Center of The First Affiliated Hospital, Sun Yat-sen University (2021–387).
Joint range of motion (ROM) is a measure of interest in clinical practice as it is significant for the diagnosis, functional assessment and treatment evaluation of the upper extremity. It is reported that measurement of ROM is required in more than 80% of commonly used function assessment scales for the shoulder and elbow . Conventionally, the measurement of ROM was performed by manual goniometry . The goniometer is low-cost and portable, but its reliability highly depends on the rater's experience . Moreover, the procedure is demanding and time-consuming, which may impact the efficiency of medical care.
In addition, with the rapid development of telemedicine, how to determine the joint movement at-distance has peaked the interests of many researches [4,5,6,7,8,9]. Typical motion capture system could provide accurate kinematics measurement [10, 11] but requires large space for data collection, which makes it costly, not portable, and thus impractical for home-use. Advances in smartphone technology, specifically the build-in sensors and high-resolution cameras, provides a potential platform for joint measurement. The number of mobile application used for clinical assessment have considerably increased in recent years . There are two main groups of these applications using the embedded inclinometer and images taken by phone camera. Mitchell et al. had evaluated the reliability of two applications, one from each group, in the measurement of shoulder rotation, and indicated that both of the methods had acceptable reliability compared with standard goniometry . Subsequent studies also confirmed this finding and the mean difference between the two methods and manual goniometry ranged from 0.2 to 6.4 (inclinometer-based) [5,6,7, 14] and 0.1 to 11.9 (photographic-based) [15, 16]. According to previous studies, the inclinometer-based method could provide more consistent results and detect slighter changes [17, 18]. Morey et al. have reported the minimal detectable change of digital inclinometer in shoulder measurement ranging from 4 to 9 degrees . However, the measuring process of this method was relatively complicated. The participants need to attach the instrument to specific positions [5, 7, 18] and changing the position could lead to measurement errors . Compared with the inclinometer-based method, the photographic-based method provides easier procedure to follow and contactless measuring process . In addition, it is possible for doctors to know whether the measurement was correctly performed through the photos, which is extremely important for patient self-assessment. However, the existing application still need raters to mark on the photos , which means it could not actually reduce the workload of therapist nor the subjectivity of results.
Therefore, an object, accurate and automatic method is desired. Recent advances in human tracking algorithm offers a new option for this task. This kind of algorithm can detect the coordinate of a set of joint points from images. Through the position of these points (shoulder, elbow, etc.,), the pose of person can be described and the angle of joints could also be calculated, which provides an attractive alternative for at-distance measuring [11, 20]. In this study, we employed OpenPose, one of the most widely used method proposed by Cao et al.  to estimate joint position from RGB images. Previous articles have evaluated the reliability of OpenPose-based system in gait analysis  and Parkinson rating [23, 24]. Ota et al. also compared OpenPose and VICON (a 3D motion capture system) in measuring lower limb joint angle and found significant associations of the two methods . However, the utility of OpenPose in the assessment of upper limb angle remains unclear. Herein we constructed a measuring setup based on this algorithm, using RGB images to measure upper limb movements. This study evaluates the reliability of OpenPose for clinical measurement by comparing the results with photography-based goniometry.
Materials and methods
Thirty healthy young adults (20 males, 10 females, 22–35 years old), with no claim of medical history nor impairment in the upper limbs participated in this study. This study was approved by the institutional review board of our institution (2021–387). Estimated sample size was calculated by PASS software (version 15.0) using equivalence test for the difference between two means. With a type one error (α) of 0.05, power (1-β) of 0.95, equivalence limit of 10 degree, and standard deviation of 10, that a minimum of 27 samples would be required. All subjects were given full explanations about the motion tasks. After that, written consent was obtained for the use of their images for research purposes.
Since many factors such as the distance to cameras, the angle and height of cameras could affect the measurement results, we constructed a standardized measurement environment in this study, shown in Supplementary Fig. 1A. Three commercial digital cameras with 2560 × 1920 resolution and 79 degrees field of view (HIKIVISION DS-2CD3T56FWDV2-I3) were positioned around the field (one in the front and two in the sides). The height of the cameras was 1.5 m and the distance between the camera and subjects was 3 m. To ensure the consistency of the participant placement, feet markers were placed 3 m away from the cameras. The environment was illuminated by normal white light from LED sources. The background was a white wall without decoration.
Motion tasks and parameters extracting
We designed a 6-task procedure including shoulder abduction, shoulder elevation, elbow flexion, elbow extension, wrist flexion and extension (Shown in Supplementary Fig. 1B). All the motion tasks were completed in the above mentioned environment. To control the impact introduced by rotation, all the interest angles in our design were fully presented in either sagittal or coronal view. Participants were asked to stand in the field and perform the motion tasks one after another. To ensure their performances were the same as we recommended, we set a screen in front of participants with word and video instructions. Moreover, their motion videos were real-time displayed on that screen as well. All photographs were taken from the anterior side, except the elbow flexion was taken from the lateral side (one for each side).
The landmarks of each joint were estimated by the Openpose Human Pose Estimation library (version 1.5.0) . The coordinates for landmarks of joints were further extracted, and skeleton models were rebuilding accordingly. Then, the joint angle was calculated by corresponding coordinates using the following formula.
Digital photography-based measurement
After the automatic measurement, the photography-based measurements were conducted by using the same images. The angle of joints was measured by two hand surgeons individually, applying a screen goniometer software to the images displayed on the computer screen (The main reason of screen-goniometry was to make sure the posture present to measurement system and human researchers were identical. The validity of this method have been previously confirmed [25, 26]). To minimize the uncertainty of manual assessment, these images were reassessed by the same researchers at an interval of one week. The landmarks included the center of the shoulder, elbow and wrist, axis along the center of the upper arm and forearm, and central axis along the metacarpals. During the measurement, surgeons were free to locate the landmarks after reading the instruction. During this procedure, observers were not allowed to see the results of automatic measurement or another observer's report.
Data processing and statistical analysis
The mean values of the four measurements (2 researchers * 2 round) were considered as the standard results for comparison. All measurements are presented as mean ± standard deviation (means ± sd). The deviation between the automatic assessment and standard results and the 95% confidence interval (CI) were calculated to assess the accuracy. The intra-class correlation coefficient (ICC) was also performed between the standard and the proposed measurement for assessing the agreement. Next, the results were analyzed using Bland and Altman analysis . The upper limits of agreement (LOA) were considered reference values to judge if the proposed measurement could be a reliable method for upper limb ROM. As the results of Openpose, like other deep learning models, were calculated by a series of formulas, it is not hard to conclude that the results would be in complete agreement when analyzing the same image twice. So the repeatability of the automatic methods was not assessed. In comparison, the repeatability of manual measurement was evaluated by comparing the test–retest results. In addition, to confirm the reliability, linear regression analyses were conducted to compare the manual and system measurement data. R-square was calculated to evaluate the correlation between different methods.
Statistical analysis was performed by SPSS 22.0 (Armonk, NY: IBM Corp) and R software 4.0.3 (R Foundation for Statistical Computing, Vienna, Austria). Results with p < 0.05 were considered statistically significant. Interpretation of ICC value was as follows: < 0.20: unacceptable, 0.20–0.40: questionable, 0.41–0.60: good, 0.61–0.80 very good, 0.81–1.00: excellent. The correlation coefficient, 1 indicates a total positive linear correlation, 0 means no linear correlation, and -1 shows a total negative linear correlation.
The measuring results in the shoulder, elbow and wrist measured by two observers and the human tracking algorithm are summarized in Table 1 and Fig. 1. The example of automatic measurements result is shown in Supplementary Fig. 2.
The poses of participants were successfully estimated in all but two images, and both were because of the person detection failure (The reason of error was due to these pictures included more than one person and the angle calculation was performed on the wrong target). The success rate was 99.44% (358/360).
Difference between observers
The results of the inter and intra-observer comparison are presented in Tables 2 and 3. There was excellent agreement between observers, with mean difference ranging from 0.08 to 4.33 and ICC value ranging from 0.897 to 0.951. The intra-observer comparison also indicates a good consistency, the mean differences between test and re-test measurements were less than 5 degrees.
Difference between observer and machine
As shown in Table 2, the observer-system differences were comparable to the inter and intra-observer difference. The most significant difference was found in wrist flexion (8.96 ± 12.71; 95%CI: -12.24–5.68). In the other 5 motions, the 95% confidence intervals of the mean differences between manual and automatic assessment were less than 5 degrees. Similarly, the Bland–Altman plots also indicate acceptable agreements for the shoulder and elbow motions. In comparison, the conformity for wrist motions is relatively poor (Fig. 2), as the credible intervals were more than 10 degrees. Then, the consistency was further evaluated by ICC values. The results suggested a good to excellent agreement (ICC > 0.60) in all motions (Table 3). The lowest consistency was found in shoulder elevation and wrist extension (ICC = 0.620), while the best was found in elbow extension (ICC = 0.831). Additionally, linear correlations between system and observer measurement were also demonstrated (R ranges from 0.45 to 0.71, p < 0.001 Fig. 3).
The range of motion (ROM) of the upper limb is an important clinical parameter to various functional evaluations before and after treatment. Conventionally, ROM was assessed manually using the standard goniometer. This procedure is time-consuming and requires expertise. In addition, various reasons such as financial and geographic factors, and busy schedule could prevent patients from clinic visiting . Therefore, telemedicine has become popular as a method of patient evaluation. Photographs are easily obtained and disseminated in our daily life. Getting movement parameters from remote photographs has potential to decrease the cost of physical evaluation. Human pose tracking algorithms can automatically calculate joint angles from RGB images and provide a new option for the remote evaluation. However, the reliability of this method is extremely important before the using in clinical settings.
This study sought to evaluated the reliability of an automatic goniometry method. The testing environments in this study is also possible to be set up in patient's home. In our analysis, we found that the algorithm-based method has acceptable reliability compared to human observers. The results indicate that the differences between the proposed method and the average value of observers are less than 5 degrees in shoulder and elbow motions, comparable to the inter and intra-observer differences. Compared to that reported in previous studies, these differences are notably more minor than that of visual estimation [15, 29] and are comparable to inertial sensors  and depth camera . Therefore, the proposed method may have great accuracy and reliability in measuring ROMs of the shoulder and elbow.
In this study, the greatest observer-machine difference was found in wrist flexion, and the mean value was 8.96 degrees. However, this reliability is still competitive compared to other image-based applications [6, 32]. Nevertheless, as seen in the Bland–Altman plots, we found the angle was over-estimated by the system in most cases. Thus, we speculate this might be a systematic error that could be correct when a larger sample size is available.
It is difficult for participants to keep their posture still during measuring, as previous studies indicated [33, 34]. According to the literature, several methods were employed to minimize this problem. Cook et al. used a wooden triangle with fixed internal angles to support the joints of interest during assessing . In comparison, Chang et al. adopted a glass plate as hand support to reduce movement during the 3D scanning process . More commonly, many studies choose the 3D motion capture system to achieve data collecting [37,38,39] simultaneously and thus minimize the differences caused by involuntary posture changing. Our study compared the results of the automatic system and human observers by measuring the same image individually. In this way, we can conclude the actual differences between the two methods without impacting the inconsistency of motions. The concept of obtaining joint ROM from photographs is not new. Previous studies have indicated that it is accurate and reliable compared with conventional clinical goniometry [25, 26]. Additionally, the results of image-based goniometry could be more consistent than that of the conventional way in some cases . This present study also proved the value of screen goniometry as a reliable alternative for measuring, with slight inter and intra-observer differences.
There are still some limitations of our study: Firstly, although the comparison between the OpenPose algorithm and human observers revealed clinical reliability, future validity studies utilizing the motion capture system as a standard method are still needed to clarify the accuracy. Secondly, the participants were limited to young, healthy persons, and did not included the elderly nor the patients, making the results statistically less robust and lessening the generalizability of the proposed method. Next, motions with rotation were not assessed because it was hard to estimate 3D motions through 2D images. Although it could be an inevitable technical error, this issue will be the aim of our future studies. In addition, angle of joints may contain the movements of several joints (For example, the angle of shoulder joint includes the movement of the scapula, thorax, and thoracic spine) which lead to inaccurate of measurement, but we believe that is still good enough for telemedicine system. Another drawback is that the accuracy of our method depends on the compliance and cooperation of participants to some extent. If the subject cannot properly understand our purposes, the results can exhibit deviation.
This study demonstrates a reliable method to measure joint ROM of the upper limb using RGB photographs. We have proved the reliability of the proposed method by comparing it with photography-based goniometry. Our results indicated that this human pose tracking algorithm could act as an exciting alternative to conventional goniometry. Its use may benefit the remote evaluation as users can obtain reliable kinematics parameters personally without traveling to clinical centers. However, it would be interesting to implement a study with a larger sample of patients or the elders with movement disorders and study more motions.
Availability of data and materials
The data analyzed during this study are not publicly available due to containing identifying information (the accuracy of the algorithm would be impacted if the eyes/facial region was blurring) but are available from the corresponding author on reasonable request.
Range of motion
Limits of agreement
Intra-class correlation coefficient
Suk M, Hanson B, Norvell D, Helfet D. Musculoskeletal Outcomes Measures and Instruments. Vol 1. Dübendorf, Switzerland: AO Foundation Publishing; 2009. 84–386.
Boone DC, Azen SP, Lin CM, Spence C, Baron C, Lee L. Reliability of goniometric measurements. Phys Ther. 1978;58(11):1355–60.
Bovens AM, van Baak MA, Vrencken JG, Wijnen JA, Verstappen FT. Variability and reliability of joint measurements. Am J Sports Med. 1990;18(1):58–63.
Naeemabadi M, Dinesen B, Andersen OK, Madsen NK, Simonsen OH, Hansen J. Developing a telerehabilitation programme for postoperative recovery from knee surgery: specifications and requirements. BMJ Health Care Inform. 2019;26(1):e000022.
Behnoush B, Tavakoli N, Bazmi E, Nateghi Fard F, Pourgharib Shahi MH, Okazi A, Mokhtari T. Smartphone and universal goniometer for measurement of elbow joint motions: a comparative study. Asian J Sports Med. 2016;7(2):e30668.
Lendner N, Wells E, Lavi I, Kwok YY, Ho PC, Wollstein R. Utility of the iPhone 4 Gyroscope Application in the Measurement of Wrist Motion. Hand (New York, NY). 2019;14(3):352–6.
Vauclair F, Aljurayyan A, Abduljabbar FH, Barimani B, Goetti P, Houghton F, Harvey EJ, Rouleau DM. The smartphone inclinometer: a new tool to determine elbow range of motion? Eur J Orthop Surg Traumatol. 2018;28(3):415–21.
Meislin MA, Wagner ER, Shin AY. A comparison of elbow range of motion measurements: smartphone-based digital photography versus goniometric measurements. J Hand Surg Am. 2016;41(4):510-515.e511.
Buvik A, Bugge E, Knutsen G, Småbrekke A, Wilsgaard T. Quality of care for remote orthopaedic consultations using telemedicine: a randomised controlled trial. BMC Health Serv Res. 2016;16(1):483.
Kim SH, Kwon OY, Park KN, Jeon IC, Weon JH. Lower extremity strength and the range of motion in relation to squat depth. J Hum Kinet. 2015;45:59–69.
Ota M, Tateuchi H, Hashiguchi T, Kato T, Ogino Y, Yamagata M, Ichihashi N. Verification of reliability and validity of motion analysis systems during bilateral squat using human pose tracking algorithm. Gait Posture. 2020;80:62–7.
Buechi R, Faes L, Bachmann LM, Thiel MA, Bodmer NS, Schmid MK, Job O, Lienhard KR. Evidence assessing the diagnostic performance of medical smartphone apps: a systematic review and exploratory meta-analysis. BMJ Open. 2017;7(12):e018280.
Mitchell K, Gutierrez SB, Sutton S, Morton S, Morgenthaler A. Reliability and validity of goniometric iPhone applications for the assessment of active shoulder external rotation. Physiother Theory Pract. 2014;30(7):521–5.
Kim TS, Park DD, Lee YB, Han DG, Shim JS, Lee YJ, Kim PC. A study on the measurement of wrist motion range using the iPhone 4 gyroscope application. Ann Plast Surg. 2014;73(2):215–8.
Hayes K, Walton JR, Szomor ZR, Murrell GA. Reliability of five methods for assessing shoulder range of motion. Aust J Physiother. 2001;47(4):289–94.
Scott KL, Skotak CM, Renfree KJ. Remote Assessment of wrist range of motion: inter- and intra-observer agreement of provider estimation and direct measurement with photographs and tracings. J Hand Surg. 2019;44(11):954–65.
Kolber MJ, Vega F, Widmayer K, Cheng MS. The reliability and minimal detectable change of shoulder mobility measurements using a digital inclinometer. Physiother Theory Pract. 2011;27(2):176–84.
Boissy P, Diop-Fallou S, Lebel K, Bernier M, Balg F, Tousignant-Laflamme Y. Trueness and minimal detectable change of smartphone inclinometer measurements of shoulder range of motion. Telemed J E Health. 2017;23(6):503–6.
Ma M, Proffitt R, Skubic M. Validation of a Kinect V2 based rehabilitation game. PLoS One. 2018;13(8):e0202338.
Zago M, Luzzago M, Marangoni T, De Cecco M, Tarabini M, Galli M. 3D Tracking of human motion using visual skeletonization and stereoscopic vision. Front Bioeng Biotechnol. 2020;8:181.
Cao Z, Hidalgo G, Simon T, Wei S-E, Sheikh Y. OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields. IEEE Trans Pattern Anal Mach Intell. 2019;43(1):172–86.
Ota M, Tateuchi H, Hashiguchi T, Ichihashi N. Verification of validity of gait analysis systems during treadmill walking and running using human pose tracking algorithm. Gait Posture. 2021;85:290–7.
Park KW, Lee EJ, Lee JS, Jeong J, Choi N, Jo S, Jung M, Do JY, Kang DW, Lee JG, et al. Machine learning-based automatic rating for cardinal symptoms of Parkinson disease. Neurology. 2021;96(13):e1761–9.
Sato K, Nagashima Y, Mano T, Iwata A, Toda T. Quantifying normal and parkinsonian gait features from home movies: Practical application of a deep learning-based 2D pose estimator. PLoS One. 2019;14(11):e0223549.
Armstrong AD, MacDermid JC, Chinchalkar S, Stevens RS, King GJ. Reliability of range-of-motion measurement in the elbow and forearm. J Shoulder Elbow Surg. 1998;7(6):573–80.
Blonna D, Zarkadas PC, Fitzsimmons JS, O’Driscoll SW. Validation of a photography-based goniometry method for measuring joint range of motion. J Shoulder Elbow Surg. 2012;21(1):29–35.
Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307–10.
Jacklin PB, Roberts JA, Wallace P, Haines A, Harrison R, Barber JA, Thompson SG, Lewis L, Currell R, Parker S, et al. Virtual outreach: economic evaluation of joint teleconsultations for patients referred by their general practitioner for a specialist opinion. BMJ. 2003;327(7406):84.
Terwee CB, de Winter AF, Scholten RJ, Jans MP, Devillé W, van Schaardenburg D, Bouter LM. Interobserver reproducibility of the visual estimation of range of motion of the shoulder. Arch Phys Med Rehabil. 2005;86(7):1356–61.
Beshara P, Chen JF, Read AC, Lagadec P, Wang T, Walsh WR. The reliability and validity of wearable inertial sensors coupled with the Microsoft Kinect to measure shoulder range-of-motion. Sensors (Basel, Switzerland). 2020;20(24):7238.
Lee SH, Yoon C, Chung SG, Kim HC, Kwak Y, Park HW, Kim K. Measurement of shoulder range of motion in patients with adhesive capsulitis using a Kinect. PLoS One. 2015;10(6):e0129398.
Ienaga N, Fujita K, Koyama T, Sasaki T, Sugiura Y, Saito H. Development and user evaluation of a smartphone-based system to assess range of motion of wrist joint. J Hand Surg Glob Online. 2020;2(6):339–42.
Yu F, Zeng L, Pan D, Sui X, Tang J. Evaluating the accuracy of hand models obtained from two 3D scanning techniques. Sci Rep. 2020;10(1):11875.
Zl A, Ccc B, Pgd B, Xc A. Refraction effect analysis of using a hand-held laser scanner with glass support for 3D anthropometric measurement of the hand: strategy comparison and application. J Measurement. 2008;41(8):851–61.
Cook JR, Baker NA, Cham R, Hale E, Redfern MS. Measurements of wrist and finger postures: a comparison of goniometric and motion capture techniques. J Appl Biomech. 2007;23(1):70–8.
Chang C-C, Li Z, Cai X, Dempsey P. Error control and calibration in three-dimensional anthropometric measurement of the hand by laser scanning with glass support. J Measurement. 2007;40(1):21–7.
Zulkarnain RF, Kim GY, Adikrishna A, Hong HP, Kim YJ, Jeon IH. Digital data acquisition of shoulder range of motion and arm motion smoothness using Kinect v2. J Shoulder Elbow Surg. 2017;26(5):895–901.
Wilson JD, Khan-Perez J, Marley D, Buttress S, Walton M, Li B, Roy B. Can shoulder range of movement be measured accurately using the Microsoft Kinect sensor plus Medical Interactive Recovery Assistant (MIRA) software? J Shoulder Elbow Surg. 2017;26(12):e382–9.
Chu H, Joo S, Kim J, Kim JK, Kim C, Seo J, Kang DG, Lee HS, Sung KK, Lee S. Validity and reliability of POM-Checker in measuring shoulder range of motion: protocol for a single center comparative study. Medicine. 2018;97(25):e11082.
We want to thank Prof. Bai Leng, Prof. Zhenguo Lao and Jianwen Zheng for technical help, writing assistance, and general support.
Ethics approval and consent to participate
The study was approved by the Ethics Committee of the First Affiliated Hospital of Sun Yat-sen University. All participants provided informed consent prior to participation in this study (reference number: 2021–387) AND publication of identifying information/images in an online open-access publication. We confirm that all methods were performed in accordance with the relevant guidelines and regulations.
Consent for publication
The authors declared no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Fan, J., Gu, F., Lv, L. et al. Reliability of a human pose tracking algorithm for measuring upper limb joints: comparison with photography-based goniometry. BMC Musculoskelet Disord 23, 877 (2022). https://doi.org/10.1186/s12891-022-05826-4
- Pose estimation
- Range of motion
- Automatic measurement