 Research article
 Open Access
 Published:
Testretest reliability and measurement errors of grip strength test in patients with traumatic injuries in the upper extremity: a crosssectional study
BMC Musculoskeletal Disorders volume 20, Article number: 256 (2019)
Abstract
Background
Grip strength (GS) test is an essential aspect of clinical practice with patients with upper extremity injuries. The random error of GS test was hypothesized to be proportional to the level of GS. The purpose of the current study was to estimate a precise range for the measurement error of GS in patients following traumatic injuries in the upper extremity.
Methods
Following traumatic injuries in the upper extremity, 109 participants completed GS tests twice one weekend apart. The BlandAltman plot analysis was adopted to estimate the precise limits of agreement with 95% confidence interval (CI).
Results
The mean of three consecutive trials had a higher intraclass correlation coefficient of 0.974 (95% CI = 0.963, 0.982) than those of one trial and the mean of the first two trials in injured upper extremities. When GS was ≤20 kg, the upper limit of agreement with 95% CI was estimated as (0.41 × average GS + 1.24), while the lower limit was estimated as (− 0.41 × average GS − 0.39). A table of onetoone matches between averaged GS ≤ 20 kg and transformed ranges of random errors with 95% certainty was created; the standard error of measurement and minimal detectable change with 95% certainty of GS test were 1.8 and 4.9 kg, respectively. When GS was > 20 kg, the width of agreement with 95% CI ranged from − 4.9 to 5.3 kg, and the standard error of measurement and minimal detectable change with 95% certainty were 1.8 and 5.1 kg, respectively.
Conclusion
The onetoone match table can be considered as a practical tool to judge a change in GS score is real or due to random errors when it is ≤20 kg.
Background
In clinical practice, therapists concern several aspects for patients with upper extremity injuries, including pain, scarring, swelling, and range of motion of involved and adjacent joints, sensibility, muscle strength, and fine motor abilities. Among these aspects, grip strength (GS) is an essential indictor of hand function because it is a basic requirement for the performance of sports, daily activities, and work tasks [1,2,3]. Additionally, GS can also reflect general health status and, more specifically, it is negatively associated with cardiovascular mortality, myocardial infarction, and stroke [4]. Therefore, reliable GS measures are important for evaluating the severity of a disability and for monitoring clinical progress.
The intraclass correlation coefficient (ICC) is traditionally used to estimate the agreement between two repeated administrations [5, 6]. Previous studies related to the measurement properties of GS showed that hand dynamometer has satisfactory testretest reliability in upper extremities with physical dysfunction [7,8,9]. To determine patients’ changes in a specific measurement are real or due to random errors, the minimal detectable change with 95% certainty (MDC_{95}) is used as a parameter to estimate the size of random errors [10,11,12]. Therefore, by knowing the MDC of GS in patients with upper extremity injuries, clinicians can determine the change in GS score is likely to be the result of a real improvement or caused by random measurement errors. Schreuders et al. [8] estimated the testretest reliability of GS test in patients with hand injuries and reported an ICC of 0.97, with an MDC_{95} of 61 N (≈ 6.22 kg). This shows that differences between two consecutive measurements greater than 61 N can be interpreted as real changes in GS, with 95% certainty. Nevertheless, in clinical practice with patients with upper extremity injuries, many patients at subacute stage may experience a very low GS score of only a few kilograms. To the present authors’ knowledge, the MDC_{95} of 61 N may be too large for patients with only a few kilograms of GS, and we consider that it is quite impossible for patients with extremely poor GS to have such relatively large random errors. Although clinicians may have high confidence in determining if patients’ changes are real when GS scores are greater than the large MDC_{95}, it will result in high possibility of falsenegative interpretations.
The BlandAltman plot complements the role of ICC and MDC in determining testretest reliability of measurement tools. The plot, usually presented as differences of two measurements against the mean of two measurements, can reveal the 95% limits of agreement (LoA_{95}), which is the width of the differences with 95% certainty. The LoA_{95} defines a range within which most differences will lie, and a narrow range of LoA_{95} indicates that the scores of two measurements are close together [13]. Whether and how a relationship exists between them can be identified through statistical analysis and visual inspection [13]. In a study evaluating the testretest reliability of the Jamar Dynamometer in a healthy population of 76 participants, the BlandAltman plot seemed to indicate that the differences were proportional to the mean [14]. In addition, we found similar scatters in the BlandAltman plot of another study with 19 healthy participants [15]. Therefore, we hypothesized that the random errors between two administrations are also proportional to GS in patients with upper extremity injuries. Furthermore, it is not appropriate to determine the change in GS score is real or due to random errors by using MDC_{95} alone in patients with different levels of GS ranging from several kilograms to tens of kilograms.
The purpose of this study was to estimate the testretest reliability and the precise range of measurement errors of GS test in patients with upper extremity injuries using the BlandAltman plot analysis to help clinical practitioners to determine that patients’ changes in GS indicate real progress or are due to random errors.
Methods
Design
This research was a clinical measurement and crosssectional study. Participants received GS tests twice; more specifically, the first test was administered on Friday and we carried out the second test on the following Monday.
Participants
Patients with upper extremity dysfunction due to traumatic occupational injuries were recruited in a rehabilitation center. All patients were receiving inpatient rehabilitation services in the rehabilitation center when they were recruited. The following inclusion criteria were applied: (1) aged 18 years or above; (2) having a traumatic injury in unilateral upper extremity; (3) being capable of being evaluated for GS, confirmed by an occupational therapist experienced in hand therapy; (4) remaining dysfunction in injured upper extremities; and (5) having good compliance with occupational therapists’ daily treatment instructions. The following exclusion criteria were applied: (1) having concurrent injuries in any other parts of the body; (2) experiencing pain when performing maximal isometric GS (visual analogue scale > 3); and (3) not being able to attend the second GS test.
All participants signed an informed consent form in accordance with the Declaration of Helsinki, and the study was approved by the medical ethics committee of the rehabilitation center.
Procedures
In this rehabilitation center, all patients receive rehabilitation services five days per week, from Monday to Friday. During weekends, they usually go home or stay in wards and do not receive any formal rehabilitation services from clinical practitioners. The aim of this study was to estimate the testretest reliability and the range of measurement errors of GS test. To avoid any bias from interventions, we arranged the first test on Friday and the second test on the following Monday. Therefore, we hypothesized that because no effective interventions were delivered in the short interval between the two tests, none of the participants would have experienced a real change in GS. After signing the consent form, demographic data including gender, marital status, age, height, body weight, and dominant hand were collected from each participant. In addition, injured sides, injury sites, and the number of days since injuries were confirmed.
GS test
Prior to starting the first test, participants were instructed to sit on a chair and maintain the posture recommended by the American Society of Hand Therapy [16] and Roberts et al. [17]. The participants sat with their feet flat on the floor, the shoulder adducted 0 degree, the elbow flexed at 90 degrees, the forearm in a neutral position, and the wrist extended to 30 degrees. The dynamometer used in this study was a calibrated Jamar Hydraulic Hand Dynamometer (model SH5001, Saehan Corp, Masan, Korea) which was the most commonly used one and showed excellent reliability for the measurement of GS in previous studies [17]. Verbal instructions and demonstration about how to perform GS test were provided to each participant prior to the test. Once everything was ready, the participants were instructed to exert maximum grip at the second handle position and to maintain the contraction for five seconds. Three consecutive trials were performed with both injured and healthy upper extremities and there was 15 s of rest period among trials to prevent muscle fatigue. All participants started the test with their healthy hands. The value at which the needle of the dynamometer stopped was recorded for each trial. The second test followed the above procedures and used the same dynamometer for all patients. In the current study, the same occupational therapist experienced in hand therapy was responsible for all participants’ GS tests.
Statistical analysis
Descriptive statistics were computed to illustrate participants’ demographic characteristics. Both the onesample KolmogorovSmirnov test and histogram plot were applied to check for the normality of continuous variables. We used the data of the first trial, the mean of the first two trials (mean_{2}), and the mean of the three trials (mean_{3}) to estimate the testretest reliability and the measurement error of GS of injured and healthy upper extremities. ICC_{2,1} as well as their 95% confidence intervals (CI) were calculated [5]. An ICC value higher than 0.9 was considered excellent. In addition, a paired ttest was applied to verify if there was any systematic bias between the first and second tests. The MDC_{95} and standard error of measurement (SEM) were calculated using the following formulas [18]:
To verify whether there were any other relationships between GS and measurement errors, the BlandAltman plots were created based on the values of mean_{3}. A systematic error is confirmed if the 95% CI for the mean value of differences does not include 0. The LoA_{95} was calculated by using the BlandAltman plots which present the scatter of differences between the first and second tests (yaxis) against the average of the first and second GS tests (average GS) (xaxis) [19]. If the differences are normally distributed and do not show any associations with the average GS, limits of the LoA_{95} are computed as
where mean_{difference} is the mean of differences between the two tests, and SD_{difference} is the standard deviation of the differences. This implies that 95% of the differences will lie between the upper and lower limits.
In injured upper extremities, the Spearman’s correlation coefficient ρ between the observed differences, which were not normally distributed, and the average GS was 0.118 (p = 0.310). Therefore, residuals were defined as the differences between observed differences and the mean of differences. It was observed that the absolute values of residuals (R), which were the distances between the observed differences and mean_{difference}, tended to increase as the average GS increased in upper extremities with poor GS. However, in upper extremities with high GS, this trend was not distinct. To identify the most appropriate cutoff point on the average GS to separate the above two conditions, the Spearman’s correlation coefficient ρ between the R and the average GS lower than each possible cutoff point on the average GS was calculated. This was because the R was not normally distributed. The cutoff was defined as the point where the relationship between the R and the average GS had the highest Spearman’s correlation coefficient. The BlandAltman plots were then constructed again for the two conditions according to Bland and Altman’s recommendations [13]. First, we regressed the R on the average GS to derive
Second, the LoA_{95} was calculated using the following formula:
Once upper and lower limits of the LoA_{95} were calculated, onetoone matches between integral GS scores and transformed ranges of random errors with 95% certainty were created for convenience in clinical application. The transformed lower and upper limits of the ranges of random errors were calculated using the integral GS scores plus the upper and lower limits of LoA_{95}, respectively.
All statistical analyses were performed with the IBM SPSS Statistics 20. The level of significance was set at p < 0.05 for all statistical analyses performed.
Results
Sample characteristics
A total of 111 patients participated in the current study between February and September 2017, of whom two patients did not attend the second test. Therefore, 109 patients were included in the statistical analysis, and their median age was 37 years. In terms of the types of injuries, 52 (47.7%) patients had fractures, followed by 18 (16.5%), 14 (12.8%), 14 (12.8%), and 11 (10.1%) for finger replantation, finger amputation, complex injuries involving tendons, and complex injuries involving nerves, respectively. The median days since being injured was 133. Characteristics of the participants are presented in full in Table 1.
The testretest reliability of GS test
The indices of testretest reliability of GS test in injured and healthy upper extremities based on the data of the first trial, mean_{2}, and mean_{3} are presented in Table 2. No significant differences were observed between the first and second test scores, which ranged from 0.1 (95% CI = − 0.7, 0.8) kg to 0.7 (95% CI = 0, 1.4) kg. The testretest reliabilities of GS in injured and healthy upper extremities were excellent, with high ICCs ranging from 0.936 (95% CI = 0.908, 0.956) to 0.974 (95% CI = 0.963, 0.982). Although mean_{3} had the highest ICC of 0.956 (95% CI = 0.936, 0.970) in healthy upper extremities, it did not differ significantly from those of mean_{2} and the first trial, 0.949 (95% CI = 0.926, 0.965) and 0.936 (95% CI = 0.908, 0.956), respectively. In contrast, in injured upper extremities, mean_{3} had the highest ICC of 0.974 (95% CI = 0.963, 0.982), which was significantly higher than that of the first trial, 0.945 (95% CI = 0.920, 0.962). However, the ICC of mean_{3} was not significantly different from that of mean_{2}, 0.970 (95% CI = 0.956, 0.979). Mean_{3} had the lowest SEM and MDC_{95}, 1.8 kg and 5.0 kg, respectively, compared with mean_{2} and the first trial in injured upper extremities (Table 2).
The BlandAltman plot analysis based on mean_{3}
The BlandAltman plot analysis was conducted for further examination of the differences. In healthy upper extremities, the BlandAltman plot showed no systematic trend (Fig. 1 a). The mean difference between the second and first tests was 0.2 (95% CI = − 0.4, 0.7) kg. The width of LoA_{95} was − 5.6 to 5.9 kg, and 101 (92.7%) cases fell within the 95% limits of agreement. In injured upper extremities, the mean difference between the second and first tests was 0.4 (95% CI = − 0.1, 0.8) kg. The width of LoA_{95} was − 4.6 to 5.3 kg (Fig. 1 b), and 97 (89.0%) cases fell within the 95% limits of agreement. Visual inspection suggested a trend whereby the absolute value of residuals seemed to be proportional to the value along the average GS in injured upper extremities with poor GS. However, in injured upper extremities with high GS, this trend was not distinct.
The Spearman’s correlation coefficients between residuals and average GS scores lower than possible cutoffs were calculated (Fig. 2). The results indicated that 20 kg was the most appropriate cutoff with the highest correlation coefficient (Spearman’s ρ = 0.566, p < 0.001). The Spearman’s ρ between the absolute values of residuals and the average GS in injured upper extremities with GS ≤ 20 was 0.566 (95% CI = 0.406, 0.896, p < 0,001), but − 0.003 (95% CI = − 0.352, 0.355, p = 0.987) in injured upper extremities with GS > 20 kg. Findings for the testretest reliability and measurement errors based on mean_{3} ≤ 20 kg and > 20 kg in injured upper extremities are presented in Table 2. For injured upper extremities with GS ≤ 20 kg (n = 76), the SEM and MDC_{95} were 1.8 kg and 4.9 kg, respectively; while the SEM and MDC_{95} were 1.8 kg and 5.1 kg in injured upper extremities with GS > 20 kg (n = 33), respectively.
As shown in Fig. 3, for injured upper extremities with average GS ≤ 20 kg, the BlandAltman plot showed a narrow LoA_{95} in upper extremities with poor GS, but the LoA_{95} width increased as the average GS increased. The mean difference between the second and first tests was 0.43 (95% CI = − 0.15, 1.00). The limits of the agreement estimated were:
Onetoone matches between GS scores ≤20 kg and ranges of random errors with 95% certainty were created and are presented in Table 3.
In injured upper extremities with GS scores > 20 kg, the BlandAltman plot based on mean_{3} showed no systematic difference between the first and second tests and no distinct trend between residuals and average GS scores. The mean difference between the second and first tests was 0.2 (95% CI = − 0.7, 1.1) kg, and the width of LoA_{95} was − 4.9 to 5.3 kg. In total, 104 (95.4%) cases fell within the LoA_{95} in the new BlandAltman plot analyses.
Discussion
The interrater reliability of GS test has been shown excellent in previous study [20]. The current study estimated the testretest reliability of GS test based on results of the first trial, the mean of the first two trials, and the mean of three consecutive trials, and calculated the SEM as well as MDC. In the current study, BlandAltman plot analysis was adopted to explore the relationship between measurement errors and GS in healthy and injured upper extremities.
In healthy upper extremities, our findings were consistent with previous studies, which evaluated the testretest reliability of handheld dynamometers [21,22,23,24,25]. Our study confirmed that the GS test using Jamar Dynamometer had excellent reliability and was not affected by practice effect. In addition, although mean_{3} had the highest ICC, it was not significantly different from those of the first trial and mean_{2}. This indicated that these three methods had comparable reliability and supported the onetrial protocol for assessing GS in healthy upper extremities [26]. Visual inspection of the BlandAltman plot for healthy upper extremities did not show signs of any systematic bias in the relationships between differences and GS scores. Therefore, the MDC_{95} could be considered as an ideal criterion to determine that the changes in GS of healthy upper extremities are real or due to random error.
However, in injured upper extremities, mean_{3} had a significantly higher ICC than that of the first trial, but its difference from mean_{2} was nonsignificant. The SEM and MDC_{95} of mean_{3} were also the lowest in injured upper extremities. In particular, the paired ttest for the first trial of injured upper extremities showed a pvalue close to the significance threshold, and the lower limit of 95% CI for the mean difference was zero. This revealed that there might be a systematic bias which influenced the reliability. Kennedy et al. [27] found that both one trial and mean_{3} had comparable testretest reliability based on a sample of 25 participants with rheumatoid arthritis. However, we recruited 109 participants in the current study, which resulted in narrow 95% CIs for the ICC. Therefore, the ICC values of the first trial (ICC = 0.945) and mean_{3} (ICC = 0.974) did not exhibit a large difference, but it was significantly different. Accordingly, we considered that the first trial, mean_{2}, and mean_{3} had excellent testretest reliability, among which the mean_{3} method was the most reliable. Therefore, we support the use of the mean_{3} method to test patients’ GS in clinical practice, as recommended by the American Society of Hand Therapy [16], even though it would entail extra time for the test.
The BlandAltman plot is a graphical method to identify any relationships between the differences and averages of scores on two tests [13]. In our study, a trend was observed whereby the difference was proportional to the average of two GS tests in injured upper extremities with poor GS. However, the width of the differences in injured upper extremities with high GS was stable. We used the Spearman’s correlation coefficient to identify the ideal cutoff point where the relationship between the absolute values of residuals and average GS had the highest Spearman’s correlation coefficient. Our findings showed that 20 kg was the most appropriate cutoff point to separate injured upper extremities into the above two conditions. Additionally, we estimated the width of LoA_{95} for injured upper extremities with GS ≤ 20 kg according to the recommendations of Bland and Altman [13]. The graph of LoA_{95} looked like a “horn,” which indicated that the measurement error increased as GS increased when GS was ≤20 kg. To take the example of a GS score of 5 kg, the width of the measurement error was − 3.5 to 3.4 kg, according to the hornlike LoA_{95}. However, according to the MDC_{95}, the width of the measurement error was − 4.9 to 4.9 kg. Therefore, when the MDC_{95} was used to determine whether a patient’s change was beyond the threshold of random error, the possibility of a falsenegative interpretation would be increased.
To simplify clinical application of the equations we proposed, we transformed them into a table giving onetoone matches between GS scores and ranges of random errors with 95% certainty. For the clinical application of this table, clinicians can first find their patients’ current level of GS in the left column of the table, and then the lower and upper limits of the corresponding range of random errors can be determined. Specifically, a patient’s change may not be real if the result of the second test lies within the corresponding range. In addition, upon combining the BlandAltman plot for GS ≤ 20 and GS > 20 kg, a higher percentage of cases was found to fall within the 95% limits of agreement.
Our study may be the first one with a large sample size which constructed the BlandAltman plots for poor and high GS scores. In the past, many authors adopted the BlandAltman plots to analyze the reliability of GS test, but they did not conduct the analysis presented in our study. The most important reason might be their small sample sizes [15, 28, 29]. Scatter plots of small sample sizes usually cannot easily indicate obvious relationships, and statistical analyses may easily yield nonsignificant results. Therefore, such analyses were ignored in previous studies. In our study, the Spearman’s ρ between the absolute values of residuals and the average of the first and second tests was 0.566 in 76 patients with GS ≤ 20 kg, and post hoc analysis showed a statistical power of 0.999, indicating a powerful statistical significance. On the other hand, previous researchers commonly evaluated the reliability of devices for GS test in healthy populations showing high level of GS score [14]. However, our study revealed that the GS score of injured upper extremities ranged from several kilograms to tens of kilograms, which covered the full range of GS scores. This was another reason why we could identify the relationship between measurement errors and GS scores.
This study also had some limitations. Firstly, the present participants received rehabilitation services on weekdays. We hypothesized that they did not undergo any real change over the weekend because they did not receive any formal interventions during this period. However, two confounding factors might have influenced the results of the current study: the lasting effect of interventions received during weekdays and additional exercises done by the participants during the weekend. Generally speaking, the lasting effect and additional exercises could improve participants’ GS scores and increase the extent of disagreement between the two tests. This may be the reason why the paired ttest for the first trial of injured upper extremities showed a pvalue close to the significance level. Secondly, only 33 participants had GS score > 20 kg. Therefore, we could not make conclusions with strong confidence as to whether the appropriate cutoff point had changed if we recruited more participants with GS > 20 kg. Thirdly, to avoid any learning effect, some researchers employed a warmup practice prior to GS test in addition to verbal instructions and demonstration [30, 31]. However, in the current study, we provide verbal instructions and demonstration only and this may have a negative influence on the reliability of GS test. Lastly, we only sampled participants who had traumatic injuries and only used one commercial handhold dynamometer to estimate the measurement error in the current study. Therefore, we cannot be certain that our results can be generalized to other disorders and devices to asses GS.
Conclusions
In summary, the GS test was found to have excellent testretest reliability in healthy and injured upper extremities. We also recommend that clinical practitioners should use mean_{3} for GS test, particularly in cases with injured upper extremities. When the GS is ≤20 kg, clinicians can use the onetoone match table to judge a change in GS is real or due to random errors.
Abbreviations
 R:

Absolute values of residuals
 CI:

Confidence intervals
 GS:

Grip strength
 ICC:

Intraclass correlation coefficient
 LoA_{95} :

95% limits of agreement
 MDC_{95} :

Measurement minimal detectable change with 95% certainty
 Mean_{2} :

Mean of the first two trials
 Mean_{3} :

Mean of the three trials
 SD:

Standard deviation
 SEM:

Standard error of measurement
References
 1.
Beumer A, Lindau TR. Grip strength ratio: a grip strength measurement that correlates well with DASH score in different hand/wrist conditions. BMC Musculoskelet Disord. 2014;15:336.
 2.
Chang JH, Wu M, Lee CL, Guo YL, Chiu HY. Correlation of return to work outcomes and hand impairment measures among workers with traumatic hand injury. J Occup Rehabil. 2011;21(1):9–16.
 3.
Bruyns CN, Jaquet JB, Schreuders TA, Kalmijn S, Kuypers PD, Hovius SE. Predictors for return to work in patients with median and ulnar nerve injuries. J Hand Surg Am. 2003;28(1):28–34.
 4.
Leong DP, Teo KK, Rangarajan S, LopezJaramillo P, Avezum A Jr, Orlandini A, et al. Prognostic value of grip strength: findings from the prospective urban rural epidemiology (PURE) study. Lancet. 2015;386(9990):266–73.
 5.
Shrout PE. Measurement reliability and agreement in psychiatry. Stat Methods Med Res. 1998;7(3):301–17.
 6.
Lexell JE, Downham DY. How to assess the reliability of measurements in rehabilitation. Am J Phys Med Rehabil. 2005;84(9):719–23.
 7.
Brown A, Cramer LD, Schmidt J, Ware L, Mackenzie E. Validity and reliability of the Dexter hand evaluation and therapy system in handinjured patients. J Hand Ther. 2000;13(1):37–45.
 8.
Schreuders TA, Roebroeck ME, Goumans J, van Nieuwenhuijzen JF, Stijnen TH, Stam HJ. Measurement error in grip and pinch force measurements in patients with hand injuries. Phys Ther. 2003;83(9):806–15.
 9.
Clifford MS, Hamer P, Phillips M, Wood FM, Edgar DW. Grip strength dynamometry: reliability and validity for adults with upper limb burns. Burns. 2013;39(7):1430–6.
 10.
Stratford PW, Binkley J, Solomon P, Finch E, Gill C, Moreland J. Defining the minimum level of detectable change for the RolandMorris questionnaire. Phys Ther. 1996;76(4):359–65.
 11.
Stratford PW, Binkley JM. Applying the results of selfreport measures to individual patients: an example using the RolandMorris questionnaire. J Orthop Sports Phys Ther. 1999;29(4):232–9.
 12.
Portney LG, Watkins MP. Foundations of clinical research: Pearson new international edition: applications to practice. 3rd ed. New Jersey: Pearson Education; 2009.
 13.
Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8(2):135–60.
 14.
Hogrel JY. Grip strength measured by high precision dynamometry in healthy subjects from 5 to 80 years. BMC Musculoskelet Disord. 2015;16:139.
 15.
Savva C, Karagiannis C, Rushton A. Testretest reliability of grip strength measurement in full elbow extension to evaluate maximum grip strength. J Hand Surg Eur Vol. 2013;38(2):183–6.
 16.
Fess EE. Grip strength In: Casanova JS, editor. Clinical assessment recommendations. Chicago: American Society of Hand Therapists; 1992. p. 41–6.
 17.
Roberts HC, Denison HJ, Martin HJ, Patel HP, Syddall H, Cooper C, et al. A review of the measurement of grip strength in clinical and epidemiological studies: towards a standardised approach. Age Ageing. 2011;40(4):423–9.
 18.
Haley SM, FragalaPinkham MA. Interpreting change scores of tests and measures used in physical therapy. Phys Ther. 2006;86(5):735–43.
 19.
Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;327(8476):307–10.
 20.
Bodilsen AC, JuulLarsen HG, Petersen J, Beyer N, Andersen O, Bandholm T. Feasibility and interrater reliability of physical performance measures in acutely admitted older medical patients. PLoS One. 2015;10(2):e0118248.
 21.
Nitschke JE, McMeeken JM, Burry HC, Matyas TA. When is a change a genuine change? A clinically meaningful interpretation of grip strength measurements in healthy and disabled women J Hand Ther. 1999;12(1):25–30.
 22.
Wang CY, Chen LY. Grip strength in older adults: testretest reliability and cutoff for subjective weakness of using the hands in heavy tasks. Arch Phys Med Rehabil. 2010;91(11):1747–51.
 23.
Vermeulen J, Neyens JC, Spreeuwenberg MD, van Rossum E, Hewson DJ, de Witte LP. Measuring grip strength in older adults: comparing the gripball with the Jamar dynamometer. J Geriatr Phys Ther. 2015;38(3):148–53.
 24.
Beaudart C, Rolland Y, CruzJentoft AJ, Bauer JM, Sieber C, Cooper C, et al. Assessment of muscle function and physical performance in daily clinical practice. Calcif Tissue Int. 2019. https://doi.org/10.1007/s0022301900545w.
 25.
Bohannon RW. Testretest reliability of measurements of handgrip strength obtained by dynamometry from older adults: a systematic review of research in the PubMed database. J Frailty Aging. 2017;6(2):83–7.
 26.
Hollak N, Soer R, van der Woude LH, Reneman MF. Towards a comprehensive functional capacity evaluation for hand function. Appl Ergon. 2014;45(3):686–92.
 27.
Kennedy D, JeroschHerold C, Hickson M. The reliability of one vs. three trials of painfree grip strength in subjects with rheumatoid arthritis. J of Hand Ther. 2010;23(4):384–90.
 28.
Savva C, Giakas G, Efstathiou M, Karagiannis C. Testretest reliability of handgrip strength measurement using a hydraulic hand dynamometer in patients with cervical radiculopathy. J Manip Physiol Ther. 2014;37(3):206–10.
 29.
Stockton KA, Wrigley TV, Mengersen KA, Kandiah DA, Paratz JD, Bennell KL. Testretest reliability of handheld dynamometry and functional tests in systemic lupus erythematosus. Lupus. 2011;20(2):144–50.
 30.
Nascimento LR, Polese JC, Faria CDCM, TeixeiraSalmela LF. Isometric hand grip strength correlated with isokinetic data of the shoulder stabilizers in individuals with chronic stroke. J Bodyw Mov Ther. 2012;16(3):275–80.
 31.
Faria CDCM, Aguiar L, Larar E, Souza L, Martins J, TeixeiraSalmela L. Dynamometry for the assessment of grip, pinch, and trunk strength in subjects with chronic stroke: reliability and various sources of outcome values. Int J Phys Med Rehabil. 2013;1(8):1–5.
Acknowledgements
We thank all patients who participated in this study.
Funding
This research was supported by the National Natural Science Foundation of China (grant numbers 61761166002) and the Fundamental Research Funds for the Central Universities.
Availability of data and materials
The datasets used and analyzed in the current study are available from the corresponding author (NW) upon reasonable request.
Author information
Affiliations
Contributions
Study conception and design as well as data collection were performed by BZ, ST and NW. The statistical analysis and the interpretation of the results were completed by BZ and NW. The first draft of the manuscript was composed by BZ. All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The current study was approved by the Medical Ethics Committee of Shanghai Yangzhi Rehabilitation Hospital (reference No. 17YZ101). All patients signed an informed consent form prior to participating in the study.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Bai, Z., Shu, T. & Niu, W. Testretest reliability and measurement errors of grip strength test in patients with traumatic injuries in the upper extremity: a crosssectional study. BMC Musculoskelet Disord 20, 256 (2019). https://doi.org/10.1186/s128910192623z
Received:
Accepted:
Published:
Keywords
 Grip strength
 Testretest reliability
 Measurement error
 Minimal detectable change
 BlandAltman plot
 Upper extremity