- Research article
- Open Access
Hand-held dynamometry in patients with haematological malignancies: Measurement error in the clinical assessment of knee extension strength
BMC Musculoskeletal Disorders volume 10, Article number: 31 (2009)
Hand-held dynamometry is a portable and inexpensive method to quantify muscle strength. To determine if muscle strength has changed, an examiner must know what part of the difference between a patient's pre-treatment and post-treatment measurements is attributable to real change, and what part is due to measurement error. This study aimed to determine the relative and absolute reliability of intra and inter-observer strength measurements with a hand-held dynamometer (HHD).
Two observers performed maximum voluntary peak torque measurements (MVPT) for isometric knee extension in 24 patients with haematological malignancies. For each patient, the measurements were carried out on the same day. The main outcome measures were the intraclass correlation coefficient (ICC ± 95%CI), the standard error of measurement (SEM), the smallest detectable difference (SDD), the relative values as % of the grand mean of the SEM and SDD, and the limits of agreement for the intra- and inter-observer '3 repetition average' and the 'highest value of 3 MVPT' knee extension strength measures.
The intra-observer ICCs were 0.94 for the average of 3 MVPT (95%CI: 0.86–0.97) and 0.86 for the highest value of 3 MVPT (95%CI: 0.71–0.94). The ICCs for the inter-observer measurements were 0.89 for the average of 3 MVPT (95%CI: 0.75–0.95) and 0.77 for the highest value of 3 MVPT (95%CI: 0.54–0.90). The SEMs for the intra-observer measurements were 6.22 Nm (3.98% of the grand mean (GM) and 9.83 Nm (5.88% of GM). For the inter-observer measurements, the SEMs were 9.65 Nm (6.65% of GM) and 11.41 Nm (6.73% of GM). The SDDs for the generated parameters varied from 17.23 Nm (11.04% of GM) to 27.26 Nm (17.09% of GM) for intra-observer measurements, and 26.76 Nm (16.77% of GM) to 31.62 Nm (18.66% of GM) for inter-observer measurements, with similar results for the limits of agreement.
The results indicate that there is acceptable relative reliability for evaluating knee strength with a HHD, while the measurement error observed was modest. The HHD may be useful in detecting changes in knee extension strength at the individual patient level.
Intensive medical treatment regimens can significantly improve survival in patients with haematological malignancies [1–3]. The cancer therapy itself, including chemotherapy or radiotherapy, damages healthy cells throughout the body, resulting in side-effects including nausea, emesis, decreased nutritional intake and anaemia. Higher fatigue levels that are associated with decreased levels of activity and lengthened bed rest contribute to muscular catabolism and atrophy . As a result, functional limitations and muscle weakness may persist even well beyond the period of active treatment [5–7].
Patients with haematological malignancies may benefit from physical exercise programs in terms of maintenance or even improvement in physical activity levels , fitness levels [8, 9], and muscular strength [5, 10, 11]. Assessment of muscle strength is an important part of the management of cancer patients, particularly in determining the response to a muscular strength training program [12–14]. It is thus important to be able to accurately quantify the muscle strength of patients who are recovering from intensive medical treatment.
Muscular strength can be assessed both in research settings and in clinical practice settings by means of isokinetic and hand-held dynamometers (HHD). One of the advantages of using isokinetic dynamometers in patients with chronic diseases is the ability to assess muscle strength dynamically through a range of movements at various velocities, which may more accurately reflect functional performance [15, 16]. However, isokinetic strength testing protocols may be too time consuming in typical clinical settings, and the size of the equipment can also be problematic (i.e., lack of portability). Clinically, HHD represents a simple, portable and relatively inexpensive alternative to isokinetic machines for assessing muscle strength . Moreover, hand-held dynamometers provide quantification of muscle strength, and are more sensitive to change in muscle strength than simple manual muscle tests [16, 17].
Evidence of the validity of HHD has been provided in several studies, including a comparison of HHD with isokinetic strength measurements to assess lower limb strength in the elderly (r = 0.91) , a comparison of HHD and manual muscle testing (r = 0.77) , and of HHD and the Timed-Up-and-Go-test (r = 0.64 to -0.94).  Nollet et al. also provided evidence for the validity of a HHD in lower strength ranges in patients with post-polio syndrome .
To be clinically meaningful, however, the muscle strength assessment procedure must be reliable enough to evaluate outcomes of a therapeutic intervention . Reliability can be reported in relative or absolute terms . Relative reliability statistics indicate the degree of association between 2 or more measures (e.g., intraclass correlation coefficients or ICCs),  but they do not provide clinical guidance for assessing real changes at an individual patient level [23, 24]. The relative reliability of hand-held dynamometers for knee extension has been examined in numerous populations. ICCs' of 0.75 or higher have been reported in studies of healthy young and elderly adults [25, 26], community-dwelling elderly fallers [19, 27], people with acquired brain injury , elderly after hip fracture and elective hip and knee arthroplasty [29, 30], adults with cerebral palsy , and patients with chronic obstructive pulmonary disease (COPD) .
Absolute reliability reflects the magnitude of the differences between two measures . Examples of these statistics are the standard error of measurement (SEM), the corresponding 95% confidence interval, the smallest detectable difference (SDD), and the limits of agreement (LA). To be clinically useful, an assessment with an HHD must have only a small amount of measurement error in detecting real change over time. A retest difference in a patient with a value smaller than the SEM is likely to be the result of 'measurement noise' and is unlikely to be detected reliably in practice; a difference greater than the SDD is likely to be a real difference with 95% certainty . The absolute reliability of HHD has been reported by several authors [16, 26, 27, 31, 33, 34]. However, measures of reliability are specific to the populations and testing procedures used. This implies that the findings of previous studies may not be applicable to patients with haematological malignancies. Disease- and treatment-related symptoms, including de-conditioning, muscle weakness, and fatigue may affect not only the reliability, but also the safety of performing HHD [16, 22]. Therefore, the investigation of the measurement error of an HHD in patients with haematological malignancies is warranted.
In daily physiotherapy or rehabilitation practice, strength measurements for the same patient are often performed by several examiners. However, the measurement error associated with the assessment of strength by one observer (intra-observer reliability) may be different than that associated with the assessment of strength by several observers (inter-observer reliability). For this reason, it is important to determine both the intra- and inter-observer reliability of the measurements obtained with a HHD. This study aimed to determine the relative and absolute reliability (measurement error) of intra and inter-observer strength measurements with a HHD in a sample of patients with haematological malignancies.
The study sample included patients with a diagnosis of haematological cancer who had completed treatment with high-dose chemotherapy in the Departments of Oncology and Haematology of the University Hospital Zurich. Patients were excluded if they were experiencing the direct side-effects of high-dose chemotherapy (e.g. fever, haemoglobin level < 10 g/dl, emesis, dyspnoea, ≤ 36 of 52 points on the Functional Assessment for Cancer Therapy-Anemia (FACT-An) scale [36, 37]), or had gait abnormalities, known impairment of the lower limbs, severe graft versus host disease (GVHD) except for grade I not requiring treatment, painful joints, instable osteolyses of the vertebrae, chronic low back pain, lesions of the central or peripheral nervous system, uncontrolled cardiovascular disease, thyroid disease, or diabetes.
Forty-nine patients were initially invited to participate in the study. Five of these patients (10.2%) were subsequently excluded due to low haemoglobin values and/or severe fatigue, 2 patients (4%) were excluded due to knee pain at the time of measurement, and 12 patients (24%) were not interested in participating. Of the remaining 30 patients, 14 had leukaemia treated with induction chemotherapy following peripheral blood stem-cell transplantation, 11 non-Hodgkin lymphoma treated with high-dose chemotherapy alone (n = 10) or high-dose chemotherapy following autologous stem cell transplantation (n = 1) and 4 multiple myeloma/plasmacytoma treated with high-dose chemotherapy alone (n = 1) or high-dose chemotherapy following 2 cycles of autologous stem-cell transplantation (n = 3). All participating patients were in a physically stable condition and provided written informed consent. The ethics committee of the Canton of Zurich approved the study.
Blood values (haemoglobin in g/dl) were determined at the time of an outpatient visit to the hospital. Self-reported fatigue was measured with the German-language version of the FACT-An scale.  The FACT-An scale includes 13 items relating to both the symptoms and consequences of cancer fatigue, and is highly reliable . Haemoglobin values and self-reported fatigue were assessed because both of these variables (i.e. low haemoglobin levels and high fatigue levels) can have adverse effects on physical performance over time . The patient's height was assessed to the nearest 0.5 cm with a wall fixed tape measure. Weight was assessed to the nearest 0.5 kg with a weighting machine, SECA©, Model 791.
Isometric muscle strength assessment
The maximum voluntary push torque (MVPT) was assessed with the CompuFet HHD. The CompuFet is a portable force evaluation and testing system (weight 0.45 KG), designed by Hoggan Health Industries Inc. (USA). The HHD sets a high or low threshold for the minimal force with which to start. The high threshold recording of the test data begins at 13.6 Newton. The display shows peak force read-outs in 4.4 Newton increments. The CompuFet HHD has a test-range from 3.6 to 440 Newton .
Standardization of the measurement protocol
The MVPT for knee extension was tested at a knee angle of 25 degrees . An angle of 25 degrees was selected to correspond to the knee angle at which the force production is of crucial importance in walking, as has been shown in biomechanical analyses of this activity [33, 40, 41]. Patients were positioned sitting upright, with no back support, and with the hips in 90 degrees flexion. The patient stabilized the trunk by grasping the table. The thigh of the patient was stabilized by the examiner's hand. Thus, the examiner assured that sufficient counterforce was produced by the thigh, so that the lower limb could not pivot down during the break test with the knee near full extension. In this way, the examiner could ensure that the knee extension was really "broken". The joint angles were defined according to the Academy of Orthopaedic Surgeons (AAOS) system . The HHD was positioned perpendicular to the tibia, at 80% of the shank length (between the marks at the lower edge of the 'lateral epicondylus' and the lower edge of the 'lateral malleolus'), distal to the knee. The knee joint centre and the 80% shank length were marked with a dot on the patients' skin. The position of the patient, the examiner, and the HHD were standardized (Figure 1).
The test was performed as a 'break test'. The break technique requires the examiner to overpower a maximal effort by the patient, thereby producing a measurement of eccentric muscle strength . The break technique produces higher values than the 'make technique'. The make technique requires the patient to exert a maximal isometric contraction while the examiner holds the dynamometer in a fixed position. Both the break and the make method (ICCs for both methods are 0.90 or higher) produce strength measurements that have excellent reliability, although the 'break' technique produces higher values . The patient's forced exertion was standardized according to 'Caldwell', with a build-up phase of 2 seconds, and steady maximal force exertion over 3 seconds, after which the examiner breaks through the forced exertion of the patient [33, 45]. The patient was encouraged by means of standardized, verbal instructions during the tests. The break test requires sufficient force from the examiner . For this study, MVPT for isometric extension strength measurements was expressed in Newton-meter (Nm). A concave interchangeable patch attachment for curved surfaces  was used to avoid pain at the tibia during the assessment. The average of 3 peak torque measurements and the highest value of 3 peak torque measurements were used as outcomes. The knee extension score was estimated from the torque signal, multiplied by the measured lever arm between the HHD device and the knee joint.
The test procedure started with a familiarization session of three knee extension repetitions of the dominant limb , which was defined as the preferred limb for kicking. The rest interval between the test repetitions was 30 seconds. The reliability study started with one examiner (intra-observer reliability) performing two measurement sessions of three repetitions each. Subsequently, the second examiner (inter-observer reliability) performed a third measurement session of three repetitions. Thus, a total of nine repetitions on the dominant limb were performed by each patient. The measurement sessions, including the training session, were separated by 60 minutes. After one hour, no real change in muscle strength in patients is expected, so any observed differences were expected to be due to measurement error. In addition, the break-interval is long enough to avoid muscular fatigue effects .
The reliability of the knee extension measurements was evaluated at the Institute of Physical Medicine at the University Hospital Zurich by two examiners, both female students (examiner 1: 80 Kg, 1.64 m, and examiner 2: 53 Kg, 1.62 m) from the Institute of Human Movement Sciences and Sport of the ETH, Zurich. As neither examiner had previous experience with manual muscle testing, they underwent training sessions to learn the requisite manual muscle testing skills. They practised the manual muscle testing skills on fellow students, and on 3 patients with haematological malignancies. The students practised the muscle strength measurements during 8 sessions for 1.5 hours, totalling approximately 12 hours. During this training they were supervised by a senior physical therapist (RHK) with experience in manual muscle testing.
Normality of the data was tested with the Kolmogorov-Smirnov test . A two-way mixed model (ICC3.1 and ICC3.3) and a two-way random effect model (ICC2.1 and ICC2.3) were used for the intra, and inter-observer reliability estimation, respectively . An ICC > 0.75 was defined as acceptable reliability. The SEM was calculated from the average known standard deviation (SD) and the relative reliability coefficient (ICC) of the measurement used for our sample: SEM = SD(√ 1-ICC) [49, 50]. The corresponding 95% confidence interval (95%CI), in which the true score (drawn from the normally distributed population) is expected to fall, was ± 1.96 × SEM [33, 51, 52]. The broader the limits of the 95% confidence interval, the less confident the estimation of the true score and, as a consequence, the less confident the detection of real change due to intervention . This knowledge about the standard error of the measurement is necessary before one can say that a change has occurred [50, 53].
Moreover, when analyzing a difference between two consecutive observations, one must consider the standard error of the observed score for both the first (SEM(first measurement session)) and the second (SEM(second measurement session)) observations. The SDD is defined as the measure of statistically significant change between two independently obtained measurements. Given a probability value of α = 0.05 as indication for statistical significance, the SDD is estimated as 1.96 × √(SEM(first strength assessment) 2 + SEM(second strength assessment) 2) . Assuming that the standard error of the measurement of the observed score of the first and second observations are equal, the SDD is 1.96 × √2 × SEM. For a statistically significant change between two separate observations to be detected, this change must be at least the SDD of the measurement procedure . The SEM and SDD's were expressed as absolute values and in relative values as % of the grand mean.
The limits of agreement (LA) were calculated as the difference against the mean plot (LA = mean + 1.96 × SD) as proposed by Bland and Altman . The Bland and Altman plots graphically display between measurement differences, thus allowing direct insight into the variability of the measurement under study .
A repeated measures ANOVA was carried out to test for learning effects within the three MVPT strength measurements . The differences between means of the intra-observer and inter-observer measurements (p < 0.05) were calculated with a paired t-test . Sociodemographic differences between patients included and excluded from the study were calculated by means of a Student's t-test . All statistical analyses were performed using SPSS® 15 for Windows (SPSS, Inc.).
The HHD-assessments were tolerated by all 30 patients. Of these 30 patients, 6 (1 woman and 5 men) were excluded from the analyses because they did not perform the knee extension measurements according to the standardized procedures and because they exceeded the torque limit of 218 Nm. These 6 patients were significantly younger, taller and heavier (p < 0.05), than the 24 patients included in the analysis (Table 1).
For the remaining 24 patients, all results of the muscle strength measurements and the difference in muscle strength measurements between intra-session 1, intra-session 2, and the inter-session were normally distributed. The relative reliability of the HHD, including the ICCs and the 95%CIs, was acceptable, ranging from 0.77 to 0.94, for the intra-observer and inter-observer measurement sessions, respectively (Tables 2 and 3).
The absolute reliability of the SEM, the SDD, and the relative values as % of the grand mean of the SEM and SDD are presented in Tables 2 and 3. The 95% limits of agreement according to the method of Bland and Altman are presented in Figures 2, 3, 4, 5. The ANOVA for repeated measures yielded no significant changes (p > 0.05) between the three MVPT strength measurements, indicating that there were no learning effects from the first to the third measurements for the intra- and inter-tester observers. There were no significant differences in muscle strength between the intra- and inter-observer sessions for the average of 3 MVPT measurements (Tables 4 + 5) or for the highest value of 3 MVPT measurements (Tables 6 + 7).
This study evaluated the relative and absolute reliability of a strength assessment protocol using an HHD among a sample of haematological cancer patients recovering from high-dose treatment. We used the ICC (with accompanying 95%CI) to estimate relative reliability. Relative reliability is highly dependent on the variability observed in the patient sample, and relates to the ability to classify patients' strength measurements in the same rank. Thus, relative reliability is most relevant for assessing instruments that are to be used for discriminative purposes . Guyatt et al. demonstrated that discriminative instruments require a high level of relative reliability. That is, the measurement error should be small in comparison to the variability between the observers. In other words, if the difference between the observers is large, a certain amount of measurement error is acceptable [23, 24].
However, if the aim is to measure change in health status, which is often the case in clinical practice, absolute reliability is more relevant [23, 24]. Absolute reliability describes the agreement between repeated measurements and is concerned with measurement error [23, 24]. For an evaluative instrument, it is not the variability between the observers that is of primary concern, but rather measurement error [23, 24]. The measurement error should be smaller than the changes that the observer wishes to detect [23, 58]. We calculated the SEM, the SDD and the limits of agreement to estimate absolute reliability.
To be of practical use, the results should be interpreted as follows: the intra-observation of the average of 3 MVPT 'knee strength' assessments provided acceptable relative reliability (ICC3.3 = 0.94). The reliability of this parameter is affected by the variance statistic of the assessments from 'intra session 1', which was 644.65 Nm (calculated as the square of the standard deviation [25.39 Nm]), and the assessment from intra session 2', which was 847.39 Nm (sd 29.11 Nm) (see the distribution from Bland and Altman in Figure 2). When taking the measurement error into account, an SDD equal to or greater than 17.23 Nm between two measurements should be used as the threshold for a true clinical change in knee extension. The results of the other examination models in this study: the Inter-observer reliability for the average of 3 MVPT measurements (ICC2.3), the intra-observer reliability for the highest value of 3 MVPT measurements (ICC3.1), and the inter-observer reliability for the highest value of 3 MVPT measurements (ICC2.1), should be interpreted in the same way (see Tables 2 and 3, and Figures 3, 4, 5). Thus, when evaluating knee strength measurements (e.g. after a muscle strength program), it is recommended to use the 3-repetition average strength measurement by one or more examiners.
We performed intra- and inter-observer re-test measurements on the same day. However, no learning effect was found in the present study between the first and the third strength measurement. This is probably due to the familiarization session . Although the highest value is probably a more valid measurement for assessing muscle strength  (even though it is less reliable), the average of three MVPT strength measurements can be used in determining whether a result is a real change or is within the range of measurement error.
The protocol used for assessing isometric knee strength in this study had acceptable re-test reliability, as evidenced by ICCs equal to or greater than 0.75. The ICCs in the current study are similar to test-retest reliability coefficients reported in other, related studies [16, 19, 25–31].
The measurement error of HHD for knee extension strength in haematological patients can be compared to that observed in other studies. In a study in orthopaedic knee patients, the intra-observer assessment of the SDD was 21.5 Nm for the single value, and 13.8 Nm for the average value. For inter-observer assessment, the SDD was 28.2 Nm for the single value and 18.7 Nm for the average value . However, one should keep in mind that the authors used the 'make' method to assess knee extension strength.
To compare our absolute reliability results for knee extension strength with those observed in COPD patients , we estimated the SEM from their results. The SEM was estimated from the ICC and the total variance, using the formula SEM = Sd × (√1-ICC) . A SDD (= SEM × 1.96 × √2) of approximately 49 Nm from knee extension was calculated from their study results (ICC .87, Sd 14.5 Nm, strength value originally expressed in Kg, converted to Nm and corrected to an average lever arm of 34 cm, which was the average 80% shank length of the included and excluded participants in our study, (see table 2) . An important difference from our measurement protocol was that the measurements in this study were performed with a knee angle in 90 degrees of flexion.
From the study of Taylor et al.  among patients with cerebral palsy, we were able to calculate a SDD of approximately 43 Nm (ICC .81, Sd 10.7, strength value originally expressed in Kg, converted to N and corrected to an average lever arm of 34 cm for Nm).
Excellent SEMs in knee arthroplasty patients were described by Gagnon et al. The average SEM from 3 trials was 1.84 Nm (SDD 5.10 Nm) . However, in this latter study, a chair-fixed device was used, and therefore was not fully comparable with the results of hand held dynamometry. In contrast to chair fixed dynamometry, the reliability of strength measurements in HHD is influenced by the experience of the examiners, the amount of strength that examiners are able to resist, and the standardization of measurements .
Currently, there is no criterion for the SDD of hand held dynamometry. Therefore, the SDD in knee extension strength was compared to studies that obtained quadriceps strength measures after a resistive strength exercise program. A relatively small improvement of 18 Nm (95%CI 7–30 Nm, GM 144, Sd 45 Nm) was found in patients with COPD . Conversely, we estimated a mean change of 29.92 Nm (CI95% 24 Nm to 35 Nm) from the results of a study of breast cancer patients . Although muscle strength in this study was assessed with an eight repetition maximum, which is not fully comparable to HHD, the findings indicated that cancer patients may benefit from muscle strength training during chemotherapy treatment. Taken together, if obtained by the same observer, the SDD threshold of 17 Nm (see table 2) that corresponds to the average of 3 MVPT strength measurements, will probably be surpassed.
For the average inter-examiner MVPT measurements with the HHD, it is questionable if the threshold of 26 Nm (see table 2) will be surpassed in all haematological patients after a strength resistive training program. However, this is probably the case only in patients who recover steadily from the side effects of the medical treatment, and who are good responders to resistive strength training.
Several limitations of the current study should be mentioned. First, the resultant moment at the knee joint and the moment by the dynamometer are different. When measuring isometric strength, one should keep in mind that the differences between the measured and the resultant joint moments might influence the estimation of muscle torque parameters. Although the test protocol can be standardized to a reasonable degree, the deformation at the soft tissue of the leg, especially at the thigh, where the muscle mass is considerable, plays an important role in changing the alignment of the HHD axis of rotation, and the axis of the knee joint . Therefore, future studies need to examine the 'real' joint angles of hand-held dynamometry measurements.
Second, the measurements in this study were performed by female examiners without prior experience in muscle strength assessment with HHD. This may have influenced the upper boundary of the muscle strength assessments. Knee extension strength measurements performed by stronger examiners with experience in hand-held dynamometry may result in measurement values that are higher than 218 Nm. Moreover, the use of an isokinetic dynamometer has been recommended if the muscle strength of the patients exceeds the strength of the examiners . In several studies, isokinetic dynamometers yielded reproducible measurements with low measurement error [21, 61–63]. However, isokinetic dynamometers also have several disadvantages. They require a good deal of space, and are costly, hampering their widespread use in clinical settings. The reliability of a HHD measurement may depend on the strength and the body mass of the examiner. The female examiners in this study were of varying weight. Examiner 2 achieved the highest (mean) MVPT measurements.
Third, the point in time at which the assessments took place varied considerably (see Table 1), and therefore some patients may have had the possibility to recover more from the side-effects of high-dose chemotherapy than others. This may have influenced the inter-subject variability, which in turn increases relative reliability (ICCs). However, this inter-subject variability does not effect absolute reliability (SEM, SDD). It is also possible that the patients in our study were healthier than other haematological cancer patients at the same stage of recovery. The primary reason that 12 patients did not participate was that they felt too fatigued or too weak to do so.
Fourth: although we could not detect a learning effect between the MVPT measurements, one should keep in mind that the results of this reliability study are based on an intra-day reliability assessment. A more complete picture of the reliability would require a between-day reliability study to allow the corresponding variations to affect (or not) the measures. Learning effects for strength measurements can potentially be of more concern for between-day than for within-day measurements [64, 65]. In addition, if truly maximal exertions of muscle strength are desired, visual feedback should be employed during the measurements . A factor that may also influence the reliability of strength measurements is the circadian rhythm. A time-of-day effect for leg and back strength measurements was reported in one study in which maximum strength values increased consistently during daytime . Gauthier et al.  reported similar findings for elbow flexion torque and body temperature, which varied concomitantly during the day. One should keep in mind that circadian rhythm disruption is hypothesized as a mechanism underlying fatigue in cancer patients . Fatigue is one of the most prevalent symptoms that cancer patients experience and it has a considerable effect on physical performance . Therefore, fatigue may also influence the reliability of the measurements in cancer patients.
Fifth, at the end-phase of the training period, the upper limit for the examiners torque was fixed at 218 Nm, because the weakest examiner was able to break through the knee extension movement of the 3 pre-test patients at 218 Nm, but not higher. Thus, only haematological cancer patients with knee extension measurements lower than this value were included in the analysis.
Finally, this study had a relatively small sample size. Although the sample size was adequate for studies of this nature , a larger study might narrow the confidence intervals around the reliability coefficients (without necessarily affecting the reliability estimates themselves).
Clinical implications for the use of a HHD in patients with haematological malignancies
In this reliability study both participating assessors were students of the Institute of Human Movement Sciences and Sport. They underwent training sessions to learn the requisite manual muscle testing skills during 8 sessions of 1.5 hours each. The data for the average intra-examiner MVPT measurements in 24 patients with hematological malignancies yielded acceptable results for relative (ICC 0.94) and absolute reliability (SDD 17 Nm).
The conflicting finding on inter-examiner reliability, where the experience of the assessing examiners seemingly plays an important role, has important clinical implications. If more than one examiner is to evaluate the muscle strength of a haematological patient, then it is important that all examiners concerned apply the tests reliably and consistently. If this can not be achieved, then the resulting data will be of little use in a clinical setting. Clinicians specialized in the treatment of chronic diseases, and with comparable levels of practical experience with an HHD can, however, use the average MVPT value for intra-examiner measurements in their everyday practice with confidence. The HHD may be used in patients with haematological malignancies who have recovered from the direct side-effects of their medical treatment and who are in a stable physical condition to: 1) compare muscle strength with normative reference values (e.g. for discriminative purpose); or 2) evaluate the effect of a resistive exercise training in an individual patient (e.g. measure change in health status over time).
The results of this study indicate that there is acceptable relative reliability for evaluating knee strength with an HHD, while the observed measurement error is modest. The HHD may be useful in detecting changes in knee extension strength at the individual patient level.
Chronic obstructive pulmonary disease
Functional Assessment of Cancer Therapy-Anaemia
Graft versus host disease
Intraclass correlation coefficient
Limits of agreement
Maximum voluntary peak torque
Smallest detectable difference
Standard error of measurement
95% Confidence interval.
Fuchs M, Diehl V, Re D: Current strategies and new approaches in the treatment of Hodgkin's lymphoma. Pathobiology. 2006, 73: 126-140. 10.1159/000095559.
Held G, Schubert J, Reiser M, Pfreundschuh M: German High-Grade Non-Hodgkin-Lymphoma Study Group. Dose-intensified treatment of advanced-stage diffuse large B-cell lymphomas. Semin Hematol. 2006, 43: 221-229. 10.1053/j.seminhematol.2006.07.003.
Blaise D, Vey N, Faucher C, Mohty M: Current status of reduced-intensity-conditioning allogeneic stem cell transplantation for acute myeloid leukemia. Haematologica. 2007, 92: 533-541. 10.3324/haematol.10867.
Dimeo F, Fetscher S, Lange W, Mertelsmann R, Keul J: Effects of aerobic exercise on the physical performance and incidence of treatment-related complications after high-dose chemotherapy. Blood. 1997, 90: 3390-3394.
Mello M, Tanaka C, Dulley F: Effects of an exercise program on muscle performance in patients undergoing allogeneic bone marrow transplantation. Bone Marrow Transplant. 2003, 32: 723-728. 10.1038/sj.bmt.1704227.
Gerber L, Hoffman K, Chaudhry U, Augustine E, Parks R, Bernad M, Mackall C, Steinberg S, Mansky P: Functional outcomes and life satisfaction in long-term survivors of pediatric sarcomas. Arch Phys Med Rehabil. 2006, 87: 1611-1617. 10.1016/j.apmr.2006.08.341.
Knols R, Aaronson NK, Uebelhart D, Fransen J, Aufdemkampe G: Physical exercise in cancer patients during and after medical treatment: a systematic review of randomized and controlled clinical trials. J Clin Oncol. 2005, 23: 3830-3842. 10.1200/JCO.2005.02.148.
Dimeo F: Exercise for cancer patients: a new challenge in sports medicine. Br J Sports Med. 2000, 34: 160-161. 10.1136/bjsm.34.3.160.
Dimeo F, Schmittel A, Fietz T, Schwartz S, Kohler P, Boning D, Thiel E: Physical performance, depression, immune status and fatigue in patients with hematological malignancies after treatment. Ann Oncol. 2004, 15: 1237-1242. 10.1093/annonc/mdh314.
Bauman FT, Schüle K, Fauser AA, Kraut L: Auswirkungen von Bewegungstherapie bei und nach Knochenmark-/Stamzelltransplantation. Deutsche Zeitschrift für Onkologie. 2005, 37: 152-158. 10.1055/s-2005-918019.
Hayes S, Davies P, Parker T, Bashford J, Green A: Role of a mixed type, moderate intensity exercise program after peripheral blood stem cell transplantation. Br J Sports Med. 2004, 38: 304-309. 10.1136/bjsm.2002.003632.
Courneya KS, Segal RJ, Mackey JR, Gelmon K, Reid RD, Friedenreich CM, Ladha AB, Proulx C, Vallance JK, Lane K: Effects of aerobic and resistance exercise in breast cancer patients receiving adjuvant chemotherapy: a multicenter randomized controlled trial. J Clin Oncol. 2007, 25: 4396-4404. 10.1200/JCO.2006.08.2024.
De Backer IC, Van Breda E, Vreugdenhil A, Nijziel MR, Kester AD, Schep G: High-intensity strength training improves quality of life in cancer survivors. Acta Oncol. 2007, 30: 1-9.
Wiskeman J, Huber G: Physical exercise as adjuvant therapy for patients undergoing hematopoetic stem cell transplantation. Bone Marrow Transplantation. 2008, 41: 321-329. 10.1038/sj.bmt.1705917.
Mathur S, Makrides L, Hernandez P: Test-retest reliability of isometric and isokinetic torques in patients with chronic obstructive disease. Physiother Can. 2004, 56: 94-101. 10.2310/6640.2004.00005.
O'Shea SD, Taylor NF, Paratz JD: Measuring muscle strength for people with chronic obstructive pulmonary disease: retest reliability of hand-held dynamometry. Arch Phys Med Rehabil. 2007, 88: 32-36. 10.1016/j.apmr.2006.10.002.
Bohannon RW: Measuring knee extensor muscle strength. Am J Phys Med Rehabil. 2001, 80: 13-18. 10.1097/00002060-200101000-00004.
Martin HJ, Yule V, Syddall HE, Dennison EM, Cooper C, Aihie Sayer A: Is hand-held dynamometry useful for the measurement of quadriceps strength in older people? A comparison with the gold standard Bodex dynamometry. Gerontology. 2006, 52: 154-159. 10.1159/000091824.
Schaubert KL, Bohannon RW: Reliability and validity of three strength measures obtained from community-dwelling elderly persons. J Strength Cond Res. 2005, 19: 717-720. 10.1519/R-15954.1.
Nollet F, Beelen A: Strength assessment in postpolio syndrome: validity of a hand-held dynamometer in detecting change. Arch Phys Med Rehabil. 1999, 80: 1316-1323. 10.1016/S0003-9993(99)90036-9.
Sole G, Hamrén J, Milosavljevic S, Nicholson H, Sullivan SJ: Test-retest reliability of isokinetic knee extension and flexion. Arch Phys Med Rehabil. 2007, 88: 626-631. 10.1016/j.apmr.2007.02.006.
Portney L, Watkins M: Foundations of clinical research. Applications to practice. Reliability. 1993, Norwalk: Appleton and Lang, 53-60. 1
de Vet HC, Terwee CB, Knol DL, Bouter LM: When to use agreement versus reliability measures. J Clin Epidemiol. 2006, 59: 1033-1039. 10.1016/j.jclinepi.2005.10.015.
Guyatt G, Walter S, Norman G: Measuring change over time: assessing the usefulness of evaluative instruments. J Chronic Dis. 1987, 40: 171-178. 10.1016/0021-9681(87)90069-5.
Bohannon RW: Intertester reliability of hand-held dynamometry: a concise summary of published research. Percept Mot Skills. 1999, 88: 899-902. 10.2466/PMS.88.3.899-902.
McKeon PO, Gontkof LM, Hertel J: Hand-held dynamometry: reliability of lower extremity muscle testing in healthy, physically active, young adults. J Sport Rehabil. 2008, 17: 160-170.
Wang CY, Olson SL, Protas EJ: Test-retest strength reliability: hand-held dynamometry in community-dwelling elderly fallers. Arch Phys Med Rehabil. 2002, 83: 811-815. 10.1053/apmr.2002.32743.
Riddle DL, Finucane SD, Rothstein JM, Walker ML: Intrasession and intersession reliability of hand-held dynamometer measurements taken on brain-damaged patients. Phys Ther. 1989, 69: 182-194.
Kwoh CK, Petrick MA, Munin MC: Inter-rater reliability for function and strength measurements in the acute care hospital after elective hip and knee arthroplasty. Arthritis Care Res. 1997, 10: 128-134. 10.1002/art.1790100208.
Roy MA, Doherty TJ: Reliability of hand-held dynamometry in assessment of knee extensor strength after hip fracture. Am J Phys Med Rehabil. 2004, 83: 813-818. 10.1097/01.PHM.0000143405.17932.78.
Taylor NF, Dodd KJ, Graham HK: Test-retest reliability of hand-held dynamometric strength testing in young people with cerebral palsy. Arch Phys Med Rehabil. 2004, 85: 77-80. 10.1016/S0003-9993(03)00379-4.
Kwoh CK, Petrick MA, Munin MC: The relative and absolute reliability of two balance performance measures in chronic stroke patients. Disabil Rehabil. 2007, 21: 1-6.
Roebroeck M, Harlaar J, Lankhorst G: Reliability assessment of isometric knee extension measurements with a computer-assisted hand-held dynamometer. Arch Phys Med Rehabil. 1998, 79: 442-448. 10.1016/S0003-9993(98)90147-2.
Gagnon D, Nadeau S, Gravel D, Robert J, Bélanger D, Hilsenrath M: Reliability and validity of static knee strength measurements obtained with a chair-fixed dynamometer in subjects with hip or knee arthroplasty. Arch Phys Med Rehabil. 2005, 86: 1998-2008. 10.1016/j.apmr.2005.04.013.
Lu TW, Hsu HC, Chang LY, Chen HL: Enhancing the examiner's resisting force improves the reliability of manual muscle strength measurements: Comparison of a new device with hand-held dynamometry. J Rehabil Med. 2007, 39: 679-684. 10.2340/16501977-0107.
Palumbo A, Petrucci MT, Lauta VM, Musto P, Caravita T, Barbui AM, Nunzi M, Boccadoro M, Italian Multiple Myeloma Study Group: Correlation between fatigue and haemoglobin level in multiple myeloma patients: results of a cross-sectional study. Haematologica. 2005, 90: 858-860.
Cella D: The Functional Assessment of Cancer Therapy-Anemia (FACT-An) Scale: a new tool for the assessment of outcomes in cancer anemia and fatigue. Semin Hematol. 1997, 34 (Suppl 2): 13-19.
Birgegård G, Gascón P, Ludwig H: Evaluation of anaemia in patients with multiple myeloma and lymphoma: findings of the European CANCER ANAEMIA SURVEY. Eur J Haematol. 2006, 77: 378-386. 10.1111/j.1600-0609.2006.00739.x.
Biometrics Europe BV: User manual CompuFet-Software V1.1. 2003, Almere-The Netherlands
Roebroeck ME, Doorenbosch CA, Harlaar J, Roebroeck ME, Lankhorst GJ: Two strategies of transferring from sit-to-stand; the activation of monoarticular and biarticular muscles. J Biomech. 1994, 27: 1299-1307. 10.1016/0021-9290(94)91372-2.
Harlaar J, Roebroeck ME, Lankhorst GJ: Computer-assisted hand-held dynamometer: low-cost instrument for muscle function assessment in rehabilitation medicine. Med Biol Eng Comput. 1996, 34: 329-335. 10.1007/BF02519999.
American Academy of Orthopaedic Surgeons: Joint Motion. Method of measuring and recording. 1966, Edinburgh: Churchill Livingstone
Burns SP, Spanier DE: Break-technique handheld dynamometry: relation between angular velocity and strength measurements. Arch Phys Med Rehabil. 2005, 86: 1420-1426. 10.1016/j.apmr.2004.12.041.
Burns SP, Breuninger A, Kaplan C, Marin H: Hand-held dynamometry in persons with tetraplegia: comparison of make-versus break-testing techniques. Am J Phys Med Rehabil. 2005, 84: 22-29. 10.1097/01.PHM.0000150790.99514.C6.
Caldwell LS, Chaffin DB, Dukes-Dobos FN, Kroemer KH, Laubach LL, Snook SH, Wasserman DE: A proposed standard procedure for static muscle strength testing. Am Ind Hyg Assoc J. 1974, 35: 201-206.
Phillips BA, Lo SK, Mastaglia FL: Muscle force measured using "break" testing with a hand-held myometer in normal subjects aged 20 to 69 years. Arch Phys Med Rehabil. 2000, 81: 653-661.
Siegel S, Castellan N: Non-parametric statistics for the behavioural sciences. The single sample case. 1988, Norwalk-New York: Mcgraw-Hill, 51-55. 1
Portney L, Watkins M: Foundations of clinical research. Applications to practice. Statistical Measures of Reliability. 1993, Norwalk: Appleton and Lang, 509-525. 1
Kropmans T, Dijkstra PU, Stegenga B, Stewart R, de Bont LG: Smallest detectable difference in outcome variables related to painful restriction of the temporomandibular joint. J Dent Res. 1999, 78: 784-789. 10.1177/00220345990780031101.
Ottenbacher KJ, Johnson MB, Hojem M: The significance of clinical change: issues and methods. Am J Occupat Ther. 1988, 42: 156-162.
IJzerman M, Baardman G, Van 't Hof M, Boom H, Hermens H, Veltink P: Validity and reproducibility of crutch force and hearth rate measurements to assess energy expenditure of paraplegic gait. Arch Phys Med Rehabil. 1999, 80: 1017-1023. 10.1016/S0003-9993(99)90054-0.
de Bruin ED, Rozendal R, Stüssi E: Reliability of Phase-Velocity Measurements of Tibial Bone. Phys Ther. 1998, 78: 1166-1174.
Hayes KW: The effect of awareness of measurement error on physical therapists confidence in their decisions. Phys Ther. 1992, 72: 515-525.
Bland JM, Altman DG: Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986, 1: 307-310.
Lodder MC, Lems WF, Ader HJ, Marthinsen AE, van Coeverden SC, Lips P, Netelenbos JC, Dijkmans BA, Ross JC: Reproducibility of bone mineral density measurement in daily 20 practice. Ann Rheum Dis. 2004, 63: 285-289. 10.1136/ard.2002.005678.
Portney L, Watkins M: Foundations of clinical research. Applications to practice. Analysis of variance. 1993, Norwalk: Appleton and Lang, 387-395. 1
Portney L, Watkins M: Foundations of clinical research. Applications to practice. Comparison of group means: The t-test. 1993, Norwalk: Appleton and Lang, 364-372. 1
Weir JP: Quantifying test-retest reliability using the intraclass correlation coefficient and the sem. J Strength Cond Res. 2005, 19: 231-240. 10.1519/15184.1.
Symons TB, Vandervoort AA, Rice CL, Overend TJ, Marsh GD: Reliability of a single-session isokinetic and isometric strength measurement protocol in older men. J Gerontol A Biol Sci Med Sci. 2005, 60: 114-119.
Troosters T, Gosselink R, Decramer M: Short- and long-term effects of outpatient rehabilitation in patients with chronic obstructive pulmonary disease: a randomized trial. Am J Med. 2000, 15: 207-212. 10.1016/S0002-9343(00)00472-1.
Arampatzis A, Karamanidis K, De Monte G, Stafilidis S, Morey-Klapsing G, Brüggemann GP: Differences between measured and resultant joint moments during voluntary and artificially elicited isometric knee extension contractions. Clin Biomech. 2004, 19: 277-283. 10.1016/j.clinbiomech.2003.11.011.
Flansbjer UB, Holmbäck AM, Downham D, Lexell J: What change in isokinetic knee muscle strength can be detected in men and women with hemiparesis after stroke?. Clin Rehabil. 2005, 19: 514-522. 10.1191/0269215505cr854oa.
Pierce SR, Lauer RT, Shewokis PA, Rubertone JA, Orlin MN: Test-retest reliability of isokinetic dynamometry for the assessment of spasticity of the knee flexors and knee extensors in children with cerebral palsy. Arch Phys Med Rehabil. 2006, 87: 697-702. 10.1016/j.apmr.2006.01.020.
Ferber R, McClay Davis I, Williams DS, Laughton C: A comparison of within- and between-day reliability of discrete 3D lower extremity variables in runners. J Orthop Res. 2002, 20: 1139-1145. 10.1016/S0736-0266(02)00077-3.
Kroll W: A reliable method of assessing isometric strength. Research Quarterly. 1963, 34: 350-355.
Jung M, Hallbeck MS: Quantification of the effects of instructions type, verbal encouragement, visual feedback on static and peak handgrip strength. International Journal of Industrial Ergonomics. 2004, 34: 367-374. 10.1016/j.ergon.2004.03.008.
Coldwells A, Atkinson G, Reilly T: Sources of variation in back and leg dynamometry. Ergonomics. 1994, 37: 79-86. 10.1080/00140139408963625.
Gauthier A, Davenne D, Martin A, Van Hoecke J: Time of day effects on isometric and isokinetic torque developed during elbow flexion in humans. European Journal of Applied Physiology. 2001, 84: 249-252. 10.1007/s004210170014.
Ryan JL, Carroll JK, Ryan EP, Mustian KM, Fiscella K, Morrow GR: Mechanisms of cancer-related fatigue. Oncologist. 2007, 12 (Suppl 1): 22-34. 10.1634/theoncologist.12-S1-22.
Dimeo F, Schwartz S, Wesel N, Voigt A, Thiel E: Effects of an endurance and resistance exercise program on persistent cancer-related fatigue after treatment. Ann Oncol. 2008, 19: 1495-1499. 10.1093/annonc/mdn068.
Walter SD, Eliasziw M, Donner A: Sample size and optimal designs for reliability studies. Stat Med. 1998, 17: 101-110. 10.1002/(SICI)1097-0258(19980115)17:1<101::AID-SIM727>3.0.CO;2-E.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2474/10/31/prepub
We thank Pamela Hofer, MSc and Michèle Hubli, MSc for performing the strength measurements.
Thank you to Leanne Pobjoy for her assistance in preparing the manuscript.
We also thank the physicians from the Departments of Oncology and Haematology, University Hospital Zurich for referring their patients, and especially the patients for their willingness to take part in the study.
The authors declare that they have no competing interests.
RHK is the guarantor of the study. He designed the study and was the main writer of the manuscript. GA and EDB designed and wrote the study, and critically revised the study for its content. DU initiated and monitored the study. NKA supervised and critically revised the study for its content. All authors read and approved the final manuscript.
About this article
Cite this article
Knols, R.H., Aufdemkampe, G., de Bruin, E.D. et al. Hand-held dynamometry in patients with haematological malignancies: Measurement error in the clinical assessment of knee extension strength. BMC Musculoskelet Disord 10, 31 (2009). https://doi.org/10.1186/1471-2474-10-31
- Muscle Strength
- Knee Extension
- Relative Reliability
- Knee Extension Strength
- Small Detectable Difference