The responsiveness of sensibility and strength tests in patients undergoing carpal tunnel decompression

Background Several clinical measures of sensory and motor function are used alongside patient-rated questionnaires to assess outcomes of carpal tunnel decompression. However there is a lack of evidence regarding which clinical tests are most responsive to clinically important change over time. Methods In a prospective cohort study 63 patients undergoing carpal tunnel decompression were assessed using standardised clinician-derived and patient reported outcomes before surgery, at 4 and 8 months follow up. Clinical sensory assessments included: touch threshold with monofilaments (WEST), shape-texture identification (STI™ test), static two-point discrimination (Mackinnon-Dellon Disk-Criminator) and the locognosia test. Motor assessments included: grip and tripod pinch strength using a digital grip analyser (MIE), manual muscle testing of abductor pollicis brevis and opponens pollicis using the Rotterdam Intrinsic Handheld Myometer (RIHM). The Boston Carpal Tunnel Questionnaire (BCTQ) was used as a patient rated outcome measure. Results Relative responsiveness at 4 months was highest for the BCTQ symptom severity scale with moderate to large effects sizes (ES = -1.43) followed by the BCTQ function scale (ES = -0.71). The WEST and STI™ were the most responsive sensory tests at 4 months showing moderate effect sizes (WEST ES = 0.55, STI ES = 0.52). Grip and pinch strength had a relatively higher responsiveness compared to thenar muscle strength but effect sizes for all motor tests were very small (ES ≤0.10) or negative indicating a decline compared to baseline in some patients. Conclusions For clinical assessment of sensibility touch threshold assessed by monofilaments (WEST) and tactile gnosis measured with the STI™ test are the most responsive tests and are recommended for future studies. The use of handheld myometry (RIHM) for manual muscle testing, despite more specifically targeting thenar muscles, was less responsive than grip or tripod pinch testing using the digital grip analyser (MIE). When assessing power and pinch strength the effect of other concomitant conditions such as degenerative joint disease on strength needs to be considered.


Background
Carpal tunnel syndrome (CTS) is an entrapment neuropathy of the median nerve at the wrist causing numbness, tingling and pain in the palm, thumb, index and middle fingers as well as weakness of the thenar muscles. CTS is an important contributor to impaired hand function, work disability and increased dependence in adults [1]. It is estimated that one third of patients diagnosed with CTS will require surgery to decompress the carpal tunnel [2]. Surgical intervention has been found to be effective in reducing disability and improving healthrelated quality of life [3,4] and is cost-effective [5] in patients with moderate to severe pathology. Several systematic reviews on the effectiveness of surgical [6,7] and non-surgical treatments [8,9] have been published. However, the wide range of outcomes assessed has impeded the pooling of results for meta-analysis [10] and there is a lack of consensus on which tests should be used in trials of treatment effectiveness for CTS. In a qualitative study on what patients consider important criteria for judging the success of surgery [11] the following outcomes were identified: symptom resolution, specifically relief of hand numbness, hand pain and nocturnal waking, improved muscle function, return to work and resumption of everyday activities. A range of patientrated outcome measures and clinically derived assessments are available to measure these domains and a combination of both patient-reported and clinicianderived outcomes is advocated ensuring that patients' perspectives on their health status are included in evaluations of effectiveness.
The Boston Carpal Tunnel Questionnaire (BCTQ) [12] is a disease-specific, patient-reported questionnaire which has extensive research underpinning its validity, reliability and responsiveness in patients with CTS [13]. It has been widely used as a primary outcome measure in trials and has shown superior responsiveness compared to other region-specific or generic patient-rated questionnaires such as the DASH or SF-36 [14,15]. However the answer to the question of which physical measures of hand sensibility and hand strength should be used in clinical trials in CTS has continued to elude us. As a result clinicians and researchers continue to use different tests to assess sensibility and strength making comparison across centres difficult. Moreover the use of multiple tests of sensory and motor function may unnecessarily duplicate information whilst increasing assessor burden.
Several tests of sensibility are available and studies on their validity, reliability and responsiveness in peripheral nerve injuries have been published [16], but little is known about their relative responsiveness in patients undergoing surgery for carpal tunnel syndrome. A systematic review of the outcomes assessed in 28 randomised controlled trials comparing open with endoscopic carpal tunnel release [17] found that sensory function was assessed by a range of clinical performance tests in 15/28 studies and motor function in 24/28 studies. A subsequent statistical review of those trials assessing grip, pinch and manual muscle strength [18] found that a wide range of strength tests were used. Although grip strength assessed by hydraulic dynamometry was most commonly used it was not the most responsive indicator of change. This may in part be explained by the fact that power grip does not specifically target the thenar muscles and may also be affected in the short-term by pain and tenderness over the surgical scar. The authors recommended further investigation into different methods of assessing power, pinch grip and manual muscle testing including the use of handheld myometry for manual muscle testing of thenar muscles and over a longer follow-up period than 12 weeks. This prospective observational study was designed to investigate the relative responsiveness of several clinically derived tests of motor and sensory function in patients undergoing carpal tunnel decompression over a follow-up period of 8 months after surgery. The tests which best capture change can then be recommended as part of a core set of outcome measures for use in clinical practice and future trials.

Methods
We conducted a longitudinal cohort study using repeated measures of clinician derived and patientreported outcomes. The study received full approval from the Norfolk Research Ethics Committee (Ref 09/ H0310/2) and the local NHS Trust Research and Development Department. All participants gave written informed consent.
Patients were recruited from a single centre, the Norfolk and Norwich University Hospital covering the operating lists of four consultant orthopaedic or plastic surgeons. Patients were identified from the Day Procedure Unit's surgical waiting lists, which is made up of patients who have been referred either directly by their general practitioner or seen by an orthopaedic or plastic surgeon and listed for carpal tunnel decompression. CTS was diagnosed by signs, symptoms and clinical history and nerve conduction studies (NCS) were only carried out in those patients in whom the diagnosis by clinical presentation was uncertain. As not all patients undergo NCS and due to ethical constraints of accessing such data it was not possible to classify CTS severity.
Administrative staff at the Day Procedure Unit were asked to mail participant information sheets, consent forms and pre-assessment questionnaires with a self addressed envelope to all patients listed for carpal tunnel surgery between April 2009 and April 2010. Patients were given 1 week in which to decide whether to take part and return their signed consent form and screening questionnaire by mail. Inclusion criteria were a confirmed diagnosis of CTS through a clear clinical history with or without neurophysiological examination, listed for surgical decompression with a date of surgery at least 2 weeks away, aged 18 or over and able to give fully informed consent.
Patients returning a signed consent form were invited to attend for their presurgical assessment at the Clinical Trials Unit at the University of East Anglia.
Clinician derived measures of sensory and motor function which have been standardised on populations with peripheral nerve trauma and/or compression were used. Clinical assessments were carried out by two qualified occupational therapists specialising in hand therapy and experienced in their use. A standardised protocol was followed for each test. The order of testing was randomised to control for possible order effects. Four sensory function tests were selected based on a review of evidence regarding their validity and reliability in peripheral nerve injuries (Jerosch-Herold 2003). 1) Touch threshold was measured using the Weinstein Enhanced Sensory Test (WEST) (Bioinstruments, Connecticut, USA) which has been demonstrated to have high validity and excellent inter-and intra-tester reliability [19]. The WEST monofilaments have improved tip geometry reducing slippage [20]and are supplied with guaranteed calibration. An ascending method of threshold testing was used starting with the lightest filament and randomly interspersing 3 stimuli with 2 'shams'. Detection of at least 1 out of 3 stimuli was used to determine the lowest threshold and recorded on an ordinal scale as follows: 0.07 gm = 4: 0.2 gm = 3: 2.0 gm = 2; 4.0 gm = 1; 200 gm = 0; The tip of the thumb and index finger were tested and a mean score calculated.
2) Static two-point discrimination (2PD) was measured using the Dellon-Mackinnon Disk-Criminator ™(AliMed, MA, USA) following Moberg's protocol [21]. Starting with a calliper distance of 5 mm a few random applications of 1 or 2 points was used to determine if patients could discriminate correctly then increasing or decreasing the distance depending on responses. The final threshold was determined as the smallest distance at which at least 7 out of 10 applications were correctly identified. The tip of the thumb and index finger were tested and a mean value calculated.
3) Locognosia was assessed using a standardised area localisation test [22]. Using a hand map in which the fingertips are divided into four quadrants and consecutively numbered, patients were asked to identify the exact quadrant in which they felt a stimulus using the heaviest monofilament on the WEST (200 gms). Each zone is stimulated twice in a pre-randomised order and 1 point is given for each correctly identified digit and quadrant, respectively. Only the median nerve innervated area was tested with a maximum possible score of 56 points. 4) Tactile gnosis was assessed using the Shape Texture Identification (STI™) Test (Ossur, Sweden) according to a standardised protocol [23]. Using three shapes and three textures of decreasing size fixed onto disks, patients are required to use their index fingertip to correctly identify each shape and texture. A maximum of 6 points can be scored.
The choice of motor tests was based on a systematic review of clinical trials and statistical review of responsiveness previously undertaken [18].
Functional grip and tip pinch strength were assessed using the MIE digital grip analyser (MIE Medical Research Ltd, Leeds, UK). This instrument has lightweight padded handles attached to a strain gauge tension dynamometer and was chosen for its ability to register even very weak grip and greater handle comfort compared to the hydraulic dynamometer which requires a visual reading from a scale and has been shown to have flooring effects [24]. A standardised protocol was used to measure power grip and tip pinch using the positioning recommended by the Clinical Assessment Recommendations of the American Society of and Therapists [25]. The mean of three trials was recorded.
Individual muscle testing of Opponens Pollicis (OP) and Abductor Pollicis Brevis (APB) was performed using the Rotterdam Intrinsic Handheld Myometer (RIHM) (Erasmus Medical Centre, Rotterdam) The RIHM is a digital handheld myometer which allows individual muscle strength to be measured and quantified in Newtons of force. It has not been used with CTS patients, however it is validated for use in peripheral nerve injuries [26,27] and has been shown to have excellent test-retest reliability. It is much more sensitive to small changes compared to the ordinal Oxford scale for manual muscle testing but a grade 3 or more must be achieved on the Oxford scale in order to complete the RIHM testing. A standardised protocol was followed and the average of three trials was recorded. Prior to testing motor function all patients were asked whether they had pain on gripping or pinching at the base of their thumb and this was recorded.
Data on clinical presentation of CTS were collected using the clinical history questionnaire developed by Bland [28] which was incorporated into a screening questionnaire sent as part of the initial invitation. It has been shown to have an overall sensitivity of 79% when compared to nerve conduction results as gold standard [28].
Patient rated symptom severity and functional status were assessed using the disease-specific Boston Carpal Tunnel Questionnaire [29]. It is made up of two scalesthe symptom severity scale (SSS) which contains 11 questions and the functional status scale (FSS) which has 8 questions. Patients were asked to rate the severity or difficulty from 1 to 5 and an average total score calculated for each subscale where a higher score indicates worse symptoms and function, respectively. Patients received the questionnaire by mail and were asked to bring it completed to their baseline assessment.
The same battery of tests was used for all follow-up assessments. Additionally we used a subjective global rating scale at the 4 and 8 month follow-up assessments. Participants were asked, prior to completing their objective assessments, whether they felt that overall their hand had improved, stayed the same or deteriorated since surgery. This was used as an external criterion to determine clinically important change and test for differences between those improved and those who remained the same or got worse. Such anchor-based approaches are considered important in externally validating whether the change observed in outcome measures relates to the patient's perceived improvement and is of clinical importance [30].

Statistical methods
Descriptive statistics were used to summarise the characteristics of the cohort and the outcomes at baseline, 4 and 8 months. Responsiveness for each outcome variable over time was quantified using the Effect Size (mean change divided by the baseline standard deviation) and the Standardised Response Mean (mean change divided by the standard deviation of change) (SRM). An effect size of <0.3 is considered small, 0.5 is moderate and >0.8 large [31].
The change in outcome was also considered by Patient Global Assessment. A two-sample t-test (with 95% confidence interval) was used to test for a mean difference in change between those reporting an improvement and those not.

Results
Between April 2009 and April 2010 a total of 267 patients listed for carpal tunnel decompression were invited to participate in the study. 81 (30.3%) patients consented of whom 67 attended a baseline assessment. Four patients subsequently postponed or cancelled their surgery thus becoming ineligible for the study and these were excluded from the baseline analysis (see Figure 1). Table 1 summarises the sociodemographic and clinical characteristics of the 63 patients assessed at baseline. There was an almost equal distribution of gender and the mean age of the sample was 60.4 years. 38 patients had idiopathic CTS.
Summary statistics for the clinical sensory and motor tests and self-reported symptom and function scale at baseline, 4 months and 8 months follow-up are presented in table 2. The proportion of patients scoring normal and below normal results for each of the 4 sensory tests and at each timepoint is presented in Figure  2. Two-point discrimination was normal in more than 70% of patients pre-operatively whereas touch threshold, locognosia and tactile gnosis were the tests showing the largest proportion of patients with pre-operative sensory deficits. By 8 months tactile gnosis was still below normal for 30% of patients. Table 3 presents the relative responsiveness of the four sensory and four motor tests and the symptom and function subscale of the BCTQ. ES and SRM for 2PD were negative as a reduction in threshold, measured in mm, indicates an improvement. For abduction and opposition measured by RIHM a negative ES or SRM indicates a decline in strength.
Moderate effect sizes were observed for the WEST and STI tests at 4 months and 8. All four motor tests had either small or negative effect sizes. The BCTQ SSS was the most responsive test with a large ES, followed by the FSS over a 4 months follow-up. Table 4 presents the change from baseline to 4 and 8 months for those patients who had self-reported an improvement and those who had remained the same or become worse. Ten patients considered that they had either not improved or got worse by 4 months and two patients went on to have revision surgery. There were statistically significant differences between the improved and not improved groups at 4 months in the BCTQ SSS and FSS. A highly significant difference was also seen between these groups in grip at 4 months with the not improved group showing a mean decline in strength of -17.33 Newtons from baseline. No significant differences were observed between improved and not improved for any outcomes by 8 months.

Discussion
This study is the first to present a comparison of the relative responsiveness of several clinical sensory and motor tests in a cohort of patients who have undergone open carpal tunnel decompression and over an 8 month follow-up period.
Responsiveness statistics such as ES or SRM give a standardised score which is unit free and allows comparison between different measurement scales. ES and SRM can be interpreted using Cohen's criteria [31] whereby the larger the effect size the greater the change or response to treatment. However caution is needed in interpreting these values. Both ES and SRM use a standard deviation as denominator hence large variance in a study sample leads to a larger measure of dispersion (standard deviation) and can therefore result in small effect sizes even when the change is clinically important. Both ES and SRM are also dependent on the intervention and how much change is expected in a patient's health-status, responsiveness statistics are context-specific and there are no agreed criteria for what is a responsive measure [32]. Therefore comparing the responsiveness between tests in the same group of patients all undergoing surgery for CTS is more appropriate than between studies, as it is based on the interpretation of the relative magnitude of ES or SRM rather than the absolute values.
Of the four clinical sensory tests the most responsive tests at 4 months from baseline were touch threshold assessed by WEST (ES = 0.55, SRM = 0.59) and tactile gnosis assessed with the STI test (ES = 0.53, SRM = 0.66) which showed moderate effect sizes. This was followed by locognosia (ES = 0.29, SRM = 0.37) and 2PD (ES = -0.22, SRM = -0.57) with small effect sizes. The responsiveness statistics increased slightly by 8 months indicating that sensibility continued to improve between 4 and 8 months but also remained abnormal for touch threshold, locognosia and tactile gnosis for more than 30% of patients. Notable is that 2PD was normal (<5 mm) in 46 of 63 patients at baseline (mean at baseline = 4.3 mm) increasing only by 1 patient at 4 and 8 months.
A number of authors have reported that 2PD often remains normal in CTS when other sensibility indices show abnormal results [33] and that touch threshold is more responsive than 2PD [34]. The responsiveness for Participants listed for CTD who consented to participate in study (n=81) 30.3% response rate.   sensibility tests in CTS has been reported for touch threshold [14]. In a sample of 22 patients followed for 3 months after surgery a small change was observed (SRM = -0.30) in touch threshold assessed by monofilaments. Appleby et al [35] investigated change over 3 months after surgery in 29 patients. Although they did not report responsiveness statistics these could be calculated from the data presented. Touch threshold had a moderate ES (0.67). Our study is the first to include additional measures of spatial discrimination such as the locognosia test and STI test, which were moderately responsive. Both tests have been shown to have excellent discriminative validity and responsiveness in peripheral nerve injuries [22,36,23]. The results of these tests at 8 months after surgery also highlight that for a proportion of patients localisation and shape/texture identification remained impaired. The effect sizes and standardised response means for all four motor tests were either very small, close to zero or had decreased at 4 months and 8 months. Grip and pinch strength assessed with the MIE had a relatively higher responsiveness compared to thumb abduction and opposition assessed by RIHM. Our study is the first to report on the responsiveness of the RIHM in patients with CTS. This handheld digital myometer targets the thenar muscles, specifically abduction and opposition of the thumb which are solely reliant on median nerve innervation. It combines individual manual muscle testing with myometry thus allowing strength to be measured objectively and on a continuous scale (Newtons of force) rather than subjectively grading by using the ordinal Oxford scale. We hypothesised that this method of testing thumb opposition and abduction using the RIHM would be more responsive than power or pinch strength, however our results do not support this. Power and pinch strength measured by the MIE were more   responsive than the RIHM but the question remains whether both measures are required. It has been argued before that pinch strength is a more precise measure of motor impairment in CTS as it relies to a greater extent on the median nerve innervated thenar muscles, whereas in power grip weakness can be masked by the synergistic action of long flexors [18]. However the ES and SRM for grip and pinch at 4 months were very similar and by 8 months marginally higher for power grip than tip pinch. There are several plausible explanations for these findings. Power grip can decrease in the short term after surgery due to pillar pain and scar tenderness over the carpus, recovering by 8 months and even improving due to compensatory use of long flexors and the increased use of the hand in functional activities. Another factor is that other variables such as age, dominance and gender can account for large variations in strength. The variance in strength for all 4 measures in our study was large as evidenced in the wide standard deviations and as these are used as the denominator when calculating ES or SRM this can result in small responsiveness statistics. A further possibility is that other comorbidities such as degenerative joint disease, especially carpometacarpal joint osteoarthritis can significantly reduce pinch strength. In our study 17 patients (27%) reported that they had pain at the base of the thumb on forceful gripping or pinching. Furthermore, a systematic review of strength tests found that grip and pinch strength values prior to surgery were often within or close to normative values and therefore little scope remains for further improvement after surgery in some patients [18], a so called ceiling effect. Although the responsiveness of patient-rated outcome assessed by BCTQ was not the primary focus of this study, by 4 months the BCTQ SSS showed the largest effect sizes followed by the BCTQ FSS. These findings concur with other studies that have reported large ES or SRMs for the symptom severity scale and moderate to large ES for the functional status scale at 3 months post surgery [12,14,15,[37][38][39]. The BCTQ is a disease-specific outcome measure which addresses typical symptoms such as pain, tingling, nocturnal waking which are relieved or improved upon surgical decompression. It is therefore not surprising that this measure is more responsive than clinical tests of sensory and motor function. It is interesting though, that the responsiveness statistics for the BCTQ subscales did not increase at 8 months suggesting that the large improvement in symptom relief and functional status occurs rapidly within the first 3-4 months after surgery, whilst tactile sensibility especially spatial discrimination takes much longer to recover after surgical decompression hence the increasing effect sizes from 4 to 8 months.
The BCTQ was also the outcome measure which best differentiated between the group of patients who had improved and those unchanged or worse. The mean change at 4 months from baseline in symptom severity was -1.25 points in the improved group and -0.51 points in the same/worse group. For the functional status scale a change of -0.69 points was observed in the improved group and 0.08 points in the same/worse group. Atroshi [38] reported smaller values for minimally clinically important difference (MCID) for symptom severity (0.8 points) and for the function scale (0.5 points) using patient satisfaction as the criterion in patients undergoing surgery. Our findings also supports the notion that patients underwent 'clinically important change' as a result of surgery. They also provide an external criterion for interpreting the change scores in the clinical tests and to determine the magnitude of change which can be deemed clinically important. For example for the STI test the change from baseline in the improved group was 1.09 points as opposed to 0.20 in the same or worse group.
A potential limitation is that we did not exclude patients with other conditions. Our study sample included 4 patients who reported having had a previous stroke, 13 indicated having arthritis and 12 were diabetic. These conditions may account for weakness and/ or sensory impairment from other aetiologies and which may not respond to surgery. However they are also typical of the wide range of patients undergoing surgery for CTS and therefore enhance the generalisability of our results to other surgical cohorts. A further limitation is that we were not able to objectively verify whether all patients had a median nerve pathology as not all patients have nerve conduction tests prior to surgery. Finally our sample size is relatively small although larger than other cohorts for which responsiveness of some clinical measures has been published.

Conclusions
We conducted the first longitudinal cohort study to examine the relative responsiveness of several clinical tests of motor and sensory function. Several of the clinical measures of sensibility showed good sensitivity to change, especially the touch threshold (WEST) and tactile gnosis (STI test). We recommend the use of these two sensory tests in the assessment of outcome in future trials of interventions for CTS particularly in patients with sensory deficits. Exploring the relationship between changes in clinical sensory function and sensory parameters from nerve conduction studies would also warrant further investigation.
Clinical tests of motor function included the assessment of power and pinch grip with the MIE digital myometer and handheld myometry for thenar muscles showed very small changes with ES of 0.10 and below. Power grip was marginally more responsive than tip pinch and more responsive than abductor pollicis brevis and opponens testing with the RIHM. Despite the fact that the RIHM targets more specifically the median nerve innervated thenar muscles its low responsiveness suggests that it does not offer any benefits over the more commonly used and widely available power and pinch strength tests. Our study shows that some patients have considerable impairments in sensibility and strength before and after surgery which are not adequately captured by self-report alone and warrant the additional use of these clinical objective tests. They should be considered for inclusion as secondary outcome measures in future trials.

Funding
The NIHR funded this work under a Career Development Fellowship (CJH).