Preliminary study of the Southampton Hand Assessment Procedure for Children and its reliability

Background The Southampton Hand Assessment Procedure (SHAP) is currently used in the adult population for evaluating the functionality of impaired or prosthetic hands. The SHAP cannot be used for children because of the relatively larger size of the objects used to perform SHAP tasks and unknown clinimetric properties. The aims of this study were to adapt the SHAP for use in children (SHAP-C), to determine norm values for the SHAP-C, and to analyze the reliability of the SHAP-C. Methods The SHAP-C was adapted based on the SHAP protocol. Some objects were downsized, and the timing of tasks was performed by the rater instead of the participant. Intra- and inter-rater reliability were assessed in 24 children (5 [0.54] y/o) with unimpaired hands. The repeatability coefficients (RCs) were calculated. An RC ≤ 75% of the mean SHAP-C task values was considered good reliability. Results Participants were able to perform all SHAP-C tasks. The means of the SHAP-C tasks ranged from 0.75 to 1.21 seconds for abstract objects and from 0.64-19.13 seconds for activities of daily living. The RCs of a single assessor did not exceed 75% in 17/26 SHAP-C tasks, displaying a relatively good intra-rater reliability, whereas the RCs for the inter-rater reliability exceeded 75% in 22/26 SHAP-C tasks, thus displaying poor reliability. Conclusion In this first study that adjusted the SHAP for pediatric use, we found that all SHAP-C objects and tasks could be performed by children. The intra-rater reliability was better than the inter-rater reliability. Although the SHAP-C appears to be a promising instrument, the protocol requires further modifications to provide reliable measurements in children.


Background
The Southampton Hand Assessment Procedure (SHAP) is a measurement instrument of the functionality of normal, impaired, and prosthetic hands [1]. Currently, clinicians and researchers prefer to use the SHAP [2][3][4][5][6][7][8][9] because it provides a comprehensive overview of the functionality of prehensile grips (spherical, tripod, power, lateral, tip, and extension) and a general functionality score. SHAP scores are calculated based on the execution times of its tasks. SHAP tasks are designed to evaluate unilateral hand function of adults [1]. Although assessing hand function is equally important for the adult as the pediatric population, currently no version of the SHAP exists for children.
Several instruments are available for evaluating hand functioning of children with different hand impairments, such as spastic hand due to cerebral palsy (CP; 20.8/10.000 births) [10], upper limb reduction deficiencies (ULRD; 5.0 births/10.000) [11], or traumatic injuries of the hand (41% of childhood injuries) [12]. For instance, the Melbourne Assessment of Unilateral Upper Limb Function (Melbourne assessment) and the Quality of Upper Extremity Skills Test (QUEST) [13][14][15] are used in children with different types of CP. The Assessment of Capacity for Myoelectric Control (ACMC) evaluates functioning with a prosthesis, the University of New Brunswick Test (UNB) focuses on bimanual functioning, and the Assisting Hand Assessment (AHA) evaluates the role of the impaired hand as assisting hand for the unimpaired hand [16][17][18]. These measurement instruments are recommended for clinical use. However, some measurement instruments require extended training of the assessor (e.g., ACMC), which may limit clinical applicability [19][20][21]. The instruments focusing on evaluating bilateral functioning [16,22,23] cannot assess the capabilities of the affected hand alone (e.g., UNB). Some instruments focus on evaluating the functionality of prosthetic hands (e.g., ACMC and other, see the literature reviews) [19,20,24,25] or of hands subject to the specific effects of conditions such as CP or ULRD (e.g., Melbourne assessment, AHA) [14,16]. Furthermore, the outcomes of the existing instruments lack description about the functionality of different hand grips. With high incidences and various impairments of the hand, healthcare professionals need an accurate instrument that does not require formal training and can be used for multiple hand impairments. Such a broadapplicability instrument would enable comparison of the functionality scores of different impairments of pediatric hands with regard to unimpaired hand functioning.
The SHAP is an instrument that has web-based training [26,27] and makes comparisons between scores of unimpaired, impaired, and prosthetic hands possible. The SHAP provides scores for the functionality that are calculated relative to norms [1,28]. The SHAP reliability and norm values were determined using unimpaired young adults that were considered to have optimal hand functionality [1]. Thus, SHAP functionality scores of any type of hand impairment are relative to optimal hand functionality (of unimpaired young adults). However, the SHAP has not been used in children thus far, and the relevant clinimetric properties have yet not been established in children. To use the SHAP for children (SHAP-C), several steps are required: a. Adjustment of the objects used to perform the tasks and the SHAP protocol for a specific age group and size of the impaired/unimpaired hand or prosthesis, as some of the SHAP objects are relatively heavy and large for a child's hand or prosthesis [16]; b. Testing of the reliability in unimpaired children and determination of the norm values for unimpaired children; c. Testing of SHAP-C validity; and d. Testing of the reliability in children with prosthetic hands and other hand impairments because the SHAP was originally designed to evaluate unimpaired, impaired, and prosthetic hands.
This study focused on the first steps, adjusting the SHAP, providing norm values, and testing reliability in children with unimpaired upper limbs.
The aims of the study were as follows: (1) to modify the objects and protocol of SHAP for children's hands or prostheses, (2) to provide norm data about the means of SHAP-C tasks for 4-to 6-y/o boys and girls with unimpaired hands, and (3) to assess the inter-and intra-rater reliability of the SHAP-C in these children.

SHAP-C
The SHAP consists of 26 tasks: 12 tasks with abstract objects and 14 tasks concerning activities of daily living (ADL, Table 1). The time needed to complete each task is recorded in seconds. Using z-score transformations of task-times and the Euclidean distance, six prehensile patterns and a general index of function (IOF) are computed. All six scores of prehensile patterns form the functionality profile (FP). The prehensile patterns and IOF are calculated relative to the predetermined norms and represent the functionality scores of hand grips (Spherical, Tripod, Power, Lateral, Tip, Extension) [1]. The FP and IOF scores range from 1 to 100 (100 is normal functionality). Scores higher than 100 are possible if the assessed person performs better compared with the normative data. The normative data for the tasks in adults range from a mean performance time of 1.58 seconds to 1.84 seconds in abstract objects tasks and from 3.12 seconds to 6.77 seconds in ADL tasks. The normative data for the prehensile patterns and the IOF are not available in the literature because of the intellectual property rights of parties commercializing SHAP. The test-retest reliability of SHAP has been tested in unimpaired young adults using analysis of variance (ANOVA) [1]. The ANOVA F-values of SHAP tasks and the FP and IOF functionality scores do not exceed F critical = 3.28, indicating there is no difference between the replicates and demonstrating, thus, good reliability [1,28].
In the process of establishing the SHAP-C, we focused on keeping the alterations of the original SHAP to a minimum. Therefore, a systematic approach was used for designing the SHAP-C. (1) First, several objects were downsized (Table 1) to allow grasping with both pediatric unimpaired and prosthetic hands, as the SHAP was designed for prosthetic hands as well (maximum opening of the prosthetic hand, distance from thumb to index finger is 5 cm [myoelectric prosthesis, Electrohand 2000]). Object sizes and the original SHAP protocol [27] were tested in a pilot study on eight unimpaired children (4-7 y/o, three girls and five boys). The children were recruited from a local school. They performed with a normal hand or with a myoelectric prosthesis adapted for the use in unimpaired children (a prosthetic simulator, Figure 1).  Lighter tray (length = 42 cm, width = 26 cm, weight = 558 g) The unit kit was placed with the shorter side facing the participant.
L + Tp Rotate a key 90°Rotate the key from a vertical position 90°to a white mark using a lateral grip.
--L + Tp Open/close a zip Open and close a zipper. An extension to the zipper's pull-tab (paperclip) The assessor held the pull-tab for easier grasping. b P Rotate a screw 90°T he screwdriver is placed on the form-board on the side of the assessed hand. The screw is clipped on the exterior of SHAP unit on the side of the assessed hand. Both hands can be used to guide the screwdriver to the screw, but only the assessed hand is turning the screwdriver. Adjustments performed for the prosthetic hands (based on the pilot study); for non-disabled hands, the adjustment was unnecessary.
(2) After that pilot study, it was decided that the assessor will time the tasks instead of the child as stated in the standard SHAP protocol because the children often forgot to start and stop the timer. Timing was started at the moment of opening the hand to grasp the object and stopped when the object was released. Furthermore, all of the objects (including the resized objects) and the changed SHAP protocol were tested in three other children (5 y/o), using the myoelectric simulator, to evaluate the feasibility of the SHAP-C protocol in pediatric prosthetic hands as well. (3) Based on the observations from the children using the prosthetic simulator, the starting position of a few objects was slightly changed to facilitate the gripping of the objects in prosthesis users (Table 1).

Participants' norm values and reliability study
The children were recruited from two local primary schools on a voluntary basis. Children were unimpaired, righthanded and were included if they were four to six years old (y/o) (primary school starts at 4 y/o in the Netherlands), free of upper limb musculoskeletal or neurologic disorders, had normal/corrected to normal sight and were not familiar with the SHAP. As we would like the SHAP-C also to be available for pediatric prosthesis users in the future, we included 4-to 6-y/o children. This age group was defined according to the size (opening width) and functional abilities of a generally used prosthesis hand (Otto Bock, Electrohand 2000), appropriateness of the SHAP-C tasks in children, and ability to receive and follow tasks instructions.
Study approval was granted by the Medical Ethical Committee (NL35268.042.11). A parent (or guardian) provided written informed consent and filled in a short questionnaire about age, gender, and hand dominance of their child. All of the children received a gift toy (value ± 5 Euro) at the completion of the measurements.

Procedure
A repeated-measures study was set up to evaluate intraand inter-rater reliability of the SHAP-C. The children were assigned to perform the tasks every session with the same hand, dominant (right hand) or non-dominant hand. Dominant and non-dominant hand performance was needed to obtain a better representation of the tasks means. Measurements were performed in a quiet classroom at the primary school. The child and two assessors were present. First, the child was seated comfortably on a chair, and, when needed, height was adjusted to allow 90°elbow flexion when the hand rested on the table. Each SHAP-C task was first demonstrated by the assessor. The tasks had to be executed as accurate and as fast as possible. Children started to open the hand when near the object. For each object, a start position and an end position were specified with molds on a board (formboard) that was lying on the table in front of the child. Before executing the abstract objects tasks, the corresponding mold on the form-board was aligned to meet the middle line of the participant to standardize testing in both conditions with the dominant or non-dominant hand. No prior practice was allowed and repetition of a task was performed when the child did not complete the task according to the exact requirements (appropriate grip, object location) [27].
Three assessors collected the measurements. The assessors were instructed by a detailed SHAP protocol, which was accompanied by videos demonstrating the tasks and time measurement [26,27]. They read and understood the modifications of the SHAP-C and practiced SHAP-C during the pilot study. Assessor 1 and assessor 3 had previous experience in applying the SHAP, which was not the case for assessor 2. A verbal signal to start the task was given to the child. The tasks were executed in random order to avoid any sequence effects. In total, four measurement sessions of the SHAP-C were collected during four consecutive days. Children participated in one SHAP-C session per day with approximately 24 h between sessions. The SHAP-C results of the four sessions were used to determine the norm values.
Intra-and inter-rater reliability Assessor 1 measured the task times of two SHAP-C sessions (day 1 and day 2). Assessor 2 measured the times of one SHAP-C session on day 3. Assessor 3 measured the times of one SHAP-C session on day 4.

Statistical analysis
In this study, the task times, denoting performance times in each SHAP-C task, were used for the analyses.

Norm values
First, the task means of the four sessions were calculated per participant. Second, to determine the norm values, the means and the standard deviations of each SHAP-C task were calculated based on the means of the four sessions. Independent samples t-tests were used to determine differences between boys and girls. The test results of the 'equal variances not assumed' row were reported when the homogeneity of variances assumption was violated. In addition, we tested the differences between performance with dominant and non-dominant hands with a t-test.

Intra-rater reliability
The paired samples t-test was used to analyze the differences between the task times of the first and second session of assessor 1. A repeatability coefficient (RC) was determined for each SHAP-C task [29,30]. The RC is defined as the value in which the differences between repeated measurements are expected to lie with a 95% probability and is calculated as 1.96 × s × ffiffi ffi 2 p (s, within-subject standard deviation) [29,30]. The relative RCs, the percentage of variance of the RC outcome from the mean, were also calculated and constituted the primary outcome measure.

Inter-rater reliability
Repeated-measures ANOVA was used to analyze the differences between the task times of the second SHAP-C session (assessor 1), the third session (assessor 2), and the fourth session (assessor 3). When sphericity was violated, Greenhouse-Geisser correction for the degrees of freedom was applied. Bonferroni correction was applied for the post-hoc test. For each SHAP-C task, the agreement between the assessors was determined by calculating the RC and the relative RC.
We considered values of ≤ 75% for relative RC as clinically acceptable values for variation of task times from the mean denoting acceptable reliability. Statistical significance for analyses was P ≤ 0.05 (two-sided) and analyses were performed using SPSS Statistics for Windows, version 20.0 (IBM Corp., 2011, Armonk, NY, www.spss. com).

Results
In total, 24 children participated, and 54% were boys. The mean age was 5 y/o [SD = 0.54], and the dominant: nondominant hand ratio was 8:5 for boys and 5:6 for girls.

SHAP-C feasibility and task means
All children were able to grip the resized objects with their hand. The means for the abstract objects varied between 0.75 and 1.21 seconds, and the means for ADL tasks varied per task, with the highest mean of 19.13 seconds for the undo buttons task ( Table 2). Girls were slower than boys in five SHAP-C tasks: light extension (P = 0.006), heavy lateral (P = 0.012), heavy extension (P = 0.018), pour water from jug (P = 0.044) and open/ close a zipper (P = 0.007, Table 2). Participants performing with the dominant hand were faster in the heavy extension, food cutting and page turning tasks (P-values < 0.01).

Intra-rater reliability Abstract objects
The mean task times of assessor 1 in session 2 were significantly lower than the times in session 1 for light lateral (P = 0.044) and for heavy power (P = 0.049; Table 3).
Tasks RCs varied from 0.51 to 0.92 seconds in the abstract objects tasks. Relative to the mean of the first two sessions, values ≤ 75% were observed for the relative RCs in 7/12 abstract objects tasks. Light power, light tip, light extension, heavy tripod, and heavy power displayed relative RCs > 75%.

ADL tasks
The t-test indicated significantly lower means in session 2 compared with session 1 in the tasks: food cutting (P = 0.023), page turning (P < 0.01), pouring water from jug (P < 0.01), pouring water from carton (P < 0.01), and rotating a door handle (P = 0.030, Table 3).
The RCs of 10 out of the 14 ADL tasks were lower than 75% from the tasks means. In the undo buttons, food cutting, rotate a key 90°, and open/close a zipper task, the relative RCs exceeded 75% from the task mean (Table 3).
Inter-rater reliability Abstract objects tasks No significant differences in the time means of the three assessors were found, except for the light extension task (P = 0.002) ( Table 3). In the post hoc analysis, the mean of assessor 2 was significantly lower than those of assessor 1 and assessor 3 (P = 0.016 and P = 0.010, respectively).
The RC values ranged from 0.58 to 1.20 seconds in the abstract object tasks ( Table 3). The relative RC values between the three raters were all > 75%, except for the heavy sphere task (relative RC = 74.3%).

ADL tasks
The assessors differed significantly in the task times in 6 out of 14 ADL tasks: undo buttons (P = 0.044), food cutting (P = 0.005), page turning (P = 0.008), move a full jar (P = 0.021), move a tray (P = 0.026), and rotate a screw  Abbreviations and notations: SHAP-C-Southampton Hand Assessment Procedure for Children, AO-abstract objects, S-session, A-assessor, SD-standard deviation, RC-repeatability coefficient, RRC-relative repeatability coefficient, P-significance value.
a Paired t-test for the results of the first and the second session of assessor 1. b Repeated measures ANOVA for the results of assessor 1 (second session) assessor 2 and assessor 3. *Significance P < 0.05. 90°(P = 0.003) ( Table 3). Post hoc analyses revealed that assessor 2 recorded lower means than both assessor 1 (P = 0.006) and assessor 3 (P = 0.038) for the rotating a screw 90°, than assessor 1 in food cutting (P = 0.014) and in moving a full jar (P = 0.001) and differed from assessor 3 in moving a tray (P = 0.054, mean assessor2 > mean assessor3 ). Between assessor 1 and assessor 3, significant differences were observed in undoing buttons (P = 0.034) and in page turning (P = 0.003); assessor 3 recorded lower means. In all ADLs, task times varied within the RC ≤ 6.07 seconds, except for the undo buttons (task in which RC = 24.1 seconds). The relative RCs were > 75% in the majority of ADLs. For four tasks, the relative RCs were ≤ 75%: pick up coins, pour water from jug, pour water from carton, and move a tray.

Discussion
This is the first study to adapt the SHAP for pediatric use and assessed reliability of this adapted version, the SHAP-C. Children were able to perform all SHAP-C tasks using the corresponding objects (including the downsized objects). The task means were significantly different in 7/26 tasks when a single assessor tested twice and in 7/26 tasks when three different assessors tested (P-values < 0.05). The intra-rater reliability of the SHAP-C was relatively better compared with the inter-rater reliability. Variation values within the same assessor, the RCs, had percentages < 75% in 17 out of 26 SHAP-C tasks (7/12 abstract objects tasks and 10/14 ADL tasks), indicating a relatively good repeatability of the procedure within the same assessor, at least in ADL tasks. The time scores per task varied largely between the three assessors. In 22/26 SHAP-C tasks, the RCs were higher than 75% of the task mean, thus revealing, poor SHAP-C repeatability. The small differences in task means on a group level indicate that the SHAP-C can be used for group comparisons. However, in clinical practice on an individual level, the SHAP-C may be used when one assessor is engaged but with considerable within-subject variation ( Table 3). The SHAP-C should be used with caution when more assessors are engaged. Further adjustments are required to provide clinicians with a reliable SHAP-C.
In the current study, the mean values for the abstract objects of the SHAP-C tasks are much lower than the means of the SHAP tasks in adults (overall means SHAP-C < 1.21 seconds vs. means SHAP > 1.58 seconds) [28]. This discrepancy is most likely because timing of the SHAP-C tasks was differently executed than in the SHAP (timing by the assessor vs. self-timing). Compared with the SHAP, the SHAP-C means do not include the times of two phases: (1) stopwatch activation-reaching-the-object and (2) after-release of the object-stopping the stopwatch. On the other hand, the SHAP-C means in more complex ADL tasks were overall higher than those of SHAP in adults (e.g., pick up coins, undo buttons, or food cutting, means SHAP-C = 5.64-19.13 seconds vs. means SHAP = 3.12-6.77 seconds) [28]. This finding of children being slower than adults in executing complex tasks is in line with the reports in literature explaining age-related differences in (neuro) motor development (e.g., maturation of neural cortex gradually over time) [31,32]. Nevertheless, our means for the SHAP-C tasks represent the first estimations of norm values. Because of the observed variability in task times (Tables 2 and 3), a larger sample is required to determine the norms once the SHAP-C protocol is more definitive.
Bland and Altman recommended the use of RC to determine consistency in outcomes of a measurement instrument [29,33]. The precision of an RC over the Pearson's correlation coefficient and intra-class correlation has been highlighted [29,34]. There are, however, no standardized rules for interpretation of RCs. The suggested approach is that the lower the RCs are, the better the repeatability of the instrument is. The comparison of the RC to the minimum clinically important difference/change (MCID) would indicate good reliability of the instrument if the RC < MCID and vice-versa [34]. In the absence of MCID values for the SHAP or SHAP-C, we chose to represent the RCs in percentages relative to the task means as used by others [35,36]. The relative RCs quantify the degree of agreement between different or single assessors and facilitate the interpretation of RCs. The cut-off point for the relative RC (75%) was chosen arbitrarily; higher (80%) and lower (50%) cut-off points have been reported previously [35,36]. Thus, one may shift the cut-off value and interpret the RCs found in this study accordingly. Using a cut-off point of 80% for our relative RCs, for example, would have not changed the current results because all non-reliable tasks had relative RCs higher than 80%.

Intra-rater reliability
The majority of tasks were reliable (relative RCs < 75% in 17/26 tasks) for assessor 1. In comparison with the adult version of the SHAP intra-rater reliability, we found approximately the same amount of less replicable tasks. In the SHAP, seven tasks have been found to be less reliable (light power, light tip, heavy extension, page turning, pour water from carton, rotate a key 90°, rotate a screw 90°), but not to a significant extent [28]. In the SHAP-C, nine tasks were significantly less reliable (light power, light tip, light extension, heavy tripod, heavy power, undo buttons, food cutting, rotate a key 90°, and open/close a zip). The difference in less-reliable tasks between adults and children may be due to age differences in motor abilities with the upper limb [31,32]. For instance, rotating a screw 90°requires fine-motor skills. In adults, the hand motor skills have been acquired to a different extent, thus the variability when rotating a screw, whereas the tested children did not vary in this task.
Interestingly, five of the SHAP-C tasks with relative RCs > 75% were abstract object tasks. In the context of SHAP-C tasks being timed by the assessor, a possible explanation might be the variation in the assessor's reaction time, especially in rapidly executed tasks involving abstract objects (< 1.2 seconds). The literature reports a response time of 0.18-0.20 seconds after visual stimuli and that many factors account for reaction response: practice, gender, age, fatigue, distraction, and even breathing cycle [37]. Practice might have had a role. The first measurement of the same assessor most likely served as practice and led to lower scores (faster performance) in the second measurement. We cannot exclude the fact that learning effects of SHAP-C tasks within a child might have occurred, but distinguishing learning effects from the reaction time of the assessor is not possible in this case.
An alternative, more objective method for SHAP-C data collection would be to use a different timing system. Possibly, a system that recognizes a certain opening of the thumb-index finger angle or the lifting of the hand from the table, in combination with sensors able to detect movement of position of the objects, would time the performance more accurately. Solutions for recording performance accurately can be extended to computerized systems able to depict the hand positions and objects' shapes [38]. The inclusion of kinematic measurements would provide information about the movement time and quality of the movement. Each abstract object task may also be executed repeatedly in a certain amount of time (e.g., 10 seconds) and the number of execution times rated accordingly as in the pegboard hand-dexterity test, for example [39]. However, the mentioned solutions would increase the assessment time, costs, and dimensions of the SHAP-C kit, which is beyond the SHAP/SHAP-C purpose. With the disadvantage of increasing the time needed to determine the functionality scores, the simplest and most inexpensive approach would be to videotape the performance. Afterward, the task times could be accurately evaluated from the recordings, as has been previously accomplished for pediatric functionality tests [22,23]. Although we avoided introduction of procedures that increase data-collection or analysis time, our results suggest that these types of changes might be necessary after all, as the influence of the assessor would be diminished considerably.

Inter-rater reliability
Clinically, the RCs of 0.58-1.20 seconds observed in abstract object tasks would be a negligible variation, but relative to the task means, this variation was rather large (≥ 75%). In this case, again, the reaction time might have had an influence on the abstract object task times [37]. Moreover, the practice experience was different across our assessors. Two of the assessors had extended experience with applying SHAP. Assessor 2, on the other hand, had no previous experience and taking into account the findings-the means of assessor 2 differed in several tasks (Table 3)-the assessors may require a longer training period prior to applying the SHAP-C. The training might include studying instructional movies centralized on an online database as in the case of the SHAP [26]. Creating a benchmark test to evaluate assessors' instructional and data-collection skills after the training would ensure a level of proficiency when applying the SHAP-C.
Furthermore, distraction, another risk-factor for variation in performance [37], might have affected our participants. Engaging 5-y/o children in performing tasks requires good motivational techniques. Our assessors used intrinsic motivation by stimulating a playful atmosphere [40] and extrinsic motivation by rewarding performance with positive reinforcement, candy, allowing the child to color, draw or offering an animal sticker. However, the motivation of some children varied during different tasks and sessions, especially in ADLs, causing delays in performance, and thus the variability in task times. One study referred to SHAP tasks as being unattractive for children [16]. If this is the case, then substituting current SHAP-C tasks with tasks simulating activities of child play [16] and using colorful objects may improve motivation and reduce distraction. Furthermore, the necessity of providing clear instructions and using good motivational techniques in children has been emphasized in the literature about other measurement instruments for pediatric hand functionality [41]. The flow theory provides some suggestions on how to stimulate intrinsic motivation in children: use age-appropriate tasks, promote a 'fun' environment, provide the possibility for the children to control some of the tasks (e.g., allowing them to choose an object/ task that they want to continue with), set clear and achievable goals for the tasks, and avoid negative feedback (oral or non-verbal) [42].
The observed variability of the SHAP-C means may be partly explained by the variability in (neuro) motor development in preschool children up to adolescence [43][44][45]. Therefore, the scores on functionality assessments have to be interpreted bearing in mind this variability in children [46]. Importantly, the timed performances in children require a standardized test, well-trained assessors, and norms for different age categories [47].
Summarizing the steps to be considered for improving reliability of SHAP-C, future research has to identify an appropriate a data-collection method that will diminish the assessor's influence. In addition, researchers have to consider providing information to the assessors about techniques to improve motivation of children, either in the form of training or including motivational techniques in the instructions for the SHAP-C protocol.
A limitation of this study would be the relatively small sample size. Although the SHAP reliability was also determined with 24 participants [1], we feel that for pediatric populations, a larger sample size is necessary to determine the SHAP-C norms and reliability. The study design for the inter-rater reliability is limited by the fact that data were collected on separate days. A more adequate design would involve simultaneously timing of the performance of the participant by the three assessors, but the task instructions being performed by one assessor would result in potential bias for the measurements of the other two assessors. In addition, having three assessors in the same room would be overwhelming for a child. Another alternative would be to measure the participants on the same day, three times with different assessors. However, this approach was not possible because of the limited availability of the children during school hours. More importantly, fatigue and disinterest may occur if children are requested to repeat 26 tasks three times in one day. Videotaping the performance might solve this issue of three consecutive sessions and limit measurements to one-time session. In addition, the order of assessors was the same per participant and measurement day. Therefore, the task means might have been affected by the order of the assessors and/or by the measurement day. For practicality reasons, we could not randomize assessors per measurement day, but further studies should consider randomizing the assessors.
A future approach for assessing inter-rater reliability of the SHAP-C in children may consider the following: (1) allowing each child to perform the SHAP-C once with a randomly assigned assessor and (2) live broadcasting the performance of the participants to the other assessors that will measure simultaneously the performance. This way, the children will not be solicited more than once and by more than one assessor, and the possible bias of rating performance of the participants that received instructions from another assessor will be evenly distributed throughout the data.
Another limitation of this study is the inability to estimate the norms for the prehensile patterns of FP and IOF that are of interest to clinicians. Based on the means and standard deviations in our study, the estimates of norms for prehensile patterns of FP and the IOF could have been calculated, but the formulas for such calculations [1,26,28,48] are not clear for us nor to our statistician. Not having the exact procedure for determining the norm values that are needed for the calculation of FP z-scores and IOF z-scores made the calculation of FP and IOF impossible for the SHAP-C data. Explanations regarding the formulas were denied to us because of the holders' exclusive rights on the SHAP (intellectual property).
The sizes of the objects were not systematically evaluated. Therefore, research is also needed to determine the appropriate size of the objects for the hands of older children (> 6 y/o), for larger prosthetic hands with an opening width > 5 cm or for spastic hands with an opening < 5 cm. In addition, clinimetric properties should be studied in older children because of changes in performance with age [49]. The reliability of the SHAP-C should be evaluated in different impaired hands and in prosthetic users because the SHAP was also designed for use with such patients [1]. The evaluation of learning effects of the SHAP-C in prosthetic users would be valuable for clinicians repeatedly using the SHAP-C.

Conclusions
Adjusting SHAP objects to allow grasping with normal and prosthetic hands in 4-to 6-y/o children was performed successfully. Participants were able to perform all of the SHAP-C tasks with means from 0.64 to 19.13 seconds for the tasks. The intra-rater reliability was relatively good in comparison with the inter-rater reliability. However, more adjustments of the protocol are needed to ensure the reliability of the SHAP-C, to improve the motivation of children, to minimize the assessor influence, and to determine the norms.