Introduction

Quantitative assessment of hand function in people with tetraplegia is important not only for day-to-day clinical practice but also for evaluating emerging therapies. Despite the large number of hand assessments,1 none of them are ideal.2 Most were designed for patients with injuries other than tetraplegia,3, 4, 5, 6, 7, 8 and those that were designed for tetraplegia evaluate the success of tendon transfers, functional electrical stimulation or neuroprosthetic implantations.8, 9, 10, 11 Consequently, such tests have floor effects and little sensitivity to change when used in the majority of patients. The few assessments that are potentially appropriate include timed and/or bilateral hand tasks.3, 6, 8, 12 Inclusion of such tasks creates a problem for clinical practice and research. For example, scores recorded as units of time are problematic for statistical analysis if patients are unable to complete tasks. Assessments involving both hands introduce confounders if patients have asymmetrical hand function or if clinical trials involve therapy for just one hand. A potentially suitable unilateral hand assessment was recently developed, but the scoring is heavily weighted on the type of hand grasp used by individuals, which is arguably of little importance provided individuals can use their hands effectively.13 Therefore, the purpose of this study was to devise a simple, sensitive and reliable instrument to quantify unilateral hand function in people with tetraplegia, which has enough scope to cater to those with initially poor hand function. The instrument was named the AuSpinal.

Methods and Results

There were five phases in this study. For simplicity, the methods and results of each phase will be presented together. All descriptive data are expressed as means and standard deviations (s.d.) unless otherwise stated. The authors certify that all applicable institutional and governmental regulations with regard to the ethical use of human volunteers were followed during the course of this research.

Phase 1: Development of the AuSpinal

Methods

The tasks of 20 existing hand assessments were reviewed and the seven most appropriate tasks for people with tetraplegia selected. A scoring system was then devised. To assess the face validity of the tasks and scoring system, both were presented to people with tetraplegia and therapists experienced in handling spinal cord injury (SCI), using an unstructured interview format. The goal of the interviews was to ensure that the tasks and scoring system were relevant for people with tetraplegia, and to adjust both as necessary, until a consensus was reached.14 For example, interviews were conducted with people with tetraplegia on a one-to-one basis by one of the investigators. In addition, therapists experienced in the hand management of people with SCI and representing all SCI units in Australia met through teleconferences and at a national scientific meeting to refine and discuss the selected tasks and scoring system. Some tasks were modified to reflect real-life situations more accurately. For example, a coin task was modified to include manipulating it from a hip bag strapped around participants’ waists.

Results

The final version of the AuSpinal consisted of seven tasks (see Appendix). Four of the seven tasks were based on elements of the Sollerman Hand Function Test7 and involved manipulating a key, coin, telephone and metal nut. Two tasks were modified from the Rehabilitation Engineering Laboratory Hand Function Test For Functional Electrical Stimulation Assisted– Grasping,11 and included manipulating a can of soft drink and a credit card. The last task was modified from the Upper Extremity Function Test,4 but instead of manipulating small ball bearings, it involved manipulating a small, chocolate-covered candy mimicking a pill. Administration of the AuSpinal takes approximately 15 min per hand.

To establish a scoring paradigm, each task was divided into 3–6 subcomponents on the basis of an analysis of the critical steps that are involved in successful task performance. A unique aspect of the scoring system was that all decisions were dichotomized. For example, scoring for one of the subcomponents of the key task simply reflected whether the participant could, or could not, insert the key in the lock. The subcomponents were given different weightings using a theoretical approach to score allocation.14 The score for each subcomponent was summed to obtain a total score for each task. The total scores differed between tasks reflecting their number of subcomponents (determined by the task analysis). Scores for each task were summed with a maximum possible score of 86.

Phase 2: Test-retest and intrarater reliability (from videos)

Methods

Test-retest and intrarater reliability were assessed by asking therapists to rate performance after watching videos on two separate occasions. Eight people with tetraplegia were included in the videos. The median (interquartile range) time since injury was 4 years (3–12 years). All participants had bilateral motor complete lesions of C6 or C7, according to the International Standards for Neurological Classifications of SCI. Videos of both hands were made while participants completed the AuSpinal. The performances were not scripted or practiced. Ten performances reflecting a range of abilities were selected for each of the seven tasks. Two of these 10 performances (that is, 14 videos in total) were digitally manipulated to create an identical but mirrored version (that is, the right hand appeared like the left hand and vice-versa). This resulted in a compilation of 12 performances for each of the 7 tasks, making a total of 84 (12 × 7 tasks) videos. The digitally mirrored videos were randomly dispersed throughout and not disclosed to therapists. Labview Software (National Instruments, Austin, TX, USA) was used, enabling therapists to electronically rate each video.

A total of 17 therapists independently rated the 84 videos on one occasion and 13 therapists rated the same 84 videos on two occasions, separated by approximately 2 weeks. The therapists had varying amounts of SCI experience, ranging from 6 months to 25 years. None of them had used the AuSpinal before participation, but all were provided with written instructions about testing and scoring. The order of the videos for each task was randomized, but the order of the tasks was not. For example, all key tasks were randomized, but presented first and one after another. Therapists were able to view the videos for each task as often as necessary before rating them, but were unable to freely move backwards and forwards between tasks or revise a rating once nominated. Therapists’ scores for each task and their total scores obtained during the first session were compared with the equivalent scores obtained during the second session to determine test-retest reliability. Therapists’ scores for digitally mirrored videos were compared with each other to assess intrarater reliability. The reliability of the different data sets was determined using typical errors,15 mean change scores, pairwise comparisons and intraclass correlations (and corresponding 95% CI).

Results

The test-retest reliability for each of the tasks rated across the two occasions is shown in Table 1. The intraclass correlation coefficients ranged from 0.79 to 0.98 (95% CI ranged from 0.72 to 0.96). The mean difference for each task was 0.5 points or less (95% CI ranged from −0.4 to 0.7). There were, however, small systematic test-retest differences for the two tasks presented first in the software package (that is, the key and coin tasks). The mean (95% CI) test-retest difference for the total score was 1.5 points (−0.1–3.1). The intrarater reliability from the comparison of digitally mirrored videos is shown in Table 2. The mean (95% CI) difference between ratings of digitally mirrored videos ranged between −0.2 and 0.3 points (95% CI ranged from −0.5 to 0.9). These differences were slightly less than the test-retest variability.

Table 1 Results from phase 2: Test-retest reliability
Table 2 Results from phase 2: Intra-rater reliability

Phase 3: Interrater reliability and internal consistency (from real-life performances)

Methods

Interrater reliability and internal consistency of scores were assessed by asking six therapists to simultaneously watch and rate real-life AuSpinal assessments determined by one of the investigators. This was carried out on four participants (eight hands) with bilateral C6 or C7 motor complete lesions and a median (interquartile) time since injury of 7 years (3–12 years). The six therapists had varying levels of experience dealing with SCI, ranging from 6 months to 15 years. Therapists’ scores were recorded in paper format and without consultation. Therapists’ scores for each task and their total scores were compared to determine interrater reliability using pairwise comparisons. Therapists’ scores of each of the seven tasks for the four participants (eight hands) were analyzed using the Cronbach's α-value. This is a test of internal consistency and was used to determine whether the tasks were assessing similar or different domains.

Results

The interrater reliability indicated by the pairwise comparisons for each of the tasks is shown in Table 3. Three of the seven therapists had perfect concordance across total scores for all eight hands. The means (s.d.) of the differences across all pairwise comparisons in total scores were small. Differences between therapists were dependent on the overall performance of the participant. That is, there was better concordance between therapists in participants with good hand function than in participants with poor hand function. For example, the coefficient of therapists’ concordance in participants who scored above 60/86 was less than 1.2%. This equates to a one- to two-point difference in those scoring between 60/86 and 86/86. The coefficient of therapists’ concordance in participants who scored less than 60/86 was approximately 15%, with the greatest variance being for the two participants with the poorest hand function. The overall Cronbach's α-value was 0.93, indicating a very high concordance with tasks assessing a common domain. This result suggests that all seven tasks of the AuSpinal target a common element of unilateral hand function.

Table 3 Results from phase 3: Inter-rater reliability

Phase 4: Validity: range of scores (from cross-sectional analysis)

Methods

The aim of the fourth phase was to explore the relative difficulty of the seven AuSpinal tasks by examining the range of scores on a diverse sample of convenience. In total, 26 participants (50 hands) were recruited; some were undergoing initial rehabilitation (n=6) and others were living in the community many years after injury (n=20). The median (interquartile range) time since injury and age was 8 years (1–17) and 44 years (37–57), respectively. Participants had ASIA impairment scale ASIA A (n=7), ASIA B (n=6), ASIA C (n=6) or ASIA D (n=7) lesions.

Results

The median score for each task of the AuSpinal ranged from 9 to 13 (see Figures 1 and 2). The corresponding interquartile ranges extended from 6 to 14. The key and can tasks showed the greatest spread of scores. The candy and phone tasks showed the least spread, with clumping around top scores.

Figure 1
figure 1

Items of the AuSpinal, including an Australian 20 cent piece coin (diameter=28.52 mm; thickness=2.5 mm; weight=11.3 g), credit card, key, desk telephone, can of drink (375 ml), chocolate-coated candy (1.3 g) and nut (5/8th inch diameter). The stopwatch is used to limit the time spent attempting a task. It is not used to record the time taken to complete a task.

Figure 2
figure 2

Results from phase 4. Median (interquartile range) total scores for each task of the AuSpinal in 50 hands (26 participants). The total possible score for each task is indicated in brackets.

Phase 5: Validity: sensitivity to change over time (from longitudinal analysis)

Methods

The fifth phase of the study involved repeat assessments of a subset of eight participants (16 hands) to examine change over time. The participants were admitted to an in-patient SCI rehabilitation unit after injury. The AuSpinal was administered on admission to, and discharge from the unit, with a median time (interquartile range) between assessments of 14 weeks (12–15). The median (interquartile range) time since injury was 54 days (42–83). Participants had ASIA A (n=1) or ASIA D (n=7) lesions.

Results

The AuSpinal scores at admission and discharge are shown in Figure 3. There was an obvious and marked change over time in 8 of 16 hands with small change in the remaining 8 hands. There was a ceiling effect, with 7 of 16 hands attaining top scores by discharge.

Figure 3
figure 3

Results from phase 5. The total AuSpinal score at admission and discharge for the 16 hands (eight participants). The solid line indicates the change in AuSpinal score for each hand.

Discussion

Hand function in people with tetraplegia is often asymmetric. The AuSpinal was therefore developed specifically to quantify unilateral hand function in people with tetraplegia. The seven tasks were selected from an array of existing hand assessments and modified to ensure they were appropriate for this population and were sensitive to change in people with poor hand function. Developing a sensitive unilateral test was considered important for future clinical trials designed to determine the effectiveness of different conservative approaches to hand management, especially for trials using a within-participant design, in which one hand of each participant functions as a control for the other treated hand.

The results of this study indicate that the AuSpinal has good to excellent test-retest, interrater and intrarater reliability. This reliability was reflected by the high degree of concordance between different therapists’ ratings of the same task during both video and real-life assessments, and by the high degree of concordance between the same therapists’ ratings of digitally mirrored videos. The data indicate that an increase of one or more points on a task is likely to reflect a real change in performance. Good concordance was obtained between therapists across multiple centers, which may reflect the process of a forced dichotomized decision-making protocol (that is, yes or no) for each of the task subcomponents. This removes ambiguities and, unlike a similar hand test,13 places less emphasis on the type of grasp adopted. The concordance of the total AuSpinal scores between different therapists was better for individuals with good hand function than for individuals with poor hand function. This may reflect some ambiguities in the original testing instructions provided to therapists. For example, not all therapists dealt with the problems of patients dropping or placing items in the same way. In addition, some therapists were stricter than others with respect to scoring tasks, which required participants to hold an item vertically. These issues have been addressed in minor language changes in the revised instructions (see Appendix A). Concordance between therapists may be improved with formal training. However, taken together, our findings suggest that the AuSpinal is robust when repeated across centers and by different therapists.

There was a small systematic increase of 0.5 and 0.4 points for the ratings of the key and coin tasks by the same therapists, respectively, as scored from the videos on the two separate occasions. It is unclear whether differences as small as these are clinically important, although it would seem unlikely. These differences may have been in part due to therapists’ unfamiliarity with the computer-rating system. The key and coin tasks were presented to therapists first and it is possible that at the beginning of the first session they rated these two tasks differently, compared with subsequent tasks, and differently than when viewed on the second occasion. Future studies could guard against this order effect by randomizing the order in which tasks are presented and by providing training in the computer-rating system.

The concurrent viewing of eight real-life performances by seven therapists was included to mimic clinical practice. Unlike the video assessments, therapists could view each performance only once in real time. This may have limited the ability of therapists to provide a score in situations in which they had not observed a critical feature during the one-off performance. However, the reliability of real-life assessments was found to be similar to that of video assessments. That is, there was a high level of agreement between therapists’ ratings from both video and real-life observations. This fidelity suggests that the scoring system is robust, a critical feature for longitudinal studies, in which hand assessments are commonly determined by different therapists.

The seven tasks of the AuSpinal measure the same domain of hand function as reflected in the high Cronbach α-coefficient. This is not unexpected and suggests that some tasks may be redundant. That is, a shorter version incorporating fewer tasks may yield the same information as the current version incorporating seven tasks. This issue is currently being investigated. The psychometric properties of the AuSpinal also require further investigation. The preliminary cross-sectional (phase 4) and longitudinal (phase 5) data suggest that the AuSpinal has a ceiling effect and requires the addition of some more difficult tasks to cater to those with better hand function. For example, 7 of 50 hands received a maximal score in the cross-sectional study. These patients represented a sample of convenience, but probably provide a reasonable estimate of the population at large. The ceiling effect was not totally unexpected because, initially, the emphasis was on designing an assessment tool appropriate for patients with typical motor complete C6 and C7 lesions. The scope was expanded as the project progressed. The obvious solution is to add a few difficult tasks from one of the currently available hand assessments. We are currently exploring this option in a cohort of 68 recently injured patients, with the hope of adding one or two tasks modified from the ARAT5 to the AuSpinal.

Conclusion

The AuSpinal is a good, simple and quick measure of hand function. The results of this study indicate that the AuSpinal has face validity and good test-retest, intrarater and interrater reliability. It caters better to patients with limited hand function, but with the addition of harder tasks may prove to be useful for all people with tetraplegia.