INTRODUCTION

Groin pain is common in sport affecting up to 50 % of football and ice hockey players during a season .1–3 Low hip adduction strength increases the risk of adductor-related injuries in male football players,4,5 whereas low eccentric hip adduction strength relative to abduction strength is associated with adductor muscle strains in ice hockey players.6 In addition, a decrease in hip adduction strength precedes groin pain,7 and is also observed in football players with prior season longstanding groin pain >6 weeks.8 Therefore, researchers have proposed evaluation of hip adduction and abduction strength, including the adduction/abduction ratio, as a clinical screening tool to detect early groin pain, aid secondary prevention strategies,9 and guide readiness to play following groin injuries.10–12

Simple and reliable methods to measure unilateral hip adduction and abduction strength using hand-held dynamometry was described in 2010 by Thorborg et al.13 Since then, several additional studies have confirmed these findings highlighting the clinical usefulness of handheld dynamometry.14–17 One of the advantages of handheld dynamometry is the ease of use without the need for a comprehensive setup, however, for valid and reliable measures the tester must possess enough strength compared to the force output of the patient or athlete to be able to fixate the dynamometer.18 If this is not the case, the dynamometer will move indicating that concentric rather than maximal isometric strength is obtained, leading to a lower strength output.19 Thorborg et al.18 showed how inadequate upper body strength of the tester led to lower tested maximal isometric strength compared to a tester with high upper body strength. One way to overcome this is to externally fixate the handheld dynamometer against a wall14 or using a rigid belt.15,20 which has been applied successfully in several studies. Such a setup may not always be feasible, which calls for alternative ways of estimating hip adduction and abduction strength in strong athletes without the need for external fixation. Healthcare professionals working with athletes have to a wide extent adopted the bilateral hip adduction squeeze test in various forms21 which provides a gross measure of hip adduction strength and function.11,22,23 The long lever hip adduction squeeze test elicits high hip adduction torque,24 and is thus considered appropriate to measure maximal strength17 and to stress the groin tissue.25 The long lever hip adduction has previously been applied to obtain strength profiles of both elite football26 and ice hockey players,11 and to aid secondary prevention of groin injuries.9 Previous studies have only established intra-tester reliability (ICC: 0.92; SEM %: 4.3).17 Similarly, bilateral hip abduction strength can be measured using the long lever press test; a novel test which has been utilized in elite ice hockey players.11 However, determination of intra- and inter-tester reliability is lacking in the literature.

Assessment of rate of force development (RFD), the ability to rapidly produce force over a short period of time,27 may provide value in the management of patients with hip and groin pain beyond measures of maximal strength. In patients who have undergone hip arthroscopy for femoroacetabular impingement syndrome, hip flexion rate of torque development was lower in the operated compared to the non-operated hip despite maximal hip flexion strength being normalized,23 a tendency which is also present in other types of musculoskeletal pain conditions.28 Therefore, assessment of hip adductor and abductor RFD may aid return-to-play decisions, as sports activities such as kicking, skating, and change of direction rely on rapid force development.29–33 Good test-retest reliability of bilateral hip adduction and abduction RFD using the GroinBar testing system (Vald Performance, Albion, Australia) and a user-independent portable dynamometer have been observed.34,35 Both studies applied a short lever arm by placing the force pads between the knees rather than ankles, thereby limiting the torque production across the hip joint17 and compromising detection of groin pain.25

The aim of the present study was to assess intra- and inter-tester reliability of maximal, and explosive strength during the long lever hip adduction squeeze test and the long lever hip abduction press test in healthy adults using a hand-held dynamometer.

METHODS

Study design and subjects

Intra- and inter-tester reliability of maximal isometric strength and rate of force development of 0-100 ms (RFD100) and 0-200 ms (RFD200) was assessed with a hand-held dynamometer during the long lever hip adduction squeeze test17 and a novel long lever bilateral hip abduction press test using a rigid belt.11 All subjects were included by convenience sampling from two different setting. Subjects for the intra-tester part were included from the Physiotherapy Department at Metropolitan University College, Denmark, whereas subjects for the inter-tester part were included from sub-elite sports clubs in the Capital Region of Denmark (Figure 1). All subjects were between 18-40 years old, and subjects included from sport clubs for the inter-tester reliability part had to be injury-free at the time of testing. Exclusion criteria for all subjects were any current pain in the hip and groin region, knee, or low back considered to influence their ability to exert a maximal and rapid muscle contraction.

Figure 1
Figure 1.The flow of participants.

The reporting adheres to the Proposed Guidelines for Reporting Reliability and Agreement Studies (GRRAS)36 and approval by the Ethics Committee of the Capital Region, Denmark (16041360) was obtained prior to commencement. All subjects gave their written informed consent in accordance with the Declaration of Helsinki.

Testers

All three testers (one for the intra-tester part and two for the inter-tester part) were final year physiotherapist students. The tester involved in the intra-tester part had a previous half year experience with handheld dynamometers but not specifically related to the tests applied in this study, whereas the testers for the inter-tester part had no previous experience with handheld dynamometry before being involved in this study. All testers received 1-2 hours of practice supervised by LI and KT, followed by self-practice for maximum of one week. LI and KT approved all testers prior to data collection.

Data Collection

The force signal for bilateral hip adduction and abduction using a hand-held dynamometer (HHD) were recorded with a sampling frequency of 100 Hz (Hoggan microFET2, Hoggan Scientific L.L.C, Salt Lake City UT, USA). From this maximal isometric strength and rate of force development for 0-100 ms and 0-200 ms (detailed below) was determined. The test procedure for the long lever squeeze test followed a standardized reliable set-up (ICC2.1; 0.90-0.97).17 With the subject in supine position and hip and knees straight (0º flexion), the dynamometer was placed 5 cm proximal to the most prominent point on the medial malleolus of the dominant leg. The participant’s legs were slightly abducted corresponding to the length of the testers forearm and the dynamometer. The forearm and the HHD was placed between the ankles, and the subject was subsequently instructed to perform a bilateral hip adduction squeeze. The bilateral hip abduction press test was conducted with the subject in supine position. The HHD was placed 5 cm proximal to the most prominent point of the lateral malleolus and fixated with a rigid belt around the legs. The participant’s legs were slightly abducted corresponding to the position of the hip adduction squeeze test. The subject was instructed to press against a rigid belt placed around the ankles and the HHD by performing a bilateral hip abduction. During both tests, subjects were instructed to push or press as “fast and hard” as possible, and to keep pushing or pressing until instructed to relax (approximately 3-4 s).27

Both tests consisted of two submaximal trials at 50% and 100% of self-perceived maximum effort, followed by three valid trials, separated by one-minute rest. After each trial subjects were asked to score pain in the hip and groin on an 11-point Numerical Rating Scale (NRS 0-10). The test was terminated in case of pain >3. For the inter-tester part, the two testers were blinded to strength measures obtained by the other tester.

The force data were transmitted from the HHD to a commercial software program (TBS, Hoggan, Scientific L.L.C., Salt Lake City, USA) and extracted to a custom-made spreadsheet (Microsoft Excel, USA) for analyses.14 Force was recorded in Newton (N). RFD100 (100 ms) and RFD200 (200 ms) was calculated as the mean change in N per second in each time interval (100 ms and 200 ms) with the onset threshold of force (t=0 ms) set at 6.7 N above baseline force.27 Maximal isometric strength was determined as the peak value.

The sequence of measurements and testers were randomized, and an identical sequence was used during the retest session. Subjects rested 15 minutes between the test-retest sessions.

Statistical methods

Systematic bias between-sessions (intra-tester) and between-testers (inter-tester) was assessed as differences in mean values using paired t-tests with a significance level set at p<0.05. Relative reliability was assessed as Intraclass Correlation Coefficient (ICC) with a two-way random effects model and absolute agreement definition using the “irr” package in R. Absolute agreement was chosen as this examines the relative reliability without incorporating a systemic error term (this means that the ICC value reflects any systematic variation between testers/sessions, and was thus used because of systematic bias was detected36). The relative reliability was interpreted as poor (ICC<0.50), moderate (0.50≤ICC≤0.75), and good (ICC>0.75).37 Absolute reliability was expressed as 1) the standard error of measurement (SEM) calculated as SDPooled×1ICC,38 and 2) SEM% calculated as:, (SEMmeanpooled)×100.39 Minimal detectable change % (MDC%) was calculated using SEM%, both at the individual level (MDCind % = SEM%×1.96×2) and group level (MDCgroup % = SEM%×1.96×2n , where n is the sample).38,40 Bland-Altman plots were constructed using the “BlandAltmanLeh” package in R. All statistical analyses were calculated in R Studio (v. 3.6.1).

RESULTS

Participants

Fifty-four subjects were recruited. Four subjects were excluded due to technical errors in the software program used for data collection and one subject were excluded due to pain during testing that affected performance. Therefore, a total of 49 subjects were included. Twenty subjects (males: 10; females: 10, mean age ± SD: 25.5 ± 4.2, mean body mass ± SD: 75 ± 12.3, mean height ± SD: 177.7 ± 12.8) were included for intra-tester reliability. Due to loss of data, twenty subjects were included for abduction testing, whereas 19 subjects were included for adduction testing. Twenty-nine male subjects (mean age ± SD: 23.5 ± 5.5, mean body mass ± SD: 89 ± 11.0, mean height ± SD: 187.5 ± 12.5) were included for inter-tester reliability (Figure 1).

Intra-tester reliability

No systematic bias between sessions was observed (p≥0.62). Good intra-tester reliability was observed for peak force and RFD200 for both hip adduction squeeze and hip abduction press (ICC 0.83-0.97), whereas only hip abduction press showed good reliability for RFD100 (ICC 0.76) (Table 1). The absolute intra-tester reliability (SEM %) for peak force, RFD100, and RFD200 ranged from 3.9-7.7 %, 16.9-25.2 %, and 10.5-11.8 %, respectively (Table 1). MDCind % for peak force, RFD100, and RFD200 ranged from 10.9-21.2 %, 47.0-69.9 %, and 29.0-32.7 %, respectively (Table 1). Bland-Altman plots are depicted in Figure 2.

Table 1.Intra-tester reliability of peak force, and rate of force development for hip adduction (n=19) and hip abduction (n=20)
Isometric hip actions Session 1
mean (SD)
Session 2
mean (SD)
Difference
session 1-⁠session 2
peak [CI 95 %]
Paired
t-⁠test
p-⁠value
ICC (2.1) *
[CI 95%]
SEM SEM % MDCind (%) MDCgroup
(%)
MVC – N
HADD 175.3
(48.3)
173.5
(47.3)
1.8
[-7.3; 11.0]
0.68 0.92
(0.81; 0.97]
13.3 7.7 21.2 4.9
HABD 156.2
(33.7)
156.4
(37.8)
-0.2
[-4.6; 4.2]
0.91 0.97
[0.92; 0.99]
6.1 3.9 10.9 2.5
RFD 0-100 – N/s
HADD 700.0
(300.7)
708.0
(282.6)
-7.9
[-132.6; 116.7]
0.89 0.62
[0.23; 0.84]
177.5 25.2 69.9 16.0
HABD 844.1
(327.4)
838.7
(257.9)
5.3
[-91.3; 102.0]
0.91 0.76
[0.49; 0.90]
142.5 16.9 47.0 10.8
RFD 0-200 – N/s
HADD 535.2
(179.3)
526.8 (159.1) 8.5
[--36.1; 53.1]
0.69 0.86
[0.67; 0.94]
62.6 11.8 32.7 7.5
HABD 574.0
(161.3)
581.6
(134.8)
-7.6
[-48.8; 33.6]
0.70 0.83
[0.62; 0.93]
60.5 10.5 29.0 6.7

MVC (Maximal voluntary contraction); Nm (Newton meter); Nm/s (Newton meter/second); ICC (Intraclass Correlation Coefficient); SEM (Standard Error of Measurement); MDCind (Minimal Detectable Change on an individual level); MDCgroup (Minimal Detectable Change on a group level); SD (Standard Deviation); HABD (hip abduction); HADD (hip adduction); RFD (Rate of force development); *ICC used for absolute assessment between session.

Figure 2
Figure 2.Bland-Altman plots for inter-tester reliability.

Inter-tester reliability

Systematic bias was observed between testers for all tests except adduction squeeze MVC test (Table 2). Good inter-tester reliability was observed for peak force for both hip adduction squeeze and hip abduction press (ICC 0.91-0.93). For rate of force development, all measures showed moderate reliability (ICC 0.5-0.75) (Table 2). The absolute inter-tester reliability (SEM %) for peak force and rate RFD measures ranged from 5.7-6.2 % and 10.8-18.3 %, respectively, while MDCInd(%) ranged from 15.8-17.1 and 30.0-50.8 for peak force and RFD measures, respectively (Table 2). Bland-Altman plots are depicted in Figure 3.

Table 2.Inter-tester reliability of peak force, and rate of force development for hip adduction and hip abduction (n=29)
Isometric
hip actions
Tester 1
mean (SD)
Tester 2
mean (SD)
Difference
Tester 1-⁠Tester 2
Mean [CI 95%]
Paired
t-⁠test
p-⁠value
ICC (2.1) *
[CI 95%]
SEM SEM % MDCind (%) MDCgroup
(%)
MVC – N
HADD 278.3
(57.6)
274.2
(62.3)
4.1
[-4.7; 12.9]
0.35 0.93
[0.85-0.96]
15.7 5.7 15.8 2.9
HABD 190.1
(38.1)
178.9
(36.8)
11.2
[6.4; 16.1]
<0.01 0.91
[0.56-0.97]
11.4 6.2 17.1 3.2
RFD 0-100 - N/s
HADD 1511.2
(360.7)
1245.3
(305.7)
265.8
[159.1; 372.6]
<0.01 0.50
[-0.01-⁠0.77]
252.8 18.3 50.8 9.4
HABD 1187.7
(308.6)
1055.6
(283.0)
132.1
[44.5; 219.6]
<0.01 0.64
[0.30-0.82]
180.6 16.1 44.6 8.3
RFD 0-200 - N/s
HADD 1005.3
(206.2)
900.8
(195.7)
104.5
[62.4; 146.6]
<0.01 0.75
[0.19-0.91]
103.0 10.8 30.0 5.6
HABD 759.9
(167.1)
704.5
(162.2)
50.4
[7.1; 93.6]
0.02 0.73
[0.48-0.87]
85.8 11.8 32.6 6.1

MVC (Maximal voluntary contraction); N (Newton); N/s (Newton/second); ICC (Intraclass Correlation Coefficient); SEM (Standard Error of Measurement); MDCind (Minimal Detectable Change on an individual level); MDCgroup (Minimal Detectable Change on a group level); SD (Standard Deviation); HABD (hip abduction); HADD (hip adduction); RFD (Rate of force development); *ICC with two-way random effects and absolute agreement definition.

Figure 3
Figure 3.Bland-Altman plots for inter-tester reliability.

Discussion

This study introduces a new and reliable way to measure maximal and explosive bilateral hip abduction strength using a hand-held dynamometer using a test setup unaffected by the tester’s strength. The current findings show that maximal isometric strength can be reliably measured within and between testers with low measurement error in both tests, while late-phase RFD (0-200ms) for both tests also showed good intra-tester reliability. All remaining RFD measures showed moderate reliability but with high imprecision based on the confidence intervals crossing threshold of <0.50 signifying poor reliability, and wide Limits of Agreements. This suggest that measures of maximal isometric hip adduction and abduction strength can be obtained by different testers, whereas late-phase RFD only should be obtained by the same tester as the confidence interval for inter-tester reliability is too large to ensure reliable measurements. The early-phase RFD shows large measurement error regardless of whether this was obtained by the same or different testers, and thus provide little to no utility.

Maximal isometric strength

In elite sport settings there is often a need to efficiently measure athletes’ lower body strength in a short time frame, such as during periodic testing.11,41 This can be achieved by using handheld dynamometers which easily accommodate various testing setups, such as unilateral hip adduction (ICC: 0.93) and abduction (ICC: 0.97).13 In strong athletes, external fixation may be needed dependent on the tester’s upper body strength to obtain valid measures,15,18 but this is not always feasible due to time constraints when large cohorts need testing. An alternative to unilateral testing of the hip is the long lever hip adduction squeeze test and the bilateral hip abduction press test. Both tests are quick and can easily be applied in even strong athletes without the need for an external fixation setup, thus providing a feasible way of measuring hip adduction and abduction strength in athletes. The current findings for intra-tester reliability of the long lever squeeze test are consistent with previous findings reported by Light et. al. (ICC: 0.92; SEM %: 4.3).17 Other hip squeeze test variations have shown similar reliability using either a sphygmometer (ICC: 0.81-0.94; SEM %: 1.60-3.27)42,43 or the Groinbar (ICC: 0.85-0.94; SEM %: 8.2).34,44 The inter-tester reliability of the long lever squeeze test has not yet been examined, but the current findings are consistent with those reported when using other variations of the squeeze test (short lever squeeze test, ICC: 0.91-0.92)45,46 and peak force measured unilaterally (ICC:0.92-0.94).15,47 This study is the first to examine intra- and inter-tester reliability of the long lever hip abduction press test. The current data are comparable to previous literature reporting on reliability for hip abduction strength testing, using a bilateral short-lever test in the Groinbar (ICC: 0.82)34 or performed in a user-independent portable device (ICC: 0.91).35

Rate of Force development

The present study is the first to establish reliability data on RFD measures in the long-lever squeeze test and the novel hip abduction press test. These data indicate that late-phase RFD (0-200ms) showed good intra-tester reliability for both tests, while remaining RFD measures showed poor reliability. Two previous studies have reported reliability for RFD measured during bilateral hip adduction squeeze and abduction press tests; both studies applied a short lever position, precluding direct comparison with the present study. Desmyttere et al. found moderate to good intra-tester reliability of peak RFD using a 200 ms moving average during bilateral short lever adduction and abduction testing using the Groinbar (ICC 0.81 [95 % CI: 0.65-0.90] and 0.68 [95 % CI: 0.42-0.83]).34 In contrast to the present study, good reliability has been reported for early-phase RFD (0-100 ms) using a user-independent device to measure bilateral hip adduction and abduction strength in the short lever position, while early-phase RFD has also showed good reliability in both abduction and adduction when performed unilaterally.14 These discrepancies in testing properties may be explained by different set-ups, but suggest that, if explosive bilateral hip adduction and abduction strength in the long-lever position is of interest across athletes or over time, this should be obtained by the same tester and only late-phase RFD should be considered. The lower reliability for early-phase RFD in our study could be explained by the timeframe of 0-100 ms being too short to coordinate a simultaneous fast contraction with both legs.

Application to clinical practice

The procedure examined in this study is feasible to be included in clinical practice, as a quick and simple method for testing maximal isometric strength and late-phase explosive strength in strong athletes without potential bias related to the tester size or strength.15,18 In research settings, evaluation of changes is often of interest at a group level. The MDCgroup% reported in the present study is calculated based on 20 participants for the intra-tester and 29 participants for the inter-tester part. Using an equivalent sample size, changes at group level exceeding 5 % for peak force and 7.5 % for late-phase RFD (0-200 ms) can be detected with 95 % certainty,40 when using both a single- and multiple-tester setup. When applied on individual patients or athletes in the clinical setting, changes would have to exceed ~10-20 % for peak force and ~30 % for late-phase RFD (0-200 ms) to be detected with 95 % certainty. The is further indicated by the width of the Limits of Agreement of the Bland Altman plot, suggesting that high uncertainty between two measures is expected. Although these numbers are high they should be considered in relation to findings in injured athletes. As an example, prior-season groin pain lasting more than six weeks, is related to in average 19 % decrease in peak strength during the long lever squeeze test in currently uninjured players.8 For explosive muscle strength, even larger deficits may be expected; Nunes et al. observed a decrease in explosive strength of 33 % in the hip abductors in females with patellofemoral pain.48 However, explosive hip adduction and abduction strength and its relation to groin pain have not yet been established.

Methodological considerations and Limitations

A limitation in the present study is the systematic bias observed in the inter-tester data, which could have been affected by dynamometer placement, instruction, or encouragement during testing. However, it might be possible to minimize these differences by further standardization of testing procedures and more focus on tester calibration. Since the adductor muscle force angle depends on the testers forearm length, this may also contribute to the systematic bias, thus influencing inter-tester reliability. However, the authors did not collect data related to forearm length, thus it cannot be concluded if this affected the reliability in the present study. A further limitation is that only healthy subjects were included. Thus, future studies should be performed including athletes with groin pain, to understand if groin pain may affect reliability.

Conclusion

Assessment of maximal isometric strength in hip adduction squeeze and abduction press test showed good intra- and inter-tester reliability, whereas only 0-200 ms rate of force development demonstrated good intra-tester reliability of both tests. Therefore, rate of force development should preferably be conducted by the same tester, while this is less important for isometric peak torque.


Conflicts of interest

The authors report no conflicts of interest.