Design of the study
The study design was a parallel group, single blinded, randomized, controlled trial. The intervention period was 20-weeks with pre-intervention baseline measurements and post-intervention follow-up measurements. A detailed protocol paper has been published previously . The trial was conducted within the RDAF from November 2013 to April 2014. All participants volunteered and gave their written informed consent before participation. The trial was approved by the local Ethics Committee of Southern Denmark (S-20120121) and qualified for registration in ClinicalTrials.gov (NCT01926262).
Participants and randomization
Fifty military helicopter pilots and fifty-eight crew-members from two different RDAF squadrons were invited to participate in the study. Participants were informed about the project at briefings, by email, and by telephone. Thirty one pilots (2 female and 29 male) and thirty-eight crew-members (male) agreed to participate. Inclusion criteria comprised: 1) profession as a helicopter pilot or crew-member (technician, systems operator, tactical helicopter observer and/or navigator), 2) maintaining operational flight status at enrollment, 3) operational flying within the previous 6-months. Exclusion criteria comprised: 1) participation in a training intervention during the previous 12-months. Participants flow is depicted in Fig. 1. Participants were assigned a random identification number at enrollment by an authorized person with no relation to the study. After pre-intervention assessments participants were randomized 1:1 to either an exercise-training-group (ETG) or to a reference-group (REF). Participants were stratified according to the following nested criteria to ensure comparability between the ETG and REF: 1) squadron (722 squadron or 724 squadron), 2) profession (pilot or crew-member), 3) age (< or ≥ 40 years of age), and 4) flying experience (< or ≥ 2500 h). The random identification numbers within each stratum were drawn from an opaque, tossed bag. Alternately, the first number in the first strata was allocated to either the ETG or REF depending on the flip of a coin. The first number in the second strata was allocated to the opposite group, compared to the last number in the previous strata, and so forth. The randomization procedure was carried out by a blinded custodian (last author) using the random identification numbers assigned to the participants. Data analysts and statistician were blinded to the random group allocation of the participants.
All methods have been previously described , and will therefore only be explained briefly. Anthropometric measurements included: height, seated height, weight and body mass index (BMI) (Composition Analyzer Tanita Corporation of America, USA).
Participants’ pressure-pain-threshold (PPT) was measured in the neck and shoulder muscles. PPT was measured bilaterally for the trapezius muscle (TRA) (20% medially to half the distance between the lateral edge of the acromion and seventh cervical vertebra) , the upper neck muscles (UNE) (2 cm laterally to the vertical line of the axis in level with the 4’th cervical vertebra) [17, 18], and the anterior tibialis muscle (TIA) (as the point of reference) . A handheld electronic pressure algometer was used (Type II Algometer, Somedic Production AB, Sweden). The algometer was pistol-shaped with a pressure-sensitive strain gauge at the tip. The contact area had a diameter of 1 cm2. Compression pressure was applied perpendicularly to the skin with a rate of 20 kPa/s. A digital display on the pressure algometer was used to keep the rate of pressure stable. Measurements were performed three times in a fixed order: 1) right TRA, 2) left TRA, 3) right UNE, 4) left UNE, and 5) right TIA. A rest period of approximately 1 min was given between measurements conducted on the same PPT point. Participants were given a hand held control switch and were instructed to immediately press the switch when the sensation of “pressure” changed to “pain”. When the switch was pushed the compression was stopped and the pressure was released . A low level of pain sensitivity therefore equals a high PPT value, and a high level of pain sensitivity equals a low PPT value. The maximum applied pressure registered was recorded before resetting the algometer. A maximal pressure of 1000 kPa was allowed for TRA and TIA and 700 kPa for the UNE. The algometer was calibrated before each test. Measurements were conducted by an experienced sports scientist.
An online based questionnaire was applied to participants pre- and post-intervention. The questionnaire was confidential using the assigned identification numbers. A modified version of the validated Nordic Musculoskeletal Questionnaire  was used to assess the prevalence and intensity of musculoskeletal symptoms in the following body regions: neck and shoulders, upper back, elbows, low back, wrists/hands, hips/thighs, knees, and ankles/feet according to: 1) the number of days with pain or complaints in (body region) the previous 12-months (possible answers: 0 days, 1 – 7 days, 8 – 30 days, 31 - 90 days, > 90 days, or every day), 2) inability to perform daily working tasks due to complaints in (body region) the previous 3-months (possible answers: yes or no), 3) intensity of (body region) pain previous 3-months was assessed on a scale from 1-10 (10 = worst possible pain imaginable) and to this was added 0 = no pain, resulting overall in an 11 point numeric box scale, and 4) intensity of (body region) pain previous 7-days depicted on the same box scale. All questions were accompanied by chart illustrations of the body region in focus. Furthermore, participants were inquired about the amount of total flying hours in fixed-wing aircraft, rotary-wing aircraft, and flying hours with NVG. Lastly, the questionnaire included a number of health and work related questions .
Participants randomized to the ETG received an additional questionnaire regarding: 1) their motivation to train, 2) expectations, 3) training adherence, and 4) adverse training effects. Training adherence was measured by inquiring: “You were instructed to train 3x20 min a week. How did you succeed?” Possible answers: “I trained regularly 2-3 times a week, I trained regularly 1-2 times a week, I trained irregularly, but at least 4 times a month (approximately once a week), I trained irregularly but at least 2-3 times a month, I trained some, but stopped training after a while, and I did not use the training offer”.
Participants in the ETG received 20-weeks of strength, endurance, and coordination training targeting the neck and shoulder muscles. Training was divided into training sessions of 3x20 min per week. The training programme was evidence based and designed by an interdisciplinary team of sports exercise training specialists, physiotherapists, doctors and chiropractors. The training programme was composed of ten training exercises divided into three categories: 1) two warming up exercises, 2) six neck exercises, and 3) two shoulder exercises. All exercises have been described previously  and training videos are available online .
Every training session was initiated with one or two warming up exercises recruiting the deep cervical muscle groups. Exercises included: cervical flexion from a supine position and cervical rotation from an erect position. The warming up exercises were performed using 3 sets of 15 repetitions. Intensity was increased as participants progressed. These two exercises also aimed to warm up the neck before engaging in more strenuous training exercises. The warming up exercises were followed by training exercises for the larger neck muscles including: cervical extension, cervical flexion (straight forward and oblique angels), and lateral flexion. Finally, participants performed two exercises for the shoulder girdle including: shrugs and reverse flies. Neck and shoulder exercises were performed using elastic training bands for resistance (Thera-Band®, The Hygenic Corporation, USA). The training program was designed to be progressive using an undulating design with sets ranging between 2-4 and training intensity ranging between 12 – 20 repetitions depending on week of training. The training equipment used was light weight to allow for easy transportation when participants traveled between Air Force Bases. Participants received a training bag including: a head harness (Neck Flex, Gonzo Companies, USA), six color-coded levels of resistance bands (red, green, blue, black, silver and gold), exercise handles, and a door anchor (Thera-Band®, The Hygenic Corporation, USA), and a training manual that described all training exercises in detail. In addition, participants were given online access to a training homepage with supplementary training information and training videos for each exercise. All participants received a personal training diary that described when to perform the various training exercises. Training was to be performed within working hours or if preferred at leisure.
Training was based on self-management education due to a dynamic work schedule among participants and frequent travel between Air Force Bases. At the beginning of the intervention all participants received an individual or group introduction to the training program. The introduction included: 1) a detailed description of the training program and diary, 2) an introduction to all training exercises and adjustments to ensure high quality on exercise performance, and 3) practical information regarding supervision during the intervention period. Participants received at least one follow-up visit during the intervention period in order to make sure that training exercises were performed correctly, and with progression. To motivate participants in the ETG to train, motivational posters were hung on the walls in the rooms of the two squadrons and tweets were posted on the training homepage. Thus, the training was not regularly supervised but was self-administered with the above support actions.
A pre-intervention power analysis was performed on the single primary outcome variable of self-reported intensity of neck pain previous 3-months . Pain intensity was rated on an 11 point numeric box scale. The analysis showed that we would need to include 54 participants (27 experimental subjects and 27 control subjects) in this study. The analysis was based on the finding that a change of 1 measured on a 11 point numeric box scale is considered the minimum clinically significant difference regarding change in pain . We also used results on pain intensity from a previous study among military pilots that found the response within this subject group to be normally distributed with a standard deviation in neck pain intensity of 1.5 the previous 3-months . With a power set at 0.8 and a probability of a type I error of p < 0.05, we would be able to detect a true difference in mean response of neck pain between experimental and control subjects equal to ± 1.2 measured on an 11 point numeric box scale. Allowing for a 10% loss to follow-up, the total number of participants required was 64. The null-hypothesis (no difference between experimental and control subjects) was to be rejected if a between-group-difference for intensity of neck pain previous 3-months was significant (p < 0.05). The relationship between intensity of neck and shoulder pain previous 3-months, and pilots’ and crew-members: age, height, sitting height, weight, BMI, flying hours in fixed-wing aircraft, flying hours in rotary-wing aircraft, and flying hours with NVG, was analysed by multiple regression. Two statistical analyses were performed: 1) an intention-to-treat analysis comparing participants in ETG and REF as originally allocated after randomization , and 2) a per-protocol analysis only including participants in ETG that adhered regularly to the exercise intervention. Regular training adherence was defined as training between 1–3 times a week during the 20-week intervention period (≥ 33.3% of the total amount of training sessions). For missing values last observation carried forward or backwards were imputed. If observations were missing at both baseline and follow-up, the population mean was imputed at baseline, and group mean (ETG or REF) at follow-up. Between-group-differences at baseline were analyzed using the Student’s t-test. The same analysis was performed at follow-up, including an analysis on delta-values (calculated by subtracting the pre- from post-intervention values). Within-group-changes were analyzed by a paired t-test. The level of statistical significance was p < 0.05. Results are presented as sample means and standard deviations (mean ± SD) if not otherwise specified. Statistical analyses were performed in Stata Statistics/Data Analysis version 13.0 (StataCorp LP, USA).