Between January 2001 and June 2003, 103 general practitioners recruited workers with a new episode of shoulder pain in three geographic areas in the Netherlands (Amsterdam, Groningen and Maastricht). Workers were selected if they were 18 years or older of age, had a paid job (all kind of workers, either on a permanent or a temporary contract), and had not consulted their GP or receive any form of treatment in the preceding 3 months for the afflicted shoulder. Sufficient knowledge of the Dutch language was required to complete written questionnaires. Exclusion criteria were serious physical or psychiatric conditions (i.e. fractures or dislocation in the shoulder region; rheumatic disease; neoplasm; neurological or vascular disorders; dementia). There was no restriction with respect to type of work or occupation, or whether or not the patient was on sick leave at the time of consultation.
Management of shoulder pain
All workers received standardised treatment according to the 1999 version of the Dutch guidelines for shoulder complaints issued by the Dutch College of General Practitioners[14, 15] which consists of information on the prognosis of shoulder pain, advice regarding provoking activities, and stepwise treatment consisting of paracetamol, NSAIDs, corticosteroid injection or referral for physiotherapy. The GP made the decision regarding the content of treatment based on duration and severity of pain and disability. The participating general practitioners were educated and trained to apply treatment according to this guideline.
Some GPs may have advised participants regarding return to work, but in the Netherlands this is primarily the task of an occupational physician.
Within a few days after consultation all workers received a questionnaire by post and completed it (approximately 40 minutes). The questionnaire contained questions on sociodemographic variables, disease characteristics; physical activity, physical work load, psychosocial work environment, and psychological factors. Disease characteristics included pain intensity (0–10 point rating scale), shoulder pain related disability, pain onset, duration of symptoms, previous episodes of shoulder pain (i.e. having had at least one week of shoulder pain in the past), sick leave (due to shoulder pain) in the two months prior to consultation, and comorbidity. Physical activity was measured with a single question (less/equally/more active than others). A physical examination was carried out by trained assistants. Research assistants were trained thoroughly, and consistency was checked during several meetings. In some centres the research assistant was either a physiotherapist or another assistant with experience with examining patients with musculoskeletal pain.
Physical work load was measured with a self-constructed scale of 7 questions (yes/no) concerning pushing and pulling, lifting weights; working with vibrating tools, lifting weight on one shoulder, working with hands above shoulder level, repetitive movements, and sitting in the same position for a long period of time. Each item was scored positive if the participant performed the activity on at least two days a week. Factor analysis showed that the first five items reflected one dimension (total score 0–5, Crohnbach's α = 0.74). Repetitive movements and sitting were analysed as separate items.
The pychosocial work environment was assessed with five dimensions of the Job Content Questionnaire (JCQ), which measures all dimensions of the widely used Demand-Control-Support model. On a four point scale (totally disagree, disagree, agree, totally agree) workers rated several aspects of their work. The JCQ consists of the dimensions quantitative job demands (5 questions, sumscore 5–20); skill discretion (5 questions, sumscore 5–20); decision authority (3 questions, sumscore 3–12); co-worker support (4 questions, sumscore 4–16) and supervisor support (4 questions, sumscore 4–16), as proposed by Karasek et al.[16, 17] and clinimetrically evaluated by De Jonge et al..
Previous research had shown that psychosocial factors may be important in the transition from acute to more chronic pain problems. One of the objectives of our cohort study was to test this hypothesis. The bio psychosocial model, and in particular the fear-avoidance beliefs model was used as a theoretical framework when selecting our prognostic factors. All mentioned factors are elements of these theoretical models. The following psychological variables were measured: pain coping, anxiety, depression, somatization, distress, fear-avoidance beliefs, kinesiophobia, and beliefs regarding the causes of shoulder pain. Pain coping was assessed with the 43-item Pain Coping and Cognition List (PCCL), consisting of the subdomains catastrophizing (1–6 points), coping with pain (1–6 points), internal (1–6 points) and external locus of control (1–6 points). Anxiety (0–24 points), depression (0–12 points), somatization (0–32 points), and distress (0–32 points), were measured with the 50-item Four-Dimensional Symptom Questionnaire (4DSQ) . Fear-avoidance beliefs were assessed using the 4-item physical activity subscale of the Fear Avoidance Beliefs Questionnaire (FABQ; 0–24) . Kinesiophobia was measured using two items of the Tampa Scale for Kinesiophobia (TSK; 0–12) . Participants were also asked about their beliefs regarding the possible cause of shoulder pain: unexpected movement; strain during unusal activities; overuse or strain or during regular activities; trauma; sports injury; or unclear (yes/no). Finally, our baseline questionnaire included a general one-item question regarding the presence of psychological problems: "Do you have any psychological complaints, such as distress, depression, or anxiety?" (yes/no).
Function of the shoulder joint and cervicothoracic spine were tested during a physical examination. For the glenohumeral joint active and passive abduction, passive exorotation , and shoulder impingement  were tested. Two alternative functional tests, HIB (Hand-in-back) and HIN (Hand-in-neck) [25, 26] measured on a 7-point scale (score 0 = very poor range of motion, score 7 = full range of motion) were performed as well. The assistant made an estimation of the range of motion in degrees (°).
During all mobility tests self-reported pain was assessed on a 4-point scale (0 = no pain; 3 = severe pain). A factor analysis on the results of a physical examination in a similar population of patients with shoulder pain resulted in four factors: shoulder mobility, shoulder pain, neck mobility, and neck pain .
The factor 'shoulder mobility' consisted of 6 mobility tests: HIB, HIN, active abduction, passive abduction, external rotation, and Impingement. For calculation of the sum score (0–18 points) variables were recoded into a 4-point scale, with 0 reflecting full range of motion and 3 points reflecting very poor range of motion. HIB/HIN scores were recoded as: score 7 = 0; score 5 and 6 = 1; score 3 and 4 = 2; score 1 and 2 = 3. Abduction (active and passive) was recoded as 170–180° = 0; 140–170° = 1; 90–140° = 2; 0–90° = 3. External rotation was recoded as >80° = 0; 70–80° = 1; 50–70° = 2; <50° = 3. During the impingement test pain was measured (0 = no pain; 3 = severe pain). The factor 'shoulder pain' (0–18 points) consisted of the sum of the pain scores during the mobility tests.
The factor 'neck mobility' (0–4 points) consisted of rotation of the cervicothoracic spine in neutral, flexed, and extended position and lateral bending. These range of motion tests were scored as (1 = decreased range of motion, and 0 = no decreased range of motion). The factor 'neck pain' (0–18 points) consisted of the sum of the pain scores during flexion and extension of the neck, rotation in a neutral, flexed and extended position, and lateral bending.
The outcome was measured by postal questionnaires at 6 weeks, 3 and 6 months. Our primary outcome measure was sick leave due to shoulder pain (yes = ≥ 1 day, no = 0 days). Secondary outcome measures were patient perceived recovery, shoulder disability, measured with the 16-item shoulder disability questionnaire (SDQ; 0–100) , shoulder pain (0–10 numeric rating scale) , and severity of the main complaint (0–10 numeric rating scale) . We studied the relationship between our primary and secondary outcome measures to determine if workers reporting sick leave during follow-up showed higher levels of pain and disability.
Missing values of patient characteristics were imputed (approximately 2% of all required values). Imputation was based on the correlation between the variable with missing values with the other patient characteristics. Univariable logistic regression analyses were performed for all potential prognostic indicators with our primary outcome measure, i.e. sick leave during 6 months following first consultation. Variables that had a statistically significant association with the outcome (p-value ≤ 0.20) were selected for the backward selection in the multivariable analysis and checked for co-linearity. If the correlation between two potential predictors was larger than 0.5, we included in our multivariable analysis the predictor that was considered to be most relevant to the general practitioner and was easy to measure. We adopted a hierarchically approach in the variable selection in which easily obtainable predictors were included first. Therefore, variables were selected in blocks of increasing effort to obtain during consultation: 1) socio-demographic factors and disease characteristics; 2) physical factors; 3) psychosocial work environment; 4) psychological factors; 5) physical examination. Variables with the lowest predictive value were deleted from the model until further elimination of a variable resulted in a statistically significant lower model fit estimated with the likelihood ratio test (p ≤ 0.20).
Prediction models usually provide too extreme estimates when no correction is applied in the development phase. Therefore, we used bootstrap samples to estimate a 'shrinkage factor' (between 0 and 1) . The regression coefficients were subsequently multiplied with this shrinkage factor to prevent the model for overfitting and overoptimism. Bootstrap samples were drawn with replacement (100 replications) from the full data set. The backward selection of variables and model fitting was repeated within each bootstrap sample. Bootstrapping techniques were also used to study the internal validity of the final prediction model [31, 32]. The model's performance obtained after bootstrapping can be considered as the performance that can be expected in similar future patients. All analyses were performed using S-plus 6.1 (Insightful Corp., Seattle, WA, USA).
Evaluation of the model
The reliability of the multivariable model was determined with the Hosmer-Lemeshow goodness-of-fit statistic . Calibration of the model predictions, which is related to reliability, was assessed by plotting the predicted individual probability against the observed sick leave. For this, workers were grouped into quintiles according to their predicted probability for sick leave according to the model. The prevalence of the endpoint within each quintile represents the observed probability. The area under the receiver-operating characteristic curve (ROC) was used to assess the performance of the model in terms of accuracy of correct prediction. The ROC-curve is a plot of the true positive rate (sensitivity) against the false-positive rate (1-specificity) of the model. The curve illustrates the ability of the model to discriminate between workers with and without sick leave at subsequent cut-off points along the range of the predicted probabilities. An area under the curve (AUC) of 0.5 indicates no discrimination above chance, whereas an AUC of 1.0 indicates perfect discrimination.
Prediction of an individual patient's risk
We developed a clinical prediction rule for sick leave during 6 months following first consultation, to provide general practitioners and occupational health care providers with an estimate of the absolute risk of sick leave for individual workers. Since we used logistic regression, the probability (P) of sick leave was predicted with P = 1/[1+ exp- (a0 + b1x1 + ...... + bjxj)]. The status of a patient for any dummy or binary variable included in the prediction rule can be either 0 or 1, while for a (semi) continuous variable it takes the actual observed value.
To facilitate the calculation of an individual worker's risk, we developed a score chart. We multiplied the regression coefficients by 4 and rounded them to the nearest integer to form the scores for each of the predictors. The scores of predictors which are reviewed positively are added to calculate the 'Total score'. This total score corresponds to risk of sick leave during follow-up.
The study was approved by the Medical Ethics Committee of the VU University Medical Center, Amsterdam, the Netherlands.