Analyzing musculoskeletal neck pain, measured as present pain and periods of pain, with three different regression models: a cohort study

Background In the literature there are discussions on the choice of outcome and the need for more longitudinal studies of musculoskeletal disorders. The general aim of this longitudinal study was to analyze musculoskeletal neck pain, in a group of young adults. Specific aims were to determine whether psychosocial factors, computer use, high work/study demands, and lifestyle are long-term or short-term factors for musculoskeletal neck pain, and whether these factors are important for developing or ongoing musculoskeletal neck pain. Methods Three regression models were used to analyze the different outcomes. Pain at present was analyzed with a marginal logistic model, for number of years with pain a Poisson regression model was used and for developing and ongoing pain a logistic model was used. Presented results are odds ratios and proportion ratios (logistic models) and rate ratios (Poisson model). The material consisted of web-based questionnaires answered by 1204 Swedish university students from a prospective cohort recruited in 2002. Results Perceived stress was a risk factor for pain at present (PR = 1.6), for developing pain (PR = 1.7) and for number of years with pain (RR = 1.3). High work/study demands was associated with pain at present (PR = 1.6); and with number of years with pain when the demands negatively affect home life (RR = 1.3). Computer use pattern (number of times/week with a computer session ≥ 4 h, without break) was a risk factor for developing pain (PR = 1.7), but also associated with pain at present (PR = 1.4) and number of years with pain (RR = 1.2). Among life style factors smoking (PR = 1.8) was found to be associated to pain at present. The difference between men and women in prevalence of musculoskeletal pain was confirmed in this study. It was smallest for the outcome ongoing pain (PR = 1.4) compared to pain at present (PR = 2.4) and developing pain (PR = 2.5). Conclusion By using different regression models different aspects of neck pain pattern could be addressed and the risk factors impact on pain pattern was identified. Short-term risk factors were perceived stress, high work/study demands and computer use pattern (break pattern). Those were also long-term risk factors. For developing pain perceived stress and computer use pattern were risk factors.


Background
The literature on musculoskeletal disorders contains ongoing discussions concerning both the relevance of the measured outcomes [1][2][3][4] and the lack of longitudinal studies [5,6]. Many previous studies of upper body musculoskeletal disorders (MSD) focus on work-related MSD. One group of interest is computer users, among whom several factors have been identified as being associated with MSD; physical/ergonomic factors, working technique, work organization, hours spent typing, psychosocial factors, and gender [7][8][9].
Some studies distinguish between specific and nonspecific MSD. Two recent studies indicated that certain physical risk factors were more strongly associated with specific disorders than with nonspecific pain in the upper limbs [12,13]. Another study, analyzing nonspecific regional MSD, identified psychosocial risk factors at work [1]. One recent study concluded that chronic and widespread musculoskeletal symptoms in neck or upper extremities were the main risk factors for self-reported generally reduced productivity due to musculoskeletal symptoms [15]. Several motivating reasons for further research are mentioned in the literature described above. More longitudinal and large-scale studies would permit conclusions about temporal relationships and are still requested in systematic reviews [5,16].
In the present study, three different regression models were used to analyze the longitudinal data, in order to address both the issue of better understanding of how risk factors affect musculoskeletal neck pain, and the issue of using relevant outcomes. Similar models have been proposed earlier, but either were not studied in combination or were not thoroughly discussed in terms of their benefits and their interpretation [14,17].
Musculoskeletal pain is not a clear event in time that happens at one time-point, f ex as death due to cancer or a first-time stroke. The pain comes and goes and this could be called recurrent pain [18]. The pain could also be longlasting, even if the intensity sometimes varies, and this could be called persistent pain [18]. In the present study the course of pain could have either of these qualities. The term 'ongoing pain' is here defined as pain that was present during one year or part of that year, and then present also during the following year or part of it. That is, ongoing pain could be either recurrent or persistent pain, but which of these that are present can not be determined by the data in the present study. 'Developing pain' is here defined as when responders had no pain or only experienced pain periods lasting less than 8 days the year preceding baseline, and then at one-year follow-up had one or more periods of pain lasting for at least 8 days. Note that this is not necessary newly developed pain.
In this paper the theoretical frame-work will be based on the balance theory [19,20], in which health is supposed to be negatively affected from imbalance between various factors (at work and in leisure time). The demand-controlsocial support model [21] is a less extensive model, that here will be seen as included in a part of the balance theory. Research supports the relationships between psychosocial factors and musculoskeletal neck pain [5,16]. Note that here in a group of university students the lines between work and leisure time is not as clear as among other groups in working life.
Risks factors can have a short-term influence or a longterm one. For the purposes of the present study, shortterm influence refers to the situation where the exposure and current pain are close together in time. Using different regression models and different outcome variables, the present paper tries to evaluate factors as short-respectively long-term.
The general aim of this paper was to analyze musculoskeletal pain in the neck or upper back, in a group of young adults. The explanatory variables considered were psychosocial factors, computer use, demands, and lifestyle. Specific aims were: 1. To determine which factors are long-term and shortterm risk/protective factors for musculoskeletal pain.
2. To investigate whether the same factors are involved in the development of musculoskeletal pain as in ongoing musculoskeletal pain.

Material
The material in this study is based on a cohort, recruited in 2002, of university students enrolled in medical and ITrelated studies. Baseline and yearly follow-up data were collected with an internet-based questionnaire; the baseline response rate was 70 percent. For a more extensive description of the study base see [22]. The present study uses data from the baseline, as well as the one-year and two-year follow-ups, among the 1204 respondents (628 women, 576 men) to the baseline questionnaire.
All subjects received written information concerning the study and their right to refuse to participate. Subjects agreed to participate by sending their approval by e-mail. This procedure was approved by the ethics committee of the Medical Faculty at the University of Gothenburg.
The outcome in this paper concerns pain in neck or upper back, according to the phrasing of the question in the questionnaire (Table 1). This is according to the consensus of the definition of the neck-region according to the Neck Pain Task Force [23]. We will for simplicity in the following text refer to neck pain in the meaning of our regions.
Three outcome variables were used; pain at present, a period of pain, and the number of years with pain (Table 1).
For the outcomes 'pain at present' and 'a period of pain', there were no missing values at baseline; however, 77 respondents failed to answer either question at both follow-ups, and 242 answered the one-year follow-up but not the two-year follow-up. Hence, the one-year data included 1127 respondents and the two-year data included 885 respondents.
The explanatory variables used in this study were sorted into six blocks: background, lifestyle, demands, psychosocial factors, computer use and health. Some questions were combined into new variables; f ex high work/study demands (Table 2), due to multicollinearity problems.
The explanatory variable 'perceived stress' is based on the question validated by Elo [24].

Statistical methods
For the outcomes 'pain at present' and 'a period of pain' raw baseline prevalence's were calculated. For the outcome 'number of years with pain' the proportion in each category was presented.
Three methods for regression analysis were used [14,25,26]: a marginal logistic regression model with an outcome of 'pain at present'; a Poisson regression model with an outcome of 'number of years with pain'; and a logistic Markov transitional model with an outcome of 'a period of pain'. The models are further explained below. In the first step of the analysis, each of the three models included one explanatory variable at a time together with gender (a simple regression model). In the second step of the analysis all those explanatory variables with P-values ≤ 0.2 were included in a multiple regression model. Gender was included as above. The choice of this limit, rather than P > 0.05, as an inclusion criteria for explanatory variables was made in order not to miss an explanatory variable with a possible association with musculoskeletal pain. We wanted to make sure to exclude only those variables that did not seem to be associated with musculoskeletal pain, or that were estimated with such uncertainty that no useful information was achieved. All analyses were performed using the PROC GENMOD procedure in the SAS statistical package (version 9.1, SAS Institute, Cary, NC, USA). Statistical significance was set at P ≤ 0.05. In all regression models, Wald type 3 P-values were used to assess the effect of each factor on the outcome variables. All P-values are two-sided.
For the binary outcomes 'pain at present' and 'a period of pain' odds ratios (OR) were achieved both from the simple regression models and from the multiple regression models. The OR and corresponding p-values derived from logistic regressions were used to test the association between different explanatory variables and the outcomes [27]. In addition proportion ratios (PR) were calculated. Here we use PR to denote the ratio between the proportions of those with the outcome, comparing the two exposure groups of interest. Other terminology in the literature for PR is among others; risk ratio (RR), probability ratio (PR), prevalence proportion ratio (PPR) [28][29][30]. The proportion ratios were calculated in order to estimate the magnitude of the exposure effect. The calculations were based on the parameter estimates in the simple logistic regression (as suggested in [27,31]). These calculations produces one estimate of PR for each possible combination of other explanatory variables included in the regression, and were therefore not calculated in the case of the multiple regression models. Here we calculated one PR for women and one PR for men and the adjusted PR is then the mean of these two PR's [25]. There is no straightfor-  ward procedure known to the author for providing appropriate CI for the indirect PR and it is out of the scope of this paper. Therefore in the results section estimates of OR, 95% CI and their p-values will be presented in combination with the estimates of the gender adjusted PR.

Marginal model for binary outcome (short-term effect)
The binary outcome 'pain at present' was modelled with a marginal logistic model [32]. The aim was to investigate whether the risk factors had a short-term effect (outcome and exposure occurring close together in time). A marginal model takes into account the repeated measurement structure of the data by modelling the correlation structure; it is an appropriate choice of model when focusing on population averages [32]. For each person, three observations of 'pain at present' were included in the analysis, one for each year. The explanatory variables were chosen from the same year as the outcome, as the intent was to determine short-term risk factors. If the explanatory variable varied over the three years, and the outcome also varied with the same pattern, the results were interpreted as an indication of a short-term effect. If the explanatory variable varied over the three years, and the outcome variable also varied but with a different pattern, then no association could be found and this was interpreted as a lack of short-term effect. If both the explanatory variable and the outcome were stable over time then no discrimination could be made between short-term or long-term effects.
The response variable Y (here 'pain at present') was assumed to follow a Bernoulli distribution with parameter p, where p = P(Y = 1) and where x 1 , x 2 , ..., x k are the explanatory variables, index i refers to the individual, and index t refers to the time point (0, 1, 2). The odds ratio for the effect of x j is exp(βj). The quasi-likelihood equations, which are known as generalized estimating equations (GEE), are used to take into account the repeated structure. The SAS procedure PROC GENMOD (link = logit, distribution = binomial, working correlation structure = exchangeable) was used. The GEE method gives consistent parameter estimates even if the correlation structure is misspecified; see [32] p. 468). However, it should be noted that the empirically-based standard errors which are used when making inferences about parameters with Wald statistics and asymptotic normality of estimators tend to underestimate the true errors unless the sample size is quite large [32] p. 467-468). Analysis was performed with both the exchangeable and the unstructured working correlation structure. The unstructured working correlation structure, which is more flexible then the exchangeable, was chosen as it gave slightly smaller P-values.

Poisson model for counts (long-term effect)
A Poisson model was used to analyze 'number of years with pain'. Thus, the longitudinal data has been summarized into one value for each person. For the explanatory variables the baseline values were used. The explanatory variable then precedes all the yearly outcomes. The purpose of the analysis was to identify possible long-term risk factors.
The observations, Y (here 'number of years with pain'), were assumed to follow a Poisson distribution with expected value μ, where where x 1 , x 2 , ..., x k are explanatory variables and index i refers to the individual.
A model allowing for over-dispersion was used; that is, a Poisson distribution where the variance was larger than the expected value. The PROC GENMOD procedure of SAS was used (link = Poisson, distribution = Poisson, scale = Pearson). The analysis indicated some over-dispersion (over-dispersion parameter = 1.1), and by allowing this over-dispersion, the standard errors were not too small.
The response variable is based on the sum of the binary outcome for the three years. The sum of binary variables follows a Poisson distribution if the binary variables are independent, but here, in fact, the outcomes from one person are likely to be correlated. Hence, the true effect of exposure might be underestimated (as was observed in [33], and this must be remembered when interpreting the results.

Markov transitional model for binary outcome (developing and ongoing pain)
The binary outcome 'a period of pain' was modelled with a Markov transition model. The aim of this analysis was to identify those factors that might have a long-term influence, by investigating whether an exposure at baseline was associated with the development or recurrence of pain during the following year. The response variable 'a period of pain' was assumed to follow a Bernoulli distribution, with parameter p (p = P(Y it = 1)). First, the development of pain was studied, by only considering those who were pain-free at baseline (Y 0 = 0) and modelling the probability of having pain at follow-up: ..., , where x j, t refers to explanatory variable j at time t.
Next, the recurrence of pain was studied by considering those who had pain at baseline (Y 0 = 1), and modelling the probability of having pain at follow-up:

Descriptive
There were some differences between men and women in the distribution of the explanatory variables ( Table 3). All analyses were therefore adjusted for gender (men, women).
The variables which differed over the three years were work/study time, physical activity (to some extent), high work/study demands, computer use pattern and perceived stress; it was thus possible to test these variables for shortterm effect on musculoskeletal pain ( Table 3).
The prevalence of 'a period of pain' at baseline, 23 percent, seems to be in the same range as the prevalence of upper back and neck pain after work among the younger section of the Swedish workforce (16-29 years) in 2003 [34]. The prevalence of pain at present was just slightly lower, 20 percent. Among the respondents, 15 percent developed pain between baseline and the one-year follow-up, and 52 percent had ongoing pain. If looking over the whole three year period; a little more then half of the respondents, 61 percent, did not report any year with a period of pain, while 20 percent reported one year with pain period, 11 percent reported two years with pain period, and 8 percent reported pain period all three years.

Regression analysis
The first step in the regression analyses was to include one explanatory variable at the time together with gender (results presented under OR and PR or RR in Table 4, 5, 6). The second step was a multiple regression where gender, together with explanatory variables with p ≤ 0.2 in step 1, was included.
In the multiple regression model for 'pain at present' gender (women compared to men) was a statistically significant explanatory variable for neck pain ( Table 4). The risk factors were smoking, high work/study demands, computer use pattern and perceived stress; high home life demands was a risk factor in the simple model, but in the multiple regression turned out not statistically significant (Table 4). Breakfast regularly, as a protective factor, was only close to statistically significant in the multiple model, though statistically significant in the simple model of pain at present (Table 4).
..., . In the multiple regression model for 'number of years with pain' the statistically significant explanatory variables were gender (women compared to men), computer use pattern, high work/study demands and perceived stress (Table 5). In the simple models smoking, high home life demands and asthma were risk factors and breakfast regularly was a health factor, but none of these were statistically significant in the multiple regression model (Table 5). It should be noted that the rate ratios from the Poisson model for 'number of years with pain' are interpreted as the ratios of expected numbers of years with pain, and are not comparable in magnitude with the corresponding odds ratios.
In the multiple model for developing musculoskeletal pain during the last year ('a period of pain') the statistically significant explanatory variables were gender (women compared to men), computer use pattern and perceived stress (Table 6). Asthma also was a risk factor for developing pain in the simple model, but only close to statistically significant in the multiple model (Table 6).
In the multiple model for ongoing musculoskeletal pain during the last year ('a period with pain') none of the factors were statistically significant, but perceived stress was close (Table 7). In the simple model the explanatory variables women (compared to men) and perceived stress were statistically significant (Table 7).
From the simple regression models we have that the prevalence for neck pain among women were 28 percent (pain at present), 22 percent (developing pain) and 56 percent (ongoing pain); and the prevalence for neck pain among men were 12 percent (pain at present), 9 percent (developing pain) and 41 percent (ongoing pain).

Principal findings
Pain was more prevalent among women than men for all outcome measurements, except for ongoing pain where result were indistinct. Perceived stress was a risk factor regarding developing pain, and was both a short-term and a long-term risk factor. Moreover, the results showed that high work/study demands were a short-term and longterm risk factor for neck pain. Computer use pattern was a risk factor for developing pain, but also both a short-and a long-term risk factor. The above findings, regarding type of factor and direction of association, are consistent with a systematic review concerning neck pain [5] and with results in more recent studies [3,8,11,35]. Smoking was a risk factor for pain at present. Less certain results regarding possible risk factors were that smoking was, in the simple model, associated with number of years with pain. High home life demands were, in the simple models, associated with pain at present and with number of years with pain. Asthma was, in the simple models, associated with developing pain and number of years with pain. In an earlier All odds ratios are adjusted for gender (using men as the reference category). For the adjusted odds ratios both p-values and 95% confidence intervals (CI) are presented. The total number of respondents varied between N = 737 -885, because of incomplete data. Estimates were not calculated for explanatory variables with five or fewer exposed cases. All rate ratios are adjusted for gender (using men as the reference category). For the adjusted odds ratios both p-values and 95% confidence intervals (CI) are presented. The total number of respondents varied between N = 677 -801, because of incomplete data. All odds ratios are adjusted for gender (using men as the reference category). For the adjusted odds ratios both p-values and 95% confidence intervals (CI) are presented. The total number of respondents varied between N = 267 -326, because of incomplete data. Estimates were not calculated for explanatory variables with five or fewer exposed cases.
study, association between asthma and low back pain was shown [36].
Results for ongoing pain were more uncertain, possibly due to the lower number of observations as only those reporting a period of pain in the baseline questionnaire were included. For ongoing pain perceived stress was a risk factor in the simple model, but only close to statistical significance in the multiple model. In the simple model gender was associated with ongoing pain, but in the multiple model the result was inconclusive. Note though that for all explanatory variables in the multiple regression the estimated OR:s were of the same size as in the simple regressions. The only exception was for gender, where the OR for women compared to men decreased slightly in the multiple regression compared to the simple regression.
In all analyses, work/study time and physical activity (measured as hours per week) seemed to have very small or no effect (0.98 ≤ OR ≤ 1.0, 0.99 ≤ PR ≤ 1.0, 0.99 ≤ RR ≤ 1.0) and were not statistically significant in any analyses. This was also the case for overweight (0.87 ≤ OR ≤ 1.0, 0.82 ≤ PR ≤ 1.1, RR = 0.95).
The proportion of women developing pain was more than twice the proportion of men who develop pain (PR = 2.5). For ongoing pain, the corresponding proportion ratio was 1.4, meaning that the proportion neck pain for women were in this case much closer to the proportion for men. This is consistent with recent systematic reviews that reported that women in the general working population were slightly more likely to report neck pain compared with men, but that the evidence is inconsistent regarding the role of gender in recovery from neck pain [16,37]. It would be interesting to investigate, in further studies, whether the differences between women and men in musculoskeletal pain mainly occur in the development of pain. A hypothesis worth testing would be that there is still some unexplained difference between men and women in the development of pain, while differences between men and women who already have pain can to a larger extent be explained by exposure to known risk factors. Suggested reasons in earlier studies behind the gender differences in MSD include the over-representation of women in sedentary, and repetitive work, and the persistent gender imbalance in domestic work [38]. As the present study group consisted of young university students mainly without children the above explanation does not seem to apply here.

Strengths and weaknesses of the study
One of the limitations of the present study is that the outcome variables do not incorporate the disability or burden of the pain. This study should be seen as a first step towards stronger analysis of musculoskeletal outcome by using longitudinal data.
One of the aims of this study was to classify factors as having long-term or short-term effects. In order to allow interpretation of the explanatory variables as long-term or All odds ratios are adjusted for gender (using men as the reference category). For the adjusted odds ratios both p-values and 95% confidence intervals (CI) are presented. The total number of respondents varied between N = 214-264, because of incomplete data. Estimates were not calculated for explanatory variables with fewer than five exposed cases.
short-term effects, we investigated the stability of these variables over time. High work demands, computer use pattern, and perceived stress varied over the years, and could therefore be investigated regarding short-term effects in the model for pain at present and regarding long-term effects in the model for number of years with pain, the model for development of pain, and the model for ongoing pain. Factors that were not changeable (gender), or that could vary but did not in the present material (breakfast regularly, smoking) could not be investigated concerning short-term effects.
The type of self-reported neck pain outcome used in the present study is noted to be sensitive to effects of seasonal variation at different follow-ups [39]. In the present study the time of the year for the data collection was the same to avoid this seasonal effect.
As in many studies, both the outcome and the exposure were self-reported. This could result in recall bias; that is a tendency for individuals who have a high level of pain to over report their exposure, and the reverse tendency among the pain free individuals. Interpretation of effects in the analysis of the marginal model should be made with caution, as the outcome and exposure were measured at the same time. However, as the questions about exposure referred to either the previous week or to an undefined time period previous to the measurement, it is possible to interpret the results as indications of temporal effect.
Due to the use of different regression models and different outcomes, the integrated conclusions drawn from these analyses are more informative than the conclusions from a single one of these regressions. The different outcome variables used here represent temporary or ongoing musculoskeletal pain.

Meaning of the study: possible implications
The most consistent risk factors were perceived stress, high work/study demands and computer use pattern. In the frame work of the balance theory this could be interpreted as factors measuring some aspects of lack of balance between load and recourses or possibility for recovery. Computer use pattern to some degree measures the constant low intensity physical exposure of computer work without breaks for recovery. High work/study demands when negatively affecting home life could represent demands too high to allow for sufficient recovery after work. Perceived stress could be interpreted as the persons own perception of imbalance between loads and resources. This interpretation of the most important risk factors suggests that the preventive as well as the curative work, related to musculoskeletal neck pain, should focus on the balance between load and resources/recovery.

Future research
These results should be confirmed in further longitudinal studies, where it would be desirable to investigate an outcome that includes the dimension of disability or burden and also to perform intervention studies to better understand causal effects.

Conclusion
Perceived stress, high work/study demands and computer use pattern (break pattern) were short-term risk factors as well as long-term risk factors for musculoskeletal neck pain. Perceived stress and computer use pattern were involved in developing neck pain. By the use of different regression models the different aspects of the neck pain pattern could be addressed and the risk factors impact on the pain pattern was identified.