Patients were recruited from a secondary sector specialized out-patient back pain clinic and was conducted over a two-year period from August 2005 to August 2007. This clinic receives referrals from primary care physicians or primary care chiropractors when at least four weeks of treatment in primary care by a family physician, chiropractor, physical therapist, or a combination thereof has not resulted in satisfactory improvement. At the back clinic all patients received extensive examination and diagnostic procedures and information about self-care for back pain and attended group exercises twice a week for four weeks before being offered inclusion into this trial. To be included in the project participants
had LBP with or without leg pain > 8 weeks
had averaged pain > 3 during the past two weeks on the 11 point numeric rating scale
had completed four weeks of treatment in the primary sector by a family physician, chiropractor, physical therapist, or a combination thereof
had concluded all examinations, individual and group treatment at the back clinic with at least a 75% attendance rate
were able to read and understand Danish
Exclusion criteria were
For eligible participants interested in participating in this trial, baseline data were collected one week after ending the four week group exercise program.
Randomization and interventions
Randomization was carried out by a project secretary after collection of the baseline data. Participants drew a sealed opaque envelope containing information about treatment allocation. Envelopes were arranged in clusters of 15 to secure an even spread over the three groups and was therefore an ongoing process over the two years of recruitment.
Group A received instruction and performed NW in groups of 6-8 twice a week for 8 weeks under supervision of a specially trained NW instructor. As soon as a group was filled the NW commenced resulting in different groups performing this intervention at different time points over the two years. There were three different scenic routes between three and four kilometres long. In order to determine the desired walking intensity in the supervised NW group, we placed accelerometers on the NW instructors for the first couple of sessions at the very beginning of the trial. By using, comparing, and averaging all the values from the instructors while they were performing NW with the first participants, an intensity interval arose that was used as reference. Each session lasted around 45 minutes and while encouraged to walk at the predetermined intensity, not all participants were able to comply. Thus, the participants had to be allowed to walk at different speeds and faster participants could to walk ahead and upon completion of the route, turn around, meet, and "pick up" the slower group in order to complete the session with them. Consequently the dose and frequency was equal for all participants but the intensity varied somewhat.
Group B was instructed in NW once by the same specially trained instructors in a single one-hour session. Afterwards they were left to perform NW as much as they wanted to at home on their own for the next 8 weeks.
Group C was given information about active living and exercise and about maintaining the daily function level they have achieved during the four week period at the back pain clinic by remaining active.
NW poles were provided free of charge to everyone included in the project. Participants randomized to group C received their poles as a gift after the 8 week intervention period but received no instruction in NW.
Primary outcome measures were
Low Back Pain Rating Scale (LBPRS) assesses the dimensions of pain, disability, and physical impairment for patients with LBP . The pain index measures uses three 11-box numeric rating scales (pain now, worst and average pain during the last two weeks) for back and leg pain separately. Each response score is added giving a scale range of 0-60 points. The disability index comprises 15 items and possible answers to each question were "yes", "can be a problem", or "no" which were then scored as 0, 1, and 2 giving a range of 0-30 points
Patient Specific Function Scale (PSFS) assesses functional limitations in a variety of clinical presentations. Patients are asked to identify three important activities which they are having difficulty or are unable to perform because of their problem. In addition to specifying the activities, patients are asked to rate on an 11-box numeric rating scale the current level of difficulty associated with each activity 
Information on primary outcome measures were collected 11, 26, and 52 weeks after randomization.
Secondary outcome measures were
EQ-5D is a standardized 5-item generic measure of health related quality of life. Domains of mobility, self-care, usual activities, pain/discomfort, and anxiety are assessed using a three point response scale .
Medication use, other treatment for LBP, time off work was collected using questionnaires at different time points.
Expectation to treatment was collected, and all participants rated their expectations to each of the three groups on a five-point Likert scale with response options ranging from 1 (very good) to 5 (poor).
Participants in the supervised NW and unsupervised NW groups wore Actigraph GT 256 accelerometers (MTI) the fourth and fifth week of the eight week intervention period in order to gauge if there were differences in their physical activity levels. Participants were sent an MTI by mail and were asked to wear it every day for two weeks, and then return it in a pre-paid return envelope. A MTI is strapped around the waist near by the pelvis, as tight and as close to the skin as possible. It measures physical activity by registering the vertical movement of the point of gravity thereby both registering physical activity and, using the amplitude, also the intensity. The MTI registers physical activity in counts pr. minute which makes it possible to monitor activity precisely during the day and MTIs have been shown to reliably measure daily physical activity in different populations [27, 28]. When analyzing data from MTIs, it is not possible to see exactly what kind of activities the user has been performing and only activities involving vertical motion are registered, i.e. the MTI does not register for example bicycling and swimming.
Based on the primary outcome measures, a sample size of 130 participants would provide 80% power to detect a difference of eight units on the LBPRS (SD = 13) between the primary and each of the comparator groups, assuming alpha of 0.05 using a two-sided test. This change has been shown to be the minimal clinically important difference (MCID) for patients undergoing the standard treatment at the backcenter and data from this previous study was used as basis for the power calculations . To allow for a 20% drop out the sample size was increased to 150.
For primary and secondary endpoint the focus was on two pair-wise comparisons between supervised and unsupervised NW and between supervised NW and the advice to remain active intervention. This was accomplished through an ANCOVA analysis with the change from baseline as dependent variable, the three treatment groups as a catagorical variable, and adjusting for baseline level as a continuous variable. Standard errors were estimated via the sandwich approach which is a useful method for obtaining robust standard errors in complex models. The pair-wise comparison was performed through a Wald test. The within group treatment effect was evaluated using a paired T-test.
Explorative analysis of other variables (being on sick leave, medication use, and receiving concurrent treatment) followed the same approach. Explorative analysis of dichotomous variables was performed in logisitic regression. Finally, in the exploratory analysis the primary end point was reevaluated. We defined a successful outcome if the change was equal to or greater than the MCID  and counted the number of participants in each group who achieved the MCID for the LBPRS pain and disability dimensions. The analysis of the influence of expectations on the outcome was performed in a regression model adjusting for individual baseline level. All analyses were performed using the Stata version 10 statistical software  and based on the intention to treat analysis set.
Data from the accelerometers were extracted and downloaded on computer. These files were analyzed using excel computer software and only data from participants contributing data for seven days or more were used. Between groups comparisons were described as mean and SD of the total average activity intensity per hour, comparison of high and low scores between groups and mean time spent at each intensity level during the day. Then activity level in the two groups were compared with the intent of showing possible differences in the level of physical activity in the group which performed supervised NW and the group performing NW on their own at home.
Statistical significance was accepted at p < 0.05. P-values > 0.05 but < 0.1 are referred to as borderline significant.
The study was approved by the regional ethics committee for Funen and Vejle Counties, approval # VF 2005005