A nomogram for predicting overall survival in patients with Ewing sarcoma: a SEER-based study

Background Ewing sarcoma, the second most frequent bone tumor in children and adolescents, is often presented with localized disease or metastatic-related symptoms. In this study, we aim to construct and validate a nomogram for patients with Ewing sarcoma to predict the 3- and 5-year overall survival (OS) based on the Surveillance, Epidemiology, and End Results (SEER) database. Methods Demographic and clinic pathological characteristics of patients with Ewing sarcoma diagnosed between 2010 and 2015 were extracted from SEER database. Univariate and multivariate Cox analyses were carried out to identify the independent characteristics. The independent factors were further included into the construction of a nomogram. Finally, c-index and calibration curves were used to validate the nomogram. Results A total of 578 patients were enrolled into our analysis. The results of univariate Cox analysis showed that age, 7th AJCC stage, 7th AJCC T stage, 7th AJCC N stage, 7th AJCC M stage, metastatic status to lung, liver and bone were significant factors. Multivariate Cox analysis was performed and it confirmed age, N stage and bone metastasis as independent variables. Next, a nomogram was constructed using these independent variables in prediction to the 3- and 5-year OS. Furthermore, favorable results with c-indexes (0.757 in training set and 0.697 in validation set) and calibration curves closer to ideal curves indicated the accurate predictive ability of this nomogram. Conclusions The individualized nomogram demonstrated a good ability in prognostic prediction for patients with Ewing sarcoma.


Background
Ewing sarcoma, the second most frequent bone tumor in children and adolescents, is often presented with localized disease or metastatic-related symptoms [1]. There are several clinical parameters influencing the survival of patients with Ewing sarcoma. Age, tumor stage, tumor location, metastatic disease, chemotherapy and surgery have been found to have impacts on overall survival (OS) in patients with Ewing sarcoma [2][3][4][5][6][7]. The aim of this study is to integrate prognostic parameters into analysis and predict outcomes in patients with Ewing sarcoma.
As a statistical prognostic model, the nomogram represents a pattern of graph, in which variables are given marks, and therefore it easily assesses the probability of a certain event, in comparison with traditional evaluation standards [8]. In recent years, this model has been widely applied as the increased need of individualized medicine in a great variety of tumors [9][10][11][12]. Consequently, in the present study, we extracted data in patients with Ewing sarcoma from the Surveillance, Epidemiology, and End Results (SEER) database, to construct and validate a nomogram for predicting OS.

Study population
The SEER database provides demographic and clinical pathologic information of patients in the United States. Data in this study were further obtained from the SEER 18 Regs (1973-2015. Patients who were included into the analysis would meet the following criteria: Ewing sarcoma cases (histological code 9260/3) diagnosed from 2010 to 2015, only one primary tumor in "bones and joints". Cases diagnosed made by death certificate or autopsy were excluded from our analysis. The variables in our research included age, race, sex, 7th AJCC stage, 7th AJCC T stage, 7th AJCC N stage, 7th AJCC M stage, and metastatic status to lung, brain, liver and bone. Overall survival was defined as the period from diagnosis to death or time of the last follow-up.

Nomogram construction
The patients were divided into a training set (n = 406) and a validation set (n = 172) by performing the package of caret (Classification and Regression Training) in R version 3.6.1. The nomogram construction was based on the analysis in the training set. To identify factors related to prognosis, univariate and multivariate Cox proportional hazards regression analysis were performed. The results of multivariate Cox regression analysis were further used to formulate the nomogram by performing rms package in R version 3.6.1.

Nomogram validation
Two criteria, concordance index (c-index) and the calibration curve, were used to validate the nomogram model in both the training and the validation sets. Cindex, a value range between 0 and 1, is to assess performance of the model. The larger c-index (> 0.70) is, the better performance the model has [10]. Calibration   curves closer to ideal ones were thought to have the accurate predictive ability of this nomogram [11].

Statistical analysis
All statistical analyses were performed in R 3.6.1 (http:// www.Rproject.org). P < 0.05 was considered statistically significant. The nomogram construction was based on Cox proportional hazard regression models. Kaplan-Meier method was used to display OS curves by survival and survminer packages in R 3.6.1.

Demographics and clinic pathological characteristics of the training and validation sets
A total of 578 patients with Ewing sarcoma from the SEER database diagnosed from 2010 to 2015 were incorporated into the present study. The value of age was transformed into three categorical variables: ≤18, 19-27 and ≥ 28 years by performing X-tile (Fig. 1). As Table 1 showed, the demographic and clinicopathological characteristics of these two sets were similar.

Identification of prognostic factors in the training set
Univariate Cox analysis was carried out to work out the effect of demographics and clinic pathological characteristics on survival outcomes. As shown in Table 2, age, 7th AJCC stage, T stage, M stage, N stage, and the metastatic status to the liver, lung and bone were risk factors in patients with Ewing sarcoma. Multivariate Cox analysis was further performed and suggested that age, N stage and bone metastasis were independent prognostic factors for OS. In addition, Kaplan-Meier curve analysis was used to verify the prognostic abilities of these factors (Fig. 2a-j), indicating that longer OS was related to younger age (p < 0.0001), lower tumor stage (p < 0.0001), lower M stage (p < 0.0001), lower T stage (p < 0.0001), lower N stage (p < 0.0001), no metastasis to the bone and liver (P < 0.0001), no metastasis to the brain (P = 0.041), no metastasis to the lung (P = 0.00033). Gender (P = 0.9) and race (P = 0.26) had no significant impact on OS (Additional file 1).

Construction of the nomogram in the training set
To explore a quantitative approach to predicting 3-and 5-year OS, a nomogram that included all the clinic pathological independent risk factors was formulated (Fig. 3). The scores of the items displayed in the nomogram should be added up. As it showed in Fig. 3, age contributed most to prognosis, followed by bone metastasis and N stage.

Validation of predictive accuracy of the nomogram in the training and validation sets
To validate the predictive accuracy of the nomogram, cindex and calibration curves were used to evaluate this model. C-indexes were observed in both the training (0.757) and validation sets (0.697), which suggested the good accuracy of this model. Next, the packages of rms, foreign and survival were performed in R 3.6.1, and high agreements between ideal curves and calibration curves were observed in both training and validation sets (Fig. 4a-d). These results revealed a good discrimination ability of the nomogram model.

Discussion
In this study, we built an individualized nomogram, which integrated routinely available information such as age, N stage, and metastasis to bone, to predict OS in a large cohort of patients with Ewing sarcoma. C-indexes and calibration curves were used in the validation.
The results of Kaplan-Meier curve showed that gender had no significant impact on OS, which was consistent with that of S.E. Bosma, Friedman Danielle Novetsky and Ren Yingqing et al. [3,4,7], who found that female patients with Ewing sarcoma had similar OS to male patients. In the process of developing the nomogram, we found that age played a pivotal role in total points. Those aged over 28 had a high risk and a shorter OS (19-27 years: hazard ratio (HR) =1.928, 95% confidence interval (CI) = 1.200-3.097; ≥28 years: HR = 5.324, 95% CI = 3.434-8.254). This finding was consistent with the results of other studies [3][4][5][6], except S.E. Bosma et al., who indicated by a system review that the level of evidence for an association with OS for age was inconclusive.
In childhood and adolescence, tumor metastasizes to the liver, bone and lung at an early stage [13,14]. Despite timely treatment, patients suffered from metastasis usually have a poor OS [15][16][17]. Accordingly, the multivariate Cox analysis results revealed that metastasis to the bone was another independent factor for OS, and patients with bone metastasis in our analysis lived shorter than those without metastasis (bone metastasis: HR = 3.476, 95% CI = 2.271-5.320). In addition, as we constructed the nomogram, N stage (AJCC, 7th ed.) was also taken into account. The applications of nomogram models in several tumors were found to have a better prognostic performance than the staging systems alone [10,11]. In intrahepatic cholangiocarcinoma patients who underwent partial hepatectomy, Wang et al. [10] included both laboratory indices and demographic data in construction of nomogram, finding that this nomogram was more accurate in predicting OS than different staging systems. Wang et al. [11] combined staging system with demographic information of patients and then developed a nomogram, leading to a similar conclusion. Taken together, the nomogram combining demographics with staging system predicted OS in a more accurate way.
This study had some limitations. First, although cindexes and calibration curves had been applied to validate the nomogram, the present research lacked external validation. More work should be done to strengthen the validity of the model. Second, the data of treatment were not collected, so that the predictive value of OS was not absolutely precise due to the fact that survival is affected partly by the treatment [18]. However, not including surgery, chemotherapy and radiotherapy in the nomogram could make this model more applicable to patients initially presenting to clinic who are waiting for evaluation from oncologists. Third, this model was constructed based on a retrospective cohort, which means that the inherent biases were unavoidable. Thus further prospective researchers are required for validation.

Conclusions
The nomogram demonstrated a good ability in prognostic prediction for patients with Ewing sarcoma.