



REVIEW ARTICLE 

Year : 2021  Volume
: 4
 Issue : 1  Page : 1321 

Basics of survival statistics for oncologists
Anurag Mehta^{1}, Anurag Sharma^{2}
^{1} Department of Laboratory and Transfusion Services, Rajiv Gandhi Cancer Institute and Research Center, Delhi, India ^{2} Department of Research, Rajiv Gandhi Cancer Institute and Research Center, Delhi, India
Date of Submission  08May2021 
Date of Acceptance  01Jun2021 
Date of Web Publication  31Jul2021 
Correspondence Address: Dr. Anurag Sharma Department of Research, Rajiv Gandhi Cancer Institute and Research Center, Delhi. India
Source of Support: None, Conflict of Interest: None
DOI: 10.4103/jco.jco_8_21
In clinical practice, survival curves show the fraction of patients who experienced the outcome of interest. As survival data contain “censored” observations where a patient is lost to scrutiny before experiencing the outcome, a sensible survival curve cannot be computed through simple division. This article aimed to provide various important aspects of survival analysis, censoring and various survival estimation techniques that are simple to calculate and understand, and a better visualization of statistical significance. Keywords: Censoring, statistical significance, survival analysis, survival curve
How to cite this article: Mehta A, Sharma A. Basics of survival statistics for oncologists. J Curr Oncol 2021;4:1321 
Introduction   
Survival is defined as the state of continuing to exist or live, customarily in spite of an ordeal, accident, or difficult circumstances^{[1]} or the act of living longer than another person or thing.^{[2]} In medical science, survival is defined as the period of time that a patient lives after getting diagnosed with a specific disease.^{[3]} These survival statistics help the doctors in estimating the prognosis of the patient and in evaluating treatment options. Medical data comprising patients’ survival are known as survival data.
Survival analysis is the branch of medical statistics that deals with statistical methods for estimating survival data derived from laboratory studies of animals, clinical and epidemiologic studies of humans, and other appropriate applications, medicine, public health, social science, and engineering.^{[4]} Generally, survival analysis is defined as a stack of various statistical methods for analyzing the data where the variable of interest is the “time until the outcome of interest occurs.”
Survival analysis is one of the primary statistical methods for data analysis where the outcome variable of interest is the time to the occurrence of an event. In medical statistics, this event may be death, disease recurrence, disease occurrence, or any designated outcome of interest (e.g., relapse, breast retraction) that may happen to a participant during the period of observation or study.^{[5],[6],[7],[8],[9],[10],[11]} It is also called timetoevent analysis or failure time analysis.^{[12]}
Basic Concept Behind Survival Analysis   
In medical science, use of survival analysis has gone up rapidly as it helps doctors and researchers in estimating the prognosis of the patient. It also helps them in evaluating treatment options and achieving the desired results. Furthermore, survival data are relatively skewed and consist of early events and late events. These features of the survival data make the use of survival analysis necessary.^{[13]}
In oncology, traditionally overall survival is considered as the gold standard among efficacy end points. However, it has been observed that the benefits of overall survival vary from patient to patient. Moreover, clinical impact on drugs of cancer is characterized by multiple outcome measures. Thus perhaps promising, other alternatives of overall survival should be considered and thus it makes survival analysis an integral part of study for oncologists. This research article provides various important aspects and methods of survival analysis. This article is an important step forward to make oncologists understand the basic concepts of survival analysis.^{[14]}
Survival Time   
Survival time is usually defined from the beginning of followup of an individual till the occurrence of the outcome of interest. By the time, we mean years, months, weeks, and days from the beginning of followup of an individual until the event occurs, for example, survival time can refer to the duration from the date of diagnosis till the date of death if one is interested in the overall survival of the patients.^{[15]}
The time origin in survival time should be specified clearly so that participants are as much as possible on an equal footing. For example, if one is interested in studying the survival time of patients with cancer, the origin of cancer could be taken as the point of diagnosis of that cancer. Similarly, the end point or outcome of interest should be clearly specified, so that the times considered are welldefined.^{[16]} In the aforedefined example, end point could be defined as death due to the cancer studied.
Descriptive of Survival Time   
The estimates of the survival function provide descriptive statistics of the survival function, for example, median survival time and mean survival time. Mean survival time is defined as the average time from the date of diagnosis for a disease that patients diagnosed with that disease have not experienced the outcome of the disease.^{[17]} In oncology, average survival is one method to see how well patients have responded to the new treatment. However, average survival is not a good indicator of how well the patients have responded to the treatment as it is affected by extreme values. Therefore, another descriptive statistics, namely, median survival, is used to outline the survival time.
Median survival time is described as the duration of time from the date of diagnosis for a disease that half of the patients diagnosed with that disease have not experienced the outcome of the disease.^{[17]} It is a commonly used statistic in survival analysis. However, in the case of high censoring, median survival time is not reached. Hence, median survival time is again not a good indicator of survival estimates. Median survival time has an advantage over mean survival time as survival time could be unknown (censored) and tends to have a skewed distribution.
In the case of cancer survival analysis, median survival time is not a good indicator of an intervention as it might be possible that there is not much difference in the median survival time of the intervention and nonintervention groups. However, the advantage of intervention is visible in the long followup with longterm survival of the intervention group as compared with the nonintervention group, which suggests that intervention group has longtime survivors as compared to the nonintervention group. Therefore, intervention is useful in prolonging the life of the patients, which was not evident from the median survival time. For the same reason, different statistical tests, viz, logrank test, Breslow (generalized Wilcoxon) test, and Tarone ware test, are used to compare the survival of groups based on the survivorship in the data.
Another statistic used in the survival analysis is median followup, which is defined as the median duration between the outcome of interest and the time when the data on outcome are collected. This concept is widely used in cancer survival analysis as the cancer studies must have long followup durations to record events to disclose meaningful patterns in the data. However, for a study involving very severe cancer, a short followup is also appropriate.
Therefore, in survival analysis, survival times are summarized through graphical or numerical summaries for the individuals instead of descriptive statistics. In general, survival data are conveniently presented by estimating the hazard function and survival function. These methods are known as nonparametric methods as they do not require any assumptions regarding the distribution of the survival time.
IntenttoTreat Princple   
The intenttotreat (ITT) principle is an important principle in survival analysis, which allows all the participants to include the analysis regardless of compliance (drop in and drop out) with the assigned group or treatment as excluding these patients might have significant implications on the results of the study. ITT allows the researcher to draw unbiased conclusions regarding the effectiveness of treatment.^{[18]}
ITT ignores withdrawal, protocol deviations, noncompliance, and anything that happens after the start of the study. It is often regarded as a complete strategy for design and analysis than as an approach to analysis alone.^{[19],[20]} It preserves the sample size of the study because if patients are excluded from the final data analysis, reduced sample size might lead to the reduced statistical significance of the results.^{[21]}
However, ITT has been criticized as it is too cautious and is susceptible to type II error. Moreover, it can also introduce heterogeneity as it mixes dropouts, noncompliance, and complaints subjects at once in study.^{[20],[22]}
Survival Analysis in Oncology   
Survival analysis has found various uses in the last 15–20 years especially in oncology.^{[23]} Correct application and presentation of survival analysis is critically important in oncology.^{[24],[25]} It helps in evaluating the effectiveness of care provided to patients with cancer. It is also used to analyze the time to progression, relapse, or death of patients or to compare treatments in a clinical trial. Some examples of survival analysis in oncology include “the recurrence of cancer after the completion of treatment” or “how long patients survive after getting diagnosed with cancer.”
In oncology, survival analysis is used to calculate various types of survival rates as discussed in [Table 1]^{[17]}:
Censoring   
In survival analysis, some observations may have incomplete information because the outcome of interest may not be experienced by patients till the completion of the study. In these scenarios, survival time is said to be censored survival time as patient’s actual survival time will be higher than the calculated ones. There are different types of censoring as defined below^{[9]}:
Right censoring: this censoring is defined when a patient or participant leaves the study before experiencing the outcome of interest or when the event does not occur till the completion of the study. For example, consider a clinical trial studying the effect of a treatment on the survival of the patients till 5 years. Patients who do not die till the completion of the study year are considered to the censored observation. If another patient leaves the study at some time point (t_{e}), then the event might have occurred to him in (t_{e}; ∞): Censoring may occur in the following forms: Loss to followup (LFU): the patient does not visit again or he may have moved elsewhere.
 Drop out: treatments may have side effects on the patient that treatment had to be stopped or the patient may decline to continue the treatment.
 Completion of study: in some cases, the study may end at a predefined time point. Such censoring is known as administrative censoring.
 Competing risks: the outcome is observed due to another event (e.g., death by some other reason).
Left censoring: it is the type of censoring where the patient has already experienced the outcome of interest before the start of study. For example, Rodrigues et al. presented the first use of marijuana in boys.^{[26]} An answer to this question may be “I have used marijuana but I do not remember the exact time of my first use.” Such observation is an example of leftcensored observation.
Interval censoring: this censoring occurs when the onlyavailable information is the occurrence of event during some interval. Interval censoring generally occurs when subjects are enrolled in a trial or a longitudinal study where the patients have a followup in a period and outcome is known to happen in a given interval of time (l_{i}, u_{i}), where l and u are left and right end points of censoring interval.
In survival data with censoring, applying general statistical methodologies is not a perfect choice. However, removing them is also not a fine choice. Therefore, various techniques have been used to analyze and compute survival data. This article aimed to provide three statistical techniques to estimate survival data: Kaplan–Meier (K–M) method, weighted K–M method, and intervalcensored survival method.
Kaplan–Meier Method   
K–M method is one of the oldest and most straightforward nonparametric methods.^{[27]} In this method, the survival function is computed using a productlimit formula. K–M method is one of the most commonly used methods to analyze survival data.
Mathematically, let n be the total number of monitored participants in the study and t_{1}, t_{2}, ..., t_{n}, the observed times. The survival time of some of these patients may have been censored. Therefore, we assume that the number of focused outcomes is r in which r ≤ n and t_{(1)}, t_{(2)}, ..., t_{(r)} will be patients’ ordered event times. Now, the number of patients who have survived before t_{(j)} (including those who have died at this time) is n_{j}, and the number of those who have focused outcome at t_{(j)} is . Therefore, in the time interval less than t, which is shown as , the K–M estimator is as follows:
Case 1   
Consider the treated group from Table 1.1 of Cox and Oakes (Freireich et al.)^{[9]}:
6, 6, 6, 6+, 7, 9+, 10, 10+, 11+, 13, 16, 17+, 19+, 20+, 22, 23, 25+, 32+, 32+, 34+, 35+,
where times with + are right censored.
For these data, survival times are estimated using the K–M productlimit formula. Results are shown in [Table 2]:
Median survival time is calculated by drawing a straight line from the yaxis parallel to the xaxis. Time point at which this straight line cuts the curve is defined as the median time. In this example, median survival time is not reached as more than 50% of observations were censored.
Case 2   
Consider another study where the outcome of interest is the recurrence of the cancer. Fortyfour head and neck patients diagnosed during February 2018 to September 2019 and treated in the Rajiv Gandhi Cancer Institute and Research Center (RGCIRC) were followed with a median followup of 15 months. Recurrencefree survival (RFS) was defined as the duration from the date of last treatment to the date of recurrence/last followup/last contact. At the end of the study, only nine patients experienced the outcome of interest. [Figure 1] shows the K–M curve for the patients.  Figure 1: Kaplan–Meier curve showing the survival of patients in remission
Click here to view 
The median followup of the patients was 15 months. Thus, it will be appropriate to report the 1year RFS for these data sets, which was 82.1%. Also, 2year RFS was 54.8%. However, of 44 patients, 35 patients did not experience the event (recurrence), which amounts to 79.55% of censoring. Hence, these estimates may be highly biased due to high level of censoring. Furthermore, consider the flat line from 9th month to 24th month, which means that survival probability was same between these two time periods, which is not true. Patients are always at risk instead of some specific times. The table below the K–M curve in [Figure 1] shows the patients at risk at different time points. Of 44 patients, 2 patients progressed within 1 month of treatment, whereas 1 patient was LFU. Thus, patient at risk is shown as 41 at zero time point in the table [Figure 1]. During first to fifth months, eight patients progressed/LFU in the study; thus, the table [Figure 1] shows the patient at risk as 33 at fifth month. As time increases, the patient at risk decreases, which might result in a large number of censored observations, which is evident in the flat line from 9th to 24th month in the survival curve.
Also, as time progresses, patients at risk reduce, which results in an increase of size of the step, which is also evident in [Figure 2] where the recurrence in the 24th month resulted in a drastic fall of survival probability from 82.1% to 54.8% due to just one event.  Figure 2: Kaplan–Meier curve showing the survival of head and neck patients in remission
Click here to view 
Moreover, as the last patient also progressed, K–M survival probability falls to zero, which is not true as true survival probability will never fall to zero in practical.
So, in the case of heavy censoring, the K–M estimate is not reliable, and it overestimates the survival probabilities.^{[27]} Also the K–M survival curve fails to give reliable estimates at the end points.^{[28]}
For example, a study may be terminated with a large number of censoring, which could be due to LFU, withdrawal, and alternative outcome than the focused event.
Such a high number of censored observations result in the reduction of the number of participants at risk in the successive time points. As a result, the survival estimates obtained by K–M are not reliable anymore. High levels of censoring can suggest a number of problems in the study. The Quick End (by which most of the patients do not experience the outcome by the end of the study) and a censoring patternthat makes a lot of patients/subjects excluded from the study during a specific time are among these problems. Hence, a large number of censored observations make the survival estimations contain error and be estimated higher than their real amounts. Unfortunately, no suitable test determines the validity of the censoring assumption, and this is just a judgment made by researchers.
Apart from the biased results in high censored data, the K–M method has other major drawbacks that have been discussed above. They can be summarized as follows:
The drop at each event draws unnecessary attention to those particular “danger time,” with the K–M estimate of survival time remaining unchanged until the next event is encountered. In reality, patients are not at risk only at specific times. Instead, they are in constant danger of failure, with the degree of failure feasibly changing with time.
As time progresses, there are fewer remaining patients at risk. This has 2 direct effects on the K–M curve: (i) interval between failure growth and (ii) effect of each individual failure on the size of the step increases. Thus, it unjustifiably magnifies the impact of a single failure if it is experienced at a later time.
If the remaining patient at risk falls, the K–M survival estimates fall to zero at that time, whereas the true S(t) will never reach zero in any physically sensible model.
Hence, there is a need of some alternative methods of K–M. Various methods have been proposed as an alternative to the K–M method. Two of them are weighted K–M method and Turnbull’s algorithm for the intervalcensored survival data. These two methods are chosen as these methods take into account the censoring and deal with the scenarios where the exact time of event is not known.
Weighted Kaplan–Meier Method   
To modify K–M estimations, a new method, namely, weighted K–M, was presented by Jan et al. and Jan.^{[29],[30]} They revealed that in the case of high censored observations (27% in their article), K–M estimates will have an error, and their survival estimates will be more than actual. Shafiq et al. and Huang also presented other methods to resolve the problems of unreliable K–M estimates.^{[29],[30],[31],[32]}
To calculate the weighted K–M survival estimates, a method presented by Jan et al. and Jan is used.^{[29],[30]} They proved that in the case of a considerable proportion of censored observations, the K–M estimation might give unreliable and inefficient results. As in K–M, let us assume that
c_{j} = number of censored patients at t_{(j)} and
w_{j} = weights of censored observations that are defined as follows:
If t_{(j)} is one event time, w_{j} = 1, and if t_{(j)} is a censored time, 0 < w_{j} < 1. Now, the weighted K–M estimation is defined as
where S^{*} (t) solves the problem of overestimation (which existed in the K–M estimations) by proper weighing.
Example   
Ramadurai et al. scrutinized and reported different procedures proposed for estimating the survival function. Results obtained by them revealed that weighted K–M is a worthy alternative for estimating the survival probability. They showed that in the case of heavy censoring, 5year K–M survival rate (95.37%) overestimates the survival probability as compared to weighted K–M (64.80%).^{[32]}
IntervalCensored Survival Data   
In oncology practice, an important aspect is time to recurrence/progression in cancer, commonly known as “RFS.” However, the fixed time of recurrence is generally not known, but it can be tapered down to a specific time duration between two followups of the patient. Thus, it is assumed that T lies within an interval T_{1}–T_{2}, where T_{1} ≤ T ≤ T_{2} is the intervalcensored data. However, this mechanism of interval censoring is often ignored by taking the recurrence time as date of recurrence and subsequently applying methods to analyze right censored data,^{[33],[34],[35],[36],[37]} even though the recurrence might have occurred between two followups of the patient. Furthermore, statisticians and researchers apply traditional survival analysis methods as they are well known and very few statistical software are known to analyze intervalcensored data.
Another example of intervalcensoring data is the reappearance of the oral lesion in oral care following transplantation. In aforesaid cases also, definite time of the appearance of the lesion is not known but can be tapered to a time interval between two followups of the patient. However, commonly used methods of survival analysis overestimate the survival function, which can lead to inaccurate results and conclusions.^{[38],[39],[40]} Turnbull’s algorithm is an important survival estimator method. It takes into consideration the intervalcensored data.^{[41]}
Various methods are proposed to analyze the intervalcensored data. For example, Peto proposed a method for the estimation of cumulative distribution function from intervalcensored survival data.^{[42]} This methodology is akin to the technique of life table and to the algorithm presented to estimate survival.^{[43]} Semiparametric approaches depending on the Cox proportional hazard (Cox PH) model have also been proposed to estimate intervalcensored survival data.^{[44],[45],[46],[47],[48],[49]} Moreover, a large number of parametric models can also be used for the estimation of distribution of survival time in the presence of intervalcensored survival data.^{[50],[51],[52]} In a comprehensive review, Gómez et al. presented the most commonly used parametric, nonparametric, and semiparametric estimation methods that have been widely used to analyze the intervalcensored data.^{[53],[54]} Rodrigues et al. presented an application of intervalcensored methodology in the data set of boys’ first use of marijuana.^{[26]}
Turnbull’s Algorithm   
Turnbull’s algorithm is an important survival estimator method that takes into consideration the intervalcensored data. The most common approach of analyzing the intervalcensored survival data is using the nonparametric estimation methods of survival function. This approach does not make any assumptions regarding the Cox PH model, and, therefore, the estimated curves are easy to interpret in the same manner as K–M curves in the case of right censoring.
Here, we aimed to present a productlimit estimator for the estimation of survival function in the case of intervalcensored data. This estimator was suggested by Turnbull.^{[54]} This estimator is based on an iterative method. However, this estimator has no closed form.
For construction of the estimator, let 0 = T_{0} < T_{1} < T_{3} < ... < T_{m} be the time, which comprises all interval points l_{i} and u_{i} for i = 1, ..., n. For the ith observation, α_{ij} is defined as the weight that indicates whether the event that occurred during the interval (l_{i}, u_{i}] might have occurred at time T_{j}. This weight α_{ij} is defined as
Turnbull’s algorithm is applied as follows:
Step 1: probability that the event will occur at time T_{j} is computed as
Step 2: number of events occurring at T_{j} is estimated as
Step 3: estimated number of patients at time point T_{j} is computed as
Step 4: calculate the productlimit estimator by using the quantities determined in steps 2 and 3.
Step 5: stop the process when the updated survival time is asymptotically close to the old survival time for all T_{j}. Otherwise, repeat the process by using the estimate of survival time.
Many statistical software are available to analyze intervalcensored data. Such functions can be found in R,^{[55]} SAS, and STATA to solve problems in the presence of intervalcensored data. For more details, see stintreg function in STATA,^{[56]} survreg() from survival package^{[57]} and Icens package^{[58]} in R, and procedure LIFETEST for SAS.^{[59]}
Example   
Rücker and Messerer (1988) illustrated Turnbull’s algorithm by using R.^{[39]} They considered the time till cosmetic deterioration for breast cancer patients.^{[44]} They aimed to compare the effect of radiation plus chemotherapy versus radiation therapy. Event of interest in their study was the appearance of breast retraction. The patients who did not experience the event till the end of the study were considered as right censored, whereas the patients who experienced the breast retraction were considered interval censored. Turnbull’s algorithm estimated breast retractionfree survival as 11.06% for radiation plus chemotherapy against 47.37% for radiation therapy only at t = 40 months.
Cox Proportional Hazard Model   
In practice, the exact form of the underlying survival distribution is usually unknown, and as a result, we may not be able to find an appropriate parametric model. Therefore, the use of parametric methods in identifying significant prognostic factors is somewhat limited.
Under such circumstances, we need another method to model survival data in the presence of censoring. A very popular model that works well in survival analysis with these problems is the Cox PH model. The Cox PH model is used for the analysis of survival data in the presence of covariates or prognostic factors. The most specific feature of this model is that it is not based on any assumptions about the survival distribution, but it assumes that the underlying hazard is a function of the independent covariates.
Basically, there are two main types of regression models that have been developed for timetoevent survival data. The first model consists of hazard function in patient groups compared to a baseline population by means of a multiplicative effect on the hazard scale. The multiplicative factor is constant over time, in which case the model forces the hazards in the different patient groups to be proportional, and this is also known as the Cox PH model.^{[60]} The second model is applied for modeling the survival time directly along with covariates assumed to act multiplicatively directly on the time scale. Accelerated failure time model is like another type of regression model, and it is a class of linear regression model in which the response variable is the logarithm or a known monotone transformation of a failure time.^{[61]}
Assumptions of the Cox Proportional Hazard Model   
Cox PH model is also called semiparametric regression model; it has no assumption about the shape of the hazard function, but it assumes how covariates affect the hazard function, one of the best, for example, is the Cox regression model. The Cox model is a nonparametric model to the extent that no assumption is made about the form of the baseline hazard. The first and foremost is the issue of noninformative censoring. To satisfy the assumptions, the study must ensure that the mechanisms giving the risk to the censoring of individual are not related to the probability of an event occurring. Violation of this assumption can invalidate just about any sort of survival analysis, from K–M estimation of the Cox model.
The second assumption in the Cox model is that of proportional hazards (PH); this means that the survival curves for 2 strata must have hazard functions that are proportional over time.
A key reason why the Cox model is so popular is that even though the baseline hazard is not specified, a reasonably good estimate of regression coefficients, hazard ratio of interest, and adjusted survival curves can be obtained for a wide variety of data situation. The Cox PH model is a “robust” model; the results obtained using the Cox model will closely approximate the results obtained by the correct parametric model. There are many parametric methods for assessing goodness of fit; one may not be completely certain to say that this parametric model is appropriate; therefore, the Cox model will give reliable results, so it is a “safe” choice of model.
One should note that the hazard function h(t, X) and its corresponding survival curves S(t, X) can be estimated for the Cox model even though the baseline hazard function is not specified.
One of the most important factors for the Cox model to be popular is that it is preferred when survival time information is available and when censoring exists.
Though PH models are popular for the analysis of survival data, the proportionality assumption for these models is seldom met. There are very few regression models for which the parametric forms of PH models are defined such as exponential, Gompertz–Weibull distribution. Also, hazard functions of some distributions are very difficult to obtain as compared to survival functions. So, Cox PH models can be used with relatively few probability distributions.
Conclusion   
Survival analysis is the statistical method used to analyze data when one is interested in the time till the occurrence of an event or an outcome of interest. This article discussed three different survival estimation techniques, namely, the K–M method, weighted K–M method, and Turnbull’s algorithm for intervalcensored survival data. K–M method is one of the oldest and most straight forward nonparametric methods. However, in the case of high censoring, K–M is severely affected by the censoring assumption, which causes biased estimations in the results of the study. Therefore, high censoring levels affect the accuracy and reliability of estimates obtained by K–M. In such cases, the weighted K–M method is an ideal alternative. Weighted K–M uses appropriate weights and reduces the bias in censored time points and thus resolves the issue of overestimation. Other method discussed is Turnbull’s algorithm, which is used to estimate intervalcensored data when the exact time of outcome is not known but interval in which outcome falls is known. We hope the analyses presented in this article will help researchers and doctors better understand the inference of applying conventional survival analysis methods versus appropriate methods when analyzing different survival data.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References   
1.  Definition of survival [online]. Oxford University Press. Available at: https://www.lexico.com/definition/wake. Accessed 14 February 2021. 
2.  Survival.2021. In MerriamWebster.com. Available at: https://www.merriamwebster.com/dictionary/survival. Accessed 14 February 2021. 
3.  Survival.2021. In medicaldictionary.com. Available at: https://medicaldictionary.thefreedictionary.com/survival. Accessed 14 February 2021. 
4.  Lee S, Lim H. Review of statistical methods for survival analysis using genomic data. Genomics Inform 2019;17:e41. 
5.  Cox DR, Snell EJ. A general definition of residuals. J R Stat Soc Ser B 1968;30:24865. 
6.  Crowley J, Hu M. Covariance analysis of heart transplant survival data. J Am Stat Assoc 1977;72:2736. 
7.  Kalbfleisch JD, Prentice RL. Survival models and data analysis. New York: John Wiley; 1980. 
8.  Lawless JF. Statistical models and methods for lifetime data. Vol 362. New York: John Wiley & Sons;2011. 
9.  Cox DR, Oakes D. Analysis of survival data. Vol 21. Boca Raton, FL: CRC Press;1984. 
10.  Makridakis S, Hibon M. ARMA models and the Box–Jenkins methodology. J Forecast 1997;16:14763. 
11.  Hair JF, Black WC, Babin BJ, Anderson RE, Tatham RL. Multivariate data analysis 6th edition. Pearson Prentice Hall. New Jersey. Humans: critique and reformulation. J Abnorm Psychol 2006;87:4974. 
12.  Hosmer DWJr, Lemeshow S. Applied survival analysis: Timetoevent. Vol 317. New York: Wiley Interscience;1999. 
13.  Wienke A. Frailty models in survival analysis. Boca Raton, FL: Chapman & Hall/CRC;2011. 
14.  SalasVega S, Iliopoulos O, Mossialos E. Assessment of overall survival, quality of life, and safety benefits associated with new cancer medicines. JAMA Oncol 2017;3:38290. 
15.  Llobera J, Esteva M, Rifa J, Benito E, Terrasa J, Rojas C, et al. Terminal cancer: Duration and prediction of survival time. Eur J Cancer 2000;36:203643. 
16.  Clark TG, Bradburn MJ, Love SB, Altman DG. Survival analysis part I: Basic concepts and first analyses. Br J Cancer 2003;89:2328. 
17.  Cancer.gov.in. Available at: https://www.cancer.gov/publications/dictionaries/cancerterms/. Accessed 14 February 2021. 
18.  McCoy CE. Understanding the intentiontotreat principle in randomized controlled trials. West J Emerg Med 2017;18:10758. 
19.  Lewis JA, Machin D. Intention to treat––who should use ITT? Br J Cancer 1993;68:64750. 
20.  Hollis S, Campbell F. What is meant by intention to treat analysis? Survey of published randomised controlled trials. BMJ 1999;319:6704. 
21.  Wertz RT. Intention to treat: Once randomized, always analyzed. Clin Aphasiol 1995;23:5764. 
22.  Fergusson D, Aaron SD, Guyatt G, Hébert P. Postrandomisation exclusions: The intention to treat principle and excluding patients from analysis. BMJ 2002;325:6524. 
23.  Altman DG, De Stavola BL, Love SB, Stepniewska KA. Review of survival analyses published in cancer journals. Br J Cancer 1995;72:5118. 
24.  Damuzzo V, Agnoletto L, Leonardi L, Chiumente M, Mengato D, Messori A. Analysis of survival curves: Statistical methods accounting for the presence of longterm survivors. Front Oncol 2019;9:453. 
25.  Bell Gorrod H, Kearns B, Stevens J, Thokala P, Labeit A, Latimer N, et al. A review of survival analysis methods used in NICE technology appraisals of cancer treatments: Consistency, limitations, and areas for improvement. Med Decis Making 2019;39:899909. 
26.  Rodrigues AS, Bhering FL, Pereira CAB, Polpo A. Bayesian estimation of component reliability in coherent systems. IEEE Access 2018;6:1852035. 
27.  Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958;53:45781. 
28.  Murray S. Using weighted Kaplan–Meier statistics in nonparametric comparisons of paired censored survival outcomes. Biometrics 2001;57:3618. 
29.  Jan B, Shah SWA, Shah S, Qadir MF. Weighted Kaplan–Meier estimation of survival function in heavy censoring. Pakistan J Stat Ser 2005;21:55. 
30.  Jan B. Improved inferences in the context of survival/failure time [Dissertation].Pakistan: Peshawar University;2004. 
31.  Huang ML. A weighted estimation method for survival function. App Math Sci 2008;2:75362. 
32.  Shafiq M, Shah S, Alamgir M. Modified weighted Kaplan–Meier estimator. Pak J Stat Oper Res 2007;3:3944. 
33.  Ramadurai M, Ponnuraja C. Nonparametric estimation of the survival probability of children affected by TB meningitis. Res World 2011;2:216. 
34.  Liedtke C, Mazouni C, Hess KR, André F, Tordai A, Mejia JA, et al. Response to neoadjuvant therapy and longterm survival in patients with triplenegative breast cancer. J Clin Oncol 2008;26:127581. 
35.  Mitsudomi T, Morita S, Yatabe Y, Negoro S, Okamoto I, Tsurutani J, et al; West Japan Oncology Group. Gefitinib versus cisplatin plus docetaxel in patients with nonsmallcell lung cancer harbouring mutations of the epidermal growth factor receptor (WJTOG3405): An open label, randomised phase 3 trial. Lancet Oncol 2010;11:1218. 
36.  da Costa AA, Valadares CV, Baiocchi G, Mantoan H, Saito A, Sanches S, et al. Neoadjuvant chemotherapy followed by interval debulking surgery and the risk of platinum resistance in epithelial ovarian cancer. Ann Surg Oncol 2015;22(3 suppl):S9718. 
37.  Del Carmen MG, Supko JG, Horick NK, RauhHain JA, Clark RM, Campos SM, et al. Phase 1 and 2 study of carboplatin and pralatrexate in patients with recurrent, platinumsensitive ovarian, fallopian tube, or primary peritoneal cancer. Cancer 2016;122:3297306. 
38.  Bahnassy AA, ElSayed M, Ali NM, Khorshid O, Hussein MM, Yousef HF, et al. Aberrant expression of miRNAs predicts recurrence and survival in stageII colorectal cancer patients from Egypt. Appl Cancer Res 2017;37:113. 
39.  Rücker G, Messerer D. Remission duration: An example of intervalcensored observations. Stat Med 1988;7:113945. 
40.  Law CG, Brookmeyer R. Effects of midpoint imputation on the analysis of doubly censored data. Stat Med 1992;11:156978. 
41.  Odell PM, Anderson KM, D’Agostino RB. Maximum likelihood estimation for intervalcensored data using a Weibullbased accelerated failure time model. Biometrics 1992;48:9519. 
42.  Turnbull BW. Nonparametric estimation of a survivorship function with doubly censored data. J Am Stat Assoc 1974;69:16973. 
43.  Peto R. Experimental survival curves for intervalcensored data. Appl Stat 1973;22:8691. 
44.  Turnbull BW. The empirical distribution function with arbitrarily grouped, censored and truncated data. J R Stat Soc 1976;38: 2905. 
45.  Finkelstein DM, Wolfe RA. A semiparametric model for regression analysis of intervalcensored failure time data. Biometrics 1985;41:93345. 
46.  Finkelstein DM. A proportional hazards model for intervalcensored failure time data. Biometrics 1986;42:84554. 
47.  Goetghebeur E, Ryan L. Semiparametric regression analysis of intervalcensored data. Biometrics 2000;56:113944. 
48.  Betensky RA, Rabinowitz D, Tsiatis AA. Computationally simple accelerated failure time regression for interval censored data. Biometrika 2001;88:70311. 
49.  Lesaffre E, Komárek A, Declerk D. An overview of methods for intervalcensored data with an emphasis on applications in dentistry. Stat Methods Med Res 2005;14:53952. 
50.  Zhang M, Davidian M. “Smooth” semiparametric regression analysis for arbitrarily censored timetoevent data. Biometrics 2008;64:56776. 
51.  Sparling YH, Younes N, Lachin JM, Bautista OM. Parametric survival models for intervalcensored data with timedependent covariates. Biostatistics 2006;7:599614. 
52.  Lindsey JC, Ryan LM. Tutorial in biostatistics methods for intervalcensored data. Stat Med 1998;17:21938. 
53.  Achcar JA, Tomazella VLD, Saito MY. Lifetime intervalcensored data: A Bayesian approach. J Appl Stat Sci 2007;16: 7789. 
54.  Gómez G, Calle ML, Oller R, Langohr K. Tutorial on methods for intervalcensored data and their implementation in R. Stat Model 2009;9:25997. 
55.  R Core Team. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing;2018. 
56.  StataCorp. Stata statistical software: Release 15. College Station, TX: StataCorp LLC;2017. 
57.  Therneau TM, Lumley T. Package ‘survival’. R Top Doc 2015;128:2833. 
58.  Gentleman R, Vandal A. Icens: NPMLE for censored and truncated data. R package version 1.52.0; 2018. 
59.  Cary, NC: SAS Institute;2014. 
60.  Cox DR. Regression model and life tables (with discussion). J R Stat Soc B 1972;34:187220. 
61.  Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. 1st ed. Hoboken, NJ:John Wiley & Sons;1980. 
[Figure 1], [Figure 2]
[Table 1], [Table 2]
