Survival data were widely used in oncology clinical trials. The methods used, such as the log-rank test and Cox regression model, should meet the assumption of proportional hazards. However, the survival data with non-proportional hazard (NPH) are also quite usual, which will decrease the power of these methods and conceal the true treatment effect. Therefore, during the trial design, we need to test the proportional hazard assumption and plan different analysis methods for different testing results. This paper introduces some methods that are widely used for proportional hazard testing, and summarizes the application condition, advantages and disadvantages of analysis methods for non-proportional hazard survival data. When the non-proportional hazard occurs, we need to choose the suitable method case by case and to be cautious in the interpretation of the results.
ObjectiveTo summarize and explore the application of machine learning models to survival data with non-proportional hazards (NPH), and to provide a methodological reference for large-scale, high-dimensional survival data. MethodsFirst, the concept of NPH and related testing methods were outlined. Then the advantages and disadvantages of machine learning algorithm-based NPH survival analysis methods were summarized based on the relevant literature. Finally, using real-world clinical data, a case study was conducted with two ensemble machine learning models and two deep learning models in survival data with NPH: a study of the risk of death within 30 days in stroke patients in the ICU. ResultsEight commonly used machine learning model-based NPH survival analyses were identified, including five traditional machine learning models such as random survival forest and three deep learning models based on artificial neural networks (e.g., DeepHit). The case study found that the random survival forest model performed the best (C-index=0.773, IBS=0.151), and the permutation importance-based algorithm found that age was the most important characteristic affecting the risk of death in stroke patients. ConclusionSurvival big data in the era of precision medicine presenting NPH are common, and machine learning model-based survival analysis can be used when faced with more complex survival data and higher survival analysis needs.