• 1. Institute of Medical Information, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100020, P. R. China;
  • 2. Shenzhen Health Development Research and Data Management Center, Shenzhen 518028, P. R. China;
  • 3. National Population Health Data Center, Chinese Academy of Medical Sciences, Beijing 100730, P. R. China;
HE Xiaofeng, Email: 0912hexf@163.com; LI Jiao, Email: li.jiao@imicams.ac.cn
Export PDF Favorites Scan Get Citation

Objective To summarize and explore the application of machine learning models to survival data with non-proportional hazards (NPH), and to provide a methodological reference for large-scale, high-dimensional survival data. Methods First, the concept of NPH and related testing methods were outlined. Then the advantages and disadvantages of machine learning algorithm-based NPH survival analysis methods were summarized based on the relevant literature. Finally, using real-world clinical data, a case study was conducted with two ensemble machine learning models and two deep learning models in survival data with NPH: a study of the risk of death within 30 days in stroke patients in the ICU. Results Eight commonly used machine learning model-based NPH survival analyses were identified, including five traditional machine learning models such as random survival forest and three deep learning models based on artificial neural networks (e.g., DeepHit). The case study found that the random survival forest model performed the best (C-index=0.773, IBS=0.151), and the permutation importance-based algorithm found that age was the most important characteristic affecting the risk of death in stroke patients. Conclusion Survival big data in the era of precision medicine presenting NPH are common, and machine learning model-based survival analysis can be used when faced with more complex survival data and higher survival analysis needs.

Citation: CHEN Haoran, LIU Xiayang, WANG Min, YANG Lin, WANG Jiayang, SUN Haixia, DUAN Yongheng, WU Xusheng, SHANG Li, QIAN Qing, HE Xiaofeng, LI Jiao. Application of machine learning models for survival data with non-proportional hazard and case study. Chinese Journal of Evidence-Based Medicine, 2024, 24(9): 1108-1116. doi: 10.7507/1672-2531.202401190 Copy

  • Previous Article

    Interpretation of the DECIDE-AI guideline: a reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence
  • Next Article

    Diagnostic study of machine learning model based on combinatorial optimization to predict postoperative infectious complications of gastric cancer