• 1. Department of Clinical Research, the First Affiliated Hospital of Jinan University, Guangzhou 510630, P.R.China;
  • 2. School of Public Health, Shanxi University of Chinese Medicine, Xianyang 712046, P.R.China;
LYU Jun, Email: lyujun2020@jnu.edu.cn
Export PDF Favorites Scan Get Citation

Objective To verify the influence of different variable selection methods on the performance of clinical prediction models. Methods Three sample sets were extracted from the MIMIC database (acute myocardial infarction group, sepsis group, and cerebral hemorrhage group) using the direct entry of COX regression, step by step forward, step by step backward, LASSO, and ridge regression, based on random forest. These existing six methods of variable importance algorithm, and the optimal variable set of different selected methods were used to construct the model. Through the C index, the area under the ROC curve (AUC value) and the calibration curve, and the results within and between groups were compared. Results The variables and numbers selected by the six variable selection methods were different, however, whether it was within or between groups did not reflect which method had the advantage of significantly improving the performance of the model. Conclusions Prior to using the variable selection method to establish a clinical prediction model, we should first clarify the research purpose and determine the type of data. Combining medical knowledge to select a method that can meet the data type and simultaneously achieve the research purpose.

Citation: ZHENG Shuai, HUANG Tao, YANG Rui, LI Li, QIAO Mengmeng, CHEN Chong, LYU Jun. Validation of multivariate selection method in clinical prediction models: based on MIMIC database. Chinese Journal of Evidence-Based Medicine, 2021, 21(12): 1463-1467. doi: 10.7507/1672-2531.202107175 Copy

  • Previous Article

    Application of GRADE in Chinese clinical practice guidelines/expert consensus
  • Next Article

    Construction of an intelligent data integration platform for clinical trials based on information interaction