• 1. School of Mathematics, Southwest Jiaotong University, Chengdu, Sichuan 610097, P. R. China;
  • 2. Department of Neurology, the Third People’s Hospital of Chengdu & the Affiliated Hospital of Southwest Jiaotong University, Chengdu, Sichuan 610041, P. R. China;
  • 3. School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, Sichuan 610097, P. R. China;
XIONG Yao, Email: 350960109@qq.com
Export PDF Favorites Scan Get Citation

Objective  To evaluate the predictive effect of three machine learning methods, namely support vector machine (SVM), K-nearest neighbor (KNN) and decision tree, on the daily number of new patients with ischemic stroke in Chengdu. Methods  The numbers of daily new ischemic stroke patients from January 1st, 2019 to March 28th, 2021 were extracted from the Third People’s Hospital of Chengdu. The weather and meteorological data and air quality data of Chengdu came from China Weather Network in the same period. Correlation analyses, multinominal logistic regression, and principal component analysis were used to explore the influencing factors for the level of daily number of new ischemic stroke patients in this hospital. Then, using R 4.1.2 software, the data were randomly divided in a ratio of 7∶3 (70% into train set and 30% into validation set), and were respectively used to train and certify the three machine learning methods, SVM, KNN and decision tree, and logistic regression model was used as the benchmark model. F1 score, the area under the receiver operating characteristic curve (AUC) and accuracy of each model were calculated. The data dividing, training and validation were repeated for three times, and the average F1 scores, AUCs and accuracies of the three times were used to compare the prediction effects of the four models. Results  According to the accuracies from high to low, the prediction effects of the four models were ranked as SVM (88.9%), logistic regression model (87.5%), decision tree (85.9%), and KNN (85.1%); according to the F1 scores, the models were ranked as SVM (66.9%), KNN (62.7%), decision tree (59.1%), and logistic regression model (57.7%); according to the AUCs, the order from high to low was SVM (88.5%), logistic regression model (87.7%), KNN (84.7%), and decision tree (71.5%). Conclusion  The prediction result of SVM is better than the traditional logistic regression model and the other two machine learning models.

Citation: WANG Mingxu, ZHANG Haitao, XIONG Yao, WANG Zi, YAO Yi, CHEN Yuhao, WANG Lu. Evaluation of daily number of new ischemic stroke cases in a hospital in Chengdu based on machine learning and meteorological factors. West China Medical Journal, 2023, 38(2): 233-239. doi: 10.7507/1002-0179.202205042 Copy

  • Previous Article

    Construction of financial toxicity early warning model for breast cancer patients undergoing daytime chemotherapy under diagnosis intervention packet
  • Next Article

    Relationship between depression and quality of life in schizophrenic patients: chain mediating effect analysis