Data integrity, accuracy, and traceability are key elements of high-quality clinical research, as well as weak links in the promotion of clinical research transparency. How to promote data quality has become a major concern to all clinical research stakeholders. In this article, we dissected and analyzed data generation and capturing process in clinical research, and identified a key aspect in improving data quality: to promote electronic source data, especially to break the barrier between electronic health records and clinical research systems. Additionally, we summarized the experiences regarding this issue in China and overseas to propose a solution suitable for China to improve data quality in clinical research: to strengthen clinical research source data management by building clinical research source data platform and adopt common source data management process in hospitals.
As an important auxiliary means of pharmacoeconomics evaluation, budget impact analysis can effectively measure the affordability of medical insurance fund, and plays a significant role in the process of medical insurance access negotiation, adjustment of medical insurance reimbursement directory and establishment of payment price. The quality of budget impact analysis data has a great impact on the analysis results and the scientific decision-making. When the existing data cannot meet the requirements of the paper, relevant software is needed to carry out Delphi method to ensure the data accuracy. Infopoll is a powerful, easy-to-use application that designs consultation questionnaires by providing multiple question choices and multiple forms of answer settings, as well as detailed statistical charts for results analysis. This paper introduces how to obtain the data of budget impact analysis based on Delphi method using Infopoll software, and analyzes the main results in detail.
Epigenetics refers to the modification effect of external and internal environmental factors on genes under the premise of the unaltered genetic sequence, leading to changes in gene expression level or function, and thereby affecting various phenotypes or disease outcomes. In recent years, epigenetics has attracted increasing attention. Among them, DNA methylation has been shown to be closely related to human development and the development of disease. However, the high-dimensional omics data generated by genome-wide methylation detection can comprehensively reflect the overall and local epigenetic modifications at the genome level, which has become one of the main research contents in this field. Based on genome-wide methylation chip data, this paper summarized the quality control process of this omics data, common epigenetic omics correlation statistical analysis methods and ideas, and visualization realization of main results based on SAS JMP Genomics 10 software, so as to provide reference for similar studies.
ObjectiveTo construct a demand model for electronic medical record (EMR) data quality in regards to the lifecycle in machine learning (ML)-based disease risk prediction, to guide the implementation of EMR data quality assessment. MethodsReferring to the lifecycle in ML-based predictive model, we explored the demand for EMR data quality. First, we summarized the key data activities involved in each task on predicting disease risk with ML through a literature review. Second, we mapped the data activities in each task to the associated requirements. Finally, we clustered those requirements into four dimensions. ResultsWe constructed a three-layer structured ring to represent the demand model for EMR data quality in ML-based disease risk prediction research. The inner layer shows the seven main tasks in ML-based predictive models: data collection, data preprocessing, feature representation, feature selection and extraction, model training, model evaluation and optimization, and model deployment. The middle layer is the key data activities in each task; and the outer layer represents four dimensions of data quality requirements: operability, completeness, accuracy, and timeliness. ConclusionThe proposed model can guide real-world EMR data governance, improve its quality management, and promote the generation of real-world evidence.