• 1. School of Artificial Intelligence, Beijng Technology and Business University, Beijng 100048, P. R. China;
XING Suxia, Email: xingsuxia@163.com
Export PDF Favorites Scan Get Citation

The task of automatic generation of medical image reports faces various challenges, such as diverse types of diseases and a lack of professionalism and fluency in report descriptions. To address these issues, this paper proposes a multimodal medical imaging report based on memory drive method (mMIRmd). Firstly, a hierarchical vision transformer using shifted windows (Swin-Transformer) is utilized to extract multi-perspective visual features of patient medical images, and semantic features of textual medical history information are extracted using bidirectional encoder representations from transformers (BERT). Subsequently, the visual and semantic features are integrated to enhance the model's ability to recognize different disease types. Furthermore, a medical text pre-trained word vector dictionary is employed to encode labels of visual features, thereby enhancing the professionalism of the generated reports. Finally, a memory driven module is introduced in the decoder, addressing long-distance dependencies in medical image data. This study is validated on the chest X-ray dataset collected at Indiana University (IU X-Ray) and the medical information mart for intensive care chest x-ray (MIMIC-CXR) released by the Massachusetts Institute of Technology and Massachusetts General Hospital. Experimental results indicate that the proposed method can better focus on the affected areas, improve the accuracy and fluency of report generation, and assist radiologists in quickly completing medical image report writing.

Citation: XING Suxia, FANG Junze, JU Zihan, Guo Zheng, WANG Yu. Research on automatic generation of multimodal medical image reports based on memory driven. Journal of Biomedical Engineering, 2024, 41(1): 60-69. doi: 10.7507/1001-5515.202304001 Copy

  • Previous Article

    Research on bark-frequency spectral coefficients heart sound classification algorithm based on multiple window time-frequency reassignment
  • Next Article

    Screening of immune related gene and survival prediction of lung adenocarcinoma patients based on LightGBM model