Citation: | Fei LI, Yiqiang CHEN, Yang GU, et al., “Extracting Integrated Features of Electronic Medical Records Big Data for Mortality and Phenotype Prediction,” Chinese Journal of Electronics, vol. 33, no. 3, pp. 761–777, 2024 doi: 10.23919/cje.2023.00.181 |
[1] |
B. K. Beaulieu-Jones, P. Orzechowski, and J. H. Moore, “Mapping patient trajectories using longitudinal extraction and deep learning in the MIMIC-III critical care database,” in Proceedings of the Pacific Symposium, Kohala Coast, HI, USA, pp. 123–132, 2018.
|
[2] |
A. Budrionis, M. Miara, P. Miara, et al., “Benchmarking PySyft federated learning framework on MIMIC-III dataset,” IEEE Access, vol. 9, pp. 116869–116878, 2021. doi: 10.1109/ACCESS.2021.3105929
|
[3] |
J. F. Chen, L. L. Sun, C. H. Guo, et al., “A fusion framework to extract typical treatment patterns from electronic medical records,” Artificial Intelligence in Medicine, vol. 103, article no. 101782, 2020. doi: 10.1016/j.artmed.2019.101782
|
[4] |
Y. Cheng, F. Wang, P. Zhang, et al., “Risk prediction with electronic health records: A deep learning approach,” in Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, FL, USA, pp. 432–440, 2016.
|
[5] |
A. Razavi, A. van den Oord, B. Poole, et al., “Preventing posterior collapse with delta-VAEs,” in Proceedings of the 7th International Conference on Learning Representations, New Orleans, LA, USA, 2019.
|
[6] |
A. E. W. Johnson, T. J. Pollard, L. Shen, et al., “MIMIC-III, a freely accessible critical care database,” Scientific Data, vol. 3, article no. 160035, 2016. doi: 10.1038/sdata.2016.35
|
[7] |
S. K. Bashar, M. B. Hossain, E. Ding, et al., “Atrial fibrillation detection during sepsis: Study on MIMIC III ICU data,” IEEE Journal of Biomedical and Health Informatics, vol. 24, no. 11, pp. 3124–3135, 2020. doi: 10.1109/JBHI.2020.2995139
|
[8] |
H. Chen, Z. Zhu, C. Y. Zhao, et al., “Central venous pressure measurement is associated with improved outcomes in septic patients: An analysis of the MIMIC-III database,” Critical Care, vol. 24, no. 1, article no. 433, 2020. doi: 10.1186/s13054-020-03109-9
|
[9] |
B. H. Cheng, D. W. Li, Y. Q. Gong, et al., “Serum anion gap predicts all-cause mortality in critically ill patients with acute kidney injury: Analysis of the MIMIC-III database,” Disease Markers, vol. 2020, article no. 6501272, 2020. doi: 10.1155/2020/6501272
|
[10] |
Z. Dai, S. R. Liu, J. F. Wu, et al., “Analysis of adult disease characteristics and mortality on MIMIC-III,” PLoS One, vol. 15, no. 4, article no. e0232176, 2020. doi: 10.1371/journal.pone.0232176
|
[11] |
M. L. Feng, J. I. McSparron, D. T. Kien, et al., “Transthoracic echocardiography and mortality in sepsis: Analysis of the MIMIC-III database,” Intensive Care Medicine, vol. 44, no. 6, pp. 884–892, 2018. doi: 10.1007/s00134-018-5208-7
|
[12] |
E. Y. Ding, D. Albuquerque, M. Winter, et al., “Novel method of atrial fibrillation case identification and burden estimation using the MIMIC-III electronic health data set,” Journal of Intensive Care Medicine, vol. 34, no. 10, pp. 851–857, 2019. doi: 10.1177/0885066619866172
|
[13] |
M. Böck, J. Malle, D. Pasterk, et al., “Superhuman performance on sepsis MIMIC-III data by distributional reinforcement learning,” PLoS One, vol. 17, no. 11, article no. e0275358, 2022. doi: 10.1371/journal.pone.0275358
|
[14] |
N. Ding, C. R. Guo, C. L. Li, et al., “An artificial neural networks model for early predicting in-hospital mortality in acute pancreatitis in MIMIC-III,” BioMed Research International, vol. 2021, article no. 6638919, 2021. doi: 10.1155/2021/6638919
|
[15] |
T. Gentimis, A. J. Alnaser, A. Durante, et al., “Predicting hospital length of stay using neural networks on MIMIC III data,” in Proceedings of the 15th International Conference on Dependable, Autonomic and Secure Computing, 15th International Conference on Pervasive Intelligence and Computing, 3rd International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress, Orlando, FL, USA, pp. 1194–1201, 2017.
|
[16] |
N. Z. Hou, M. Z. Li, L. He, et al., “Predicting 30-days mortality for MIMIC-III patients with sepsis-3: A machine learning approach using XGboost,” Journal of Translational Medicine, vol. 18, no. 1, article no. 462, 2020. doi: 10.1186/s12967-020-02620-5
|
[17] |
C. J. McWilliams, D. J. Lawson, R. Santos-Rodriguez, et al., “Towards a decision support tool for intensive care discharge: Machine learning algorithm development using electronic healthcare data from MIMIC-III and Bristol, UK,” BMJ Open, vol. 9, no. 3, article no. e025925, 2019. doi: 10.1136/bmjopen-2018-025925
|
[18] |
S. Nuthakki, S. Neela, J. W. Gichoya, et al., “Natural language processing of MIMIC-III clinical notes for identifying diagnosis and procedures with neural networks,” arXiv preprint, arXiv: 1912.12397, 2019.
|
[19] |
H. Harutyunyan, H. Khachatrian, D. C. Kale, et al., “Multitask learning and benchmarking with clinical time series data,” Scientific Data, vol. 6, no. 1, article no. 96, 2019. doi: 10.1038/s41597-019-0103-9
|
[20] |
Y. K. Li, S. Rao, J. R. A. Solares, et al., “BEHRT: Transformer for electronic health records,” Scientific Reports, vol. 10, no. 1, article no. 7155, 2020. doi: 10.1038/s41598-020-62922-y
|
[21] |
Y. K. Li, M. Mamouei, G. Salimi-Khorshidi, et al., “Hi-BEHRT: Hierarchical transformer-based model for accurate prediction of clinical events using multimodal longitudinal electronic health records,” IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 2, pp. 1106–1117, 2023. doi: 10.1109/JBHI.2022.3224727
|
[22] |
Z. Xu, D. R. So, and A. M. Dai, “MUFASA: Multimodal fusion architecture search for electronic health records,” in Proceedings of the 35th AAAI Conference on Artificial Intelligence, Online, pp. 10532–10540, 2021.
|
[23] |
Y. W. Meng, W. Speier, M. K. Ong, et al., “Bidirectional representation learning from transformers using multimodal electronic health record data to predict Depression,” IEEE Journal of Biomedical and Health Informatics, vol. 25, no. 8, pp. 3121–3129, 2021. doi: 10.1109/JBHI.2021.3063721
|
[24] |
P. Chatha, Y. X. Wang, Z. K. Wu, et al., “Dynamic survival transformers for causal inference with electronic health records,” arXiv preprint, arXiv: 2210.15417, 2022.
|
[25] |
S. Rao, M. Mamouei, G. Salimi-Khorshidi, et al., “Targeted-BEHRT: Deep learning for observational causal inference on longitudinal electronic health records,” IEEE Transactions on Neural Networks and Learning Systems, https://doi.org/10.1109/TNNLS.2022.3183864, In Press, 2022.
|
[26] |
X. P. Peng, G. D. Long, T. Shen, et al., “Sequential diagnosis prediction with transformer and ontological representation,” in Proceedings of the 2021 IEEE International Conference on Data Mining (ICDM), Auckland, New Zealand, pp. 489–498, 2021.
|
[27] |
J. L. Wu, J. Roy, and W. F. Stewart, “Prediction modeling using EHR data: Challenges, strategies, and a comparison of machine learning approaches,” Medical Care, vol. 48, no. 6, pp. S106–S113, 2010. doi: 10.1097/MLR.0b013e3181de9e17
|
[28] |
H. B. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, 2009. doi: 10.1109/TKDE.2008.239
|
[29] |
H. X. Guo, Y. J. Li, J. Shang, et al., “Learning from class-imbalanced data: Review of methods and applications,” Expert Systems with Applications, vol. 73, pp. 220–239, 2017. doi: 10.1016/j.eswa.2016.12.035
|
[30] |
B. Krawczyk, “Learning from imbalanced data: Open challenges and future directions,” Progress in Artificial Intelligence, vol. 5, no. 4, pp. 221–232, 2016. doi: 10.1007/s13748-016-0094-0
|
[31] |
Y. M. Sun, A. K. C. Wong, and M. S. Kamel, “Classification of imbalanced data: A review,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 23, no. 4, pp. 687–719, 2009. doi: 10.1142/S0218001409007326
|
[32] |
T. C. Zhang, J. L. Chen, F. D. Li, et al., “Intelligent fault diagnosis of machines with small & imbalanced data: A state-of-the-art review and possible extensions,” ISA Transactions, vol. 119, pp. 152–171, 2022. doi: 10.1016/j.isatra.2021.02.042
|
[33] |
H. D. Ma, Z. C. Dong, M. C. Chen, et al., “A gradient boosting tree model for multi-department venous thromboembolism risk assessment with imbalanced data,” Journal of Biomedical Informatics, vol. 134, article no. 104210, 2022. doi: 10.1016/j.jbi.2022.104210
|
[34] |
Y. Wang, Y. K. Wei, H. Yang, et al., “Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model,” BMC Medical Informatics and Decision Making, vol. 20, no. 1, article no. 238, 2020. doi: 10.1186/s12911-020-01245-4
|
[35] |
H. K. Chang, C. T. Wu, J. H. Liu, et al., “Early detecting in-hospital cardiac arrest based on machine learning on imbalanced data,” in Proceedings of the 2019 IEEE International Conference on Healthcare Informatics (ICHI), Xi’an, China, pp. 1–10, 2019.
|
[36] |
K. Fujiwara, Y. K. Huang, K. Hori, et al., “Over- and under-sampling approach for extremely imbalanced and small minority data problem in health record analysis,” Frontiers in Public Health, vol. 8, article no. 178, 2020. doi: 10.3389/fpubh.2020.00178
|
[37] |
D. P. Kingma and M. Welling, “An introduction to variational autoencoders,” Foundations and Trends® in Machine Learning, vol. 12, no. 4, pp. 307–392, 2019. doi: 10.1561/2200000056
|
[38] |
S. R. Bowman, L. Vilnis, O. Vinyals, et al., “Generating sentences from a continuous space,” in Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, Berlin, Germany, pp. 10–21, 2016.
|
[39] |
H. Fu, C. Y. Li, X. D. Liu, et al., “Cyclical annealing schedule: A simple approach to mitigating KL vanishing,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, pp. 240–250, 2019.
|
[40] |
H. J. Shao, S. C. Yao, D. C. Sun, et al., “ControlVAE: Controllable variational autoencoder,” in Proceedings of the 37th International Conference on Machine Learning, Online, pp. 8655–8664, 2020.
|
[41] |
D. P. Kingma, T. Salimans, R. Jozefowicz, et al., “Improved variational inference with inverse autoregressive flow,” in Proceedings of the 30th Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 4743-4751, 2016.
|
[42] |
X. Chen, D. P. Kingma, T. Salimans, et al., “Variational lossy autoencoder,” in Proceedings of the 5th International Conference on Learning Representations, Toulon, France, pp. 1-17, 2017.
|
[43] |
Q. L. Zhu, W. Bi, X. J. Liu, et al., “A batch normalized inference network keeps the KL vanishing away,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, pp. 2636–2649, 2020.
|
[44] |
A. D. McCarthy, X. Li, J. T. Gu, et al., “Addressing posterior collapse with mutual information for improved variational neural machine translation,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, pp. 8512–8525, 2020.
|
[45] |
H. Y. Wu and M. Tavakol, “MuseBar: Alleviating posterior collapse in recurrent VAEs toward music generation,” in Proceedings of the 20th International Symposium on Intelligent Data Analysis, Rennes, France, pp. 365–377, 2022.
|
[46] |
S. L. Wu and Y. H. Yang, “MuseMorphose: Full-song and fine-grained piano music style transfer with one transformer VAE,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1953–1967, 2023. doi: 10.1109/TASLP.2023.3270726
|
[47] |
Y. Z. Zhou, C. Luo, X. Y. Sun, et al., “VAE^2: Preventing posterior collapse of variational video predictions in the wild,” arXiv preprint, arXiv: 2101.12050, 2021.
|
[48] |
O. Dollar, N. Joshi, D. A. C. Beck, et al., “Attention-based generative models for de novo molecular design,” Chemical Science, vol. 12, no. 24, pp. 8362–8372, 2021. doi: 10.1039/d1sc01050f
|
[49] |
M. Ding, Z. Y. Yang, W. Y. Hong, et al., “CogView: Mastering text-to-image generation via transformers,” in Proceedings of the 35th Conference on Neural Information Processing Systems, Online, pp. 19822–19835, 2021.
|
[50] |
J. T. Chien, “Deep Bayesian multimedia learning,” in Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, pp. 4791–4793, 2020.
|
[51] |
J. Y. Hu, X. Y. Yi, W. H. Li, et al., “Fuse it more deeply! A variational transformer with layer-wise latent variable inference for text generation,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, WA, USA, pp. 697–716, 2022.
|
[52] |
M. Kondo, “Developing a generative model utilizing self-attention networks: Application to materials/drug discovery,” Molecular Informatics, vol. 40, no. 10, article no. 2100102, 2021. doi: 10.1002/minf.202100102
|
[53] |
C. Tang, W. Zhan, and M. Tomizuka, “Exploring social posterior collapse in variational autoencoder for interaction modeling,” in Proceedings of the 35th Conference on Neural Information Processing Systems, Virtual Event, pp. 8481–8494, 2021.
|
[54] |
J. Zhang, J. W. Xie, N. Barnes, et al., “Learning generative vision transformer with energy-based latent space for saliency prediction,” in Proceedings of the 35th Conference on Neural Information Processing Systems, Virtual Event, pp. 15448–15463, 2021.
|
[55] |
M. Ş. Bilici and M. F. Amasyali, “Transformers as neural augmentors: Class conditional sentence generation via variational Bayes,” arXiv preprint, arXiv: 2205.09391, 2022.
|
[56] |
D. M. Arroyo, J. Postels, and F. Tombari, “Variational transformer networks for layout generation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, pp. 13637–13647, 2021.
|
[57] |
J. Y. Luo, M. C. Ye, C. Xiao, et al., “HiTANet: Hierarchical time-aware attention networks for risk prediction on electronic health records,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, pp. 647–656, 2020.
|