ICU readmission is associated with longer hospitalization, mortality and adverse outcomes. An early recognition of ICU readmission can help prevent patients from worse situation and lead to lower treatment cost. As the abundance of Electronics Health Records (EHR), it is popular to design clinical decision tools with machine learning techniques manipulating on healthcare large scale data. To this end, we designed data-driven predictive models to estimate the risk of Intensive Care Unit (ICU) readmission. The discharge summary of each hospital admission was carefully represented by natural language processing algorithms. Unified Medical Language System (UMLS) was further used to standardize inconsistency of discharge summaries. 5 machine learning classifiers including naïve Bayes, support vector machine, logistic regression, gradient boosting decision tree and 2 feature representations including Bag-of-Words and Bag-of-CUIs were adopted to construct predictive configurations. The best configuration yielded a competitive AUC of 0.748. High contribution words and medical terms were further investigated to ensure that they were clinical meaningful. A comparative study between two feature representations were also discussed. Our work suggests that natural language processing of discharge summaries is capable to extract meaningful information from discharge summary automatically and to send clinicians the warning of unplanned 30-day readmission upon discharge.
Published in | Biomedical Statistics and Informatics (Volume 4, Issue 3) |
DOI | 10.11648/j.bsi.20190403.11 |
Page(s) | 22-26 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2019. Published by Science Publishing Group |
ICU Readmission, Machine Learning, Natural Language Processing, Unified Medical Language System (UMLS)
[1] | Pronovost, P. J., et al., Developing and pilot testing quality indicators in the intensive care unit. Journal of critical care, 2003. 18 (3): p. 145-155. |
[2] | Johnson, A. E., et al., MIMIC-III, a freely accessible critical care database. Scientific data, 2016. 3: p. 160035. |
[3] | Higgins, T. L., et al., Assessing contemporary intensive care unit outcome: an updated Mortality Probability Admission Model (MPM0-III). Critical care medicine, 2007. 35 (3): p. 827-835. |
[4] | Groenewegen, K. H., A. M. Schols, and E. F. Wouters, Mortality and mortality-related factors after hospitalization for acute exacerbation of COPD. Chest, 2003. 124 (2): p. 459-467. |
[5] | Makris, N., et al., Unplanned early readmission to the intensive care unit: a case-control study of patient, intensive care and ward-related factors. Anaesthesia and intensive care, 2010. 38 (3): p. 723-731. |
[6] | Tang, P. C., et al., Personal health records: definitions, benefits, and strategies for overcoming barriers to adoption. Journal of the American Medical Informatics Association, 2006. 13 (2): p. 121-126. |
[7] | Burton, L. C., G. F. Anderson, and I. W. Kues, Using electronic health records to help coordinate care. The Milbank Quarterly, 2004. 82 (3): p. 457-481. |
[8] | Sox, H. C., et al., Medical decision making. 2007: ACP Press. |
[9] | Bellazzi, R. and B. Zupan, Predictive data mining in clinical medicine: current issues and guidelines. International journal of medical informatics, 2008. 77 (2): p. 81-97. |
[10] | Alić, B., L. Gurbeta, and A. Badnjević. Machine learning techniques for classification of diabetes and cardiovascular diseases. in 2017 6th Mediterranean Conference on Embedded Computing (MECO). 2017. IEEE. |
[11] | Li, Y., et al. Early prediction of acute kidney injury in critical care setting using clinical notes. in 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018. IEEE. |
[12] | Berenson, R.A., Pronovost, P.J. and Krumholz, H.M., 2013. Achieving the potential of health care performance measures. Timely Anal Immed Health Pol, (2013), p. 2. |
[13] | Sundararaman, A., S. V. Ramanathan, and R. Thati, Novel approach to predict hospital readmissions using feature selection from unstructured data with class imbalance. Big data research, 2018. 13: p. 65-75. |
[14] | Bardell, T., et al., ICU readmission after cardiac surgery. European journal of cardio-thoracic surgery, 2003. 23 (3): p. 354-359. |
[15] | Vincent, J.-L., K. Donadello, and X. Schmit, Biomarkers in the critically ill patient: C-reactive protein. Critical care clinics, 2011. 27 (2): p. 241-251. |
[16] | Willett, P., The Porter stemming algorithm: then and now. Program, 2006. 40 (3): p. 219-223. |
[17] | Huang, Y.-F. and C.-H. Hsu, PubMed smarter: Query expansion with implicit words based on gene ontology. Knowledge-Based Systems, 2008. 21 (8): p. 927-933. |
[18] | Aronson, A. R. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. in Proceedings of the AMIA Symposium. 2001. American Medical Informatics Association. |
[19] | Cotton, R. T. and C. M. Myer III, Contemporary surgical management of laryngeal stenosis in children. American journal of otolaryngology, 1984. 5 (5): p. 360-368. |
[20] | Maitland, C. G., Perilymphatic fistula. Current neurology and neuroscience reports, 2001. 1 (5): p. 486-491. |
[21] | Frederickson, R. G., The subdural space interpreted as a cellular layer of meninges. The Anatomical Record, 1991. 230 (1): p. 38-51. |
[22] | Zhang, Y., et al., Psychiatric symptom recognition without labeled data using distributional representations of phrases and on-line knowledge. Journal of biomedical informatics, 2017. 75: p. S129-S137. |
APA Style
Zhiheng Li, Xinyue Xing, Bingzhang Lu, Ying Zhao, Zhixiang Li. (2019). Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning. Biomedical Statistics and Informatics, 4(3), 22-26. https://doi.org/10.11648/j.bsi.20190403.11
ACS Style
Zhiheng Li; Xinyue Xing; Bingzhang Lu; Ying Zhao; Zhixiang Li. Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning. Biomed. Stat. Inform. 2019, 4(3), 22-26. doi: 10.11648/j.bsi.20190403.11
AMA Style
Zhiheng Li, Xinyue Xing, Bingzhang Lu, Ying Zhao, Zhixiang Li. Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning. Biomed Stat Inform. 2019;4(3):22-26. doi: 10.11648/j.bsi.20190403.11
@article{10.11648/j.bsi.20190403.11, author = {Zhiheng Li and Xinyue Xing and Bingzhang Lu and Ying Zhao and Zhixiang Li}, title = {Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning}, journal = {Biomedical Statistics and Informatics}, volume = {4}, number = {3}, pages = {22-26}, doi = {10.11648/j.bsi.20190403.11}, url = {https://doi.org/10.11648/j.bsi.20190403.11}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.bsi.20190403.11}, abstract = {ICU readmission is associated with longer hospitalization, mortality and adverse outcomes. An early recognition of ICU readmission can help prevent patients from worse situation and lead to lower treatment cost. As the abundance of Electronics Health Records (EHR), it is popular to design clinical decision tools with machine learning techniques manipulating on healthcare large scale data. To this end, we designed data-driven predictive models to estimate the risk of Intensive Care Unit (ICU) readmission. The discharge summary of each hospital admission was carefully represented by natural language processing algorithms. Unified Medical Language System (UMLS) was further used to standardize inconsistency of discharge summaries. 5 machine learning classifiers including naïve Bayes, support vector machine, logistic regression, gradient boosting decision tree and 2 feature representations including Bag-of-Words and Bag-of-CUIs were adopted to construct predictive configurations. The best configuration yielded a competitive AUC of 0.748. High contribution words and medical terms were further investigated to ensure that they were clinical meaningful. A comparative study between two feature representations were also discussed. Our work suggests that natural language processing of discharge summaries is capable to extract meaningful information from discharge summary automatically and to send clinicians the warning of unplanned 30-day readmission upon discharge.}, year = {2019} }
TY - JOUR T1 - Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning AU - Zhiheng Li AU - Xinyue Xing AU - Bingzhang Lu AU - Ying Zhao AU - Zhixiang Li Y1 - 2019/11/22 PY - 2019 N1 - https://doi.org/10.11648/j.bsi.20190403.11 DO - 10.11648/j.bsi.20190403.11 T2 - Biomedical Statistics and Informatics JF - Biomedical Statistics and Informatics JO - Biomedical Statistics and Informatics SP - 22 EP - 26 PB - Science Publishing Group SN - 2578-8728 UR - https://doi.org/10.11648/j.bsi.20190403.11 AB - ICU readmission is associated with longer hospitalization, mortality and adverse outcomes. An early recognition of ICU readmission can help prevent patients from worse situation and lead to lower treatment cost. As the abundance of Electronics Health Records (EHR), it is popular to design clinical decision tools with machine learning techniques manipulating on healthcare large scale data. To this end, we designed data-driven predictive models to estimate the risk of Intensive Care Unit (ICU) readmission. The discharge summary of each hospital admission was carefully represented by natural language processing algorithms. Unified Medical Language System (UMLS) was further used to standardize inconsistency of discharge summaries. 5 machine learning classifiers including naïve Bayes, support vector machine, logistic regression, gradient boosting decision tree and 2 feature representations including Bag-of-Words and Bag-of-CUIs were adopted to construct predictive configurations. The best configuration yielded a competitive AUC of 0.748. High contribution words and medical terms were further investigated to ensure that they were clinical meaningful. A comparative study between two feature representations were also discussed. Our work suggests that natural language processing of discharge summaries is capable to extract meaningful information from discharge summary automatically and to send clinicians the warning of unplanned 30-day readmission upon discharge. VL - 4 IS - 3 ER -