An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes


Background and Objective: Code assignment is of paramount importance in many levels in modern hospitals, from ensuring accurate billing process to creating a valid record of patient care history. However, the coding process is tedious and subjective, and it requires medical coders with extensive training. This study aims to evaluate the performance of deep-learning-based systems to automatically map clinical notes to ICD-9 medical codes. Methods: The evaluations of this research are focused on end-to-end learning methods without manually defined rules. Traditional machine learning algorithms, as well as state-of-the-art deep learning methods such as Recurrent Neural Networks and Convolution Neural Networks, were applied to the Medical Information Mart for Intensive Care (MIMIC-III) dataset. An extensive number of experiments was applied to different settings of the tested algorithm. Results: Findings showed that the deep learning-based methods outperformed other conventional machine learning methods. From our assessment, the best models could predict the top 10 ICD-9 codes with 0.6957 F1 and 0.8967 accuracy and could estimate the top 10 ICD-9 categories with 0.7233 F1 and 0.8588 accuracy. Our implementation also outperformed existing work under certain evaluation metrics. Conclusion: A set of standard metrics was utilized in assessing the performance of ICD-9 code assignment on MIMIC-III dataset. All the developed evaluation tools and resources are available online, which can be used as a baseline for further research.

Comput. Methods Programs Biomed.