Automated Medical Coding using BERT: Benchmarking Deep Learning in the Face of Subjective Labels

Mehmet Seflek, Wesam Elshamy, Abboud Chaballout, Ali Madani

Abstract: Documenting patients' interactions with health providers and institutions requires summarizing highly complex data. Medical coding reduces the dimensionality of this problem to a set of manually assigned codes that are used to bill, track patient health, and summarize a patient encounter. Incorrect coding, however, can lead to significant financial, legal, and health costs to clinics and patients. To address this, we build several deep learning models -- including transfer learning of state-of-the-art BERT models -- to predict medical codes on a novel dataset of 39,000 patient encounters. We also show through several labeling experiments that model performance is robust to subjectivity in the labels, and find that our models outperform a clinic's coding when judged against charts corrected and relabeled by an expert.