Distracted Multi-task Learning: Addressing Negative Transfer with Fine-tuning on EHR Time-series Data

Matthew McDermott, Bret Nestor, Wancong Zhang, Peter Szolovits, Anna Goldenberg, Marzyeh Ghassemi

Abstract: Representation learning is a commonly touted goal in machine learning for healthcare, and for good reason. If we could learn a numerical encoding of clinical data which is reflective of underlying physiological similarity, this would have significant benefits both in research and application. However, many works pursuing representation learning systems evaluate only according to traditional, single-task performance metrics, and fail to assess whether or not the representations they produce actually contain generalizable signals capturing this underlying notion of similarity. In this work, we design an evaluation procedure specifically for representation learning systems, and use it to analyze the value of large-scale multi-task representation learners. We find mixed results, with multi-task representations being commonly helpful across a battery of prediction tasks and models, even while ensemble performance is often improvement by removing tasks from the trained ensemble and learned representations demonstrate no ability to cluster.