Enhancing Collaborative Medical Outcomes through Private Synthetic Hypercube Augmentation: PriSHA

Shinpei Nakamura Sakai, Dennis Shung, Jasjeet S Sekhon

View paper (PDF)

Abstract: Effective collaboration across medical institutions presents a significant challenge, primarily due to the imperative of maintaining patient privacy. Optimal machine learning models in healthcare demand access to extensive, high-quality data to achieve generality and robustness. Yet, typically, medical institutions are restricted to data within their networks, limiting the scope and diversity of information. This limitation becomes particularly acute when encountering patient cases with rare or unique characteristics, leading to potential distribution shifts in the data. To address these challenges, our work introduces a framework designed to enhance existing clinical foundation models, Private Synthetic Hypercube Augmentation (PriSHA). We leverage generative models to produce synthetic data, generated from diverse sources, as a means to augment these models while adhering to strict privacy standards. This approach promises to broaden the dataset's scope and improve model performance without compromising patient confidentiality. To the best of our knowledge, our framework is the first framework to address distribution shifts through the use of synthetic privacy-preserving tabular data augmentation.