A1136
Title: Leveraging GANs with Influence Function for Class Imbalance Mitigation in Regression tasks
Authors: Akash Sharma - University College London (United Kingdom) [presenting]
Abstract: Class imbalance is a critical challenge in machine learning, particularly in regression and classification tasks, where underrepresented classes can lead to biased models and suboptimal performance. This research proposes a novel data augmentation strategy using generative adversarial networks (GANs) and influence-function-based scoring to effectively balance class distributions and improve model accuracy. A GAN is first trained on real training data to generate synthetic samples, which serve as potential augmentations. An influence-function scoring mechanism is then employed to identify the optimal subset of synthetic samples that, when combined with real data, maximize predictive performance. The method quantifies the marginal impact of each synthetic sample on test error using first-order approximations and iteratively refines the augmented set through a greedy add/drop procedure. Experimental results demonstrate that the approach significantly reduces test errors and improves model generalization compared to random augmentation. The proposed methodology provides an automated and efficient framework for leveraging synthetic data in imbalanced learning scenarios, offering a scalable solution for enhancing machine learning model robustness.