COMPSTAT 2022: Start Registration
View Submission - COMPSTAT2022
A0645
Title: Data inspection via challenging decision boundaries' rigidity Authors:  Anthea Merida - Ecole Normale Superieure Paris Saclay (France) [presenting]
Argyris Kalogeratos - Centre Borelli Ecole Normale Superieure Paris Saclay (France)
Mathilde Mougeot - universite Paris Diderot (France)
Abstract: How smooth decision boundaries are needed in order to better fit a certain dataset? Answering this question can be useful when analyzing a dataset. It can provide insight into the dataset itself, and can also help reduce the scope of exploration of the subsequent model selection procedure for a task at hand. To answer this question, we propose the quantification of how much given `rigid' decision boundaries (produced by an algorithm that naturally finds such solutions) should be relaxed to achieve a performance improvement. The procedure starts with the rigid decision boundaries of a seed Decision Tree (DT), that is used to initialize a Neural DT. The latter is a Neural Network that is built using a DT, and whose activation functions' smoothness can be controlled by a hyperparameter. The boundaries are challenged by relaxing them progressively, through smoothing the NDT's activation functions and further training. During the procedure, the NDTs performance and decision agreement to its seed DT are measured. These two measures, along with the value of the smoothness parameter, are shown to be helpful for the user in figuring out how expressive their model should be, before exploring it further via model selection. The validity of this approach is demonstrated with experiments on simulated and other benchmark datasets.