EcoSta 2021: Start Registration
View Submission - EcoSta2021
A0209
Title: Detecting voice spoofing attacks with residual network and max filter map with Grad-CAM activation Authors:  Il-Youp Kwak - Chung-Ang University (Korea, South) [presenting]
Abstract: The 2019 automatic speaker verification spoofing and countermeasures challenge (ASVspoof) competition aims to facilitate the design of highly accurate voice spoofing attack detection systems. However, they do not emphasize model complexity and latency requirements. Such constraints are strict and integral in a real-world deployment. Hence, most of the top-performing solutions from the competition use an ensemble approach and combine multiple complex deep learning models to maximize detection accuracy. This kind of approach would sit uneasily with real-world deployment constraints. To design a lightweight system, we combine the notions of skip connection (from ResNet) and max filter map (from Light CNN), and evaluate its accuracy using the ASVspoof 2019 dataset by optimizing a well-known signal processing feature called constant Q transform (CQT), our single model achieved a spoofing attack detection equal error rate (EER) of 0.16\%, outperforming the top ensemble system from the competition that achieved an EER of 0.39\% Furthermore, we applied Grad-CAM for the better explanation of our deep learning models on sound data.