Toward Deep Learning ECAPA-TDNN Model Enhancement for Speaker Recognition

Main Article Content

Freha Mezzoudj, Chahreddine Medjahed, Ahmed Slimani

Abstract

The goal of artificial intelligence (AI) is to build intelligent machines or models that are able to learn, reason, solve problems, comprehend language and recognize patterns. A biometric system uses persons’ physiological or behavioural features to recognize them. Applications like smartphones, border control, banking, and workplace access all make extensive use of these systems for identity management, security, and authentication. An essential component of the majority of biometric systems is a uni-biometric individual recognition system. With an emphasis on the voice, we suggest automatic person recognition systems that combine Deep Learning (DL) and Machine Learning (ML) approaches to ensure simplicity and efficiency. To accomplish this objective, we propose two strategies. First, we customized ECAPA-TDNN, a pretrained acoustic deep neural network, for individual speech recognition using transfer-learning technique. Second, we used transfer learning shaped ECAPA as a feature extractor for speech signals hybridized to a branch of ML algorithms as classifiers. We performed the training and testing of the systems using spoken acoustic signals gathered in real-world environments. The classification results indicate that the proposed methods make an interesting rate of accuracy. The overall accuracy was 100% in frame level with couple of hybridized models based on DL-ML models. The architecture based on DL feature extractor-ML classifier established in this study provided a foundation for promising behaviour biometric systems.

Article Details

Section
Articles