Improving Accuracy and Efficiency of Speaker Identification Using K-means and MFCC Algorithms in Noisy Environments
DOI:
https://doi.org/10.63318/Keywords:
K-means Algorithm, Speaker identification, Artificial Intelligence, Signal to noise ratio (SNR), Mel-frequency cepstral, Coefficients (MFCC), Mean Squared Error (MSE)Abstract
Speaker identification is a critical challenge in audio processing, with significant applications in security and authentication systems. Efforts focus on developing fast and efficient AI-based techniques to identify speakers using features such as pitch and frequency. A speaker recognition system consists of two main stages: feature extraction and matching. This research presents an innovative model aimed at enhancing the accuracy of speaker recognition using K-means and MFCC algorithms. The results demonstrate that the K-means algorithm reduced the error rate from 20% to 0.85%, while the MFCC features achieved an accuracy range between 80% and 99.15%. Additionally, recognition time was significantly improved, decreasing from 0.4092 seconds to 0.0438 seconds, thereby increasing the system's efficiency. Moreover, the system's performance in noisy environments was evaluated using the Signal-to-Noise Ratio (SNR), while the Mean Squared Error (MSE) metric was employed to ensure reliability and confidence in the recognition results. These findings highlight the effectiveness of the proposed algorithms and underscore the system's potential for applications in voice-controlled systems and personal assistants.
Downloads
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.