Unveiling Sounds: Harnessing ANN And Mel Spectrograms For Audio Signals Classification
Main Article Content
Abstract
This study focuses on audio classification using a combination of Artificial Neural Networks (ANN) and mel spectrogram representations. The approach involves deriving characteristics from audio signals using mel-frequency cepstral coefficients involves extracting distinctive features from the audio signals (MFCCs) and converting them into spectrogram representations. These mel spectrograms are then used as input to an ANN architecture, allowing the model to independently discern and learn hierarchical features for effective audio classification. The research highlights the synergetic relationship between ANNs and mel spectrogram features, optimizing hyperparameters and leveraging transfer learning to enhance the model's performance. Throughout the evaluation phase, rigorous testing is conducted on benchmark datasets demonstrates the efficiency of the proposed approach in achieving accurate and generalized audio classification across diverse sound categories. Moreover, the hybrid nature of this technique ensures scalability and adaptability, rendering it suitable for addressing the complexities inherent in various audio classification tasks. In essence, this research underscores the promising prospects of integrating ANNs with mel spectrogram representations, heralding advancements in audio processing technologies and their myriad applications.