+
Skip to content

SigurdST/emotion_recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Emotion Recognition from Speech using Mel Spectrograms and CNNs

Overview

This project focuses on developing an emotion recognition system from speech using Mel Spectrograms and Convolutional Neural Networks (CNNs). The dataset used is the Acted Emotional Speech Dynamic Database (AESDD), which contains audio files categorized into five emotions: angry, disgust, fear, happy, and sad.

Objectives

  1. Transform raw audio files into numerical representations using Mel Spectrograms.
  2. Train CNNs to classify emotions from these spectrograms.
  3. Address challenges such as:
    • Variable input shapes.
    • Small dataset size.
  4. Compare performance using original data with batch of size 1, resized data, and augmented datasets.

Key Findings

  1. The best results were obtained using the original dataset with a batch size of 1, achieving an accuracy of 74.38%.
  2. Resized spectrograms resulted in lower accuracy, likely due to loss of crucial information during resizing.
  3. Artificially augmented data achieved accuracy similar to the original dataset, but with possible information loss during augmentation.

Quick Start

  1. Clone the repository:

    git clone https://github.com/SigurdST/emotion_recognition.git
    cd emotion_recognition
  2. Explore notebook.ipynb to review all the code implementation and processes, and REPORT.md for detailed explanations, results, and insights derived from the project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载