Emotion Recognition from Facial Images Using Vision Transformer (ViT)

This project implements an emotion recognition system using a Vision Transformer (ViT) model. It supports:

Emotion detection from image files
Real-time facial emotion detection using a webcam

The model is trained to classify seven human emotions and can run on either GPU or CPU.

Emotion Classes

The model classifies faces into the following seven categories:

Angry
Disgust
Fear
Happy
Sad
Surprise
Neutral

Prerequisites

Ensure you have Python 3.8+ installed.

Option 1: Install using pip manually

pip install torch torchvision timm numpy opencv-python pillow gdown

Option 2: Install from `requirements.txt`

pip install -r requirements.txt

Project Structure

emotion-recognition/
├── main.py                 # Main application (image & webcam inference)
├── download_model.py       # Downloads pretrained model from Google Drive
├── emotionvit_batch_8.pth  # Model file (downloaded after running script)
├── requirements.txt        # All dependencies listed here
├── LICENSE                 # License information (Academic use only)
├── README.md               # Project documentation
├── test_images/            # Folder containing sample images for quick testing

Step-by-Step Setup & Execution

1. Clone the Repository

git clone https://github.com/yourusername/emotion-recognition.git
cd emotion-recognition

2. Install Dependencies

pip install -r requirements.txt

3. Download the Pretrained Model

The pretrained model (~900MB) is hosted on Google Drive.

Run this command to download it automatically:

python download_model.py

This will download emotionvit_batch_8.pth into your project directory.

4. Run the Program

Launch the main application:

python main.py

You will be prompted to choose a mode:

Enter 'file' for image upload or 'webcam' to use webcam:

Using Test Images

To quickly try out the model without supplying your own images, a folder named test_images/ is included in the project. It contains sample images representing different facial expressions.

Example:

Enter image file path: test_images/happy_face.jpg

Usage Modes

Option A: Image File Inference

Choose file mode.
Provide the image path when prompted.

Example:

Enter 'file' for image upload or 'webcam' to use webcam: file
Enter image file path: test_images/happy_face.jpg

The script will:

Predict the dominant emotion
Display all class probabilities
Show prediction confidence

Option B: Webcam Inference

Choose webcam mode.
Your webcam will open automatically.

Example:

Enter 'file' for image upload or 'webcam' to use webcam: webcam

During webcam mode:

Faces will be detected in real-time.
Each detected face will display:
- Predicted emotion
- A confidence bar

Press q to quit.

How It Works

Uses timm to load a pretrained Vision Transformer (ViT).
Fine-tuned for 7-class facial emotion classification.
Uses OpenCV Haar cascades for real-time face detection.
Normalized and resized face regions are passed into the model.
Softmax is applied to generate emotion probabilities.

Troubleshooting

Model file not found

If you get:

Model file not found at 'emotionvit_batch_8.pth'.
Please run `python download_model.py`

Run:

python download_model.py

Webcam not opening

Ensure no other application is using the webcam.
Restart your computer if needed.

License

This project is licensed for research and academic use only.
Commercial use, redistribution, or modification is not permitted without prior written consent.
See the LICENSE file for more information.

Contact

Questions, issues, or suggestions? Open an issue or fork the repo and submit a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
test image		test image
README.md		README.md
download_model.py		download_model.py
main.py		main.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Emotion Recognition from Facial Images Using Vision Transformer (ViT)

Emotion Classes

Prerequisites

Option 1: Install using pip manually

Option 2: Install from `requirements.txt`

Project Structure

Step-by-Step Setup & Execution

1. Clone the Repository

2. Install Dependencies

3. Download the Pretrained Model

4. Run the Program

Using Test Images

Usage Modes

Option A: Image File Inference

Option B: Webcam Inference

How It Works

Troubleshooting

Model file not found

Webcam not opening

License

Contact

About

Uh oh!

Releases

Packages

Languages

Calebbb-ops/emotion-recognition

Folders and files

Latest commit

History

Repository files navigation

Emotion Recognition from Facial Images Using Vision Transformer (ViT)

Emotion Classes

Prerequisites

Option 1: Install using pip manually

Option 2: Install from requirements.txt

Project Structure

Step-by-Step Setup & Execution

1. Clone the Repository

2. Install Dependencies

3. Download the Pretrained Model

4. Run the Program

Using Test Images

Usage Modes

Option A: Image File Inference

Option B: Webcam Inference

How It Works

Troubleshooting

Model file not found

Webcam not opening

License

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Option 2: Install from `requirements.txt`

Packages