This project implements an emotion recognition system using a Vision Transformer (ViT) model. It supports:
- Emotion detection from image files
- Real-time facial emotion detection using a webcam
The model is trained to classify seven human emotions and can run on either GPU or CPU.
The model classifies faces into the following seven categories:
- Angry
- Disgust
- Fear
- Happy
- Sad
- Surprise
- Neutral
Ensure you have Python 3.8+ installed.
pip install torch torchvision timm numpy opencv-python pillow gdown
pip install -r requirements.txt
emotion-recognition/
├── main.py # Main application (image & webcam inference)
├── download_model.py # Downloads pretrained model from Google Drive
├── emotionvit_batch_8.pth # Model file (downloaded after running script)
├── requirements.txt # All dependencies listed here
├── LICENSE # License information (Academic use only)
├── README.md # Project documentation
├── test_images/ # Folder containing sample images for quick testing
git clone https://github.com/yourusername/emotion-recognition.git
cd emotion-recognition
pip install -r requirements.txt
The pretrained model (~900MB) is hosted on Google Drive.
Run this command to download it automatically:
python download_model.py
This will download emotionvit_batch_8.pth
into your project directory.
Launch the main application:
python main.py
You will be prompted to choose a mode:
Enter 'file' for image upload or 'webcam' to use webcam:
To quickly try out the model without supplying your own images, a folder named test_images/
is included in the project. It contains sample images representing different facial expressions.
Example:
Enter image file path: test_images/happy_face.jpg
- Choose
file
mode. - Provide the image path when prompted.
Example:
Enter 'file' for image upload or 'webcam' to use webcam: file
Enter image file path: test_images/happy_face.jpg
The script will:
- Predict the dominant emotion
- Display all class probabilities
- Show prediction confidence
- Choose
webcam
mode. - Your webcam will open automatically.
Example:
Enter 'file' for image upload or 'webcam' to use webcam: webcam
During webcam mode:
- Faces will be detected in real-time.
- Each detected face will display:
- Predicted emotion
- A confidence bar
Press q
to quit.
- Uses
timm
to load a pretrained Vision Transformer (ViT). - Fine-tuned for 7-class facial emotion classification.
- Uses OpenCV Haar cascades for real-time face detection.
- Normalized and resized face regions are passed into the model.
- Softmax is applied to generate emotion probabilities.
If you get:
Model file not found at 'emotionvit_batch_8.pth'.
Please run `python download_model.py`
Run:
python download_model.py
- Ensure no other application is using the webcam.
- Restart your computer if needed.
This project is licensed for research and academic use only.
Commercial use, redistribution, or modification is not permitted without prior written consent.
See the LICENSE file for more information.
Questions, issues, or suggestions? Open an issue or fork the repo and submit a pull request.