Automatically generate captions for your photos using OpenAI's GPT 4.1 Mini model.
Important
This code was almost entirely LLM-generated. It is not intended to be used as a reference for good engineering practices. I have reviewed it for accuracy and functionality, but not with regard for maintainability. It runs well on my machine and is quite useful, but no guarantees are made about it working in any other context. It is provided as-is.
These are both real-world examples. Neither image had any metadata associated with it. Both captions are accurate.
- Install dependencies:
pnpm install
brew install exiftool
- Create a
.env
file with your OpenAI API key:
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
- Export your JPEG photos to the
input
directory - Run the caption script:
pnpm caption
The script will:
- Find all .jpg files in the
input
folder - Resize images to ≤1248px while maintaining aspect ratio
- Generate captions using GPT 4.1 Mini
- Save the captions to the image metadata
- Move processed images to the
output
folder - Create a log file in the
logs
folder
flowchart TD
subgraph GenerateCaption["Generate Caption"]
Img["Original image"]
Img-->Resize["Resize for LLM"]
Resize--Resized Image-->OpenAI["Generate caption with OpenAI"]
end
GenerateCaption
Input(["Input photos"]) --> GenerateCaption
Input -- Move to 'output' --> OutputImage["Original photos with captions"]
OpenAI -- Generated Caption --> SetExif["Insert caption as EXIF metadata"]
SetExif --> OutputImage
Resized images are used for the caption request, but the output folder contains your original images with the added captions in the metadata (full resolution).
Note
The cost comes out to an average of $0.00052941 per image. For $1 you can add captions to ~1,900 photos.
OPENAI_API_KEY
: Your OpenAI API key (required)CONCURRENCY
: Number of images to process in parallel (default: 4)
Tip
If you have a higher tier account and hundreds or thousands of images, set the
CONCURRENCY
ENV var to a higher value. With a Tier 5 API account you should
be able to use a concurrency value of at least 50 without issue.
- Node.js 22+
- pnpm
- exiftool (for reading/writing image metadata)
- Original images are preserved with metadata intact
- Only JPEG files are processed
- Existing captions or keywords in the image metadata are used to inform the caption generation
- Images are processed concurrently (default 4 workers)
- Captions are written to Caption-Abstract, Description, and ImageDescription EXIF fields
- Processing logs are saved with timestamps and statistics
Copyright © 2025 Corey Ward. Available under the MIT License.