Various image processing scripts.
-
Updated
Mar 30, 2025 - Python
Various image processing scripts.
This project implements an advanced generative AI pipeline for extracting and rating features from images. It combines the power of Florence-2, a state-of-the-art vision-language model, with a fine-tuned version of Mistral-v3, a cutting-edge large language model.
Florence-2 quick test
This project focuses on human and cat detection in video footage.
ecko-cli is a simple CLI tool that streamlines the process of processing images in a directory, generating captions, and saving them as text files. Additionally, it provides functionalities to create a JSONL file from images in the directory you specify. Images will be captioned using the Microsoft Florence-2-large model and ONNX
Video Synopsis: Intelligent Video Object Summarization using Florence/OWL-ViT and SAM. It uses OWL-ViT or Florence 2 for object detection, SAM for segmentation, and a custom video synopsis algorithm to produce optimized outputs.
This application utilizes the powerful Florence-2 vision-language model from Microsoft to generate comprehensive captions for images. The model is capable of understanding visual content and expressing it in natural language.
Microsoft の軽量VLMのFlorence-2のColaboratory上でのサンプル
Notebooks to segment wave contour using RunPod
code for Hugging Space Florence 2 Demo
Image analysis on Florence Model
The Power of Florence-2 with OpenVINO & FiftyOne: Real-World Applications in Image Analysis
Simple Gradio application integrated with Hugging Face Multimodals to support visual question answering chatbot and more features
Counting tomatoes using the Phrase Grounding Task from Florence2
ONNX deploys for Florence 2 visual multimodal
Providing a Next.js/shadecn UI for use with advanced visual Florence-2 model.
Comprehensive Smart Assignment Assessment Tool Using Large Language Models - 24-25J-295
Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.
TextSnap: Demo for Florence 2 model used in OCR tasks to extract and visualize text from images.
Add a description, image, and links to the florence-2 topic page so that developers can more easily learn about it.
To associate your repository with the florence-2 topic, visit your repo's landing page and select "manage topics."