Lists (3)
Sort Name ascending (A-Z)
Stars
[ICLR'23 Spotlight & ECCV'24 & IJCV'24] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D (ECCV 2020)
This repository contains the code for the paper "Occupancy Networks - Learning 3D Reconstruction in Function Space"
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Open-source and strong foundation image recognition models.
Speech To Speech: an effort for an open-sourced and modular GPT4-o
A modular graph-based Retrieval-Augmented Generation (RAG) system
[AAAI 2025] EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
OCR dataset Text-Detection dataset Font-Classification dataset generator
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.
Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
⛹️ Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.