+
Skip to content
View jbwang1997's full-sized avatar

Block or report jbwang1997

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

official code of "MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction"

Python 72 2 Updated Mar 28, 2025

[ICCV 2025] This is the official PyTorch codes for the paper: "DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution"

Python 71 2 Updated Jul 11, 2025
Python 58 6 Updated Jul 10, 2025

[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Python 2,846 181 Updated May 15, 2025

[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Python 1,181 93 Updated Jul 4, 2025

Open-source simulator for autonomous driving research.

C++ 12,730 4,115 Updated Jul 16, 2025

Official code for the paper: Depth Anything At Any Condition

Python 246 17 Updated Jul 10, 2025

[ICCV 2023] MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering and Beyond.

Python 282 12 Updated Jun 5, 2024

[CVPR 2025 Oral & Award Candidate] Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Python 615 31 Updated Jun 28, 2025

[ICCV 2025] DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation

Python 72 2 Updated Jul 13, 2025

The official code for the paper: LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs

Python 87 Updated Jul 1, 2025

Unified Vision-Language-Action Model

Python 136 4 Updated Jul 3, 2025

Dingo: A Comprehensive AI Data Quality Evaluation Tool

JavaScript 290 31 Updated Jul 14, 2025

Adding Scene-Centric Forecasting Control to Occupancy World Model

Python 11 Updated Jun 19, 2025

SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.

Python 5,148 437 Updated May 12, 2025

[ICLR 2023 Oral] Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model

Python 1,275 96 Updated Apr 25, 2024

Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.

Python 384 37 Updated Jul 15, 2025

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 9,839 943 Updated Jul 14, 2025
HTML 7 Updated May 28, 2025

[ICCV 2025] GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

Python 349 12 Updated Jun 26, 2025

MMaDA - Open-Sourced Multimodal Large Diffusion Language Models

Python 1,197 55 Updated Jun 13, 2025
Python 1,278 50 Updated Jul 11, 2025

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,081 451 Updated Aug 7, 2024

ICCV 2025 | Nexus: Decoupled Diffusion Sparks Adaptive Scene Generation

Python 81 8 Updated Jul 2, 2025

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,322 53 Updated Jun 14, 2025

A Paper List for Humanoid Robot Learning.

630 37 Updated Jul 11, 2025

Enhancing Representations through Heterogeneous Self-Supervised Learning (TPAMI 2025)

Python 14 Updated May 2, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 22,577 1,526 Updated Jun 26, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 13,026 1,609 Updated Jul 4, 2025

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 1,519 64 Updated Jul 15, 2025
Next
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载