这是indexloc提供的服务,不要输入任何密码
Skip to content
View LMD0311's full-sized avatar
😇
😇

Highlights

  • Pro

Organizations

@H-EmbodVis

Block or report LMD0311

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LLaVA-VLA: A Simple Yet Powerful Vision-Language-Action Model [Actively Maintained🔥]

Python 109 2 Updated Jul 24, 2025
Python 88 4 Updated Jul 13, 2025

A Unified Driving World Model for Future Generation and Perception

Python 110 4 Updated Jul 22, 2025

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos

Jupyter Notebook 267 14 Updated Jun 26, 2025

Pusa: Thousands Timesteps Video Diffusion Model

Python 523 38 Updated Jul 26, 2025

[ICCV 2025] Aether: Geometric-Aware Unified World Modeling

Python 416 4 Updated Jul 7, 2025

Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"

Python 65 1 Updated Jul 18, 2025

Code of π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Python 844 21 Updated Jul 18, 2025

[ICCV 2025] HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation

Python 130 5 Updated Jul 14, 2025

[ICML 2025] Official PyTorch Implementation of "History-Guided Video Diffusion"

Python 416 18 Updated Jul 1, 2025

RoboBrain 2.0: Advanced version of RoboBrain. See Better. Think Harder. Do Smarter. 🎉🎉🎉

Python 472 36 Updated Jul 25, 2025

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

Python 2,477 574 Updated Jul 24, 2025

(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

Python 813 37 Updated May 17, 2025

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Python 1,386 96 Updated Jun 5, 2025

Towards a Generative 3D World Engine for Embodied Intelligence

Python 265 13 Updated Jul 21, 2025

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,284 1,385 Updated Jul 9, 2025

[ICCV 2025] LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Python 258 16 Updated Jul 15, 2025

Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching

Python 201 3 Updated Jul 19, 2025

The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.

Jupyter Notebook 7,371 1,400 Updated Jul 25, 2025

[ICCV 2025 ⭐highlight⭐] Implementation of VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

Python 222 11 Updated Jul 25, 2025

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Python 656 19 Updated Jul 16, 2025
Python 669 28 Updated Jul 7, 2025

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.

Python 2,740 221 Updated Jul 7, 2025

A lightweight LMM-based Document Parsing Model

Python 5,237 321 Updated Jul 24, 2025
SAS 50 4 Updated Jun 10, 2025

A Native Multimodal LLM for 3D Generation and Understanding

Python 463 25 Updated Jul 4, 2025

Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).

Python 10,195 879 Updated May 30, 2025

Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models

Python 137 6 Updated May 26, 2025

Official Implementation: Training-Free Efficient Video Generation via Dynamic Token Carving

Python 221 10 Updated Jun 29, 2025

Interactive visualizations of the geometric intuition behind diffusion models.

Svelte 804 32 Updated Jun 17, 2025
Next