-
HKUST
- Clear Water Bay, Hong Kong
-
19:41
(UTC +08:00) - https://www.linkedin.com/in/jqtnpu
- https://orcid.org/0009-0003-1251-0825
- https://scholar.google.com.hk/citations?user=dl5CsIUAAAAJ&hl=en
- https://jqtangust.github.io
- https://jqt.me
- https://huggingface.co/Jiaqi-hkust
Highlights
- Pro
Stars
This is the project for the paper at ICCV 2025
LongLive: Real-time Interactive Long Video Generation
This is a pytorch project for the paper Universal Adaptive Data Augmentation (IJCAI2023).
The code for TPAMI paper "Text-Guided Human Image Manipulation via Image-Text Shared Space"
The source code of Paper "Push the Limit of Scene Text Recognition Using Character and Text Length Guided Text Super-resolution"
This is the source code for CVPR paper "Low-Light Image Enhancement via Structure Modeling and Guidance"
An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.
[CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"
Awesome papers involving LLMs in Social Science.
[ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"
Awesome Unified Multimodal Models
[CVPR 2025 Highlight] Official Implementation of SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity
Chillobre / Course-Material
Forked from npu-cs/Course-Material西工大计算机专业课程攻略 | npu-cs/Course-Material
🔥 🔥 🔥 [NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies
[ICRA2023] Efficient Implicit Neural Reconstruction Using LiDAR
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
[ICCV 2025] VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE
From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓
[ECCV 2024] Official Implementation of An Incremental Unified Framework for Small Defect Inspection
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Aligning pretrained language models with instruction data generated by themselves.
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
The official GitHub page for the survey paper "A Survey on Data Augmentation in Large Model Era"