frk-tt

⛷️

frk-tt

⛷️

10 followers · 62 following

Zalo
HCM, Vietnam
06:31 (UTC +07:00)
in/frenkietran

Achievements

Stars

Lyz103 / Recommendation-paper-daily

Forked from Vincentqyw/cv-arxiv-daily

🎓Automatically Update Recommendation Papers Daily using Github Actions (Update Every 12th hours)

Python 66 2 Updated Jul 15, 2025

vllm-project / llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 1,640 176 Updated Jul 15, 2025

apple / ml-mobileclip

This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024

Python 992 84 Updated Nov 22, 2024

bytedance / Q-Insight

Q-Insight: Understanding Image Quality via Visual Reinforcement Learning

Python 134 2 Updated Jun 10, 2025

microsoft / qlib

Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…

Python 27,085 4,145 Updated Jul 11, 2025

ossu / computer-science

🎓 Path to a free self-taught education in Computer Science!

HTML 188,234 23,475 Updated Jul 5, 2025

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,319 53 Updated Jun 14, 2025

apple / ml-fastvlm

This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025

Python 4,333 238 Updated May 5, 2025

xiaobai1217 / Awesome-Video-Datasets

Video datasets

1,447 103 Updated Mar 8, 2023

Kizna1ver / DREAM

The official repo for "DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models".

Python 5 Updated Feb 11, 2025

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,963 117 Updated Jun 16, 2025

lwq20020127 / Q-Insight

Q-Insight is open-sourced at https://github.com/bytedance/Q-Insight. This repository will not receive further updates.

143 4 Updated May 30, 2025

chaofengc / Awesome-Image-Quality-Assessment

A comprehensive collection of IQA papers

TeX 1,260 75 Updated Apr 8, 2025

truera / trulens

Evaluation and Tracking for LLM Experiments and AI Agents

Python 2,636 217 Updated Jul 15, 2025

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 48,775 5,363 Updated Jul 14, 2025

microsoft / BitNet

Official inference framework for 1-bit LLMs

Python 20,523 1,537 Updated Jun 3, 2025

huggingface / transformers.js

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!

JavaScript 14,110 959 Updated Jul 11, 2025

qdrant / fastembed

Fast, Accurate, Lightweight Python library to make State of the Art Embedding

Python 2,213 145 Updated Jul 11, 2025

ggml-org / whisper.cpp

Port of OpenAI's Whisper model in C/C++

C++ 41,554 4,454 Updated Jul 14, 2025

shashikg / WhisperS2T

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

Jupyter Notebook 436 60 Updated Aug 27, 2024

bytedance / MegaTTS3

Python 5,632 422 Updated May 11, 2025

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 16,176 2,116 Updated Jul 15, 2025

noamgat / lm-format-enforcer

Enforce the output format (JSON Schema, Regex etc) of a language model

Python 1,843 78 Updated Feb 26, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,317 255 Updated Jun 12, 2025

aphrodite-engine / aphrodite-engine

Large-scale LLM inference engine

C++ 1,478 158 Updated Jul 15, 2025

karanpratapsingh / system-design

Learn how to design systems at scale and prepare for system design interviews

37,055 4,407 Updated Apr 10, 2024

NVlabs / VILA

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,407 274 Updated Jun 19, 2025

bytedance / LatentSync

Taming Stable Diffusion for Lip Sync!

Python 4,565 726 Updated Jun 20, 2025

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

2,512 113 Updated Jul 9, 2025

InternLM / InternLM-XComposer

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Python 2,869 176 Updated May 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

frk-tt

Achievements

Achievements

Block or report frk-tt

Stars

Lyz103 / Recommendation-paper-daily

vllm-project / llm-compressor

apple / ml-mobileclip

bytedance / Q-Insight

microsoft / qlib

ossu / computer-science

ByteDance-Seed / Seed1.5-VL

apple / ml-fastvlm

xiaobai1217 / Awesome-Video-Datasets

Kizna1ver / DREAM

OpenGVLab / InternVideo

lwq20020127 / Q-Insight

chaofengc / Awesome-Image-Quality-Assessment

truera / trulens

RVC-Boss / GPT-SoVITS

microsoft / BitNet

huggingface / transformers.js

qdrant / fastembed

ggml-org / whisper.cpp

shashikg / WhisperS2T

bytedance / MegaTTS3

triton-lang / triton

noamgat / lm-format-enforcer

QwenLM / Qwen2.5-Omni

aphrodite-engine / aphrodite-engine

karanpratapsingh / system-design

NVlabs / VILA

bytedance / LatentSync

yunlong10 / Awesome-LLMs-for-Video-Understanding

InternLM / InternLM-XComposer