这是indexloc提供的服务,不要输入任何密码
Skip to main content

Showing 1–50 of 937 results for author: Liu, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.17731  [pdf, ps, other

    cs.LG cs.AI

    Flow Matching Meets Biology and Life Science: A Survey

    Authors: Zihao Li, Zhichen Zeng, Xiao Lin, Feihao Fang, Yanru Qu, Zhe Xu, Zhining Liu, Xuying Ning, Tianxin Wei, Ge Liu, Hanghang Tong, Jingrui He

    Abstract: Over the past decade, advances in generative modeling, such as generative adversarial networks, masked autoencoders, and diffusion models, have significantly transformed biological research and discovery, enabling breakthroughs in molecule design, protein generation, drug discovery, and beyond. At the same time, biological applications have served as valuable testbeds for evaluating the capabiliti… ▽ More

    Submitted 23 July, 2025; originally announced July 2025.

    Comments: Preprint, 27 pages

  2. arXiv:2507.15230  [pdf, ps, other

    cs.DC cs.GR

    GALE: Leveraging Heterogeneous Systems for Efficient Unstructured Mesh Data Analysis

    Authors: Guoxi Liu, Thomas Randall, Rong Ge, Federico Iuricich

    Abstract: Unstructured meshes present challenges in scientific data analysis due to irregular distribution and complex connectivity. Computing and storing connectivity information is a major bottleneck for visualization algorithms, affecting both time and memory performance. Recent task-parallel data structures address this by precomputing connectivity information at runtime while the analysis algorithm exe… ▽ More

    Submitted 21 July, 2025; originally announced July 2025.

  3. arXiv:2507.13410  [pdf, ps, other

    cs.CL cs.AI

    Causal Language Control in Multilingual Transformers via Sparse Feature Steering

    Authors: Cheng-Ting Chou, George Liu, Jessica Sun, Cole Blondin, Kevin Zhu, Vasu Sharma, Sean O'Brien

    Abstract: Deterministically controlling the target generation language of large multilingual language models (LLMs) remains a fundamental challenge, particularly in zero-shot settings where neither explicit language prompts nor fine-tuning are available. In this work, we investigate whether sparse autoencoder (SAE) features, previously shown to correlate with interpretable model behaviors, can be leveraged… ▽ More

    Submitted 17 July, 2025; originally announced July 2025.

  4. arXiv:2507.12898  [pdf, ps, other

    cs.LG cs.AI cs.CV cs.RO

    Generalist Bimanual Manipulation via Foundation Video Diffusion Models

    Authors: Yao Feng, Hengkai Tan, Xinyi Mao, Guodong Liu, Shuhe Huang, Chendong Xiang, Hang Su, Jun Zhu

    Abstract: Bimanual robotic manipulation, which involves the coordinated control of two robotic arms, is foundational for solving challenging tasks. Despite recent progress in general-purpose manipulation, data scarcity and embodiment heterogeneity remain serious obstacles to further scaling up in bimanual settings. In this paper, we introduce VIdeo Diffusion for Action Reasoning (VIDAR), a two-stage framewo… ▽ More

    Submitted 17 July, 2025; originally announced July 2025.

  5. arXiv:2507.12768  [pdf, ps, other

    cs.CV cs.LG cs.RO

    AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation

    Authors: Hengkai Tan, Yao Feng, Xinyi Mao, Shuhe Huang, Guodong Liu, Zhongkai Hao, Hang Su, Jun Zhu

    Abstract: Vision-language-action (VLA) models have shown promise on task-conditioned control in complex settings such as bimanual manipulation. However, the heavy reliance on task-specific human demonstrations limits their generalization and incurs high data acquisition costs. In this work, we present a new notion of task-agnostic action paradigm that decouples action execution from task-specific conditioni… ▽ More

    Submitted 16 July, 2025; originally announced July 2025.

  6. arXiv:2507.12507  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training

    Authors: Mingjie Liu, Shizhe Diao, Jian Hu, Ximing Lu, Xin Dong, Hao Zhang, Alexander Bukharin, Shaokun Zhang, Jiaqi Zeng, Makesh Narsimhan Sreedhar, Gerald Shen, David Mosallanezhad, Di Zhang, Jonas Yang, June Yang, Oleksii Kuchaiev, Guilin Liu, Zhiding Yu, Pavlo Molchanov, Yejin Choi, Jan Kautz, Yi Dong

    Abstract: Recent advancements in reasoning-focused language models such as OpenAI's O1 and DeepSeek-R1 have shown that scaling test-time computation-through chain-of-thought reasoning and iterative exploration-can yield substantial improvements on complex tasks like mathematics and code generation. These breakthroughs have been driven by large-scale reinforcement learning (RL), particularly when combined wi… ▽ More

    Submitted 16 July, 2025; originally announced July 2025.

    Comments: 14 pages, 7 figures

  7. arXiv:2507.11942  [pdf, ps, other

    cs.CL

    DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression

    Authors: Yi Zhao, Zuchao Li, Hai Zhao, Baoyuan Qi, Guoming Liu

    Abstract: Task-agnostic prompt compression leverages the redundancy in natural language to reduce computational overhead and enhance information density within prompts, especially in long-context scenarios. Existing methods predominantly rely on information entropy as the metric to compress lexical units, aiming to achieve minimal information loss. However, these approaches overlook two critical aspects: (i… ▽ More

    Submitted 16 July, 2025; originally announced July 2025.

    Comments: ACL 2025

  8. arXiv:2507.11273  [pdf, ps, other

    cs.CL

    KV-Latent: Dimensional-level KV Cache Reduction with Frequency-aware Rotary Positional Embedding

    Authors: Luohe Shi, Zuchao Li, Lefei Zhang, Guoming Liu, Baoyuan Qi, Hai Zhao

    Abstract: Large language models (LLMs) based on Transformer Decoders have become the preferred choice for conversational generative AI. Despite the overall superiority of the Decoder architecture, the gradually increasing Key-Value (KV) cache during inference has emerged as a primary efficiency bottleneck, both in aspects of memory consumption and data transfer bandwidth limitations. To address these challe… ▽ More

    Submitted 15 July, 2025; originally announced July 2025.

    Comments: To be published in The 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)

  9. arXiv:2507.10290  [pdf, ps, other

    cs.RO

    TOP: Trajectory Optimization via Parallel Optimization towards Constant Time Complexity

    Authors: Jiajun Yu, Nanhe Chen, Guodong Liu, Chao Xu, Fei Gao, Yanjun Cao

    Abstract: Optimization has been widely used to generate smooth trajectories for motion planning. However, existing trajectory optimization methods show weakness when dealing with large-scale long trajectories. Recent advances in parallel computing have accelerated optimization in some fields, but how to efficiently solve trajectory optimization via parallelism remains an open question. In this paper, we pro… ▽ More

    Submitted 16 July, 2025; v1 submitted 14 July, 2025; originally announced July 2025.

    Comments: 8 pages, submitted to RA-L

  10. arXiv:2507.09505  [pdf, ps, other

    cs.RO

    TruckV2X: A Truck-Centered Perception Dataset

    Authors: Tenghui Xie, Zhiying Song, Fuxi Wen, Jun Li, Guangzhao Liu, Zijian Zhao

    Abstract: Autonomous trucking offers significant benefits, such as improved safety and reduced costs, but faces unique perception challenges due to trucks' large size and dynamic trailer movements. These challenges include extensive blind spots and occlusions that hinder the truck's perception and the capabilities of other road users. To address these limitations, cooperative perception emerges as a promisi… ▽ More

    Submitted 13 July, 2025; originally announced July 2025.

  11. arXiv:2507.07032  [pdf, ps, other

    cs.LG cs.AI q-bio.QM

    PLAME: Leveraging Pretrained Language Models to Generate Enhanced Protein Multiple Sequence Alignments

    Authors: Hanqun Cao, Xinyi Zhou, Zijun Gao, Chenyu Wang, Xin Gao, Zhi Zhang, Chunbin Gu, Ge Liu, Pheng-Ann Heng

    Abstract: Protein structure prediction is essential for drug discovery and understanding biological functions. While recent advancements like AlphaFold have achieved remarkable accuracy, most folding models rely heavily on multiple sequence alignments (MSAs) to boost prediction performance. This dependency limits their effectiveness on low-homology proteins and orphan proteins, where MSA information is spar… ▽ More

    Submitted 17 June, 2025; originally announced July 2025.

  12. arXiv:2507.06517  [pdf, ps, other

    cs.CL

    SpindleKV: A Novel KV Cache Reduction Method Balancing Both Shallow and Deep Layers

    Authors: Zicong Tang, Shi Luohe, Zuchao Li, Baoyuan Qi, Guoming Liu, Lefei Zhang, Ping Wang

    Abstract: Large Language Models (LLMs) have achieved impressive accomplishments in recent years. However, the increasing memory consumption of KV cache has possessed a significant challenge to the inference system. Eviction methods have revealed the inherent redundancy within the KV cache, demonstrating its potential for reduction, particularly in deeper layers. However, KV cache reduction for shallower lay… ▽ More

    Submitted 8 July, 2025; originally announced July 2025.

    Comments: Accepted by ACL 2025 main

  13. arXiv:2507.06261  [pdf, ps, other

    cs.CL cs.AI

    Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

    Authors: Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, Luke Marris, Sam Petulla, Colin Gaffney, Asaf Aharoni, Nathan Lintz, Tiago Cardal Pais, Henrik Jacobsson, Idan Szpektor, Nan-Jiang Jiang, Krishna Haridasan, Ahmed Omran, Nikunj Saunshi, Dara Bahri, Gaurav Mishra, Eric Chu , et al. (3284 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde… ▽ More

    Submitted 22 July, 2025; v1 submitted 7 July, 2025; originally announced July 2025.

    Comments: 72 pages, 17 figures

  14. arXiv:2507.05268  [pdf, ps, other

    q-bio.NC cs.CV eess.SY

    Cross-Subject DD: A Cross-Subject Brain-Computer Interface Algorithm

    Authors: Xiaoyuan Li, Xinru Xue, Bohan Zhang, Ye Sun, Shoushuo Xi, Gang Liu

    Abstract: Brain-computer interface (BCI) based on motor imagery (MI) enables direct control of external devices by decoding the electroencephalogram (EEG) generated in the brain during imagined movements. However, due to inter-individual variability in brain activity, existing BCI models exhibit poor adaptability across subjects, thereby limiting their generalizability and widespread application. To address… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: 20 pages, 9 figures

  15. arXiv:2507.04620  [pdf, ps, other

    cs.RO

    IDAGC: Adaptive Generalized Human-Robot Collaboration via Human Intent Estimation and Multimodal Policy Learning

    Authors: Haotian Liu, Yuchuang Tong, Guanchen Liu, Zhaojie Ju, Zhengtao Zhang

    Abstract: In Human-Robot Collaboration (HRC), which encompasses physical interaction and remote cooperation, accurate estimation of human intentions and seamless switching of collaboration modes to adjust robot behavior remain paramount challenges. To address these issues, we propose an Intent-Driven Adaptive Generalized Collaboration (IDAGC) framework that leverages multimodal data and human intent estimat… ▽ More

    Submitted 6 July, 2025; originally announced July 2025.

    Comments: Accepted by IROS 2025

  16. arXiv:2507.04227  [pdf, ps, other

    cs.CR cs.AI

    Hijacking JARVIS: Benchmarking Mobile GUI Agents against Unprivileged Third Parties

    Authors: Guohong Liu, Jialei Ye, Jiacheng Liu, Yuanchun Li, Wei Liu, Pengzhi Gao, Jian Luan, Yunxin Liu

    Abstract: Mobile GUI agents are designed to autonomously execute diverse device-control tasks by interpreting and interacting with mobile screens. Despite notable advancements, their resilience in real-world scenarios where screen content may be partially manipulated by untrustworthy third parties remains largely unexplored. Owing to their black-box and autonomous nature, these agents are vulnerable to mani… ▽ More

    Submitted 5 July, 2025; originally announced July 2025.

  17. arXiv:2507.01951  [pdf, ps, other

    cs.LG cs.CL

    Test-Time Scaling with Reflective Generative Model

    Authors: Zixiao Wang, Yuxin Wang, Xiaorui Wang, Mengting Xing, Jie Gao, Jianjun Xu, Guangcan Liu, Chenhui Jin, Zhuo Wang, Shengzhuo Zhang, Hongtao Xie

    Abstract: We introduce our first reflective generative model MetaStone-S1, which obtains OpenAI o3-mini's performance via the new Reflective Generative Form. The new form focuses on high-quality reasoning trajectory selection and contains two novelties: 1) A unified interface for policy and process reward model: we share the backbone network and use task-specific heads for reasoning trajectory predicting an… ▽ More

    Submitted 9 July, 2025; v1 submitted 2 July, 2025; originally announced July 2025.

  18. arXiv:2507.01926  [pdf, ps, other

    cs.CV

    IC-Custom: Diverse Image Customization via In-Context Learning

    Authors: Yaowei Li, Xiaoyu Li, Zhaoyang Zhang, Yuxuan Bian, Gan Liu, Xinyuan Li, Jiale Xu, Wenbo Hu, Yating Liu, Lingen Li, Jing Cai, Yuexian Zou, Yancheng He, Ying Shan

    Abstract: Image customization, a crucial technique for industrial media production, aims to generate content that is consistent with reference images. However, current approaches conventionally separate image customization into position-aware and position-free customization paradigms and lack a universal framework for diverse customization, limiting their applications across various scenarios. To overcome t… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: Project page: https://liyaowei-stu.github.io/project/IC_Custom

  19. arXiv:2507.00985  [pdf, ps, other

    cs.CL

    Discourse Heuristics For Paradoxically Moral Self-Correction

    Authors: Guangliang Liu, Zimo Qi, Xitong Zhang, Kristen Marie Johnson

    Abstract: Moral self-correction has emerged as a promising approach for aligning the output of Large Language Models (LLMs) with human moral values. However, moral self-correction techniques are subject to two primary paradoxes. First, despite empirical and theoretical evidence to support the effectiveness of self-correction, this LLM capability only operates at a superficial level. Second, while LLMs posse… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  20. arXiv:2507.00601  [pdf

    cs.CL

    Transferable Modeling Strategies for Low-Resource LLM Tasks: A Prompt and Alignment-Based Approach

    Authors: Shuangquan Lyu, Yingnan Deng, Guiran Liu, Zhen Qi, Ruotong Wang

    Abstract: This paper addresses the limited transfer and adaptation capabilities of large language models in low-resource language scenarios. It proposes a unified framework that combines a knowledge transfer module with parameter-efficient fine-tuning strategies. The method introduces knowledge alignment loss and soft prompt tuning to guide the model in effectively absorbing the structural features of targe… ▽ More

    Submitted 2 July, 2025; v1 submitted 1 July, 2025; originally announced July 2025.

  21. arXiv:2507.00485  [pdf, ps, other

    cs.LG cs.AI

    PNAct: Crafting Backdoor Attacks in Safe Reinforcement Learning

    Authors: Weiran Guo, Guanjun Liu, Ziyuan Zhou, Ling Wang

    Abstract: Reinforcement Learning (RL) is widely used in tasks where agents interact with an environment to maximize rewards. Building on this foundation, Safe Reinforcement Learning (Safe RL) incorporates a cost metric alongside the reward metric, ensuring that agents adhere to safety constraints during decision-making. In this paper, we identify that Safe RL is vulnerable to backdoor attacks, which can man… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  22. arXiv:2506.22866  [pdf, ps, other

    cs.CV cs.AI

    Region-Aware CAM: High-Resolution Weakly-Supervised Defect Segmentation via Salient Region Perception

    Authors: Hang-Cheng Dong, Lu Zou, Bingguo Liu, Dong Ye, Guodong Liu

    Abstract: Surface defect detection plays a critical role in industrial quality inspection. Recent advances in artificial intelligence have significantly enhanced the automation level of detection processes. However, conventional semantic segmentation and object detection models heavily rely on large-scale annotated datasets, which conflicts with the practical requirements of defect detection tasks. This pap… ▽ More

    Submitted 28 June, 2025; originally announced June 2025.

  23. arXiv:2506.22637  [pdf, ps, other

    cs.CV

    CaO$_2$: Rectifying Inconsistencies in Diffusion-Based Dataset Distillation

    Authors: Haoxuan Wang, Zhenghao Zhao, Junyi Wu, Yuzhang Shang, Gaowen Liu, Yan Yan

    Abstract: The recent introduction of diffusion models in dataset distillation has shown promising potential in creating compact surrogate datasets for large, high-resolution target datasets, offering improved efficiency and performance over traditional bi-level/uni-level optimization methods. However, current diffusion-based dataset distillation approaches overlook the evaluation process and exhibit two cri… ▽ More

    Submitted 8 July, 2025; v1 submitted 27 June, 2025; originally announced June 2025.

    Comments: ICCV 2025. Code is available at https://github.com/hatchetProject/CaO2

  24. arXiv:2506.22565  [pdf, ps, other

    stat.ML cs.LG math.OC

    Adjoint Schrödinger Bridge Sampler

    Authors: Guan-Horng Liu, Jaemoo Choi, Yongxin Chen, Benjamin Kurt Miller, Ricky T. Q. Chen

    Abstract: Computational methods for learning to sample from the Boltzmann distribution -- where the target distribution is known only up to an unnormalized energy function -- have advanced significantly recently. Due to the lack of explicit target samples, however, prior diffusion-based methods, known as diffusion samplers, often require importance-weighted estimation or complicated learning processes. Both… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

  25. arXiv:2506.18165  [pdf, ps, other

    cs.LG cs.AI

    Non-equilibrium Annealed Adjoint Sampler

    Authors: Jaemoo Choi, Yongxin Chen, Molei Tao, Guan-Horng Liu

    Abstract: Recently, there has been significant progress in learning-based diffusion samplers, which aim to sample from a given unnormalized density. These methods typically follow one of two paradigms: (i) formulating sampling as an unbiased stochastic optimal control (SOC) problem using a canonical reference process, or (ii) refining annealed path measures through importance-weighted sampling. Although ann… ▽ More

    Submitted 25 June, 2025; v1 submitted 22 June, 2025; originally announced June 2025.

    Comments: 21 pages, 7 figures

  26. arXiv:2506.18088  [pdf, ps, other

    cs.RO cs.AI cs.CL cs.CV cs.MA

    RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation

    Authors: Tianxing Chen, Zanxin Chen, Baijun Chen, Zijian Cai, Yibin Liu, Qiwei Liang, Zixuan Li, Xianliang Lin, Yiheng Ge, Zhenyu Gu, Weiliang Deng, Yubin Guo, Tian Nian, Xuanbing Xie, Qiangyu Chen, Kailun Su, Tianling Xu, Guodong Liu, Mengkang Hu, Huan-ang Gao, Kaixuan Wang, Zhixuan Liang, Yusen Qin, Xiaokang Yang, Ping Luo , et al. (1 additional authors not shown)

    Abstract: Simulation-based data synthesis has emerged as a powerful paradigm for enhancing real-world robotic manipulation. However, existing synthetic datasets remain insufficient for robust bimanual manipulation due to two challenges: (1) the lack of an efficient, scalable data generation method for novel tasks, and (2) oversimplified simulation environments that fail to capture real-world complexity. We… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

    Comments: Project Page: https://robotwin-platform.github.io/

  27. arXiv:2506.16979  [pdf, ps, other

    cs.CG cs.DS

    Minimum-Weight Half-Plane Hitting Set

    Authors: Gang Liu, Haitao Wang

    Abstract: Given a set $P$ of $n$ weighted points and a set $H$ of $n$ half-planes in the plane, the hitting set problem is to compute a subset $P'$ of points from $P$ such that each half-plane contains at least one point from $P'$ and the total weight of the points in $P'$ is minimized. The previous best algorithm solves the problem in $O(n^{7/2}\log^2 n)$ time. In this paper, we present a new algorithm wit… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: To appear in CCCG 2025. arXiv admin note: text overlap with arXiv:2407.00329, arXiv:2501.02195

  28. arXiv:2506.15910  [pdf, ps, other

    eess.SY cs.DC cs.NI

    Autonomous Trajectory Optimization for UAVs in Disaster Zone Using Henry Gas Optimization Scheme

    Authors: Zakria Qadir, Muhammad Bilal, Guoqiang Liu, Xiaolong Xu

    Abstract: The unmanned aerial vehicles (UAVs) in a disaster-prone environment plays important role in assisting the rescue services and providing the internet connectivity with the outside world. However, in such a complex environment the selection of optimum trajectory of UAVs is of utmost importance. UAV trajectory optimization deals with finding the shortest path in the minimal possible time. In this pap… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: 12 pages, 9 figuers

    ACM Class: C.2; I.6

  29. arXiv:2506.15803  [pdf, ps, other

    physics.med-ph cs.AI

    Unsupervised deep learning model for fast energy layer pre-selection of delivery-efficient proton arc therapy plan optimization of nasopharyngeal carcinoma

    Authors: Bohan Yang, Gang Liu, Rirao Dao, Yujia Qian, Ke Shi, Anke Tang, Yong Luo, Jingnan Liu

    Abstract: Objective. Proton arc therapy (PAT) is an emerging and promising modality in radiotherapy, offering several advantages over conventional intensitymodulated proton therapy (IMPT). However, identifying the optimal energy layer (EL) sequence remains computationally intensive due to the large number of possible energy layer transitions. This study proposes an unsupervised deep learning framework for f… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  30. arXiv:2506.15728  [pdf

    q-bio.QM cs.CV q-bio.BM

    Smartphone-integrated RPA-CRISPR-Cas12a Detection System with Microneedle Sampling for Point-of-Care Diagnosis of Potato Late Blight in Early Stage

    Authors: Jiangnan Zhao, Hanbo Xu, Cifu Xu, Wenlong Yin, Laixin Luo, Gang Liu, Yan Wang

    Abstract: Potato late blight, caused by the oomycete pathogen Phytophthora infestans, is one of the most devastating diseases affecting potato crops in the history. Although conventional detection methods of plant diseases such as PCR and LAMP are highly sensitive and specific, they rely on bulky and expensive laboratory equipment and involve complex operations, making them impracticable for point-of care d… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 32 pages,7 figures,1 table

  31. arXiv:2506.15050  [pdf, ps, other

    cs.AI

    Truncated Proximal Policy Optimization

    Authors: Tiantian Fan, Lingjun Liu, Yu Yue, Jiaze Chen, Chengyi Wang, Qiying Yu, Chi Zhang, Zhiqi Lin, Ruofei Zhu, Yufeng Yuan, Xiaochen Zuo, Bole Ma, Mofan Zhang, Gaohong Liu, Ru Zhang, Haotian Zhou, Cong Xie, Ruidong Zhu, Zhi Zhang, Xin Liu, Mingxuan Wang, Lin Yan, Yonghui Wu

    Abstract: Recently, test-time scaling Large Language Models (LLMs) have demonstrated exceptional reasoning capabilities across scientific and professional tasks by generating long chains-of-thought (CoT). As a crucial component for developing these reasoning models, reinforcement learning (RL), exemplified by Proximal Policy Optimization (PPO) and its variants, allows models to learn through trial and error… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  32. arXiv:2506.14541  [pdf, ps, other

    cs.CV

    Exploring Diffusion with Test-Time Training on Efficient Image Restoration

    Authors: Rongchang Lu, Tianduo Luo, Yunzhi Jiang, Conghan Yue, Pei Yang, Guibao Liu, Changyang Gu

    Abstract: Image restoration faces challenges including ineffective feature fusion, computational bottlenecks and inefficient diffusion processes. To address these, we propose DiffRWKVIR, a novel framework unifying Test-Time Training (TTT) with efficient diffusion. Our approach introduces three key innovations: (1) Omni-Scale 2D State Evolution extends RWKV's location-dependent parameterization to hierarchic… ▽ More

    Submitted 22 June, 2025; v1 submitted 17 June, 2025; originally announced June 2025.

    ACM Class: I.4.9

  33. arXiv:2506.11116  [pdf, ps, other

    cs.CL cs.AI

    Infinity Instruct: Scaling Instruction Selection and Synthesis to Enhance Language Models

    Authors: Jijie Li, Li Du, Hanyu Zhao, Bo-wen Zhang, Liangdong Wang, Boyan Gao, Guang Liu, Yonghua Lin

    Abstract: Large Language Models (LLMs) demonstrate strong performance in real-world applications, yet existing open-source instruction datasets often concentrate on narrow domains, such as mathematics or coding, limiting generalization and widening the gap with proprietary models. To bridge this gap, we introduce Infinity-Instruct, a high-quality instruction dataset designed to enhance both foundational and… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  34. arXiv:2506.10895  [pdf, ps, other

    cs.CV cs.AI

    AIR: Zero-shot Generative Model Adaptation with Iterative Refinement

    Authors: Guimeng Liu, Milad Abdollahzadeh, Ngai-Man Cheung

    Abstract: Zero-shot generative model adaptation (ZSGM) aims to adapt a pre-trained generator to a target domain using only text guidance and without any samples from the target domain. Central to recent ZSGM approaches are directional loss which use the text guidance in the form of aligning the image offset with text offset in the embedding space of a vision-language model like CLIP. This is similar to the… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  35. arXiv:2506.10168  [pdf, ps, other

    stat.ML cs.LG

    Momentum Multi-Marginal Schrödinger Bridge Matching

    Authors: Panagiotis Theodoropoulos, Augustinos D. Saravanos, Evangelos A. Theodorou, Guan-Horng Liu

    Abstract: Understanding complex systems by inferring trajectories from sparse sample snapshots is a fundamental challenge in a wide range of domains, e.g., single-cell biology, meteorology, and economics. Despite advancements in Bridge and Flow matching frameworks, current methodologies rely on pairwise interpolation between adjacent snapshots. This hinders their ability to capture long-range temporal depen… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  36. arXiv:2506.09070  [pdf, ps, other

    cs.GR cs.AI

    STREAMINGGS: Voxel-Based Streaming 3D Gaussian Splatting with Memory Optimization and Architectural Support

    Authors: Chenqi Zhang, Yu Feng, Jieru Zhao, Guangda Liu, Wenchao Ding, Chentao Wu, Minyi Guo

    Abstract: 3D Gaussian Splatting (3DGS) has gained popularity for its efficiency and sparse Gaussian-based representation. However, 3DGS struggles to meet the real-time requirement of 90 frames per second (FPS) on resource-constrained mobile devices, achieving only 2 to 9 FPS.Existing accelerators focus on compute efficiency but overlook memory efficiency, leading to redundant DRAM traffic. We introduce STRE… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  37. arXiv:2506.07778  [pdf, ps, other

    cs.CV

    Language-Vision Planner and Executor for Text-to-Visual Reasoning

    Authors: Yichang Xu, Gaowen Liu, Ramana Rao Kompella, Sihao Hu, Tiansheng Huang, Fatih Ilhan, Selim Furkan Tekin, Zachary Yahn, Ling Liu

    Abstract: The advancement in large language models (LLMs) and large vision models has fueled the rapid progress in multi-modal visual-text reasoning capabilities. However, existing vision-language models (VLMs) to date suffer from generalization performance. Inspired by recent development in LLMs for visual reasoning, this paper presents VLAgent, an AI system that can create a step-by-step visual reasoning… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  38. arXiv:2506.07639  [pdf, ps, other

    cs.RO

    Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse

    Authors: Zhekai Duan, Yuan Zhang, Shikai Geng, Gaowen Liu, Joschka Boedecker, Chris Xiaoxuan Lu

    Abstract: Embodied Chain-of-Thought (ECoT) reasoning enhances vision-language-action (VLA) models by improving performance and interpretability through intermediate reasoning steps. However, its sequential autoregressive token generation introduces significant inference latency, limiting real-time deployment. We propose Fast ECoT, an inference-time acceleration method that exploits the structured and repeti… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  39. arXiv:2506.07548  [pdf, ps, other

    cs.AI cs.RO

    Curriculum Learning With Counterfactual Group Relative Policy Advantage For Multi-Agent Reinforcement Learning

    Authors: Weiqiang Jin, Hongyang Du, Guizhong Liu, Dong In Kim

    Abstract: Multi-agent reinforcement learning (MARL) has achieved strong performance in cooperative adversarial tasks. However, most existing methods typically train agents against fixed opponent strategies and rely on such meta-static difficulty conditions, which limits their adaptability to changing environments and often leads to suboptimal policies. Inspired by the success of curriculum learning (CL) in… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: 16 pages; 12figures

  40. arXiv:2506.07463  [pdf, ps, other

    cs.CL cs.AI

    CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models

    Authors: Guang Liu, Liangdong Wang, Jijie Li, Yang Yu, Yao Xu, Jiabei Chen, Yu Bai, Feng Liao, Yonghua Lin

    Abstract: We introduce CCI4.0, a large-scale bilingual pre-training dataset engineered for superior data quality and diverse human-like reasoning trajectory. CCI4.0 occupies roughly $35$ TB of disk space and comprises two sub-datasets: CCI4.0-M2-Base and CCI4.0-M2-CoT. CCI4.0-M2-Base combines a $5.2$ TB carefully curated Chinese web corpus, a $22.5$ TB English subset from Nemotron-CC, and diverse sources fr… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  41. arXiv:2506.07459  [pdf, ps, other

    cs.LG q-bio.QM

    ProteinZero: Self-Improving Protein Generation via Online Reinforcement Learning

    Authors: Ziwen Wang, Jiajun Fan, Ruihan Guo, Thao Nguyen, Heng Ji, Ge Liu

    Abstract: Protein generative models have shown remarkable promise in protein design but still face limitations in success rate, due to the scarcity of high-quality protein datasets for supervised pretraining. We present ProteinZero, a novel framework that enables scalable, automated, and continuous self-improvement of the inverse folding model through online reinforcement learning. To achieve computationall… ▽ More

    Submitted 10 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

  42. arXiv:2506.07413  [pdf, ps, other

    cs.LG cs.CV

    Variational Supervised Contrastive Learning

    Authors: Ziwen Wang, Jiajun Fan, Thao Nguyen, Heng Ji, Ge Liu

    Abstract: Contrastive learning has proven to be highly efficient and adaptable in shaping representation spaces across diverse modalities by pulling similar samples together and pushing dissimilar ones apart. However, two key limitations persist: (1) Without explicit regulation of the embedding distribution, semantically related instances can inadvertently be pushed apart unless complementary signals guide… ▽ More

    Submitted 26 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

  43. arXiv:2506.06800  [pdf, other

    cs.CL

    On the Adaptive Psychological Persuasion of Large Language Models

    Authors: Tianjie Ju, Yujia Chen, Hao Fei, Mong-Li Lee, Wynne Hsu, Pengzhou Cheng, Zongru Wu, Zhuosheng Zhang, Gongshen Liu

    Abstract: Previous work has showcased the intriguing capabilities of Large Language Models (LLMs) in instruction-following and rhetorical fluency. However, systematic exploration of their dual capabilities to autonomously persuade and resist persuasion, particularly in contexts involving psychological rhetoric, remains unexplored. In this paper, we first evaluate four commonly adopted LLMs by tasking them t… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

    Comments: Working in progress

  44. arXiv:2506.06205  [pdf, other

    cs.RO cs.AI

    Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning

    Authors: Sheng Chen, Peiyu He, Jiaxin Hu, Ziyang Liu, Yansheng Wang, Tao Xu, Chi Zhang, Chongchong Zhang, Chao An, Shiyu Cai, Duo Cao, Kangping Chen, Shuai Chu, Tianwei Chu, Mingdi Dan, Min Du, Weiwei Fang, Pengyou Fu, Junkai Hu, Xiaowei Jiang, Zhaodi Jiang, Fuxuan Li, Jun Li, Minghui Li, Mingyao Li , et al. (46 additional authors not shown)

    Abstract: Modern robot navigation systems encounter difficulties in diverse and complex indoor environments. Traditional approaches rely on multiple modules with small models or rule-based systems and thus lack adaptability to new environments. To address this, we developed Astra, a comprehensive dual-model architecture, Astra-Global and Astra-Local, for mobile robot navigation. Astra-Global, a multimodal L… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: Astra Technical Report

  45. arXiv:2506.05566  [pdf, ps, other

    cs.AR cs.AI

    ScaleRTL: Scaling LLMs with Reasoning Data and Test-Time Compute for Accurate RTL Code Generation

    Authors: Chenhui Deng, Yun-Da Tsai, Guan-Ting Liu, Zhongzhi Yu, Haoxing Ren

    Abstract: Recent advances in large language models (LLMs) have enabled near-human performance on software coding benchmarks, but their effectiveness in RTL code generation remains limited due to the scarcity of high-quality training data. While prior efforts have fine-tuned LLMs for RTL tasks, they do not fundamentally overcome the data bottleneck and lack support for test-time scaling due to their non-reas… ▽ More

    Submitted 15 July, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

    Comments: Accepted to MLCAD 2025

  46. arXiv:2506.04975  [pdf, ps, other

    cs.CY

    Evaluating Prompt-Driven Chinese Large Language Models: The Influence of Persona Assignment on Stereotypes and Safeguards

    Authors: Geng Liu, Li Feng, Carlo Alberto Bono, Songbo Yang, Mengxiao Zhu, Francesco Pierri

    Abstract: Recent research has highlighted that assigning specific personas to large language models (LLMs) can significantly increase harmful content generation. Yet, limited attention has been given to persona-driven toxicity in non-Western contexts, particularly in Chinese-based LLMs. In this paper, we perform a large-scale, systematic analysis of how persona assignment influences refusal behavior and res… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  47. Automated Mechanism to Support Trade Transactions in Smart Contracts with Upgrade and Repair

    Authors: Christian Gang Liu, Peter Bodorik, Dawn Jutla

    Abstract: In our previous research, we addressed the problem of automated transformation of models, represented using the business process model and notation (BPMN) standard, into the methods of a smart contract. The transformation supports BPMN models that contain complex multi-step activities that are supported using our concept of multi-step nested trade transactions, wherein the transactional properties… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Journal ref: Elsevier Journal of Blockchain: Research and Applications, 2025

  48. arXiv:2506.03117  [pdf, ps, other

    cs.CV

    Targeted Forgetting of Image Subgroups in CLIP Models

    Authors: Zeliang Zhang, Gaowen Liu, Charles Fleming, Ramana Rao Kompella, Chenliang Xu

    Abstract: Foundation models (FMs) such as CLIP have demonstrated impressive zero-shot performance across various tasks by leveraging large-scale, unsupervised pre-training. However, they often inherit harmful or unwanted knowledge from noisy internet-sourced datasets, compromising their reliability in real-world applications. Existing model unlearning methods either rely on access to pre-trained datasets or… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: 12 Figures,5 Pages. The project page is \url{https://zhangaipi.github.io/forget_clip/}

  49. Transforming Automatically BPMN Models to Smart Contracts with Nested Collaborative Transactions (TABS+)

    Authors: Christian Gang Liu, Peter Bodorik, Dawn Jutla

    Abstract: Development of blockchain smart contracts is more difficult than mainstream software development because the underlying blockchain infrastructure poses additional complexity. To ease the developer's task of writing smart contract, as other research efforts, we also use Business Process Model and Notation BPMN modeling to describe application requirements for trade of goods and services and then tr… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: Preprint. arXiv admin note: substantial text overlap with arXiv:2505.24309

    Journal ref: Distributed Ledger Technologies: Research and Practice, Volume 3, Issue 3 Article No.: 21, Pages 1 - 37, 2024

  50. arXiv:2506.02175  [pdf, ps, other

    cs.CL

    AI Debate Aids Assessment of Controversial Claims

    Authors: Salman Rahman, Sheriff Issaka, Ashima Suvarna, Genglin Liu, James Shiffer, Jaeyoung Lee, Md Rizwan Parvez, Hamid Palangi, Shi Feng, Nanyun Peng, Yejin Choi, Julian Michael, Liwei Jiang, Saadia Gabriel

    Abstract: As AI grows more powerful, it will increasingly shape how we understand the world. But with this influence comes the risk of amplifying misinformation and deepening social divides-especially on consequential topics like public health where factual accuracy directly impacts well-being. Scalable Oversight aims to ensure AI truthfulness by enabling humans to supervise systems that may exceed human ca… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.