这是indexloc提供的服务,不要输入任何密码
Skip to main content

Showing 1–50 of 2,213 results for author: Liu, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.17388  [pdf, ps, other

    cs.CV eess.IV

    EndoGen: Conditional Autoregressive Endoscopic Video Generation

    Authors: Xinyu Liu, Hengyu Liu, Cheng Wang, Tianming Liu, Yixuan Yuan

    Abstract: Endoscopic video generation is crucial for advancing medical imaging and enhancing diagnostic capabilities. However, prior efforts in this field have either focused on static images, lacking the dynamic context required for practical applications, or have relied on unconditional generation that fails to provide meaningful references for clinicians. Therefore, in this paper, we propose the first co… ▽ More

    Submitted 23 July, 2025; originally announced July 2025.

    Comments: MICCAI 2025

  2. arXiv:2507.15836  [pdf, ps, other

    cs.LG cs.CR

    Optimizing Canaries for Privacy Auditing with Metagradient Descent

    Authors: Matteo Boglioni, Terrance Liu, Andrew Ilyas, Zhiwei Steven Wu

    Abstract: In this work we study black-box privacy auditing, where the goal is to lower bound the privacy parameter of a differentially private learning algorithm using only the algorithm's outputs (i.e., final trained model). For DP-SGD (the most successful method for training differentially private deep learning models), the canonical approach auditing uses membership inference-an auditor comes with a smal… ▽ More

    Submitted 21 July, 2025; originally announced July 2025.

  3. arXiv:2507.14686  [pdf, ps, other

    cs.CV

    From Semantics, Scene to Instance-awareness: Distilling Foundation Model for Open-vocabulary Situation Recognition

    Authors: Chen Cai, Tianyi Liu, Jianjun Gao, Wenyang Liu, Kejun Wu, Ruoyu Wang, Yi Wang, Soo Chin Liew

    Abstract: Recent Multimodal Large Language Models (MLLMs) exhibit strong zero-shot abilities but struggle with complex Grounded Situation Recognition (GSR) and are resource-intensive for edge device deployment. Meanwhile, conventional GSR models often lack generalization ability, falling short in recognizing unseen and rare situations. In this paper, we exploit transferring knowledge from a teacher MLLM to… ▽ More

    Submitted 19 July, 2025; originally announced July 2025.

  4. arXiv:2507.12590  [pdf, ps, other

    cs.CV cs.LG

    Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows

    Authors: Judy Long, Tao Liu, Sean Alexander Woznicki, Miljana Marković, Oskar Marko, Molly Sears

    Abstract: Crop mapping involves identifying and classifying crop types using spatial data, primarily derived from remote sensing imagery. This study presents the first comprehensive review of large-scale, pixel-wise crop mapping workflows, encompassing both conventional supervised methods and emerging transfer learning approaches. To identify the optimal supervised crop mapping workflows, we conducted syste… ▽ More

    Submitted 16 July, 2025; originally announced July 2025.

    Comments: A review article. 41 pages, 22 figures. Preprint

  5. arXiv:2507.10951  [pdf, ps, other

    cs.NE cs.AI q-bio.NC

    Biological Processing Units: Leveraging an Insect Connectome to Pioneer Biofidelic Neural Architectures

    Authors: Siyu Yu, Zihan Qin, Tingshan Liu, Beiya Xu, R. Jacob Vogelstein, Jason Brown, Joshua T. Vogelstein

    Abstract: The complete connectome of the Drosophila larva brain offers a unique opportunity to investigate whether biologically evolved circuits can support artificial intelligence. We convert this wiring diagram into a Biological Processing Unit (BPU), a fixed recurrent network derived directly from synaptic connectivity. Despite its modest size 3,000 neurons and 65,000 weights between them), the unmodifie… ▽ More

    Submitted 14 July, 2025; originally announced July 2025.

    Comments: Accepted to AGI 2025

  6. arXiv:2507.10722  [pdf, ps, other

    q-bio.NC cs.NE

    Bridging Brains and Machines: A Unified Frontier in Neuroscience, Artificial Intelligence, and Neuromorphic Systems

    Authors: Sohan Shankar, Yi Pan, Hanqi Jiang, Zhengliang Liu, Mohammad R. Darbandi, Agustin Lorenzo, Junhao Chen, Md Mehedi Hasan, Arif Hassan Zidan, Eliana Gelman, Joshua A. Konfrst, Jillian Y. Russell, Katelyn Fernandes, Tianze Yang, Yiwei Li, Huaqin Zhao, Afrar Jahin, Triparna Ganguly, Shair Dinesha, Yifan Zhou, Zihao Wu, Xinliang Li, Lokesh Adusumilli, Aziza Hussein, Sagar Nookarapu , et al. (20 additional authors not shown)

    Abstract: This position and survey paper identifies the emerging convergence of neuroscience, artificial general intelligence (AGI), and neuromorphic computing toward a unified research paradigm. Using a framework grounded in brain physiology, we highlight how synaptic plasticity, sparse spike-based communication, and multimodal association provide design principles for next-generation AGI systems that pote… ▽ More

    Submitted 14 July, 2025; originally announced July 2025.

  7. arXiv:2507.10502  [pdf

    cs.LG cs.AI

    Benchmarking and Evaluation of AI Models in Biology: Outcomes and Recommendations from the CZI Virtual Cells Workshop

    Authors: Elizabeth Fahsbender, Alma Andersson, Jeremy Ash, Polina Binder, Daniel Burkhardt, Benjamin Chang, Georg K. Gerber, Anthony Gitter, Patrick Godau, Ankit Gupta, Genevieve Haliburton, Siyu He, Trey Ideker, Ivana Jelic, Aly Khan, Yang-Joon Kim, Aditi Krishnapriyan, Jon M. Laurent, Tianyu Liu, Emma Lundberg, Shalin B. Mehta, Rob Moccia, Angela Oliveira Pisco, Katherine S. Pollard, Suresh Ramani , et al. (10 additional authors not shown)

    Abstract: Artificial intelligence holds immense promise for transforming biology, yet a lack of standardized, cross domain, benchmarks undermines our ability to build robust, trustworthy models. Here, we present insights from a recent workshop that convened machine learning and computational biology experts across imaging, transcriptomics, proteomics, and genomics to tackle this gap. We identify major techn… ▽ More

    Submitted 15 July, 2025; v1 submitted 14 July, 2025; originally announced July 2025.

  8. arXiv:2507.10097  [pdf, ps, other

    cs.IR

    User Long-Term Multi-Interest Retrieval Model for Recommendation

    Authors: Yue Meng, Cheng Guo, Xiaohui Hu, Honghu Deng, Yi Cao, Tong Liu, Bo Zheng

    Abstract: User behavior sequence modeling, which captures user interest from rich historical interactions, is pivotal for industrial recommendation systems. Despite breakthroughs in ranking-stage models capable of leveraging ultra-long behavior sequences with length scaling up to thousands, existing retrieval models remain constrained to sequences of hundreds of behaviors due to two main challenges. One is… ▽ More

    Submitted 14 July, 2025; originally announced July 2025.

  9. arXiv:2507.08839  [pdf, ps, other

    cs.LG cs.AI eess.IV

    Domain-Adaptive Diagnosis of Lewy Body Disease with Transferability Aware Transformer

    Authors: Xiaowei Yu, Jing Zhang, Tong Chen, Yan Zhuang, Minheng Chen, Chao Cao, Yanjun Lyu, Lu Zhang, Li Su, Tianming Liu, Dajiang Zhu

    Abstract: Lewy Body Disease (LBD) is a common yet understudied form of dementia that imposes a significant burden on public health. It shares clinical similarities with Alzheimer's disease (AD), as both progress through stages of normal cognition, mild cognitive impairment, and dementia. A major obstacle in LBD diagnosis is data scarcity, which limits the effectiveness of deep learning. In contrast, AD data… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

    Comments: MICCAI 2025

  10. arXiv:2507.08741  [pdf, ps, other

    cs.CV

    HieraRS: A Hierarchical Segmentation Paradigm for Remote Sensing Enabling Multi-Granularity Interpretation and Cross-Domain Transfer

    Authors: Tianlong Ai, Tianzhu Liu, Haochen Jiang, Yanfeng Gu

    Abstract: Hierarchical land cover and land use (LCLU) classification aims to assign pixel-wise labels with multiple levels of semantic granularity to remote sensing (RS) imagery. However, existing deep learning-based methods face two major challenges: 1) They predominantly adopt a flat classification paradigm, which limits their ability to generate end-to-end multi-granularity hierarchical predictions align… ▽ More

    Submitted 11 July, 2025; originally announced July 2025.

    Comments: 17 pages, 11 figures

  11. arXiv:2507.08719  [pdf, ps, other

    cs.CL cs.AI cs.SE

    Multilingual Multimodal Software Developer for Code Generation

    Authors: Linzheng Chai, Jian Yang, Shukai Liu, Wei Zhang, Liran Wang, Ke Jin, Tao Sun, Congnan Liu, Chenchen Zhang, Hualei Zhu, Jiaheng Liu, Xianjie Wu, Ge Zhang, Tianyu Liu, Zhoujun Li

    Abstract: The rapid advancement of Large Language Models (LLMs) has significantly improved code generation, yet most models remain text-only, neglecting crucial visual aids like diagrams and flowcharts used in real-world software development. To bridge this gap, we introduce MM-Coder, a Multilingual Multimodal software developer. MM-Coder integrates visual design inputs-Unified Modeling Language (UML) diagr… ▽ More

    Submitted 11 July, 2025; originally announced July 2025.

    Comments: Preprint

  12. arXiv:2507.07781  [pdf, ps, other

    cs.CV cs.RO

    SURPRISE3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes

    Authors: Jiaxin Huang, Ziwen Li, Hanlve Zhang, Runnan Chen, Xiao He, Yandong Guo, Wenping Wang, Tongliang Liu, Mingming Gong

    Abstract: The integration of language and 3D perception is critical for embodied AI and robotic systems to perceive, understand, and interact with the physical world. Spatial reasoning, a key capability for understanding spatial relationships between objects, remains underexplored in current 3D vision-language research. Existing datasets often mix semantic cues (e.g., object name) with spatial context, lead… ▽ More

    Submitted 10 July, 2025; originally announced July 2025.

  13. arXiv:2507.06503  [pdf, ps, other

    cs.IR

    USD: A User-Intent-Driven Sampling and Dual-Debiasing Framework for Large-Scale Homepage Recommendations

    Authors: Jiaqi Zheng, Cheng Guo, Yi Cao, Chaoqun Hou, Tong Liu, Bo Zheng

    Abstract: Large-scale homepage recommendations face critical challenges from pseudo-negative samples caused by exposure bias, where non-clicks may indicate inattention rather than disinterest. Existing work lacks thorough analysis of invalid exposures and typically addresses isolated aspects (e.g., sampling strategies), overlooking the critical impact of pseudo-positive samples - such as homepage clicks mer… ▽ More

    Submitted 10 July, 2025; v1 submitted 8 July, 2025; originally announced July 2025.

  14. arXiv:2507.06261  [pdf, ps, other

    cs.CL cs.AI

    Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

    Authors: Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, Luke Marris, Sam Petulla, Colin Gaffney, Asaf Aharoni, Nathan Lintz, Tiago Cardal Pais, Henrik Jacobsson, Idan Szpektor, Nan-Jiang Jiang, Krishna Haridasan, Ahmed Omran, Nikunj Saunshi, Dara Bahri, Gaurav Mishra, Eric Chu , et al. (3284 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde… ▽ More

    Submitted 22 July, 2025; v1 submitted 7 July, 2025; originally announced July 2025.

    Comments: 72 pages, 17 figures

  15. arXiv:2507.06203  [pdf, ps, other

    cs.CL

    A Survey on Latent Reasoning

    Authors: Rui-Jie Zhu, Tianhao Peng, Tianhao Cheng, Xingwei Qu, Jinfa Huang, Dawei Zhu, Hao Wang, Kaiwen Xue, Xuanliang Zhang, Yong Shan, Tianle Cai, Taylor Kergan, Assel Kembay, Andrew Smith, Chenghua Lin, Binh Nguyen, Yuqi Pan, Yuhong Chou, Zefan Cai, Zhenhe Wu, Yongchi Zhao, Tianyu Liu, Jian Yang, Wangchunshu Zhou, Chujie Zheng , et al. (8 additional authors not shown)

    Abstract: Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, especially when guided by explicit chain-of-thought (CoT) reasoning that verbalizes intermediate steps. While CoT improves both interpretability and accuracy, its dependence on natural language reasoning limits the model's expressive bandwidth. Latent reasoning tackles this bottleneck by performing multi-step inferen… ▽ More

    Submitted 10 July, 2025; v1 submitted 8 July, 2025; originally announced July 2025.

  16. arXiv:2507.05216  [pdf, ps, other

    cs.LG cs.CY stat.AP stat.ML

    Bridging Prediction and Intervention Problems in Social Systems

    Authors: Lydia T. Liu, Inioluwa Deborah Raji, Angela Zhou, Luke Guerdan, Jessica Hullman, Daniel Malinsky, Bryan Wilder, Simone Zhang, Hammaad Adam, Amanda Coston, Ben Laufer, Ezinne Nwankwo, Michael Zanger-Tishler, Eli Ben-Michael, Solon Barocas, Avi Feller, Marissa Gerchick, Talia Gillis, Shion Guha, Daniel Ho, Lily Hu, Kosuke Imai, Sayash Kapoor, Joshua Loftus, Razieh Nabi , et al. (10 additional authors not shown)

    Abstract: Many automated decision systems (ADS) are designed to solve prediction problems -- where the goal is to learn patterns from a sample of the population and apply them to individuals from the same population. In reality, these prediction systems operationalize holistic policy interventions in deployment. Once deployed, ADS can shape impacted population outcomes through an effective policy change in… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  17. arXiv:2507.04820  [pdf, ps, other

    cs.IR

    Harnessing Pairwise Ranking Prompting Through Sample-Efficient Ranking Distillation

    Authors: Junru Wu, Le Yan, Zhen Qin, Honglei Zhuang, Paul Suganthan G. C., Tianqi Liu, Zhe Dong, Xuanhui Wang, Harrie Oosterhuis

    Abstract: While Pairwise Ranking Prompting (PRP) with Large Language Models (LLMs) is one of the most effective zero-shot document ranking methods, it has a quadratic computational complexity with respect to the number of documents to be ranked, as it requires an enumeration over all possible document pairs. Consequently, the outstanding ranking performance of PRP has remained unreachable for most real-worl… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

    Comments: ReNeuIR 2025 (at SIGIR 2025) - 4th Workshop on Reaching Efficiency in Neural Information Retrieval, July 17, 2025, Padua, Italy

  18. arXiv:2507.04306  [pdf, ps, other

    cs.CV

    Exploring Remote Physiological Signal Measurement under Dynamic Lighting Conditions at Night: Dataset, Experiment, and Analysis

    Authors: Zhipeng Li, Kegang Wang, Hanguang Xiao, Xingyue Liu, Feizhong Zhou, Jiaxin Jiang, Tianqi Liu

    Abstract: Remote photoplethysmography (rPPG) is a non-contact technique for measuring human physiological signals. Due to its convenience and non-invasiveness, it has demonstrated broad application potential in areas such as health monitoring and emotion recognition. In recent years, the release of numerous public datasets has significantly advanced the performance of rPPG algorithms under ideal lighting co… ▽ More

    Submitted 6 July, 2025; originally announced July 2025.

  19. arXiv:2507.04119  [pdf, ps, other

    cs.LG cs.AI cs.CR cs.CV

    When Data-Free Knowledge Distillation Meets Non-Transferable Teacher: Escaping Out-of-Distribution Trap is All You Need

    Authors: Ziming Hong, Runnan Chen, Zengmao Wang, Bo Han, Bo Du, Tongliang Liu

    Abstract: Data-free knowledge distillation (DFKD) transfers knowledge from a teacher to a student without access the real in-distribution (ID) data. Its common solution is to use a generator to synthesize fake data and use them as a substitute for real ID data. However, existing works typically assume teachers are trustworthy, leaving the robustness and security of DFKD from untrusted teachers largely unexp… ▽ More

    Submitted 5 July, 2025; originally announced July 2025.

    Comments: Accepted by ICML 2025

  20. arXiv:2507.03698  [pdf, ps, other

    cs.CV

    SAMed-2: Selective Memory Enhanced Medical Segment Anything Model

    Authors: Zhiling Yan, Sifan Song, Dingjie Song, Yiwei Li, Rong Zhou, Weixiang Sun, Zhennong Chen, Sekeun Kim, Hui Ren, Tianming Liu, Quanzheng Li, Xiang Li, Lifang He, Lichao Sun

    Abstract: Recent "segment anything" efforts show promise by learning from large-scale data, but adapting such models directly to medical images remains challenging due to the complexity of medical data, noisy annotations, and continual learning requirements across diverse modalities and anatomical structures. In this work, we propose SAMed-2, a new foundation model for medical image segmentation built upon… ▽ More

    Submitted 4 July, 2025; originally announced July 2025.

    Comments: Accepted by MICCAI 2025

  21. arXiv:2507.03027  [pdf, ps, other

    cs.CL

    The Book of Life approach: Enabling richness and scale for life course research

    Authors: Mark D. Verhagen, Benedikt Stroebl, Tiffany Liu, Lydia T. Liu, Matthew J. Salganik

    Abstract: For over a century, life course researchers have faced a choice between two dominant methodological approaches: qualitative methods that analyze rich data but are constrained to small samples, and quantitative survey-based methods that study larger populations but sacrifice data richness for scale. Two recent technological developments now enable us to imagine a hybrid approach that combines some… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: 25 pages, 4 figures

  22. arXiv:2507.01449  [pdf, ps, other

    cs.CL

    LogitSpec: Accelerating Retrieval-based Speculative Decoding via Next Next Token Speculation

    Authors: Tianyu Liu, Qitan Lv, Hao Li, Xing Gao, Xiao Sun

    Abstract: Speculative decoding (SD), where a small draft model is employed to propose draft tokens in advance and then the target model validates them in parallel, has emerged as a promising technique for LLM inference acceleration. Many endeavors to improve SD are to eliminate the need for a draft model and generate draft tokens in a retrieval-based manner in order to further alleviate the drafting overhea… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  23. arXiv:2507.01381  [pdf, ps, other

    cs.LG cs.AI

    Distributional Soft Actor-Critic with Diffusion Policy

    Authors: Tong Liu, Yinuo Wang, Xujie Song, Wenjun Zou, Liangfa Chen, Likun Wang, Bin Shuai, Jingliang Duan, Shengbo Eben Li

    Abstract: Reinforcement learning has been proven to be highly effective in handling complex control tasks. Traditional methods typically use unimodal distributions, such as Gaussian distributions, to model the output of value distributions. However, unimodal distribution often and easily causes bias in value function estimation, leading to poor algorithm performance. This paper proposes a distributional rei… ▽ More

    Submitted 10 July, 2025; v1 submitted 2 July, 2025; originally announced July 2025.

    Comments: Accepted IEEE ITSC 2025

  24. arXiv:2506.23717  [pdf, ps, other

    cs.NE cs.AI cs.CV cs.LG

    Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit Allocation

    Authors: Xingting Yao, Qinghao Hu, Fei Zhou, Tielong Liu, Gang Li, Peisong Wang, Jian Cheng

    Abstract: Multi-bit spiking neural networks (SNNs) have recently become a heated research spot, pursuing energy-efficient and high-accurate AI. However, with more bits involved, the associated memory and computation demands escalate to the point where the performance improvements become disproportionate. Based on the insight that different layers demonstrate different importance and extra bits could be wast… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

  25. arXiv:2506.23590  [pdf, ps, other

    cs.CV

    CAI: Caption-Sensitive Attention Intervention for Mitigating Object Hallucination in Large Vision-Language Models

    Authors: Qiming Li, Zekai Ye, Xiaocheng Feng, Weihong Zhong, Libo Qin, Ruihan Chen, Baohang Li, Kui Jiang, Yaowei Wang, Ting Liu, Bing Qin

    Abstract: Although Large Vision-Language Models (LVLMs) have demonstrated powerful capabilities in interpreting visual information, they frequently produce content that deviates from visual information, leading to object hallucination. To tackle this, recent works mostly depend on expensive manual annotations and training cost, or significantly increase inference time. In this work, we observe that LVLMs' a… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

  26. arXiv:2506.23482  [pdf, ps, other

    cs.CV

    MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting

    Authors: Jun Huang, Ting Liu, Yihang Wu, Xiaochao Qu, Luoqi Liu, Xiaolin Hu

    Abstract: Advancements in generative models have enabled image inpainting models to generate content within specific regions of an image based on provided prompts and masks. However, existing inpainting methods often suffer from problems such as semantic misalignment, structural distortion, and style inconsistency. In this work, we present MTADiffusion, a Mask-Text Alignment diffusion model designed for obj… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

    Comments: CVPR 2025

  27. arXiv:2506.23434  [pdf, ps, other

    cs.CV cs.RO

    Towards foundational LiDAR world models with efficient latent flow matching

    Authors: Tianran Liu, Shengwen Zhao, Nicholas Rhinehart

    Abstract: LiDAR-based world models offer more structured and geometry-aware representations than their image-based counterparts. However, existing LiDAR world models are narrowly trained; each model excels only in the domain for which it was built. Can we develop LiDAR world models that exhibit strong transferability across multiple domains? We conduct the first systematic domain transfer study across three… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

    Comments: 25 pages, 13 figures

  28. arXiv:2506.23353  [pdf, ps, other

    cs.CV eess.IV

    Layer Decomposition and Morphological Reconstruction for Task-Oriented Infrared Image Enhancement

    Authors: Siyuan Chai, Xiaodong Guo, Tong Liu

    Abstract: Infrared image helps improve the perception capabilities of autonomous driving in complex weather conditions such as fog, rain, and low light. However, infrared image often suffers from low contrast, especially in non-heat-emitting targets like bicycles, which significantly affects the performance of downstream high-level vision tasks. Furthermore, achieving contrast enhancement without amplifying… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

  29. arXiv:2506.23351  [pdf, ps, other

    cs.RO cs.AI cs.LG cs.MA

    Benchmarking Generalizable Bimanual Manipulation: RoboTwin Dual-Arm Collaboration Challenge at CVPR 2025 MEIS Workshop

    Authors: Tianxing Chen, Kaixuan Wang, Zhaohui Yang, Yuhao Zhang, Zanxin Chen, Baijun Chen, Wanxi Dong, Ziyuan Liu, Dong Chen, Tianshuo Yang, Haibao Yu, Xiaokang Yang, Yusen Qin, Zhiqiang Xie, Yao Mu, Ping Luo, Tian Nian, Weiliang Deng, Yiheng Ge, Yibin Liu, Zixuan Li, Dehui Wang, Zhixuan Liang, Haohui Xie, Rijie Zeng , et al. (74 additional authors not shown)

    Abstract: Embodied Artificial Intelligence (Embodied AI) is an emerging frontier in robotics, driven by the need for autonomous systems that can perceive, reason, and act in complex physical environments. While single-arm systems have shown strong task performance, collaborative dual-arm systems are essential for handling more intricate tasks involving rigid, deformable, and tactile-sensitive objects. To ad… ▽ More

    Submitted 2 July, 2025; v1 submitted 29 June, 2025; originally announced June 2025.

    Comments: Challenge Webpage: https://robotwin-benchmark.github.io/cvpr-2025-challenge/

  30. arXiv:2506.23219  [pdf, ps, other

    cs.CV cs.AI cs.CL

    UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding

    Authors: Jie Feng, Shengyuan Wang, Tianhui Liu, Yanxin Xi, Yong Li

    Abstract: Urban research involves a wide range of scenarios and tasks that require the understanding of multi-modal data. Current methods often focus on specific data types and lack a unified framework in urban field for processing them comprehensively. The recent success of multi-modal large language models (MLLMs) presents a promising opportunity to overcome this limitation. In this paper, we introduce… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

    Comments: Accepted by ICCV 2025

  31. arXiv:2506.22694  [pdf, ps, other

    cs.CL

    VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs

    Authors: Raghavv Goel, Sudhanshu Agrawal, Mukul Gagrani, Junyoung Park, Yifan Zao, He Zhang, Tian Liu, Yiping Yang, Xin Yuan, Jiuyan Lu, Chris Lott, Mingu Lee

    Abstract: In this paper, we introduce a simple training-free technique to improve the performance of drafter-based speculative decoding (SpD) methods that incorporates language modeling head (LM head) during drafting process. A drafter-based speculative decoding leverages one or more smaller language models, a.k.a. drafters or draft models, to sample a draft sequence or tree consisting of multiple tokens, f… ▽ More

    Submitted 3 July, 2025; v1 submitted 27 June, 2025; originally announced June 2025.

    Comments: 8 pages, 4 figures, 5 tables, accepted at ICML 2025 workshop on Efficient Systems for Foundational Models

  32. arXiv:2506.22299  [pdf, other

    cs.LG cs.AI

    CoATA: Effective Co-Augmentation of Topology and Attribute for Graph Neural Networks

    Authors: Tao Liu, Longlong Lin, Yunfeng Yu, Xi Ou, Youan Zhang, Zhiqiu Ye, Tao Jia

    Abstract: Graph Neural Networks (GNNs) have garnered substantial attention due to their remarkable capability in learning graph representations. However, real-world graphs often exhibit substantial noise and incompleteness, which severely degrades the performance of GNNs. Existing methods typically address this issue through single-dimensional augmentation, focusing either on refining topology structures or… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    Comments: icmr

    ACM Class: I.2

  33. arXiv:2506.21215  [pdf, ps, other

    cs.AI cs.CL cs.LG

    Unveiling Causal Reasoning in Large Language Models: Reality or Mirage?

    Authors: Haoang Chi, He Li, Wenjing Yang, Feng Liu, Long Lan, Xiaoguang Ren, Tongliang Liu, Bo Han

    Abstract: Causal reasoning capability is critical in advancing large language models (LLMs) toward strong artificial intelligence. While versatile LLMs appear to have demonstrated capabilities in understanding contextual causality and providing responses that obey the laws of causality, it remains unclear whether they perform genuine causal reasoning akin to humans. However, current evidence indicates the c… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: 24 pages, accepted at NeurIPS 2024

    Journal ref: Advances in Neural Information Processing Systems, 2024, 37: 96640-96670

  34. arXiv:2506.20977  [pdf, ps, other

    cs.CV cs.AI

    From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging

    Authors: Tao Liu, Dafeng Zhang, Gengchen Li, Shizhuo Liu, Yongqi Song, Senmao Li, Shiqi Yang, Boqian Li, Kai Wang, Yaxing Wang

    Abstract: Face aging has become a crucial task in computer vision, with applications ranging from entertainment to healthcare. However, existing methods struggle with achieving a realistic and seamless transformation across the entire lifespan, especially when handling large age gaps or extreme head poses. The core challenge lies in balancing age accuracy and identity preservation--what we refer to as the A… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

    Comments: 30 pages, 12 figures

  35. arXiv:2506.20485  [pdf, ps, other

    cs.RO

    EANS: Reducing Energy Consumption for UAV with an Environmental Adaptive Navigation Strategy

    Authors: Tian Liu, Han Liu, Boyang Li, Long Chen, Kai Huang

    Abstract: Unmanned Aerial Vehicles (UAVS) are limited by the onboard energy. Refinement of the navigation strategy directly affects both the flight velocity and the trajectory based on the adjustment of key parameters in the UAVS pipeline, thus reducing energy consumption. However, existing techniques tend to adopt static and conservative strategies in dynamic scenarios, leading to inefficient energy reduct… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  36. arXiv:2506.19425  [pdf, ps, other

    cs.SE

    What Makes the Best Decomposition? Investigating Binary Decomposition Under FCG Variance

    Authors: Ang Jia, He Jiang, Zhilei Ren, Xiaochen Li, Ming Fan, Ting Liu

    Abstract: Binary decomposition, which decomposes binary files into modules, plays a critical role in binary reuse detection. Existing binary decomposition works either apply anchor-based methods by extending anchor functions to generate modules, or apply clustering-based methods by using clustering algorithms to group binary functions, which all rely on that reused code shares similar function call relation… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  37. arXiv:2506.19019  [pdf, ps, other

    cs.DC cs.AI

    Survey of HPC in US Research Institutions

    Authors: Peng Shu, Junhao Chen, Zhengliang Liu, Huaqin Zhao, Xinliang Li, Tianming Liu

    Abstract: The rapid growth of AI, data-intensive science, and digital twin technologies has driven an unprecedented demand for high-performance computing (HPC) across the research ecosystem. While national laboratories and industrial hyperscalers have invested heavily in exascale and GPU-centric architectures, university-operated HPC systems remain comparatively under-resourced. This survey presents a compr… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  38. arXiv:2506.18946  [pdf, ps, other

    cs.CV cs.AI

    DiffRIS: Enhancing Referring Remote Sensing Image Segmentation with Pre-trained Text-to-Image Diffusion Models

    Authors: Zhe Dong, Yuzhe Sun, Tianzhu Liu, Yanfeng Gu

    Abstract: Referring remote sensing image segmentation (RRSIS) enables the precise delineation of regions within remote sensing imagery through natural language descriptions, serving critical applications in disaster response, urban development, and environmental monitoring. Despite recent advances, current approaches face significant challenges in processing aerial imagery due to complex object characterist… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

  39. arXiv:2506.18398  [pdf, ps, other

    cs.SE

    RPHunter: Unveiling Rug Pull Schemes in Crypto Token via Code-and-Transaction Fusion Analysis

    Authors: Hao Wu, Haijun Wang, Shangwang Li, Yin Wu, Ming Fan, Wuxia Jin, Ting Liu

    Abstract: Rug pull scams have emerged as a persistent threat to cryptocurrency, causing significant financial losses. A typical scenario involves scammers deploying honeypot contracts to attract investments, restricting token sales, and draining the funds, which leaves investors with worthless tokens. Current methods either rely on predefined patterns to detect code risks or utilize statistical transaction… ▽ More

    Submitted 8 July, 2025; v1 submitted 23 June, 2025; originally announced June 2025.

  40. arXiv:2506.17869  [pdf, ps, other

    cs.CV cs.RO

    Cross-modal State Space Modeling for Real-time RGB-thermal Wild Scene Semantic Segmentation

    Authors: Xiaodong Guo, Zi'ang Lin, Luwen Hu, Zhihong Deng, Tong Liu, Wujie Zhou

    Abstract: The integration of RGB and thermal data can significantly improve semantic segmentation performance in wild environments for field robots. Nevertheless, multi-source data processing (e.g. Transformer-based approaches) imposes significant computational overhead, presenting challenges for resource-constrained systems. To resolve this critical limitation, we introduced CM-SSM, an efficient RGB-therma… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

  41. arXiv:2506.17540  [pdf, ps, other

    eess.IV cs.CV cs.LG

    MTSIC: Multi-stage Transformer-based GAN for Spectral Infrared Image Colorization

    Authors: Tingting Liu, Yuan Liu, Jinhui Tang, Liyin Yuan, Chengyu Liu, Chunlai Li, Xiubao Sui, Qian Chen

    Abstract: Thermal infrared (TIR) images, acquired through thermal radiation imaging, are unaffected by variations in lighting conditions and atmospheric haze. However, TIR images inherently lack color and texture information, limiting downstream tasks and potentially causing visual fatigue. Existing colorization methods primarily rely on single-band images with limited spectral information and insufficient… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  42. arXiv:2506.16504  [pdf, ps, other

    cs.CV cs.AI

    Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details

    Authors: Zeqiang Lai, Yunfei Zhao, Haolin Liu, Zibo Zhao, Qingxiang Lin, Huiwen Shi, Xianghui Yang, Mingxin Yang, Shuhui Yang, Yifei Feng, Sheng Zhang, Xin Huang, Di Luo, Fan Yang, Fang Yang, Lifu Wang, Sicong Liu, Yixuan Tang, Yulin Cai, Zebin He, Tian Liu, Yuhong Liu, Jie Jiang, Linus, Jingwei Huang , et al. (1 additional authors not shown)

    Abstract: In this report, we present Hunyuan3D 2.5, a robust suite of 3D diffusion models aimed at generating high-fidelity and detailed textured 3D assets. Hunyuan3D 2.5 follows two-stages pipeline of its previous version Hunyuan3D 2.0, while demonstrating substantial advancements in both shape and texture generation. In terms of shape generation, we introduce a new shape foundation model -- LATTICE, which… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

    Comments: Technical report

  43. arXiv:2506.16395  [pdf, ps, other

    cs.CL

    OJBench: A Competition Level Code Benchmark For Large Language Models

    Authors: Zhexu Wang, Yiping Liu, Yejie Wang, Wenyang He, Bofei Gao, Muxi Diao, Yanxu Chen, Kelin Fu, Flood Sung, Zhilin Yang, Tianyu Liu, Weiran Xu

    Abstract: Recent advancements in large language models (LLMs) have demonstrated significant progress in math and code reasoning capabilities. However, existing code benchmark are limited in their ability to evaluate the full spectrum of these capabilities, particularly at the competitive level. To bridge this gap, we introduce OJBench, a novel and challenging benchmark designed to assess the competitive-lev… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

    Comments: 9 pages, 5 figures

  44. arXiv:2506.16211  [pdf, ps, other

    cs.RO

    ControlVLA: Few-shot Object-centric Adaptation for Pre-trained Vision-Language-Action Models

    Authors: Puhao Li, Yingying Wu, Ziheng Xi, Wanlin Li, Yuzhe Huang, Zhiyuan Zhang, Yinghan Chen, Jianan Wang, Song-Chun Zhu, Tengyu Liu, Siyuan Huang

    Abstract: Learning real-world robotic manipulation is challenging, particularly when limited demonstrations are available. Existing methods for few-shot manipulation often rely on simulation-augmented data or pre-built modules like grasping and pose estimation, which struggle with sim-to-real gaps and lack extensibility. While large-scale imitation pre-training shows promise, adapting these general-purpose… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

    Comments: Website: https://controlvla.github.io

  45. arXiv:2506.15790  [pdf, ps, other

    cs.CR cs.SE

    ETrace:Event-Driven Vulnerability Detection in Smart Contracts via LLM-Based Trace Analysis

    Authors: Chenyang Peng, Haijun Wang, Yin Wu, Hao Wu, Ming Fan, Yitao Zhao, Ting Liu

    Abstract: With the advance application of blockchain technology in various fields, ensuring the security and stability of smart contracts has emerged as a critical challenge. Current security analysis methodologies in vulnerability detection can be categorized into static analysis and dynamic analysis methods.However, these existing traditional vulnerability detection methods predominantly rely on analyzing… ▽ More

    Submitted 8 July, 2025; v1 submitted 18 June, 2025; originally announced June 2025.

    Comments: 4 pages, 1 figure. Submitted to the 16th Asia-Pacific Symposium on Internetware (Internetware 2025)

    MSC Class: 68N01 ACM Class: D.2.0

  46. arXiv:2506.15647  [pdf, ps, other

    cs.AI

    Exploring and Exploiting the Inherent Efficiency within Large Reasoning Models for Self-Guided Efficiency Enhancement

    Authors: Weixiang Zhao, Jiahe Guo, Yang Deng, Xingyu Sui, Yulin Hu, Yanyan Zhao, Wanxiang Che, Bing Qin, Tat-Seng Chua, Ting Liu

    Abstract: Recent advancements in large reasoning models (LRMs) have significantly enhanced language models' capabilities in complex problem-solving by emulating human-like deliberative thinking. However, these models often exhibit overthinking (i.e., the generation of unnecessarily verbose and redundant content), which hinders efficiency and inflates inference cost. In this work, we explore the representati… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  47. arXiv:2506.15442  [pdf, ps, other

    cs.CV cs.AI

    Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material

    Authors: Team Hunyuan3D, Shuhui Yang, Mingxin Yang, Yifei Feng, Xin Huang, Sheng Zhang, Zebin He, Di Luo, Haolin Liu, Yunfei Zhao, Qingxiang Lin, Zeqiang Lai, Xianghui Yang, Huiwen Shi, Zibo Zhao, Bowen Zhang, Hongyu Yan, Lifu Wang, Sicong Liu, Jihong Zhang, Meng Chen, Liang Dong, Yiwen Jia, Yulin Cai, Jiaao Yu , et al. (28 additional authors not shown)

    Abstract: 3D AI-generated content (AIGC) is a passionate field that has significantly accelerated the creation of 3D models in gaming, film, and design. Despite the development of several groundbreaking models that have revolutionized 3D generation, the field remains largely accessible only to researchers, developers, and designers due to the complexities involved in collecting, processing, and training 3D… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: Github link: https://github.com/Tencent-Hunyuan/Hunyuan3D-2.1

  48. arXiv:2506.15349  [pdf, ps, other

    cs.LG cs.CR

    Enhancing One-run Privacy Auditing with Quantile Regression-Based Membership Inference

    Authors: Terrance Liu, Matteo Boglioni, Yiwei Fu, Shengyuan Hu, Pratiksha Thaker, Zhiwei Steven Wu

    Abstract: Differential privacy (DP) auditing aims to provide empirical lower bounds on the privacy guarantees of DP mechanisms like DP-SGD. While some existing techniques require many training runs that are prohibitively costly, recent work introduces one-run auditing approaches that effectively audit DP-SGD in white-box settings while still being computationally efficient. However, in the more practical bl… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  49. arXiv:2506.15066  [pdf, ps, other

    cs.AR cs.MA

    ChatModel: Automating Reference Model Design and Verification with LLMs

    Authors: Jianmin Ye, Tianyang Liu, Qi Tian, Shengchu Su, Zhe Jiang, Xi Wang

    Abstract: As the complexity of integrated circuit designs continues to escalate, the functional verification becomes increasingly challenging. Reference models, critical for accelerating the verification process, are themselves becoming more intricate and time-consuming to develop. Despite the promise shown by large language models (LLMs) in code programming, effectively generating complex reference models… ▽ More

    Submitted 24 June, 2025; v1 submitted 17 June, 2025; originally announced June 2025.

  50. arXiv:2506.14965  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

    Authors: Zhoujun Cheng, Shibo Hao, Tianyang Liu, Fan Zhou, Yutao Xie, Feng Yao, Yuexin Bian, Yonghao Zhuang, Nilabjo Dey, Yuheng Zha, Yi Gu, Kun Zhou, Yuqi Wang, Yuan Li, Richard Fan, Jianshu She, Chengqian Gao, Abulhair Saparov, Haonan Li, Taylor W. Killian, Mikhail Yurochkin, Zhengzhong Liu, Eric P. Xing, Zhiting Hu

    Abstract: Reinforcement learning (RL) has emerged as a promising approach to improve large language model (LLM) reasoning, yet most open efforts focus narrowly on math and code, limiting our understanding of its broader applicability to general reasoning. A key challenge lies in the lack of reliable, scalable RL reward signals across diverse reasoning domains. We introduce Guru, a curated RL reasoning corpu… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 38 pages, 9 figures. Under review