这是indexloc提供的服务,不要输入任何密码
Skip to main content

Showing 1–50 of 242 results for author: Cheng, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.17186  [pdf, ps, other

    cs.CL

    FinGAIA: An End-to-End Benchmark for Evaluating AI Agents in Finance

    Authors: Lingfeng Zeng, Fangqi Lou, Zixuan Wang, Jiajie Xu, Jinyi Niu, Mengping Li, Yifan Dong, Qi Qi, Wei Zhang, Ziwei Yang, Jun Han, Ruilun Feng, Ruiqi Hu, Lejie Zhang, Zhengbo Feng, Yicheng Ren, Xin Guo, Zhaowei Liu, Dongpo Cheng, Weige Cai, Liwen Zhang

    Abstract: The booming development of AI agents presents unprecedented opportunities for automating complex tasks across various domains. However, their multi-step, multi-tool collaboration capabilities in the financial sector remain underexplored. This paper introduces FinGAIA, an end-to-end benchmark designed to evaluate the practical abilities of AI agents in the financial domain. FinGAIA comprises 407 me… ▽ More

    Submitted 23 July, 2025; originally announced July 2025.

  2. arXiv:2507.09471  [pdf, ps, other

    cs.CV

    CKAA: Cross-subspace Knowledge Alignment and Aggregation for Robust Continual Learning

    Authors: Lingfeng He, De Cheng, Zhiheng Ma, Huaijie Wang, Dingwen Zhang, Nannan Wang, Xinbo Gao

    Abstract: Continual Learning (CL) empowers AI models to continuously learn from sequential task streams. Recently, parameter-efficient fine-tuning (PEFT)-based CL methods have garnered increasing attention due to their superior performance. They typically allocate a unique sub-module for learning each task, with a task recognizer to select the appropriate sub-modules for testing images. However, due to the… ▽ More

    Submitted 12 July, 2025; originally announced July 2025.

  3. arXiv:2507.06261  [pdf, ps, other

    cs.CL cs.AI

    Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

    Authors: Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, Luke Marris, Sam Petulla, Colin Gaffney, Asaf Aharoni, Nathan Lintz, Tiago Cardal Pais, Henrik Jacobsson, Idan Szpektor, Nan-Jiang Jiang, Krishna Haridasan, Ahmed Omran, Nikunj Saunshi, Dara Bahri, Gaurav Mishra, Eric Chu , et al. (3284 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde… ▽ More

    Submitted 22 July, 2025; v1 submitted 7 July, 2025; originally announced July 2025.

    Comments: 72 pages, 17 figures

  4. arXiv:2507.03898  [pdf, ps, other

    cs.CV

    Deconfounding Causal Inference through Two-Branch Framework with Early-Forking for Sensor-Based Cross-Domain Activity Recognition

    Authors: Di Xiong, Lei Zhang, Shuoyuan Wang, Dongzhou Cheng, Wenbo Huang

    Abstract: Recently, domain generalization (DG) has emerged as a promising solution to mitigate distribution-shift issue in sensor-based human activity recognition (HAR) scenario. However, most existing DG-based works have merely focused on modeling statistical dependence between sensor data and activity labels, neglecting the importance of intrinsic casual mechanism. Intuitively, every sensor input can be v… ▽ More

    Submitted 5 July, 2025; originally announced July 2025.

    Comments: Accepted by Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT)

    Journal ref: Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 9, 2, Article 56 (June 2025)

  5. arXiv:2507.02288  [pdf, ps, other

    cs.CV cs.LG

    Prompt Disentanglement via Language Guidance and Representation Alignment for Domain Generalization

    Authors: De Cheng, Zhipeng Xu, Xinyang Jiang, Dongsheng Li, Nannan Wang, Xinbo Gao

    Abstract: Domain Generalization (DG) seeks to develop a versatile model capable of performing effectively on unseen target domains. Notably, recent advances in pre-trained Visual Foundation Models (VFMs), such as CLIP, have demonstrated considerable potential in enhancing the generalization capabilities of deep learning models. Despite the increasing attention toward VFM-based domain prompt tuning within DG… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  6. arXiv:2507.01881  [pdf

    eess.IV cs.CV cs.LG

    A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs

    Authors: Niccolò McConnell, Pardeep Vasudev, Daisuke Yamada, Daryl Cheng, Mehran Azimbagirad, John McCabe, Shahab Aslani, Ahmed H. Shahin, Yukun Zhou, The SUMMIT Consortium, Andre Altmann, Yipeng Hu, Paul Taylor, Sam M. Janes, Daniel C. Alexander, Joseph Jacob

    Abstract: Low-dose computed tomography (LDCT) imaging employed in lung cancer screening (LCS) programs is increasing in uptake worldwide. LCS programs herald a generational opportunity to simultaneously detect cancer and non-cancer-related early-stage lung disease. Yet these efforts are hampered by a shortage of radiologists to interpret scans at scale. Here, we present TANGERINE, a computationally frugal,… ▽ More

    Submitted 15 July, 2025; v1 submitted 2 July, 2025; originally announced July 2025.

  7. MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism

    Authors: Zheng Zhang, Donglin Yang, Yaqi Xia, Liang Ding, Dacheng Tao, Xiaobo Zhou, Dazhao Cheng

    Abstract: Recently, Mixture-of-Experts (MoE) has become one of the most popular techniques to scale pre-trained models to extraordinarily large sizes. Dynamic activation of experts allows for conditional computation, increasing the number of parameters of neural networks, which is critical for absorbing the vast amounts of knowledge available in many deep learning areas. However, despite the existing system… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    Comments: 11 pages, accepted at IPDPS 2023

    Journal ref: 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 167-177. IEEE, 2023

  8. MCFuser: High-Performance and Rapid Fusion of Memory-Bound Compute-Intensive Operators

    Authors: Zheng Zhang, Donglin Yang, Xiaobo Zhou, Dazhao Cheng

    Abstract: Operator fusion, a key technique to improve data locality and alleviate GPU memory bandwidth pressure, often fails to extend to the fusion of multiple compute-intensive operators due to saturated computation throughput. However, the dynamicity of tensor dimension sizes could potentially lead to these operators becoming memory-bound, necessitating the generation of fused kernels, a task hindered by… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    Comments: 12 pages, accepted at SC 2024

    Journal ref: SC24: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2024

  9. arXiv:2506.14758  [pdf, ps, other

    cs.CL

    Reasoning with Exploration: An Entropy Perspective

    Authors: Daixuan Cheng, Shaohan Huang, Xuekai Zhu, Bo Dai, Wayne Xin Zhao, Zhenliang Zhang, Furu Wei

    Abstract: Balancing exploration and exploitation is a central goal in reinforcement learning (RL). Despite recent advances in enhancing language model (LM) reasoning, most methods lean toward exploitation, and increasingly encounter performance plateaus. In this work, we revisit entropy -- a signal of exploration in RL -- and examine its relationship to exploratory reasoning in LMs. Through empirical analys… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  10. arXiv:2506.13066  [pdf, ps, other

    cs.CL

    FinLMM-R1: Enhancing Financial Reasoning in LMM through Scalable Data and Reward Design

    Authors: Kai Lan, Jiayong Zhu, Jiangtong Li, Dawei Cheng, Guang Chen, Changjun Jiang

    Abstract: Large Multimodal Models (LMMs) demonstrate significant cross-modal reasoning capabilities. However, financial applications face challenges due to the lack of high-quality multimodal reasoning datasets and the inefficiency of existing training paradigms for reasoning enhancement. To address these issues, we propose an integrated framework, FinLMM-R1, combining an automated and scalable pipeline for… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: 26 pages, 16 figures

  11. arXiv:2506.13055  [pdf, ps, other

    cs.CL

    CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model

    Authors: Jiangtong Li, Yiyun Zhu, Dawei Cheng, Zhijun Ding, Changjun Jiang

    Abstract: Multimodal Large Language Models (MLLMs) have rapidly evolved with the growth of Large Language Models (LLMs) and are now applied in various fields. In finance, the integration of diverse modalities such as text, charts, and tables is crucial for accurate and efficient decision-making. Therefore, an effective evaluation system that incorporates these data types is essential for advancing financial… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: 22 pages, 9 figures

  12. arXiv:2506.12351  [pdf, ps, other

    cs.CV

    EKPC: Elastic Knowledge Preservation and Compensation for Class-Incremental Learning

    Authors: Huaijie Wang, De Cheng, Lingfeng He, Yan Li, Jie Li, Nannan Wang, Xinbo Gao

    Abstract: Class-Incremental Learning (CIL) aims to enable AI models to continuously learn from sequentially arriving data of different classes over time while retaining previously acquired knowledge. Recently, Parameter-Efficient Fine-Tuning (PEFT) methods, like prompt pool-based approaches and adapter tuning, have shown great attraction in CIL. However, these methods either introduce additional parameters… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  13. arXiv:2506.12037  [pdf, ps, other

    cs.LG cs.AI

    How to Train a Model on a Cheap Cluster with Low Cost using Block Coordinate Descent

    Authors: Zeyu Liu, Yunquan Zhang, Boyang Zhang, Guoyong Jiang, Daning Cheng

    Abstract: Training large language models typically demands extensive GPU memory and substantial financial investment, which poses a barrier for many small- to medium-sized teams. In this paper, we present a full-parameter pre-training framework based on block coordinate descent (BCD), augmented with engineering optimizations, to efficiently train large models on affordable RTX 4090 GPU clusters. BCD ensures… ▽ More

    Submitted 22 May, 2025; originally announced June 2025.

    Comments: under review

  14. arXiv:2506.10407  [pdf, ps, other

    eess.SY cs.AI cs.CV

    Semi-Tensor-Product Based Convolutional Neural Networks

    Authors: Daizhan Cheng

    Abstract: The semi-tensor product (STP) of vectors is a generalization of conventional inner product of vectors, which allows the factor vectors to of different dimensions. This paper proposes a domain-based convolutional product (CP). Combining domain-based CP with STP of vectors, a new CP is proposed. Since there is no zero or any other padding, it can avoid the junk information caused by padding. Using i… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  15. arXiv:2505.19762  [pdf, other

    cs.AI

    Language Model-Enhanced Message Passing for Heterophilic Graph Learning

    Authors: Wenjun Wang, Dawei Cheng

    Abstract: Traditional graph neural networks (GNNs), which rely on homophily-driven message passing, struggle with heterophilic graphs where connected nodes exhibit dissimilar features and different labels. While existing methods address heterophily through graph structure refinement or adaptation of neighbor aggregation functions, they often overlook the semantic potential of node text, rely on suboptimal m… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  16. arXiv:2505.18697  [pdf, ps, other

    cs.LG cs.AI

    Can LLMs Alleviate Catastrophic Forgetting in Graph Continual Learning? A Systematic Study

    Authors: Ziyang Cheng, Zhixun Li, Yuhan Li, Yixin Song, Kangyi Zhao, Dawei Cheng, Jia Li, Jeffrey Xu Yu

    Abstract: Nowadays, real-world data, including graph-structure data, often arrives in a streaming manner, which means that learning systems need to continuously acquire new knowledge without forgetting previously learned information. Although substantial existing works attempt to address catastrophic forgetting in graph machine learning, they are all based on training from scratch with streaming data. With… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  17. arXiv:2505.16708  [pdf, ps, other

    cs.IR

    A Novel Generative Model with Causality Constraint for Mitigating Biases in Recommender Systems

    Authors: Jianfeng Deng, Qingfeng Chen, Debo Cheng, Jiuyong Li, Lin Liu, Shichao Zhang

    Abstract: Accurately predicting counterfactual user feedback is essential for building effective recommender systems. However, latent confounding bias can obscure the true causal relationship between user feedback and item exposure, ultimately degrading recommendation performance. Existing causal debiasing approaches often rely on strong assumptions-such as the availability of instrumental variables (IVs) o… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 11 pages

  18. arXiv:2505.13997  [pdf, ps, other

    cs.CV

    StPR: Spatiotemporal Preservation and Routing for Exemplar-Free Video Class-Incremental Learning

    Authors: Huaijie Wang, De Cheng, Guozhang Li, Zhipeng Xu, Lingfeng He, Jie Li, Nannan Wang, Xinbo Gao

    Abstract: Video Class-Incremental Learning (VCIL) seeks to develop models that continuously learn new action categories over time without forgetting previously acquired knowledge. Unlike traditional Class-Incremental Learning (CIL), VCIL introduces the added complexity of spatiotemporal structures, making it particularly challenging to mitigate catastrophic forgetting while effectively capturing both frame-… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

  19. arXiv:2504.19244  [pdf, other

    cs.CV

    Semantic-Aligned Learning with Collaborative Refinement for Unsupervised VI-ReID

    Authors: De Cheng, Lingfeng He, Nannan Wang, Dingwen Zhang, Xinbo Gao

    Abstract: Unsupervised visible-infrared person re-identification (USL-VI-ReID) seeks to match pedestrian images of the same individual across different modalities without human annotations for model learning. Previous methods unify pseudo-labels of cross-modality images through label association algorithms and then design contrastive learning framework for global feature learning. However, these methods ove… ▽ More

    Submitted 5 May, 2025; v1 submitted 27 April, 2025; originally announced April 2025.

    Comments: Accepted by IJCV 2025

  20. arXiv:2504.14514  [pdf, ps, other

    cs.LG cs.AI eess.SY

    On Dimension-Free Transformer: An Application of STP to AI

    Authors: Daizhan Cheng

    Abstract: The matrix expressions for every parts of a transformer are firstly described. Based on semi-tensor product (STP) of matrices the hypervectors are reconsidered and the linear transformation over hypervectors is constructed by using projection. Its properties and calculating formulas are obtained. Using projection-based transformation of hypervector (PBTH), the framework of dimension-free transform… ▽ More

    Submitted 20 April, 2025; originally announced April 2025.

  21. arXiv:2504.12601  [pdf, ps, other

    cs.LG math.OC math.PR

    Stochastic Gradient Descent in Non-Convex Problems: Asymptotic Convergence with Relaxed Step-Size via Stopping Time Methods

    Authors: Ruinan Jin, Difei Cheng, Hong Qiao, Xin Shi, Shaodong Liu, Bo Zhang

    Abstract: Stochastic Gradient Descent (SGD) is widely used in machine learning research. Previous convergence analyses of SGD under the vanishing step-size setting typically require Robbins-Monro conditions. However, in practice, a wider variety of step-size schemes are frequently employed, yet existing convergence results remain limited and often rely on strong assumptions. This paper bridges this gap by i… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: 42 pages

    MSC Class: 40G15 ACM Class: G.1.0

  22. arXiv:2504.12332  [pdf, other

    cs.CL cs.CY

    Can the capability of Large Language Models be described by human ability? A Meta Study

    Authors: Mingrui Zan, Yunquan Zhang, Boyang Zhang, Fangming Liu, Daning Cheng

    Abstract: Users of Large Language Models (LLMs) often perceive these models as intelligent entities with human-like capabilities. However, the extent to which LLMs' capabilities truly approximate human abilities remains a topic of debate. In this paper, to characterize the capabilities of LLMs in relation to human capabilities, we collected performance data from over 80 models across 37 evaluation benchmark… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

  23. arXiv:2503.19786  [pdf, other

    cs.CL cs.AI

    Gemma 3 Technical Report

    Authors: Gemma Team, Aishwarya Kamath, Johan Ferret, Shreya Pathak, Nino Vieillard, Ramona Merhej, Sarah Perrin, Tatiana Matejovicova, Alexandre Ramé, Morgane Rivière, Louis Rouillard, Thomas Mesnard, Geoffrey Cideron, Jean-bastien Grill, Sabela Ramos, Edouard Yvinec, Michelle Casbon, Etienne Pot, Ivo Penchev, Gaël Liu, Francesco Visin, Kathleen Kenealy, Lucas Beyer, Xiaohai Zhai, Anton Tsitsulin , et al. (191 additional authors not shown)

    Abstract: We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters. This version introduces vision understanding abilities, a wider coverage of languages and longer context - at least 128K tokens. We also change the architecture of the model to reduce the KV-cache memory that tends to explode with long context. This is achie… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  24. arXiv:2503.12803  [pdf, ps, other

    cs.CL cs.LG

    Leveraging Deep Neural Networks for Aspect-Based Sentiment Classification

    Authors: Chen Li, Debo Cheng, Yasuhiko Morimoto

    Abstract: Aspect-based sentiment analysis seeks to determine sentiment with a high level of detail. While graph convolutional networks (GCNs) are commonly used for extracting sentiment features, their straightforward use in syntactic feature extraction can lead to a loss of crucial information. This paper presents a novel edge-enhanced GCN, called EEGCN, which improves performance by preserving feature inte… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  25. arXiv:2503.06084  [pdf, other

    cs.CV

    Exploring Interpretability for Visual Prompt Tuning with Hierarchical Concepts

    Authors: Yubin Wang, Xinyang Jiang, De Cheng, Xiangqian Zhao, Zilong Wang, Dongsheng Li, Cairong Zhao

    Abstract: Visual prompt tuning offers significant advantages for adapting pre-trained visual foundation models to specific tasks. However, current research provides limited insight into the interpretability of this approach, which is essential for enhancing AI reliability and enabling AI-driven knowledge discovery. In this paper, rather than learning abstract prompt embeddings, we propose the first framewor… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

    Comments: 10 pages, 9 figures

  26. arXiv:2503.04548  [pdf, other

    cs.CL

    An Empirical Study on Eliciting and Improving R1-like Reasoning Models

    Authors: Zhipeng Chen, Yingqian Min, Beichen Zhang, Jie Chen, Jinhao Jiang, Daixuan Cheng, Wayne Xin Zhao, Zheng Liu, Xu Miao, Yang Lu, Lei Fang, Zhongyuan Wang, Ji-Rong Wen

    Abstract: In this report, we present the third technical report on the development of slow-thinking models as part of the STILL project. As the technical pathway becomes clearer, scaling RL training has become a central technique for implementing such reasoning models. We systematically experiment with and document the effects of various factors influencing RL training, conducting experiments on both base m… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

    Comments: Technical Report on Slow Thinking with LLMs: Part III

  27. Effective High-order Graph Representation Learning for Credit Card Fraud Detection

    Authors: Yao Zou, Dawei Cheng

    Abstract: Credit card fraud imposes significant costs on both cardholders and issuing banks. Fraudsters often disguise their crimes, such as using legitimate transactions through several benign users to bypass anti-fraud detection. Existing graph neural network (GNN) models struggle with learning features of camouflaged, indirect multi-hop transactions due to their inherent over-smoothing issues in deep mul… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

    Comments: 9 pages, 5 figures, accepted at IJCAI 2024

    MSC Class: 68T07; 91B06 ACM Class: I.2.6; H.2.8

    Journal ref: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI 2024), pages 7581-7589

  28. arXiv:2502.18834  [pdf, other

    cs.CE cs.LG

    FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting

    Authors: Yifan Hu, Yuante Li, Peiyuan Liu, Yuxia Zhu, Naiqi Li, Tao Dai, Shu-tao Xia, Dawei Cheng, Changjun Jiang

    Abstract: Financial time series (FinTS) record the behavior of human-brain-augmented decision-making, capturing valuable historical information that can be leveraged for profitable investment strategies. Not surprisingly, this area has attracted considerable attention from researchers, who have proposed a wide range of methods based on various backbones. However, the evaluation of the area often exhibits th… ▽ More

    Submitted 26 February, 2025; originally announced February 2025.

  29. arXiv:2502.15802  [pdf, other

    cs.LG cs.AI cs.IT

    A General Error-Theoretical Analysis Framework for Constructing Compression Strategies

    Authors: Boyang Zhang, Daning Cheng, Yunquan Zhang, Meiqi Tu, Fangmin Liu, Jiake Tian

    Abstract: The exponential growth in parameter size and computational complexity of deep models poses significant challenges for efficient deployment. The core problem of existing compression methods is that different layers of the model have significant differences in their tolerance to compression levels. For instance, the first layer of a model can typically sustain a higher compression level compared to… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

    Comments: Under Review

  30. arXiv:2502.15686  [pdf

    cs.DB cs.CL

    V-SQL: A View-based Two-stage Text-to-SQL Framework

    Authors: Zeshun You, Jiebin Yao, Dong Cheng, Zhiwei Wen, Zhiliang Lu, Xianyi Shen

    Abstract: The text-to-SQL task aims to convert natural language into Structured Query Language (SQL) without bias. Recently, text-to-SQL methods based on large language models (LLMs) have garnered significant attention. The core of mainstream text-to-SQL frameworks is schema linking, which aligns user queries with relevant tables and columns in the database. Previous methods focused on schema linking while… ▽ More

    Submitted 16 December, 2024; originally announced February 2025.

    Comments: 10 pages,5 figures,

  31. arXiv:2502.13581  [pdf, ps, other

    cs.IR cs.LG

    ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation

    Authors: Yupeng Hou, Jianmo Ni, Zhankui He, Noveen Sachdeva, Wang-Cheng Kang, Ed H. Chi, Julian McAuley, Derek Zhiyuan Cheng

    Abstract: Generative recommendation (GR) is an emerging paradigm where user actions are tokenized into discrete token patterns and autoregressively generated as predictions. However, existing GR models tokenize each action independently, assigning the same fixed tokens to identical actions across all sequences without considering contextual relationships. This lack of context-awareness can lead to suboptima… ▽ More

    Submitted 6 June, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

    Comments: ICML 2025 (Spotlight)

  32. arXiv:2502.11133  [pdf, other

    cs.LG cs.MA

    MasRouter: Learning to Route LLMs for Multi-Agent Systems

    Authors: Yanwei Yue, Guibin Zhang, Boyang Liu, Guancheng Wan, Kun Wang, Dawei Cheng, Yiyan Qi

    Abstract: Multi-agent systems (MAS) powered by Large Language Models (LLMs) have been demonstrated to push the boundaries of LLM capabilities, yet they often incur significant costs and face challenges in dynamic LLM selection. Current LLM routing methods effectively reduce overhead in single-agent scenarios by customizing LLM selection for each query, but they overlook the critical decisions regarding coll… ▽ More

    Submitted 16 February, 2025; originally announced February 2025.

  33. arXiv:2502.06707  [pdf, other

    cs.CE

    FinMamba: Market-Aware Graph Enhanced Multi-Level Mamba for Stock Movement Prediction

    Authors: Yifan Hu, Peiyuan Liu, Yuante Li, Dawei Cheng, Naiqi Li, Tao Dai, Jigang Bao, Shu-Tao Xia

    Abstract: Recently, combining stock features with inter-stock correlations has become a common and effective approach for stock movement prediction. However, financial data presents significant challenges due to its low signal-to-noise ratio and the dynamic complexity of the market, which give rise to two key limitations in existing methods. First, the relationships between stocks are highly influenced by m… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  34. arXiv:2502.05540  [pdf, ps, other

    cs.CV

    Demystifying Catastrophic Forgetting in Two-Stage Incremental Object Detector

    Authors: Qirui Wu, Shizhou Zhang, De Cheng, Yinghui Xing, Di Xu, Peng Wang, Yanning Zhang

    Abstract: Catastrophic forgetting is a critical chanllenge for incremental object detection (IOD). Most existing methods treat the detector monolithically, relying on instance replay or knowledge distillation without analyzing component-specific forgetting. Through dissection of Faster R-CNN, we reveal a key insight: Catastrophic forgetting is predominantly localized to the RoI Head classifier, while regres… ▽ More

    Submitted 29 May, 2025; v1 submitted 8 February, 2025; originally announced February 2025.

    Comments: Accepted in ICML2025

  35. arXiv:2502.00386  [pdf, other

    cs.CV

    Efficient Adaptive Label Refinement for Label Noise Learning

    Authors: Wenzhen Zhang, Debo Cheng, Guangquan Lu, Bo Zhou, Jiaye Li, Shichao Zhang

    Abstract: Deep neural networks are highly susceptible to overfitting noisy labels, which leads to degraded performance. Existing methods address this issue by employing manually defined criteria, aiming to achieve optimal partitioning in each iteration to avoid fitting noisy labels while thoroughly learning clean samples. However, this often results in overly complex and difficult-to-train models. To addres… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

  36. arXiv:2501.13041  [pdf, other

    cs.LG

    TimeFilter: Patch-Specific Spatial-Temporal Graph Filtration for Time Series Forecasting

    Authors: Yifan Hu, Guibin Zhang, Peiyuan Liu, Disen Lan, Naiqi Li, Dawei Cheng, Tao Dai, Shu-Tao Xia, Shirui Pan

    Abstract: Time series forecasting methods generally fall into two main categories: Channel Independent (CI) and Channel Dependent (CD) strategies. While CI overlooks important covariate relationships, CD captures all dependencies without distinction, introducing noise and reducing generalization. Recent advances in Channel Clustering (CC) aim to refine dependency modeling by grouping channels with similar c… ▽ More

    Submitted 20 May, 2025; v1 submitted 22 January, 2025; originally announced January 2025.

  37. arXiv:2501.12642  [pdf

    cs.CY

    Training Data Attribution (TDA): Examining Its Adoption & Use Cases

    Authors: Deric Cheng, Juhan Bae, Justin Bullock, David Kristofferson

    Abstract: This report investigates Training Data Attribution (TDA) and its potential importance to and tractability for reducing extreme risks from AI. First, we discuss the plausibility and amount of effort it would take to bring existing TDA research efforts from their current state, to an efficient and accurate tool for TDA inference that can be run on frontier-scale LLMs. Next, we discuss the numerous r… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

  38. arXiv:2501.05884  [pdf, other

    cs.CV

    Text-to-Edit: Controllable End-to-End Video Ad Creation via Multimodal LLMs

    Authors: Dabing Cheng, Haosen Zhan, Xingchen Zhao, Guisheng Liu, Zemin Li, Jinghui Xie, Zhao Song, Weiguo Feng, Bingyue Peng

    Abstract: The exponential growth of short-video content has ignited a surge in the necessity for efficient, automated solutions to video editing, with challenges arising from the need to understand videos and tailor the editing according to user requirements. Addressing this need, we propose an innovative end-to-end foundational framework, ultimately actualizing precise control over the final video content… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: 16pages conference

  39. arXiv:2501.05281  [pdf, other

    cs.CV cs.LG

    Comparison Study: Glacier Calving Front Delineation in Synthetic Aperture Radar Images With Deep Learning

    Authors: Nora Gourmelon, Konrad Heidler, Erik Loebel, Daniel Cheng, Julian Klink, Anda Dong, Fei Wu, Noah Maul, Moritz Koch, Marcel Dreier, Dakota Pyles, Thorsten Seehaus, Matthias Braun, Andreas Maier, Vincent Christlein

    Abstract: Calving front position variation of marine-terminating glaciers is an indicator of ice mass loss and a crucial parameter in numerical glacier models. Deep Learning (DL) systems can automatically extract this position from Synthetic Aperture Radar (SAR) imagery, enabling continuous, weather- and illumination-independent, large-scale monitoring. This study presents the first comparison of DL systems… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

    ACM Class: I.4.6; I.5.4

  40. arXiv:2412.20563  [pdf, other

    cs.CL

    Counterfactual Samples Constructing and Training for Commonsense Statements Estimation

    Authors: Chong Liu, Zaiwen Feng, Lin Liu, Zhenyun Deng, Jiuyong Li, Ruifang Zhai, Debo Cheng, Li Qin

    Abstract: Plausibility Estimation (PE) plays a crucial role for enabling language models to objectively comprehend the real world. While large language models (LLMs) demonstrate remarkable capabilities in PE tasks but sometimes produce trivial commonsense errors due to the complexity of commonsense knowledge. They lack two key traits of an ideal PE model: a) Language-explainable: relying on critical word se… ▽ More

    Submitted 29 December, 2024; originally announced December 2024.

    Comments: 14 pages, 4 figures

  41. arXiv:2412.19507  [pdf, other

    cs.AI

    Hybrid Local Causal Discovery

    Authors: Zhaolong Ling, Honghui Peng, Yiwen Zhang, Debo Cheng, Xingyu Wu, Peng Zhou, Kui Yu

    Abstract: Local causal discovery aims to learn and distinguish the direct causes and effects of a target variable from observed data. Existing constraint-based local causal discovery methods use AND or OR rules in constructing the local causal skeleton, but using either rule alone is prone to produce cascading errors in the learned local causal skeleton, and thus impacting the inference of local causal rela… ▽ More

    Submitted 12 May, 2025; v1 submitted 27 December, 2024; originally announced December 2024.

    Comments: This paper has been accepted for publication in the Proceedings of the 34th International Joint Conference on Artificial Intelligence (IJCAI 2025)

  42. Semi-supervised Credit Card Fraud Detection via Attribute-Driven Graph Representation

    Authors: Sheng Xiang, Mingzhi Zhu, Dawei Cheng, Enxia Li, Ruihui Zhao, Yi Ouyang, Ling Chen, Yefeng Zheng

    Abstract: Credit card fraud incurs a considerable cost for both cardholders and issuing banks. Contemporary methods apply machine learning-based classifiers to detect fraudulent behavior from labeled transaction records. But labeled data are usually a small proportion of billions of real transactions due to expensive labeling costs, which implies that they do not well exploit many natural features from unla… ▽ More

    Submitted 24 December, 2024; originally announced December 2024.

    Comments: 9 pages, 5 figures, AAAI 2023, code: https://github.com/AI4Risk/antifraud

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 12. 2023

  43. arXiv:2412.17213  [pdf, other

    cs.CR

    Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool

    Authors: Jiangtong Li, Dungy Liu, Dawei Cheng, Changchun Jiang

    Abstract: \textbf{G}raph \textbf{N}eural \textbf{N}etworks~(GNNs) have achieved significant success in various real-world applications, including social networks, finance systems, and traffic management. Recent researches highlight their vulnerability to backdoor attacks in node classification, where GNNs trained on a poisoned graph misclassify a test node only when specific triggers are attached. These stu… ▽ More

    Submitted 22 December, 2024; originally announced December 2024.

    Comments: 13 pages, 5 figures

  44. arXiv:2412.14689  [pdf, other

    cs.CL cs.AI cs.LG

    How to Synthesize Text Data without Model Collapse?

    Authors: Xuekai Zhu, Daixuan Cheng, Hengli Li, Kaiyan Zhang, Ermo Hua, Xingtai Lv, Ning Ding, Zhouhan Lin, Zilong Zheng, Bowen Zhou

    Abstract: Model collapse in synthetic data indicates that iterative training on self-generated data leads to a gradual decline in performance. With the proliferation of AI models, synthetic data will fundamentally reshape the web data ecosystem. Future GPT-$\{n\}$ models will inevitably be trained on a blend of synthetic and human-produced data. In this paper, we focus on two questions: what is the impact o… ▽ More

    Submitted 28 May, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

    Comments: Accepted at ICML 2025

  45. arXiv:2412.08810  [pdf, other

    cs.DB cs.AI

    Efficient Dynamic Attributed Graph Generation

    Authors: Fan Li, Xiaoyang Wang, Dawei Cheng, Cong Chen, Ying Zhang, Xuemin Lin

    Abstract: Data generation is a fundamental research problem in data management due to its diverse use cases, ranging from testing database engines to data-specific applications. However, real-world entities often involve complex interactions that cannot be effectively modeled by traditional tabular data. Therefore, graph data generation has attracted increasing attention recently. Although various graph gen… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

    Comments: 14 pages,10 figures. Accepted by IEEE ICDE2025

  46. arXiv:2412.07605  [pdf, other

    cs.LG

    Fast Track to Winning Tickets: Repowering One-Shot Pruning for Graph Neural Networks

    Authors: Yanwei Yue, Guibin Zhang, Haoran Yang, Dawei Cheng

    Abstract: Graph Neural Networks (GNNs) demonstrate superior performance in various graph learning tasks, yet their wider real-world application is hindered by the computational overhead when applied to large-scale graphs. To address the issue, the Graph Lottery Hypothesis (GLT) has been proposed, advocating the identification of subgraphs and subnetworks, \textit{i.e.}, winning tickets, without compromising… ▽ More

    Submitted 10 December, 2024; originally announced December 2024.

    Comments: AAAI 2025

  47. arXiv:2412.06868  [pdf, other

    cs.CV cs.AI

    Compression for Better: A General and Stable Lossless Compression Framework

    Authors: Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu, Wenguang Chen

    Abstract: This work focus on how to stabilize and lossless model compression, aiming to reduce model complexity and enhance efficiency without sacrificing performance due to compression errors. A key challenge is effectively leveraging compression errors and defining the boundaries for lossless compression to minimize model loss. i.e., compression for better. Currently, there is no systematic approach to de… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

    Comments: Under Review

  48. arXiv:2412.06867  [pdf, other

    cs.LG cs.AI cs.CC

    Lossless Model Compression via Joint Low-Rank Factorization Optimization

    Authors: Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu, Jiake Tian

    Abstract: Low-rank factorization is a popular model compression technique that minimizes the error $δ$ between approximated and original weight matrices. Despite achieving performances close to the original models when $δ$ is optimized, a performance discrepancy remains due to the separate optimization processes for low-rank factorization and model performance, resulting in unavoidable losses. We address th… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

    Comments: Under Review

  49. arXiv:2412.06865  [pdf, other

    cs.LG cs.AI

    FP=xINT:A Low-Bit Series Expansion Algorithm for Post-Training Quantization

    Authors: Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu

    Abstract: Post-Training Quantization (PTQ) converts pre-trained Full-Precision (FP) models into quantized versions without training. While existing methods reduce size and computational costs, they also significantly degrade performance and quantization efficiency at extremely low settings due to quantization noise. We introduce a deep model series expansion framework to address this issue, enabling rapid a… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

    Comments: Under Review

  50. arXiv:2412.05722  [pdf, other

    cs.CV

    Evaluating Hallucination in Text-to-Image Diffusion Models with Scene-Graph based Question-Answering Agent

    Authors: Ziyuan Qin, Dongjie Cheng, Haoyu Wang, Huahui Yi, Yuting Shao, Zhiyuan Fan, Kang Li, Qicheng Lao

    Abstract: Contemporary Text-to-Image (T2I) models frequently depend on qualitative human evaluations to assess the consistency between synthesized images and the text prompts. There is a demand for quantitative and automatic evaluation tools, given that human evaluation lacks reproducibility. We believe that an effective T2I evaluation metric should accomplish the following: detect instances where the gener… ▽ More

    Submitted 7 December, 2024; originally announced December 2024.