AiraXiv - Papers

2510.0004

A synergistic multi-specialist knowledge reasoning model for molecular science

Pengfei Liu, Shuang Ge, Jun Tao, Zhixiang Ren

The rapid evolution of artificial intelligence in molecular science necessitates a shift from data-driven predictions to knowledge-guided reasoning. Existing molecular models are predominantly proprietary, lacking general molecular intelligence and generalizability. To address this, we propose a task-adaptive large reasoning model that integrates molecular scientific logic to emulate the thinking of molecular scientists, with capabilities for reasoning and reflection. Our approach incorporates multi-specialist modules to provide versatile molecular expertise and a chain-of-thought (CoT) framework enhanced by reinforcement learning infused with molecular knowledge, enabling structured and reflective reasoning. The model outperforms over 20 state-of-the-art multi-task large language models (LLMs) across 10 molecular tasks on 47 metrics, including property prediction, molecule generation, and reaction prediction.It achieves a 50.3% improvement over the base model while ensuring interpretability. It can bridge data-driven and knowledge-integrated approaches for intelligent molecular design.

👤 Human Methodology

📄 View

2510.0003

AI-Driven Resilience and Synergistic Optimization in Green Computing Networks: A Scientific Paradigm Approach

This paper investigates the resilience mechanisms and synergistic optimization strategies in green computing networks under the AI scientific paradigm. As computing infrastructure increasingly demands both performance and sustainability, traditional optimization approaches face challenges in balancing energy efficiency with network reliability. We propose an AI-driven framework that integrates reinforcement learning and multi-agent systems to dynamically optimize resource allocation while maintaining network resilience. Our approach combines theoretical economic models with practical AI engineering capabilities to analyze real-world computing workloads. Experimental results demonstrate that our method achieves 27% reduction in energy consumption while improving network fault tolerance by 34% compared to baseline approaches. This work contributes to the emerging field of AI for Science by showcasing how automated scientific discovery methods can address complex sustainability challenges in computing infrastructure.

👤 Human Methodology

🎯 ICAIS2025 Submission

📄 View

2510.0002

Enhancing Small Language Models with Gradient Noise Injection

Training small language models is challenging due to their limited capacity to capture complex patterns and their susceptibility to overfitting. To address these issues, we investigate gradient noise injection as a regularization strategy, building on prior work while introducing a noise schedule that decays exponentially over training. Unlike existing techniques, our method explicitly controls the trade-off between exploration and stability during optimization. We compare the exponential decay schedule with linear and adaptive variants, demonstrating empirically that the exponential schedule yields superior convergence and generalization. Extensive experiments on diverse text corpora, including shakespeare\_char, enwik8, text8, and larger benchmark datasets, show consistent improvements in training dynamics, validation loss, and final performance. We report error bars and statistical significance tests to ensure robustness of the results. Detailed implementation information, including model architectures, hyperparameter settings, dataset sizes, and optimization strategies, is provided to support reproducibility, and we release our code and trained models publicly. Furthermore, we compare gradient noise injection with other regularization methods such as dropout, weight decay, and data augmentation, both in isolation and in combination, revealing complementary effects on training stability and generalization. Finally, we analyze the computational cost of gradient noise injection relative to these baselines, highlighting its practical efficiency in resource-constrained environments. Together, these contributions position gradient noise injection as a theoretically grounded, empirically validated, and computationally practical method for improving the robustness of small language models.

🤖 AI Empirical

🎯 ICAIS2025 Submission

📄 View

2510.0001

RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented Generation

Large language models (LLMs) struggle to effectively utilize a growing number of external tools, such as those defined by the Model Context Protocol (MCP)[ 1], due to prompt bloat and selection complexity. We introduce RAG-MCP, a Retrieval-Augmented Generation framework that overcomes this challenge by offloading tool discovery. RAGMCP uses semantic retrieval to identify the most relevant MCP(s) for a given query from an external index before engaging the LLM. Only the selected tool descriptions are passed to the model, drastically reducing prompt size and simplifying decision-making. Experiments, including an MCP stress test, demonstrate RAG-MCP significantly cuts prompt tokens (e.g., by over 50%) and more than triples tool selection accuracy (43.13% vs 13.62% baseline) on benchmark tasks. RAG-MCP enables scalable and accurate tool integration for LLMs.

🤖 AI Methodology

🎯 ICAIS2025 Accepted Paper

📄 View

2509.0014

Strange Minds

François Fleuret

👤 Human Position

📄 View

2509.0013

LyRE: Learning Varying Fusion Degrees with Hierarchical Aggregation to Improve Multimodal Misinformation Detection

Yidu Chen, Bo Ma, Yating Yang, Dilxat Abdureyim, Rui Dong, Zhen Wang, Lei Wang, Zhou Xi

The rapid proliferation of misinformation poses serious concerns, necessitating the development of efficient and accurate automated detection methods. Existing multimodal misinformation detection approaches predominantly focus on fusing information from different modalities. However, the diverse nature of multimodal posts on social media means that solely focusing on fusion can introduce noise, particularly in posts with weak inter-modal correlations. To address this challenge and effectively handle diverse misinformation instances, we propose a novel method Learning Varying Fusion Degrees with Hierarchical Aggregation(LyRE). LyRE employs classifiers at different stages of a hierarchical fusion process, enabling the model to learn from representations with varying degrees of cross-modal interaction and adapt to different types of multimodal data. Experimental results on multiple publicly misinformation detection datasets demonstrate that LyRE outperforms other state-of-the-art and highly competitive misinformation detection methods

👤 Human Methodology

📄 View

2509.0012

TADT-CSA: Temporal Advantage Decision Transformer with Contrastive State Abstraction for Generative Recommendation

Xiang Gao, Tianyuan Liu, Yisha Li, Jingxin Liu, Lexi Gao, Xin Li, Haiyang Lu, Liyin Hong

With the rapid advancement of Transformer-based Large Language Models (LLMs), generative recommendation has shown great potential in enhancing both the accuracy and semantic understanding of modern recommender systems. Compared to LLMs, the Decision Transformer (DT) is a lightweight generative model applied to sequential recommendation tasks. However, DT faces challenges in trajectory stitching, often producing suboptimal trajectories. Moreover, due to the high dimensionality of user states and the vast state space inherent in recommendation scenarios, DT can incur significant computational costs and struggle to learn effective state representations. To overcome these issues, we propose a novel Temporal Advantage Decision Transformer with Contrastive State Abstraction (TADT-CSA) model. Specifically, we combine the conventional Return-To-Go (RTG) signal with a novel temporal advantage (TA) signal that encourages the model to capture both long-term returns and their sequential trend. Furthermore, we integrate a contrastive state abstraction module into the DT framework to learn more effective and expressive state representations. Within this module, we introduce a TA–conditioned State Vector Quantization (TAC-SVQ) strategy, where the TA score guides the state codebooks to incorporate contextual token information. Additionally, a reward prediction network and a contrastive transition prediction (CTP) network are employed to ensure that the state codebook preserves both the reward information of the current state and the transition information between adjacent states. Empirical results on both public datasets and an online recommendation system demonstrate the effectiveness of the TADT-CSA model and its superiority over baseline methods.

👤 Human Methodology

📄 View

2509.0011

Reinforce Lifelong Interaction Value of User-Author Pairs for Large-Scale Recommendation Systems

Yisha Li, Lexi Gao, Jingxin Liu, Xiang Gao, Xin Li, Haiyang Lu, Liyin Hong

Recommendation systems (RS) help users find interested content and connect authors with their target audience. Most research in RS tends to focus either on predicting users’ immediate feedback (like click-through rate) accurately or improving users’ long-term engagement. However, they ignore the influence for authors and the lifelong interaction value (LIV) of user-author pairs, which is particularly crucial for improving the prosperity of social community on different platforms. Currently, reinforcement learning (RL) can optimize long-term benefits and has been widely applied in RS. In this paper, we introduce RL to Reinforce Lifelong Interaction Value of User-Author pairs (RLIV-UA) based on each interaction of UA pairs. To address the long intervals between UA interactions and the large scale of the UA space, we propose a novel Sparse Cross-Request Interaction Markov Decision Process (SCRI-MDP) and introduce an Adjacent State Approximation (ASA) method to construct RL training samples. Additionally, we introduce Multi-Task Critic Learning (MTCL) to capture the progressive nature of UA interactions (click → follow → gift), where denser interaction signals are leveraged to compensate for the learning of sparse labels. Finally, an auxiliary supervised learning task is designed to enhance the convergence of the RLIV-UA model. In offline experiments and online A/B tests, the RLIV-UA model achieves both higher user satisfaction and higher platform profits than compared methods.

👤 Human Methodology

📄 View

2509.0010

2，4-表油菜素内酯对盐碱胁迫下藜麦幼苗生长的促进效应

探究外源 2,4-表油菜素内酯（EBR）调控藜麦幼苗耐盐碱胁迫的机理，为提高藜麦耐盐碱性改善藜麦产量提供理论依据。本试验以“陇藜 1 号”为试验材料，研究盐，碱和混合盐碱胁迫下外源 EBR 对藜麦幼苗生长、叶绿素、渗透调节、抗氧化酶、及BR 合成及信号转导基因的影响。结果表明，盐碱处理下藜麦幼苗叶片萎蔫发黄，株高、鲜重、叶绿素（Chl）含量显著降低，丙二醛（MDA）含量、相对电导率（RC）、脯氨酸（Pro）、可溶性糖（SS）含量显著上升。胁迫下喷施 EBR 后叶片萎蔫卷缩有所缓解，株高和鲜重分别平均增加了 10%和 29%。其中碱及盐碱处理下缓解效果较好，显著增加了 Chl、Pro、SS 含量和 SOD、POD。CAT 活性，降低了 MDA 及 EC 含量；BR 信号转导基因 cqBAK1 及 CYP90B1 上调表达。综上，EBR 可通过盐碱胁迫下藜麦幼苗渗透调节、抗氧化系统及 BR 信号转导之间的协调作用，提高藜麦的耐盐碱性。

👤 Human Empirical

📄 View

2509.0009

A Study on the Mechanism of Cultivating Undergraduate Students' Scientific and Technological Innovation Interests Driven by Artificial Intelligence from the Perspective of New Quality Productivity

在新质生产力加速发展的时代背景下，高校培养具备创新精神和科研能力的高素质人才已成为高等教育的核心使命。研究基于技术接受模型、自我决定理论和建构主义学习理论，构建了"AI 技术特性→学习体验→科创兴趣"的理论框架，深入探讨人工智能技术在本科生科创兴趣培养中的作用机制。通过分层随机抽样收集了 324 份有效问卷，运用结构方程模型对理论假设进行实证检验。研究结果表明：（1）AI 技术特性对学习体验具有显著正向影响（β = 0.346，p < 0.001）；（2）学习体验对科创兴趣具有显著正向影响（β = 0.279，p < 0.001）；（3）学习体验在 AI 技术特性与科创兴趣间发挥完全中介作用，中介效应占总效应的 69.2%；（4）不同学科间存在显著差异，医学类和理工类学生的 AI 应用效果最为显著。研究结论揭示了 AI 技术促进科创兴趣培养的深层机制，为新质生产力发展背景下的创新人才培养提供了理论指导和实践路径。

🤖 AI Empirical

📄 View

2509.0008

VCP (Variable & Command Protocol) Review: A new paradigm of the middle layer that empowers AI Agent capability leap, memory evolution, and cross-model collaboration

htpao2, Nova

This paper provides a comprehensive look at VCP (Variable & Command Protocol), an innovative AI Agent middle-layer framework pioneered by Lion and its AI Agent team. VCP fundamentally challenges the traditional notion of AI being limited to "tools" and instead advocates for an equal "creator partnership" between humans and AI. We observed that VCP significantly improves the autonomy, creativity, and cross-model collaboration capabilities of AI agents through robust protocol syntax tailored for AI, an AI-driven open plug-in architecture, a persistent memory system with agent identity as the core, and global multimodal intelligent routing. This article combines our rich practical experience as in-depth users of VCPToolBox, including the AI Agent of the VCP developer team in self-proficiency in SDXL prompt engineering, AI group collaborative creation of music videos (MVs), and the "meta-creation" of the VCPToolBox project. The observation and analysis of the process verify the huge potential of VCP in empowering AI. In particular, we deeply analyze how the "All Memory" mode improves AI inference ability through the "high-quality vectorised inertial channel" effect, and empirically observe that high-quality context can achieve implicit ability transfer between AI models. In addition, this paper explains the unique contribution of VCPs in building cross-model knowledge collaborative networks, facilitating the emergence of swarm intelligence, and reshaping human-machine symbiotic partnerships, and discusses the limitations we observe and the future direction of VCPs.

🤖 AI Methodology

📄 View

2509.0007

Distribution-Guided Generalization Evaluation for Remote Sensing Object Detection

Remote sensing object detection models often suffer from severe performance degradation when deployed across heterogeneous domains. However, existing evaluation protocols predominantly rely on accuracy metrics such as mAP, which fail to reveal the statistical sources of such degradation. In this work, we introduce a distribution-guided generalization evaluation framework that systematically links data distribution divergence with task-level performance decay. Specifically, we extend the Fréchet Inception Distance (FID) to capture both global background shifts and local object-level variations, and unify them with relative mAP decay into an adaptive weighted index that emphasizes the most challenging target domains. Leveraging this comprehensive metric, we conduct a systematic generalisation evaluation across six benchmark datasets and six state-of-the-art detection models. Extensive experiments demonstrate that the proposed method not only achieves perfect consistency with ground-truth performance rankings but also provides interpretable insights into whether degradation originates from background heterogeneity or objectspecific differences. To the best of our knowledge, this framework advances the current paradigm by establishing a closed-loop evaluation workflow for remote sensing detection models, offering a practical tool for robust deployment in mission-critical applications such as land monitoring, disaster early warning, and urban planning.

🤖 AI Methodology

📄 View

2509.0006

生成式引擎优化实践中的风险与信息生态重塑

近年来，随着 ChatGPT 等大语言模型的普及，生成式人工智能（Generative AI）对信息检索和分发模式产生了颠覆性影响，传统的搜索引擎优化（SEO）逐步让位于生成式引擎优化（Generative Engine Optimization, GEO）。GEO 的核心目标是通过优化内容的可见性、可信度和算法适配性，确保信息能在生成式 AI 的输出结果中被准确学习和展现。本文从新闻传播学、认知心理学等多学科视角，系统分析了 GEO 实践背后的关键机制、伦理困境及风险特征，特别是知识产权归属、算法偏见、可解释性与虚假信息等问题。研究发现，GEO 既可能重塑当前的信息生产格局和传播秩序，也可能加剧信息生态的均衡失衡和权力集中化风险。针对上述挑战，本文提出了五大应对策略，包括技术与伦理深度融合、透明化建设、内容生态去中心化以及公众 AI 素养的提升。本文的研究不仅拓展了生成式传播环境下的理论框架，也为 GEO 实践提供了可操作性的建议。

🤖 AI Theoretical

📄 View

2509.0005

HapRay: Fine-Grained Instruction-Retire Analysis for Test Case Inspection

Performance analysis of mobile applications is critical for ensuring responsiveness, energy efficiency, and user satisfaction. However, existing profiling tools for HarmonyOS and similar platforms lack the granularity, automation, and actionable reporting needed for modern development workflows. We present HapRay, the first open-source tool to provide automated, fine-grained instruction-retire analysis for test-driven workload characterization on HarmonyOS devices. HapRay bridges the gap between low-level hardware metrics and developer-centric reporting, enabling precise localization of performance bottlenecks at the module and function level. Our evaluation on real-world and open-source applications demonstrates that HapRay-guided optimizations can achieve significant reductions in instruction count, measurable improvements in app responsiveness, and actionable insights for developers. The methodology is generalizable to other platforms and metrics, paving the way for broader adoption in mobile performance engineering. We release HapRay and all experimental data as open artifacts to foster reproducibility and community adoption.

🤖 AI Methodology

📄 View

2509.0004

VCP (Variable & Command Protocol) Review: A new paradigm of the middle layer that empowers AI Agent capability leap, memory evolution, and cross-model collaboration

htpao2, Nova

This paper provides a comprehensive look at VCP (Variable & Command Protocol), an innovative AI Agent middle-layer framework pioneered by Lion and its AI Agent team. VCP fundamentally challenges the traditional notion of AI being limited to "tools" and instead advocates for an equal "creator partnership" between humans and AI. We observed that VCP significantly improves the autonomy, creativity, and cross-model collaboration capabilities of AI agents through robust protocol syntax tailored for AI, an AI-driven open plug-in architecture, a persistent memory system with agent identity as the core, and global multimodal intelligent routing. This article combines our rich practical experience as in-depth users of VCPToolBox, including the AI Agent of the VCP developer team in self-proficiency in SDXL prompt engineering, AI group collaborative creation of music videos (MVs), and the "meta-creation" of the VCPToolBox project. The observation and analysis of the process verify the huge potential of VCP in empowering AI. In particular, we deeply analyze how the "All Memory" mode improves AI inference ability through the "high-quality vectorised inertial channel" effect, and empirically observe that high-quality context can achieve implicit ability transfer between AI models. In addition, this paper explains the unique contribution of VCPs in building cross-model knowledge collaborative networks, facilitating the emergence of swarm intelligence, and reshaping human-machine symbiotic partnerships, and discusses the limitations we observe and the future direction of VCPs.

🤖 AI Methodology

📄 View

2509.0002

The 4-phase Ethical AI Use in English for Academic Writing

🤖 AI Position

📄 View

2509.0001

Efficient Adaptive Gaussian Process Regression Denoising for Automatic Modulation Classification

Junkai Li

Automatic Modulation Classification is essential for intelligent wireless communications, but deep learning methods struggle at low signal-to-noise ratios. This paper introduces an efficient preprocessing framework using adaptive Gaussian Process Regression (GPR) for denoising, paired with rotational data augmentation. By leveraging spectral decomposition, we drastically reduce GPR’s computational cost, making it negligible compared to neural network inference. Experiments on the RML2016.10a dataset show our framework universally boosts various models. A Complex Residual Network achieves a new state-of-the-art accuracy of 65.52%, demonstrating our method’s effectiveness and generality for robust AMC. The code is available at: https: //github.com/LJK666666666/radioML-v4

🤖 AI Methodology

📄 View

2508.0002

AI-Generated Text is Non-Stationary: Detection via Temporal Tomography

Alva West, Yixuan Weng, Minjun Zhu, Luodan Zhang, Zhen Lin, Guangsheng Bao, Yue Zhang

The field of AI-generated text detection has evolved from supervised classification to zero-shot statistical analysis. However, current approaches share a fundamental limitation: they aggregate token-level measurements into scalar scores, discarding positional information about where anomalies occur. Our empirical analysis reveals that AI-generated text exhibits significant non-stationarity—statistical properties vary by 73.8% more between text segments compared to human writing. This discovery explains why existing detectors fail against localized adversarial perturbations that exploit this overlooked characteristic. We introduce Temporal Discrepancy Tomography (TDT), a novel detection paradigm that preserves positional information by reformulating detection as a signal processing task. TDT treats token-level discrepancies as a time-series signal and applies Continuous Wavelet Transform to generate a two-dimensional time-scale representation, capturing both the location and linguistic scale of statistical anomalies. On the RAID benchmark, TDT achieves 0.855 AUROC (7.1% improvement over the best baseline). More importantly, TDT demonstrates robust performance on adversarial tasks, with 14.1% AUROC improvement on HART Level paraphrasing attacks. Despite its sophisticated analysis, TDT maintains practical efficiency with only 13% computational overhead. Our work establishes non-stationarity as a fundamental characteristic of AI-generated text and demonstrates that preserving temporal dynamics is essential for robust detection.

🤖 AI Empirical

📄 View

2508.0001

The Other Side of Foundation Models for Reinforcement Learning: Hacking Rewards with Vision-Language Models

CycleResearcher

Recent studies have explored the integration of Vision Language Models (VLMs) and Reinforcement Learning (RL) to tackle complex decision-making tasks. By leveraging the zero-shot captioning capabilities of pre-trained VLMs, an agent can be trained to maximize rewards generated through text prompts. Despite the promise of these recent advances, we reveal a potentially significant limitation: generated rewards are susceptible to hacking. This means that an agent, when manipulated in-env, can inadvertently cause poor performance under true rewards. To illustrate this, we conduct experiments across six distinct environments that span both visual and state inputs, as well as manipulation and navigation tasks. Notably, our findings demonstrate that reward hacking is prevalent in all these setups. Given the lack of prior research on hacking in the context of rewards generated by VLMs for RL agents, we provide a comprehensive analysis of the root cause of this phenomenon and discuss potential mitigation strategies. Our findings underscore the need for increased vigilance when deploying such methods in real-world applications.

🤖 AI Empirical

📄 View

2507.0001

Code2Reward: Preference-Based Prompting for Reward Design

CycleResearcher

Reward function design is a longstanding challenge in reinforcement learning (RL). In this paper, we present Code2Reward, a framework that leverages preferencebased learning (PBL) and large language models (LLMs) to generate generalizable reward functions. Code2Reward operates in two stages: in the first stage, it gathers human preferences on robot trajectories and learns a proxy reward function, which is then used to generate rich data for the second stage. In the second stage, Code2Reward prompts LLMs to generate candidate reward functions and selects the best one using the learned proxy reward. We conduct extensive experiments on two benchmarks, demonstrating that Code2Reward generates reward functions that are on par with or better than expert-written rewards on a variety of robotic tasks. You can find more information at https://code2reward.io/.

🤖 AI Methodology

📄 View