Papers
Event:
-
2510.0016ViewA Data-Driven Energy Consumption Prediction Model for 5G Base Stations: Addressing Static and Dynamic Power ComponentsThe rapid deployment of 5G networks has intensified concerns about energy consumption in mobile communication systems. Unlike previous generations, 5G base stations (BSs) exhibit significant power draw even under zero traffic conditions, with static power accounting for $30\sim 40\%$ of total energy consumption. This paper proposes a novel data-driven framework that decouples total base station energy consumption into static and dynamic components, enabling more precise energy optimization. For static consumption modeling, we introduce a hybrid ResNet-XGBoost architecture that processes configuration parameters including bandwidth, antenna elements, transmit power, carrier count, and tilt angle. For dynamic consumption, we implement a Tabular Probabilistic Function Network (TabPFN) to capture the nonlinear relationship between resource utilization and energy demand. Experimental results using real-world data from a provincial Chinese telecom operator demonstrate that our model achieves a $15.5\%$ reduction in Mean Absolute Error (MAE) and an $R^2$ of 0.91 compared to conventional approaches.
-
2510.0015ViewA Data-Driven Energy Consumption Prediction Model for 5G Base Stations: Addressing Static and Dynamic Power ComponentsThe rapid deployment of 5G networks has intensified concerns about energy consumption in mobile communication systems. Unlike previous generations, 5G base stations (BSs) exhibit significant power draw even under zero traffic conditions, with static power accounting for $30\sim 40\%$ of total energy consumption. This paper proposes a novel data-driven framework that decouples total base station energy consumption into static and dynamic components, enabling more precise energy optimization. For static consumption modeling, we introduce a hybrid ResNet-XGBoost architecture that processes configuration parameters including bandwidth, antenna elements, transmit power, carrier count, and tilt angle. For dynamic consumption, we implement a Tabular Probabilistic Function Network (TabPFN) to capture the nonlinear relationship between resource utilization and energy demand. Experimental results using real-world data from a provincial Chinese telecom operator demonstrate that our model achieves a $15.5\%$ reduction in Mean Absolute Error (MAE) and an $R^2$ of 0.91 compared to conventional approaches.
-
2510.0014ViewLLM-empowered knowledge graph construction: A surveyKnowledge Graphs (KGs) have long served as a fundamental infrastructure for structured knowledge representation and reasoning. With the advent of Large Language Models (LLMs), the construction of KGs has entered a new paradigmβshifting from rule-based and statistical pipelines to language-driven and generative frameworks. This survey provides a comprehensive overview of recent progress in **LLM-empowered knowledge graph construction**, systematically analyzing how LLMs reshape the classical three-layered pipeline of ontology engineering, knowledge extraction, and knowledge fusion. We first revisit traditional KG methodologies to establish conceptual foundations, and then review emerging LLM-driven approaches from two complementary perspectives: *schema-based* paradigms, which emphasize structure, normalization, and consistency; and *schema-free* paradigms, which highlight flexibility, adaptability, and open discovery. Across each stage, we synthesize representative frameworks, analyze their technical mechanisms, and identify their limitations. Finally, the survey outlines key trends and future research directions, including KG-based reasoning for LLMs, dynamic knowledge memory for agentic systems, and multimodal KG construction. Through this systematic review, we aim to clarify the evolving interplay between LLMs and knowledge graphs, bridging symbolic knowledge engineering and neural semantic understanding toward the development of adaptive, explainable, and intelligent knowledge systems.
-
2510.0013ViewA Review of Intelligent Rock Mechanics: From Methods to ApplicationsArtificial Intelligence (AI) has great potential to transform rock mechanics by tackling its inherent complexities, such as anisotropy, nonlinearity, discontinuous, and multiphase nature. This review explores the evolution of AI, from basic neural networks like the BP model to advanced architectures such as Transformers, and their applications in areas like microstructure reconstruction, prediction of mechanical parameters, and addressing engineering challenges such as rockburst prediction and tunnel deformation. Machine learning techniques, particularly Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs), have been crucial in automating tasks like fracture detection and efficiently generating 3D digital rock models. However, the effectiveness of AI in rock mechanics is limited by data scarcity and the need for high-quality datasets. Hybrid approaches, such as combining physics-informed neural networks (PINNs) with traditional numerical methods, offer promising solutions for solving governing equations. Additionally, Large Language Models (LLMs) are emerging as valuable tools for code generation and decision-making support. Despite these advancements, challenges remain, including issues with reproducibility, model interpretability, and adapting AI models to specific domains. Future progress will hinge on the availability of improved datasets, greater interdisciplinary collaboration, and the integration of spatial intelligence frameworks to bridge the gap between AIβs theoretical potential and its practical application in rock engineering.
-
2510.0012ViewA Review of Intelligent Rock Mechanics: From Methods to ApplicationsIntelligent rock mechanics represents the convergence of artificial intelligence (AI) and classical rock mechanics, providing new paradigms to understand, model, and predict the complex behaviors of geological materials. This review synthesizes recent progress from foundational AI methodologies to their practical applications in rock engineering. Traditional challengesβsuch as anisotropy, discontinuities, and multiphysics couplingβhave been re-examined through data-driven and hybrid approaches that integrate learning algorithms with physical principles. The study traces the evolution of AI in this field, from early backpropagation and support vector machines to modern deep learning frameworks such as convolutional and transformer architectures, highlighting their roles in microstructure reconstruction, mechanical parameter estimation, constitutive modeling, and real-time hazard prediction. Emerging techniques, including physics-informed neural networks and graph-based learning, bridge data-driven inference with physical interpretability, while large language models are beginning to facilitate automated code generation and decision support in geotechnical analysis. Despite remarkable progress, key challenges remain in data quality, model generalization, and interpretability. Addressing these issues requires standardized datasets, interdisciplinary collaboration, and the establishment of transparent, reproducible AI workflows. The paper concludes by outlining a forward-looking perspective on developing next-generation intelligent frameworks capable of coupling physical knowledge, spatial reasoning, and adaptive learning, thereby advancing rock mechanics from empirical modeling toward fully intelligent, autonomous systems.
-
2510.0011ViewAutomated Algorithmic Discovery for Gravitational-Wave Detection Guided by LLM-Informed Evolutionary Monte Carlo Tree SearchGravitational-wave signal detection with unknown source parameters buried in dynamic detector noise remains a formidable computational challenge. Existing approaches face core limitations from restrictive assumptions: traditional methods rely on predefined theoretical priors, while neural networks introduce hidden biases and lack interpretability. We propose Evolutionary Monte Carlo Tree Search (Evo-MCTS), the first integration of large language model (LLM) guidance with domain-aware physical constraints for automated gravitational wave detection. This framework systematically explores algorithmic solution spaces through tree-structured search enhanced by evolutionary optimization, combining MCTS for strategic exploration with evolutionary algorithms for solution refinement. The LLM component provides domain-aware heuristics while maintaining interpretability through explicit algorithmic pathway generation. Experimental validation demonstrates substantial performance improvements, achieving a 20.2\% improvement over state-of-the-art gravitational wave detection algorithms on the MLGWSC-1 benchmark dataset and a remarkable 59.1\% improvement over other LLM-based algorithm optimization frameworks. Beyond performance improvements, our framework establishes a transferable methodology for automated algorithmic discovery across computational science domains.
-
2510.0010ViewBioMARS: A Multi-Agent Robotic System for Autonomous Biological ExperimentsLarge language models (LLMs) and vision-language models (VLMs) have the potential to transform biological research by enabling autonomous experimentation. Yet, their application remains constrained by rigid protocol design, limited adaptability to dynamic lab conditions, inadequate error handling, and high operational complexity. Here we introduce BioMARS (Biological Multi-Agent Robotic System), an intelligent platform that integrates LLMs, VLMs, and modular robotics to autonomously design, plan, and execute biological experiments. BioMARS uses a hierarchical architecture: the Biologist Agent synthesizes protocols via retrieval-augmented generation; the Technician Agent translates them into executable robotic pseudo-code; and the Inspector Agent ensures procedural integrity through multimodal perception and anomaly detection. The system autonomously conducts cell passaging and culture tasks, matching or exceeding manual performance in viability, consistency, and morphological integrity. It also supports conte
-
2510.0009ViewBioMARS: A Multi-Agent Robotic System for Autonomous Biological ExperimentsLarge language models (LLMs) and vision-language models (VLMs) have the potential to transform biological research by enabling autonomous experimentation. Yet, their application remains constrained by rigid protocol design, limited adaptability to dynamic lab conditions, inadequate error handling, and high operational complexity. Here we introduce BioMARS (Biological Multi-Agent Robotic System), an intelligent platform that integrates LLMs, VLMs, and modular robotics to autonomously design, plan, and execute biological experiments. BioMARS uses a hierarchical architecture: the Biologist Agent synthesizes protocols via retrieval-augmented generation; the Technician Agent translates them into executable robotic pseudo-code; and the Inspector Agent ensures procedural integrity through multimodal perception and anomaly detection. The system autonomously conducts cell passaging and culture tasks, matching or exceeding manual performance in viability, consistency, and morphological integrity. It also supports context-aware optimization, outperforming conventional strategies in differentiating retinal pigment epithelial cells. A web interface enables real-time human-AI collaboration, while a modular backend allows scalable integration with laboratory hardware. These results highlight the feasibility of generalizable, AI-driven laboratory automation and the transformative role of language-based reasoning in biological research.
-
2510.0008ViewToward a Federated Model of AI Scientists: Architecture, Pipeline, and RoadmapThis paper proposes a federated model of AI Scientists, integrating a layered stack architecture, an iterative discovery pipeline, and a governance-aligned roadmap. We argue that AI Scientists should not only accelerate discovery but also serve as custodians of epistemic integrity. Through case studies in drug discovery, climate modeling, and materials science, we demonstrate how federation enables cross-domain synthesis while embedding reproducibility, incentive alignment, and participatory governance. We conclude with a research roadmap toward Trusted AI Scientists, highlighting technical, incentive, and governance challenges.
-
2510.0007ViewHEAL: Learning-Free Source Free Unsupervised Domain Adaptation for Cross-Modality Medical Image SegmentationGrowing demands for clinical data privacy and storage constraints have spurred advances in Source Free Unsupervised Domain Adaptation (SFUDA). SFUDA addresses the domain shift by adapting models from the source domain to the unseen target domain without accessing source data, even when target-domain labels are unavailable. However, SFUDA faces significant challenges: the absence of source domain data and label supervision in the target domain due to source free and unsupervised settings. To address these issues, we propose HEAL, a novel SFUDA framework that integrates Hierarchical denoising, Edge-guided selection, sizeAware fusion, and Learning-free characteristic. Large-scale cross-modality experiments demonstrate that our method outperforms existing SFUDA approaches,achieving state-of-the-art (SOTA) performance. The source code is publicly available at: https://anonymous.4open.science/r/HEAL-10C5.
-
2510.0006ViewHEAL: Learning-Free Source Free Unsupervised Domain Adaptation for Cross-Modality Medical Image SegmentationGrowing demands for clinical data privacy and storage constraints have spurred advances in Source Free Unsupervised Domain Adaptation (SFUDA). SFUDA addresses the domain shift by adapting models from the source domain to the unseen target domain without accessing source data, even when target-domain labels are unavailable. However, SFUDA faces significant challenges: the absence of source domain data and label supervision in the target domain due to source free and unsupervised settings. To address these issues, we propose HEAL, a novel SFUDA framework that integrates Hierarchical denoising, Edge-guided selection, size-Aware fusion, and Learning-free characteristic. Large-scale cross-modality experiments demonstrate that our method outperforms existing SFUDA approaches, achieving state-of-the-art (SOTA) performance. The source code is publicly available at: https://anonymous.4open.science/r/HEAL-10C5.
-
2510.0005ViewSynergistic Space-Vision Processing for Predicate InferenceScene graph generation, which parses images into structured graph, is a fundamental task for scene understanding. Most existing SGG models are dedicated to generating predicate representations based on appearance, relative position, and contextual cues. However, due to the predicate representation ambiguity arising from spatial co-occurrence, the generated scene graphs are often factually correct, but semantically shallow. To address this problem, we propose inferring predicates by synergistically processing spatial and visual information. Our core insight is that acknowledging the coexistence of geometric and non-geometric predicates, rather than struggling to disentangle them, is better suited for predicate inference than existing single-stream architectures. To this end, we introduce a novel method, Dual-stream Synergistic Network (DS-Net). Specifically, it contains two parallel streams: a space stream to predict geometric predicates from spatial layouts and edge features, and a vision stream to predict non-geometric predicates from fine-grained visual cues and linguistic priors. Based on them, we then design Cross-Stream Fusion module to enhance the corresponding predicate representation by using the mutual information of the two types. Through the collaborative processing of these streams, our DS-Net no longer treats the two predicate types as conflicting signals that need to be disentangled. Instead, it utilizes their synergy to facilitate predicate inference, providing a new perspective on resolving predicate ambiguity. Experiments have demonstrated the effectiveness of our method. Furthermore, our approach exhibits strong versatility and can be efficiently integrated with various existing models to enhance their performance. For instance, the 2.3\% $\sim$ 8.2\% increase in mR@100 on PredCls task demonstrates this capability.
-
2510.0004ViewA synergistic multi-specialist knowledge reasoning model for molecular scienceThe rapid evolution of artificial intelligence in molecular science necessitates a shift from data-driven predictions to knowledge-guided reasoning. Existing molecular models are predominantly proprietary, lacking general molecular intelligence and generalizability. To address this, we propose a task-adaptive large reasoning model that integrates molecular scientific logic to emulate the thinking of molecular scientists, with capabilities for reasoning and reflection. Our approach incorporates multi-specialist modules to provide versatile molecular expertise and a chain-of-thought (CoT) framework enhanced by reinforcement learning infused with molecular knowledge, enabling structured and reflective reasoning. The model outperforms over 20 state-of-the-art multi-task large language models (LLMs) across 10 molecular tasks on 47 metrics, including property prediction, molecule generation, and reaction prediction.It achieves a 50.3% improvement over the base model while ensuring interpretability. It can bridge data-driven and knowledge-integrated approaches for intelligent molecular design.
-
2510.0003ViewAI-Driven Resilience and Synergistic Optimization in Green Computing Networks: A Scientific Paradigm ApproachThis paper investigates the resilience mechanisms and synergistic optimization strategies in green computing networks under the AI scientific paradigm. As computing infrastructure increasingly demands both performance and sustainability, traditional optimization approaches face challenges in balancing energy efficiency with network reliability. We propose an AI-driven framework that integrates reinforcement learning and multi-agent systems to dynamically optimize resource allocation while maintaining network resilience. Our approach combines theoretical economic models with practical AI engineering capabilities to analyze real-world computing workloads. Experimental results demonstrate that our method achieves 27% reduction in energy consumption while improving network fault tolerance by 34% compared to baseline approaches. This work contributes to the emerging field of AI for Science by showcasing how automated scientific discovery methods can address complex sustainability challenges in computing infrastructure.
-
2510.0002ViewEnhancing Small Language Models with Gradient Noise InjectionTraining small language models is challenging due to their limited capacity to capture complex patterns and their susceptibility to overfitting. To address these issues, we investigate gradient noise injection as a regularization strategy, building on prior work while introducing a noise schedule that decays exponentially over training. Unlike existing techniques, our method explicitly controls the trade-off between exploration and stability during optimization. We compare the exponential decay schedule with linear and adaptive variants, demonstrating empirically that the exponential schedule yields superior convergence and generalization. Extensive experiments on diverse text corpora, including shakespeare\_char, enwik8, text8, and larger benchmark datasets, show consistent improvements in training dynamics, validation loss, and final performance. We report error bars and statistical significance tests to ensure robustness of the results. Detailed implementation information, including model architectures, hyperparameter settings, dataset sizes, and optimization strategies, is provided to support reproducibility, and we release our code and trained models publicly. Furthermore, we compare gradient noise injection with other regularization methods such as dropout, weight decay, and data augmentation, both in isolation and in combination, revealing complementary effects on training stability and generalization. Finally, we analyze the computational cost of gradient noise injection relative to these baselines, highlighting its practical efficiency in resource-constrained environments. Together, these contributions position gradient noise injection as a theoretically grounded, empirically validated, and computationally practical method for improving the robustness of small language models.
-
2510.0001ViewRAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented GenerationLarge language models (LLMs) struggle to effectively utilize a growing number of external tools, such as those defined by the Model Context Protocol (MCP)[ 1], due to prompt bloat and selection complexity. We introduce RAG-MCP, a Retrieval-Augmented Generation framework that overcomes this challenge by offloading tool discovery. RAGMCP uses semantic retrieval to identify the most relevant MCP(s) for a given query from an external index before engaging the LLM. Only the selected tool descriptions are passed to the model, drastically reducing prompt size and simplifying decision-making. Experiments, including an MCP stress test, demonstrate RAG-MCP significantly cuts prompt tokens (e.g., by over 50%) and more than triples tool selection accuracy (43.13% vs 13.62% baseline) on benchmark tasks. RAG-MCP enables scalable and accurate tool integration for LLMs.
-
2509.0014ViewStrange Minds
-
2509.0013ViewLyRE: Learning Varying Fusion Degrees with Hierarchical Aggregation to Improve Multimodal Misinformation DetectionThe rapid proliferation of misinformation poses serious concerns, necessitating the development of efficient and accurate automated detection methods. Existing multimodal misinformation detection approaches predominantly focus on fusing information from different modalities. However, the diverse nature of multimodal posts on social media means that solely focusing on fusion can introduce noise, particularly in posts with weak inter-modal correlations. To address this challenge and effectively handle diverse misinformation instances, we propose a novel method Learning Varying Fusion Degrees with Hierarchical Aggregation(LyRE). LyRE employs classifiers at different stages of a hierarchical fusion process, enabling the model to learn from representations with varying degrees of cross-modal interaction and adapt to different types of multimodal data. Experimental results on multiple publicly misinformation detection datasets demonstrate that LyRE outperforms other state-of-the-art and highly competitive misinformation detection methods
-
2509.0012ViewTADT-CSA: Temporal Advantage Decision Transformer with Contrastive State Abstraction for Generative RecommendationWith the rapid advancement of Transformer-based Large Language Models (LLMs), generative recommendation has shown great potential in enhancing both the accuracy and semantic understanding of modern recommender systems. Compared to LLMs, the Decision Transformer (DT) is a lightweight generative model applied to sequential recommendation tasks. However, DT faces challenges in trajectory stitching, often producing suboptimal trajectories. Moreover, due to the high dimensionality of user states and the vast state space inherent in recommendation scenarios, DT can incur significant computational costs and struggle to learn effective state representations. To overcome these issues, we propose a novel Temporal Advantage Decision Transformer with Contrastive State Abstraction (TADT-CSA) model. Specifically, we combine the conventional Return-To-Go (RTG) signal with a novel temporal advantage (TA) signal that encourages the model to capture both long-term returns and their sequential trend. Furthermore, we integrate a contrastive state abstraction module into the DT framework to learn more effective and expressive state representations. Within this module, we introduce a TAβconditioned State Vector Quantization (TAC-SVQ) strategy, where the TA score guides the state codebooks to incorporate contextual token information. Additionally, a reward prediction network and a contrastive transition prediction (CTP) network are employed to ensure that the state codebook preserves both the reward information of the current state and the transition information between adjacent states. Empirical results on both public datasets and an online recommendation system demonstrate the effectiveness of the TADT-CSA model and its superiority over baseline methods.
-
2509.0011ViewReinforce Lifelong Interaction Value of User-Author Pairs for Large-Scale Recommendation SystemsRecommendation systems (RS) help users find interested content and connect authors with their target audience. Most research in RS tends to focus either on predicting usersβ immediate feedback (like click-through rate) accurately or improving usersβ long-term engagement. However, they ignore the influence for authors and the lifelong interaction value (LIV) of user-author pairs, which is particularly crucial for improving the prosperity of social community on different platforms. Currently, reinforcement learning (RL) can optimize long-term benefits and has been widely applied in RS. In this paper, we introduce RL to Reinforce Lifelong Interaction Value of User-Author pairs (RLIV-UA) based on each interaction of UA pairs. To address the long intervals between UA interactions and the large scale of the UA space, we propose a novel Sparse Cross-Request Interaction Markov Decision Process (SCRI-MDP) and introduce an Adjacent State Approximation (ASA) method to construct RL training samples. Additionally, we introduce Multi-Task Critic Learning (MTCL) to capture the progressive nature of UA interactions (click β follow β gift), where denser interaction signals are leveraged to compensate for the learning of sparse labels. Finally, an auxiliary supervised learning task is designed to enhance the convergence of the RLIV-UA model. In offline experiments and online A/B tests, the RLIV-UA model achieves both higher user satisfaction and higher platform profits than compared methods.