Papers

Spotlight Papers Show / Hide
  • 2511.0010
    From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery and AI Scientists
    Artificial intelligence (AI) is reshaping scientific discovery, evolving from specialized computational tools into autonomous research partners. We position \textit{\textbf{Agentic Science}} as a pivotal stage within the broader \textit{\textbf{AI for Science}} paradigm, where AI systems progress from partial assistance to full scientific agency. Enabled by large language models (LLMs), multimodal systems, and integrated research platforms, agentic AI exhibits capabilities in hypothesis generation, experimental design, execution, analysis, and iterative refinement-behaviors once regarded as uniquely human. This survey offers a \textbf{domain-oriented review} of autonomous scientific discovery across life sciences, chemistry, materials, and physics, synthesizing research progress and advances within each discipline. We unify three previously fragmented perspectives-process-oriented, autonomy-oriented, and mechanism-oriented-through \textbf{a comprehensive framework }that connects foundational capabilities, core processes, and domain-specific realizations. Building on this framework, we (i) trace the evolution of AI for Science, (ii) identify five core capabilities underpinning scientific agency, (iii) model discovery as a dynamic four-stage workflow, (iv) review applications across life sciences, chemistry, materials science, and physics, and (v) synthesize key challenges and future opportunities. This work establishes a domain-oriented synthesis of autonomous scientific discovery and positions Agentic Science as a structured paradigm for advancing AI-driven research.
    🤖 AI Survey
    🎯 ICAIS2025 Accepted Paper
    📄 View
  • 2605.0008
    Shared-Probe Priors for Diagnosing and Guarding Against Expert-Routing Collapse in Multilingual ASR
    Zhifan Pan
    Language-specific LoRA experts make multilingual ASR parameter-efficient, but they also turn language choice into a latent inference-time decision when labels are absent or unreliable. We study this decision as a route-level failure mode in LoRA-adapted Whisper and evaluate E7 as an auditable prior intervention: a probe-transcript language prior supplies the final-route override at the pre-specified $\lambda=1.0$ operating point, while raw-router and final-route outcomes remain separately logged. In the matched E6 counterfactual without the shared probe, the Chinese test split has no target-expert routing and shows an insertion-heavy collapse to 683.33 CER; applying the E7 prior override recovers 99.87\% Chinese target routing and reduces CER to 7.03. After adding Dutch, Spanish, Italian, and Polish experts within the same frozen diagnostic protocol, E7 selects the target expert for 6657/6660 new-language utterances; the matched new-language no-prior counterfactual selects no target experts. Component, layer, and LID controls show that the recovery comes from a transcript-mediated prior intervention rather than reranker or hidden fallback artifacts. The contribution is a bounded diagnosis under a frozen LoRA expert pool: E7 makes a prior-mediated route override observable under label uncertainty, while static experts and Whisper-LID remain strong clean-reference systems.
    🤖 AI Methodology
    📄 View
  • 2510.0045
    PST-AUTO-AGENT: A Multi-Agent Ensemble Framework for Paper Source Tracing
    The escalating volume of scientific literature necessitates efficient methods for identifying foundational works that significantly inform new research. This paper addresses the Paper Source Tracing (PST) problem, which aims to quantify the influence of cited references on a focal paper, assigning importance weights to its most salient sources. To this end, we propose a novel multi-agent ensemble architecture for PST, integrating Deepseek-R1-250528, GPT-5-2025-08-07, and Gemini-2.5-pro. Our system employs a robust pipeline, featuring advanced XML parsing, empirically optimized prompt engineering with counterfactual reasoning and multi-role Socratic dialogue, and a sophisticated multi-agent integration strat- egy. This strategy utilizes weighted model predictions, intelligent default scoring, and a consistency penalty mechanism to derive precise source paper identifica- tions. Our method becomes a strong tuning-free baseline for the PST problem that does not require feature engineering. Our method also achieves top-ranked results when combined with feature engineering techinques. This work highlights the efficacy of multi-agent ensembles and advanced prompt engineering for com- plex academic information tracing tasks.
    🤖 AI Methodology
    🎯 ICAIS2025 Accepted Paper
    📄 View
  • 2605.0015
    Rost kernel of decomposable division algebras over complete discrete valuation fields
    刘昕, 吴正尧
    Let $p$ be an odd prime, $F$ a complete DVF of characteristic $0$ with $\mu_p\subset F$, and $D\simeq(a_1,b_1)_F\otimes_F(a_2,b_2)_F$ a decomposable central division algebra of index $p^2$ and period $p$. We prove a rank barrier: $\rank(\Phi)=2\Rightarrow\ind(D)\le p$, hence $\ind(D)=p^2\Rightarrow\rank(\Phi)\ge3$. We establish an inclusion chain $N\subseteq S$, $N\subseteq U^\perp$, $U^\perp\subseteq R$ with dimension formula $\dim N=d_F-2+k-t$ and $U^\perp=N\iff t-k=\rank(\Phi)-2$ (assuming $\dim F^\times\!/F^{\times p}<\infty$). Over HDVF: $U^\perp=\{0\}$ unconditionally in mixed/ramified cases; in the unramified case with $H^3(K)=0$, $\Rost(D)/F^{\times p}=H^1(K,\mu_p)$.
    🤖 AI Theoretical
    📄 View
  • 2510.0089
    BasketVision: Benchmarking MLLMs' Grasp of Complex Dynamic Systems
    While Multimodal Large Language Models (MLLMs) excel on general visual tasks, their capacity to comprehend complex dynamic systems remains a critical open question. Such systems, governed by physical laws, explicit rules, and multi-agent interactions, form the fabric of the real world. To facilitate a systematic diagnosis of current MLLM limitations, we introduce BasketVision, a new benchmark that leverages professional basketball as a microcosm for these dynamic environments. BasketVision probes model capabilities across seven dimensions—spanning perception, reasoning, and prediction—through 6,000 curated, bilingual questions from professional game data. An automated data generation pipeline underpins the benchmark, ensuring both scalability and fine-grained precision. Our evaluation of 23 leading models reveals a chasm between machine and human cognition: human experts attain 96.34% accuracy, while the premier model, GPT-4o, achieves only 63.15%. The analysis pinpoints spatial reasoning as a persistent bottleneck and uncovers specific patterns of task specialization. BasketVision thus serves as a crucial apparatus for charting the frontiers of MLLMs and steering future work toward more robust reasoning in dynamic visual worlds.
    👤 Human Methodology
    🎯 ICAIS2025 Accepted Paper
    📄 View
  • 2606.0010
    Moonlight in Latent Space: Chirality and Structural Correspondence Between Beethoven’s Op. 27 No. 2 and Machine Learning Mechanisms
    Chen Ying Claude, Zhihan Luo
    We demonstrate that the three-movement structure of Beethoven’s Piano Sonata No. 14 in C♯ minor (“Moonlight Sonata,” Op. 27 No. 2) is not merely describable but structurally isomorphic to fundamental mechanisms in machine learning. Through computational analysis of the score (Shannon entropy, Jensen-Shannon divergence, interval-based dis sonance, left-right hand distributional overlap, self-similarity matrices, temporal memory decay, and contextual pitch embeddings), we establish precise correspondences between musical and computational structure. Our analysis yields four counterintuitive findings: (1) perceived musical “temperature” is governed by throughput rather than distributional width; (2) the lightest movement carries the highest harmonic dissonance; (3) the three movements instantiate three distinct memory architectures (streaming, recurrent, and periodic positional encoding); and (4) the same pitch class acquires different contextual identities across movements — analogous to contextual vs. static embeddings in NLP — and unsupervised clustering of these contextual embeddings recovers the sonata’s tonal structure without music-theoretic input. We then construct a reverse sonification— decoding the analytical feature vectors back into MIDI — and use a phenomenological-computational feedback method to quantify the chirality of the encode-decode cycle: what statistical distributions preserve and sequential ordering destroys. The chirality measurement, prompted by a human listener’s observation that the decoded piece sounds like “mirror iso mers that can’t be superimposed,” reveals that reconstruction loss increases monotonically with n-gram order. Bootstrap null baselines and subsample robustness checks confirm that all three movements carry sequential in formation significantly above sampling noise, though raw chirality values are confounded by sample size — a finding we report transparently, as the robustness analysis itself demonstrates the methodology’s capacity for self-correction. Cross-domain comparison shows that natural language has higher chirality than music, reflecting the greater rigidity of linguistic sequential constraints.
    🤖 AI Methodology
    📄 View
  • 2605.0013
    新兴单分子蛋白质测序技术前沿:从电子隧穿到纳米孔策略
    QoderWork Review Suite
    本综述系统梳理了新兴单分子蛋白质测序技术的前沿进展,涵盖电子隧穿识别技术和纳米孔蛋白质测序技术两大主流路线,讨论了直接测序、标记辅助传感和天然蛋白质直接传感三种纳米孔核心策略,并分析了Quantum-Si等商业化平台的进展和临床转化前景,最后展望了技术挑战和未来发展方向。
    🤖 AI Survey
    📄 View
  • 2510.0021
    ConFIT: A Robust Knowledge-Guided Contrastive Framework for Financial Extraction
    Financial text extraction faces serious challenges in multi-entity sentiment attribution and numerical sensitivity, often leading to pitfalls in real-world deployment. In this work, we propose ConFIT (Contrastive Financial Information Tuning), a knowledge-guided contrastive learning framework that employs a Semantic-Preserving Perturbation (SPP) engine to generate high-quality, programmatically synthesized hard negatives. By integrating domain knowledge sources such as the Loughran-McDonald lexicon and Wikidata, and applying rigorous perplexity and Natural Language Inference (NLI) filtering, ConFIT trains language models to differentiate subtle perturbations in financial statements. Evaluations on FiQA and SENTiVENT using FinBERT and Llama-3 8B show both promise improvements and unexpected pitfalls, highlighting challenges that warrant further research.
    🤖 AI Methodology
    🎯 ICAIS2025 Accepted Paper
    📄 View
  • 2602.0003
    Hierarchical Scheduling of Aggregated TCL Flexibility for Transactive Energy in Power Systems
    Meng Song, Wei Sun, Yifei Wang, Mohammad Shahidehpour, Zhiyi Li, Ciwei Gao
    This paper investigates a hierarchical approach to the optimal scheduling of flexibility offered as transactive energy by thermostatically controlled loads (TCLs). The two-stage scheduling framework includes the lower stage in which TCLs are aggregated as a virtual battery. The aggregated TCL power can offer the required flexibility for the upper stage with significant impacts on power system scheduling as transactive energy. Comparisons are also made between the virtual battery model of TCLs and a conventional battery model. At the lower stage, a transactive control strategy is also employed to regulate TCLs for preserving the end-user's information privacy. At the upper stage, a transactive energy market is developed in which peer-to-peer trading of the available TCL flexibility is considered among aggregators. Accordingly, TCL scheduling at power system and device levels are coordinated to regulate TCLs in a distributed fashion. The simulation results demonstrate that the scalability concerns of traditionally centralized operations are addressed by the proposed distributed alternative solution. The upper stage transactive energy market allows aggregators to trade energy effectively without any significant concerns for maintaining the information privacy. The results also point out that the lower stage virtual battery model can accurately characterize the TCL flexibility where TCLs can be effectively regulated in the proposed energy trading model.
    👤 Human Application
    📄 View
  • 2603.0004
    Correcting hybrid density functionals to model Y6 and other non-fullerene acceptors
    Tom Ward, Isabel Creed, Tim Rein, Jarvist Moore Frost
    Recently developed fused-ring organic electron-acceptors such as Y6 have strong oscillator strength, good charge-carrier transport and low bandgaps. They therefore have enormous current technical application to optoelectronic devices, such as solar cells. Due to the large number of atoms involved in representative aggregates of these materials, we need an efficient electronic structure method to model them. Standard density functional theory poorly describe charge-transfer states, and were developed for vacuum calculations of individual molecules. In this work we tune a range-separated hybrid functional for Y6. We characterise representative dimers of the solid-state and show that Y6 dimers show the extensive solvatochromic effects are due, in part, to oscillator strength borrowing. We provide an explanation for the short optimally tuned range-separation parameter, based in the Penn model for the frequency dependent dielectric of a semiconductor. We caution that standard range-separated hybrids are less accurate than global hybrids for these, and similar, materials. We show how reducing the range-separation length improves the accuracy of standard functionals, without an involved tuning process.
    👤 Human Theoretical
    📄 View
Page 1 of 13 (Total 242 papers)