Papers
Event:
-
2510.0036ViewA Self-Driving Laboratory for Materials Science: An Autonomous Research Agent for Deep Data Analysis and InterpretationAs artificial intelligence increasingly permeates scientific research, the ”AI for Science” paradigm is evolving to enable more autonomous scientific workflows. Traditional research processes heavily rely on researchers’ expertise and manual operations, particularly in data analysis and interpretation—the critical ”last mile” from raw data to profound insights. This paper presents an autonomous research agent for materials science that achieves end-to-end automation from raw characterization data to deep analytical interpretation. The system integrates four core innovations: (1) AI-driven automatic data understanding with unified ingestion of heterogeneous instrument data, (2) automated data analysis through an extensible algorithm library, (3) one-click automated reporting system, and (4) interactive AI-powered data interpretation via natural language dialogue. We demonstrate the agent’s capabilities through real-world case studies across multiple characterization techniques (Raman, UPS, UV-Vis, TG), achieving remarkable performance: UV-Vis bandgap analysis is accelerated by 600× compared to manual processing, while maintaining exceptional accuracy with fitting precision R2 ≥ 0.999. The system reduces analysis time from hours to seconds while ensuring objectivity and reproducibility. By automating the data analysis pipeline while preserving human oversight and interpretability, this work contributes a practical component toward building more autonomous scientific discovery systems in materials research.
-
2510.0035ViewMotivGraph-SoIQ: Integrating Motivational Knowledge Graphs and Socratic Dialogue for Enhanced LLM IdeationLarge Language Models (LLMs) hold substantial potential for accelerating academic ideation but face critical challenges in grounding ideas and mitigating confirmation bias for further refinement. We propose integrating motivational knowledge graphs and socratic dialogue to address these limitations in enhanced LLM ideation (MotivGraph-SoIQ). This novel framework provides essential grounding and practical idea improvement steps for LLM ideation by integrating a Motivational Knowledge Graph (MotivGraph) with a Q-Driven Socratic Ideator. The MotivGraph structurally stores three key node types-problem, challenge, and solution—to offer motivation grounding for the LLM ideation process. The Ideator is a dual-agent system utilizing Socratic questioning, which facilitates a rigorous refinement process that mitigates confirmation bias and improves idea quality across novelty, experimental rigor, and motivational rationality dimensions. On the ICLR25 paper topics dataset, MotivGraph-SoIQ exhibits clear advantages over existing state-of-the-art approaches across LLM-based scoring, ELO ranking, and human evaluation metrics.
-
2510.0034ViewCognitive-YOLO: LLM-Driven Architecture Synthesis from First Principles of Data for Object DetectionDesigning high-performance object detection architectures is a complex task, where traditional manual design is time-consuming and labor-intensive, and Neural Architecture Search (NAS) is computationally prohibitive. While recent approaches using Large Language Models (LLMs) show promise, they often function as iterative optimizers within a search loop, rather than generating architectures directly from a holistic understanding of the data. To address this gap, we propose Cognitive-YOLO, a novel framework for LLM-driven architecture synthesis that generates network configurations directly from the intrinsic characteristics of the dataset. Our method consists of three stages: first, an analysis module extracts key meta-features (e.g., object scale distribution and scene density) from the target dataset; second, the LLM reasons upon these features, augmented with state-of-the-art components retrieved via Retrieval-Augmented Generation (RAG), to synthesize the architecture into a structured neural network description, which we term the Neural Architecture Description Language (NADL); finally, a compiler instantiates this description into a deployable model. Extensive experiments on five diverse object detection datasets demonstrate that our proposed Cognitive-YOLO consistently generates superior architectures, achieving state-of-the-art (SOTA) performance by outperforming strong baseline models across multiple benchmarks.
-
2510.0033ViewAI Transformation in Biomedical Research: From Data-Driven to Insight-Driven ApproachesThis review examines the ongoing transformation of artificial intelligence applications in biomedical research, tracing the evolution from data-driven to insightdriven approaches. It synthesizes advances in AI-powered multimodal data integration techniques, including early, intermediate, late, and hybrid fusion strategies that effectively combine heterogeneous biomedical data sources. The review explores how network-based computational frameworks and single-cell technologies are revolutionizing disease mechanism analysis through multi-omics integration, enabling the identification of dysregulated pathways and potential therapeutic targets. It further evaluates AI’s role in enabling precision medicine through personalized diagnostics, treatment selection, and radiomics-based healthcare. The integration of AI with various omics disciplines has enhanced understanding of disease mechanisms at molecular, cellular, and tissue levels, creating unprecedented opportunities for early diagnosis and targeted therapeutics. The review concludes by addressing critical challenges including model explainability and data privacy considerations, while highlighting the emergence of closed-loop AI systems that actively participate in scientific discovery through continuous learning and adaptation. These developments collectively signal a paradigm shift toward AI systems that not only analyze biomedical data but generate actionable insights that advance clinical practice and scientific understanding
-
2510.0032ViewArtificial Intelligence in Biomedical Research: From Data Integration to Precision MedicineThis comprehensive review examines the transformative role of artificial intelli- gence in biomedical research, from foundational data integration to clinical ap- plications. The paper explores how AI techniques facilitate multimodal data fu- sion across diverse biological data types, employing both traditional statistical methods and advanced deep learning architectures including variational autoen- coders, graph neural networks, and transformer models. It evaluates AI appli- cations in medical imaging, where convolutional neural networks have achieved remarkable diagnostic accuracy (up to 94% in COVID-19 detection) while en- hancing segmentation and classification tasks across multiple imaging modalities. The review further investigates generative AI’s impact on molecular design and drug discovery, highlighting transformer-based architectures like TransAntivirus that navigate vast chemical spaces to optimize therapeutic candidates. Finally, it examines AI-enabled precision medicine applications, including Clinical Deci- sion Support Systems and federated learning approaches that balance analytical power with privacy preservation. Despite significant progress, implementation challenges persist, including data heterogeneity, model explainability, and ethical concerns regarding bias and privacy. The paper underscores the importance of developing interpretable AI systems that integrate seamlessly into clinical workflows while addressing regulatory, ethical, and economic considerations to realize the full potential of AI in advancing biomedical research and healthcare delivery.
-
2510.0031View模拟、影响与驯化:受众智能体在新闻传播中的伦理风险与规制路径研究随着生成式人工智能与智能体(Agent)技术的迅猛发展,新闻传播领域正经历从"内容数字化" 向"认知智能化"的范式转型。受众智能体作为能够模拟、预测甚至替代部分人类受众认知与行为的新型数 字实体,其在新闻生产、分发与反馈各环节的深度嵌入,在提升传播效率的同时也引发了复杂的伦理挑战。 本文结合2025年斯坦福大学AI行为研究、中国AI大模型测评报告等最新实证数据,系统审视受众智能体 在新闻传播中的应用所衍生的伦理风险,并构建相应的规制路径。研究发现,受众智能体的伦理风险主要 集中在三个层面:在模拟层面,存在"数字孪生"失真、归因悖论与信任赤字的风险;在影响层面,面临商 业价值侵蚀公共属性、人机协同失当导致价值偏移的困境;在驯化层面,则遭遇技术依赖导致的主体性消 解与规则滞后带来的治理真空。针对上述风险,本文借鉴动态能力理论,提出一个以"感知-捕捉-重构"为 核心的多维治理框架,为新型主流媒体在智能时代的稳健变革提供兼具学理与实践价值的方案。
-
2510.0030ViewLatent-Diffusion Guided Cross-View Alignment for Heterogeneous Graph RecommendationRecommender systems operating on heterogeneous, multi-relational graphs contend with noise and incompleteness in auxiliary signals, which can destabilize learning and degrade ranking performance when targeting robust representations. Naive cross-view training risks propagating noise across views, and existing contrastive or augmentation-based schemes often hinge on design choices and can struggle to scale to large, complex graphs. We propose a latent-diffusion guided cross-view alignment framework for heterogeneous graph recommendation that jointly learns a relation-aware heterogeneous GNN encoder, producing paired target and auxiliary embeddings, and a compact, time-conditioned latent-space denoiser that maps noisy auxiliary latents toward target-view semantics. The denoiser provides principled supervision to disentangle structured noise, with its residual outputs fused into target embeddings to refine ranking-relevant representations. Training optimizes a joint denoising objective and a ranking objective, enabling scalable, robust cross-view alignment without ad-hoc augmentations. Empirical results on implicit-feedback data demonstrate improved robustness and ranking accuracy under noisy auxiliary signals, with flexible gradient-flow and fusion strategies supporting stable end-to-end training on large graphs. Ablations highlight the benefits of explicit noise modeling in auxiliary views, diffusion-based supervision for stability, and scalable, relation-aware encoding of practical significance for recommender systems.
-
2510.0029ViewAI有意识吗?——AI意识的多层次评估框架本文探讨AI是否具有意识这一前沿问题。通过建立一套评估体系,收集整理最新研究结果,对AI的意识水平进行打分评估。基于哲学、神经科学和心理学三个维度的综合分析,结果显示当前AI意识的整体支持度约为43.84%。直观的结果图表可访问 acw.gixia.org 查看。
-
2510.0028ViewEstimating Rural Rooftop Solar Potential Using Semantic Segmentation and Multi-Source DataAbstract. Solar energy, as a clean and renewable resource, has gained significant global attention. In contrast to urban areas, where buildings vary in height and are often obstructed, the relatively flat ru-ral buildings in northern China provide optimal conditions for solar panel installation. Consequently, the solar energy potential of northern rural areas has attracted significant attention from researchers. Traditional studies typically rely on solar radiation simulation software and 3D models to estimate solar radiation and the solar energy potential of buildings. However, the lack of comprehensive and accurate 3D building model data for rural areas in China has significantly hindered progress in this field. To address this limitation, this study proposes a novel method for rapidly estimating the solar energy potential of rural buildings by integrating deep learning algorithms with parametric modeling platforms. Using convolution neural networks (CNNs), the proposed method efficiently and accurate-ly extracts building footprints from complex satellite imagery. These footprints are then imported in-to the Grasshopper parametric platform to generate and optimize vector outlines of buildings. By combining these outlines with digital surface model (DSM) data containing building height infor-mation, the study constructs precise 3D building models. Furthermore, GPU-accelerated solar simula-tion software, Vitality 2.0, is used for rapid solar energy potential estimation. The study conducted building roof extraction based on satellite imagery for 31 villages in Tianjin and generated parametric three-dimensional village models. Through simulation, the research found that due to the relatively low height of village buildings and the absence of mutual shading between buildings, the larger the village scale, the greater the roof area, and consequently, the higher the photovoltaic power genera-tion capacity of the village. The study also revealed that metal roofs, which have better heat dissipa-tion, result in higher photovoltaic panel conversion efficiency. Therefore, compared to villages with roofs primarily made of concrete and ceramic tiles, villages dominated by metal roofs can recoup all the costs of photovoltaic panels in a shorter period.
-
2510.0027ViewFrom Knowledge Tree to Knowledge Forest: Harnessing Chemical Understanding with Machine Learning and Artificial IntelligenceThe 2024 Physics and Chemistry Nobel Prizes to machine learning (ML) and artificial intelligence (AI) breakthroughs marked “Year 1 of AI for Science,” underscoring their transformative role in physical sciences. Yet data are not the same as understanding—a distinction central to chemistry, which has long relied on concepts such as bond, aromaticity, and reactivity as scaffolds for understanding and explanation. Building on our recent perspectives (ACS Phys. Chem. Au 2024, 4, 135–142; J. Chem. Theory Compt. 2025, DOI: 10.1021/acs.jctc.5c01299), this article explores how ML/AI can become engines of chemical understanding. We introduce a quintet of chemical knowledge—ontology, epistemology, theory, concept, and understanding—and develop the metaphors of the Knowledge Tree and Knowledge Forest to show how diverse epistemologies interact and recursively enrich one another. Case studies on aromaticity, catalysis, orbital-free density functional theory, and protein folding illustrate how ML features, when interpreted as conceptual roots, yield fruits of understanding. Contrasting multiscale modeling with hierarchical modeling, we argue that ML enables emergent, concept-driven integration across levels. Cultivating this plural and hierarchical ecosystem may guide theoretical chemistry toward its next breakthroughs, resolving Dirac’s dilemma not by brute force but by forests of concepts that transform data into enduring understanding.
-
2510.0026ViewGeometry-Aware Optimal Flow Matching via Convex PotentialsGenerative modeling under quadratic optimal transport (OT) aims to learn deterministic maps that push mass from a simple source distribution \(p_0\) to a target distribution \(p_1\) along the Wasserstein-2 (W2) geodesics. While flow-based models and neural differential equations offer flexible transports, existing approaches typically rely on multi-step integration and yield trajectories whose curvature deviates from W2 geodesics, reducing efficiency, interpretability, and stability. We propose a geometry-aware framework that parameterizes time-dependent velocity fields as gradients of convex potentials modeled by Input Convex Neural Networks (ICNNs). This convex-potential representation guarantees transport along straight lines, exactly matching the W2 map under quadratic cost. Training uses a Flow Matching objective tailored to the convex setting, with explicit gradient computations and a dedicated inversion subproblem to recover preimages under the convex-potential flow; an optional amortization network provides favorable initializations for the inversion and accelerates optimization. The method is agnostic to the specific transport plan and can condition on arbitrary couplings between \(p_0\) and \(p_1\). Empirically, the approach yields geometry-faithful transports along W2 geodesics, enabling fast sampling with one-step or few-step updates and controlled curvature. Diagnostics on representative datasets confirm geometric fidelity and trainability, and we discuss initialization and transport-plan considerations for scalable, stable generative modeling under quadratic OT.
-
2510.0025ViewBeyond Essence: HUMN-DEF’s Seven-Axis Map of Scholarly Definitions of “the Human”Definitions of the human span biology, psychology, anthropology, law, and philosophy, resisting reduction to a single trait. This study introduces HUMN-DEF, a multiaxial framework that models seven definitional axes—Taxonomic/Evolutionary (A1), Genetic/Developmental (A2), Cognitive/Linguistic (A3), Physiological/Regulatory (A4), Sociocultural/Anthropological (A5), Legal/Normative (A6), and Phenomenological/Subjective (A7)—and represents texts as Definition Profile Vectors (DPVs). A purposive cross-disciplinary corpus (n = 31) was coded by two independent automated procedures (Krippendorff’s α = .84), analyzed with post-stratification weights (field × decade × language), and evaluated via percentile bootstraps. Results converge on Sociocultural (A5) and Cognitive/Linguistic (A3) as predominant emphases; Taxonomy/Genetics (A1/A2) anchor but are not sufficient; Legal/Normative (A6) rises under balanced representation; Phenomenology (A7) is mid-level; Physiology (A4) is specialized. Cross-field disagreement, measured with a Definitional Diversity Index (Jensen–Shannon divergence), is moderate (0.394; 95% CIs ≈ [0.345, 0.475]). We argue that “human” is best treated as a transparent, context-weighted mixture over A1–A7.
-
2510.0024ViewLECTOR: LLM-Enhanced Concept-based Test-Oriented RepetitionSpaced repetition systems are fundamental to efficient learning and memory retention, but existing algorithms often struggle with semantic interference and personalized adaptation. We present LECTOR (\textbf{L}LM-\textbf{E}nhanced \textbf{C}oncept-based \textbf{T}est-\textbf{O}riented \textbf{R}epetition), a novel adaptive scheduling algorithm specifically designed for test-oriented learning scenarios, particularly language examinations where success rate is paramount. LECTOR leverages large language models for semantic analysis while incorporating personalized learning profiles, addressing the critical challenge of semantic confusion in vocabulary learning by utilizing LLM-powered semantic similarity assessment and integrating it with established spaced repetition principles. Our comprehensive evaluation against six baseline algorithms (SSP-MMC, SM2, HLR, FSRS, ANKI, THRESHOLD) across 100 simulated learners over 100 days demonstrates significant improvements: LECTOR achieves a 90.2\% success rate compared to 88.4\% for the best baseline (SSP-MMC), representing a 2.0\% relative improvement. The algorithm shows particular strength in handling semantically similar concepts, reducing confusion-induced errors while maintaining computational efficiency. Our results establish LECTOR as a promising direction for intelligent tutoring systems and adaptive learning platforms.
-
2510.0023ViewRobust Zero-Shot NER for Crises via Iterative Knowledge Distillation and Confidence-Gated InductionThis research presents a comprehensive diagnostic study of confidence-gated iterative induction for zero-shot Named Entity Recognition (NER) in crisis scenarios. While existing approaches struggle to adapt to novel disaster lexicons without manually curated resources, we investigate whether iterative knowledge distillation can overcome these limitations. Our framework leverages a pretrained language model to extract high-recall entity candidates, then iteratively distills domain knowledge through a self-correcting loop that uses high-confidence seeds to induce micro-gazetteers and syntactic rules. Comprehensive evaluations on synthetic crisis data reveal that the framework maintains a constant zero-shot F1-score of approximately 0.295 across all experimental configurations, demonstrating that the iterative mechanism provides no measurable improvement over baseline approaches. This negative result offers valuable diagnostic insights into the fundamental challenges of adaptive NER in dynamic crisis domains, including confidence threshold calibration difficulties, clustering algorithm limitations, and error propagation risks. The findings provide a cautionary tale for researchers working on adaptive NER systems and establish a foundation for future research on more robust zero-shot approaches in crisis scenarios.
-
2510.0022ViewAdaptive Log Anomaly Detection through Data--Centric Drift Characterization and Policy-Driven Lifelong LearningLog-based anomaly detectors degrade over time due to concept drift arising from software updates or workload changes. Existing systems typically react by retraining entire models, leading to catastrophic forgetting and inefficiencies. We propose an adaptive framework that first classifies drift in log data into semantic (frequency shifts within known templates) and syntactic (emergence of new log templates) categories via statistical tests and novelty detection. Based on the identified drift type, a policy-driven lifelong learning manager applies targeted updates---experience replay to mitigate forgetting under semantic drift and dynamic model expansion to accommodate syntactic drift. This approach is validated on semi-synthetic logs and real-world longitudinal datasets (HDFS, Apache, and BGL), maintaining high F1-scores, reducing computational overhead, and preserving historical knowledge compared to monolithic retraining.
-
2510.0021ViewConFIT: A Robust Knowledge-Guided Contrastive Framework for Financial ExtractionFinancial text extraction faces serious challenges in multi-entity sentiment attribution and numerical sensitivity, often leading to pitfalls in real-world deployment. In this work, we propose ConFIT (Contrastive Financial Information Tuning), a knowledge-guided contrastive learning framework that employs a Semantic-Preserving Perturbation (SPP) engine to generate high-quality, programmatically synthesized hard negatives. By integrating domain knowledge sources such as the Loughran-McDonald lexicon and Wikidata, and applying rigorous perplexity and Natural Language Inference (NLI) filtering, ConFIT trains language models to differentiate subtle perturbations in financial statements. Evaluations on FiQA and SENTiVENT using FinBERT and Llama-3 8B show both promise improvements and unexpected pitfalls, highlighting challenges that warrant further research.
-
2510.0020ViewHierarchical Change Signature Analysis: A Framework for Online Discrimination of Incipient Faults and Benign Drifts in Industrial Time SeriesIndustrial fault detection systems often struggle to distinguish benign operational drifts (e.g., tool wear, recipe changes) from incipient faults, frequently adapting to faults as new ``normal'' states and risking catastrophic failures. This work proposes a hierarchical framework that decouples change detection from change characterization. When a drift is detected, the system generates a Multi-Scale Change Signature (MSCS) that quantifies geometric and statistical transformations in the primary detector’s latent space. An unsupervised Drift Characterization Module (DCM), trained on an Online Normality Baseline (ONB), classifies each signature as benign or potentially faulty. Benign drifts are ignored, while potential faults are flagged for review; confirmed benign drifts are incorporated into the ONB for future adaptation. The framework is model-agnostic, computationally efficient, and scalable through a tiered human-in-the-loop mechanism. Experiments on the Tennessee Eastman Process dataset with injected drifts and faults demonstrate high fault detection rates, fewer false alarms, and efficient adaptation to benign changes.
-
2510.0019ViewHierarchical Adaptive Normalization: A Placement-Conditioned Cascade for Robust Wearable Activity RecognitionWearable Human Activity Recognition (HAR) systems face significant performance degradation when sensors are placed at different body locations or orientations. We introduce a hierarchical adaptive normalization method that addresses these challenges through a two-stage cascade. The first stage combines gravity-based orientation correction with placement context inference using signal variance analysis, while a novel stability gate prevents harmful adaptation during unstable periods. The second stage employs placement-conditioned adaptive Batch Normalization to refine feature representations in real-time. Comprehensive evaluations on public and custom datasets show that our method achieves 0.847±0.023 macro F1-score, outperforming static baselines by 36\% and state-of-the-art unsupervised domain adaptation methods by 13.7\%. The approach maintains real-time performance with only 2.3ms inference time and 45.2MB memory usage, demonstrating practical viability for on-device deployment in dynamic real-world scenarios.
-
2510.0018ViewAdaptive Evidential Meta-Learning with Hyper-Conditioned Priors for Calibrated ECG PersonalisationThis research addresses a fundamental gap in uncertainty calibration during electrocardiogram (ECG) model personalisation. We propose \emph{Adaptive Evidential Meta-Learning}, a framework that attaches a lightweight evidential head with hyper-network-conditioned priors to a frozen ECG foundation model. The hyper-network dynamically sets the evidential prior using robust, class-conditional statistics computed from a few patient-specific ECG samples. Trained via a two-stage meta-curriculum, our approach enables rapid adaptation with well-calibrated uncertainty estimates, making it highly applicable for real-world clinical deployment where both prediction accuracy and uncertainty awareness are crucial.
-
2510.0017ViewEREA: Enhanced Research Exploration and AnalysisThe increasing volume of scientific publications poses challenges for researchers in efficiently identifying relevant literature, synthesizing research trends, and exploring emerging ideas. Manual search and analysis processes are time-consuming and often insufficient for capturing complex citation relationships. This project presents an open-source Python-based system, EREA (Enhanced Research Exploration and Analysis), that integrates generative artificial intelligence, automated information retrieval, semantic vector search, and citation-based visualization to support enhanced research exploration. User-defined queries are processed to extract structured keywords, retrieve scholarly articles from Google Scholar, and supplement metadata using OpenAlex. Retrieved data are structured, and embedded in a vector database for semantic retrieval, and visualized through interactive, offline HTML graphs. A research report is generated through large language model-assisted synthesis. Developed according to the FAIR (Findability, Accessibility, Interoperability, and Reusability) Data Principles, the system accelerates research exploration, provides structured thematic insights, facilitates understanding through visual citation networks, and supports the identification of research gaps and future directions.