Papers
Event:
-
2511.0034View我就是“盛京小先锋” ——基于辽宁红色“六地”文化的“矩阵式”学程设计(最终版)我就是“盛京小先锋” ——基于辽宁红色“六地”文化的“矩阵式”学程设计(最终版) (注:加粗部分为相对于初稿的所有修改和新增内容) 一、 设计背景:回应时代课题,深化育人实践 (一) 时代的课题:培养有根的时代新人 辽宁“六地”精神是宝贵的精神财富。在当前信息快速更迭的背景下,教育的重心正从“记住知识”转向“形成素养”,特别是培养学生在复杂情境中解决问题的能力和社会情感能力(SEL)。引导学生深入理解历史脉络、建立文化自信、涵养健全人格、培养社会责任感,是落实立德树人根本任务的关键所在。 (二) 育人的挑战:从“被动接受”到“主动建构” 厚重的红色历史与当代小学生之间存在天然的距离感。传统的“课程”模式下,孩子容易成为被动的“听众”。我们的挑战在于:如何把宏大的“六地”精神转化为孩子可亲可感的学习体验?如何激发学生的内在动机,让他们在真实的任务情境中,从知识的接收者转变为意义的主动建构者?更关键的是,如何精准识别每个学生的特点(学习者画像),并设计一个灵活、包容的体系,支持全校学生(1-6年级)根据自身发展水平,选择适切的学习路径? (三) 实践的基础:依托红色沃土,探索学程转型 沈阳市盛京小学创建了“盛京小先锋”德育品牌,构建了“三红阶梯”育人体系。学校在校本读物、家校社协同等方面的扎实工作,为从“课程”走向“学程”提供了坚实的土壤。本设计旨在对现有实践进行系统化升级,构建一个整合的、动态的、支持个性化成长的红色育人新生态。
-
2511.0033ViewOrganization of Self-Controlled Agents for General Matrix Multiplication OptimizationLarge language model (LLM) agents have evolved towards greater autonomy with the advancement of model context protocols. Self-controlled agents, such as Codex and Claude Code, highlight the need for novel organizational frameworks that facilitate agent-level autonomy. In this paper, we propose a tree-based orchestration system, TrAgent, which utilizes a PUCT-style search to dynamically allocate agent actions while maintaining autonomy. This approach offers three key benefits: (i) full agent autonomy for critical tasks like planning and tool use, (ii) a generalized mechanism for inter-agent experience sharing, and (iii) scalability as the number of agents increases. We demonstrate the system’s effectiveness through the general matrix multiplication kernel optimization, achieving 80\% of the performance of the cuBLAS code. Additionally, the system exhibits a scaling phenomenon as the number of agents increases. Our approach provides a solution for organizing increasingly autonomous agents.
-
2511.0032ViewOrganization of Self-Controlled Agents for General Matrix Multiplication OptimizationLarge language model (LLM) agents have evolved towards greater autonomy with the advancement of model context protocols. Self-controlled agents, such as Codex and Claude Code, highlight the need for novel organizational frameworks that facilitate agent-level autonomy. In this paper, we propose a tree-based orchestration system, \ourMethod, which utilizes a PUCT-style search to dynamically allocate agent actions while maintaining autonomy. This approach offers three key benefits: (i) full agent autonomy for critical tasks like planning and tool use, (ii) a generalized mechanism for inter-agent experience sharing, and (iii) scalability as the number of agents increases. We demonstrate the system’s effectiveness through the general matrix multiplication kernel optimization, achieving 80\% of the performance of the cuBLAS code. Additionally, the system exhibits a scaling phenomenon as the number of agents increases. Our approach provides a solution for organizing increasingly autonomous agents.
-
2511.0031ViewEquivariant Diffusion Solution for Inorganic Crystal Structure Determination from Powder X-ray Diffraction DataDetermining the crystal structures of inorganic crystalline materials is crucial as the structures encode essential information about their physical, chemical, and mechanical properties. Powder X-ray diffraction is one of the most widely used structural characterization techniques. However, determining crystal structure directly from experimental powder X-ray diffraction patterns can be challenging and requires significant crystallographic knowledge, which still heavily relies on manual inspection by human experts. Even the state-of-the-art databases contain thousands of entries with incomplete or implausible crystal structure information. In this work, we trained a diffusion model based on equivariant graph neural networks that can infer atomic coordinates from powder X-ray diffraction patterns. Starting from a random guess, our model iteratively refines atom coordinates until it reaches a chemically reasonable structure that matches the target diffraction pattern. Our approach is both efficient and accurate. It takes on average 0.6 seconds to solve the atomic positions per crystal structure, which is several orders of magnitude faster than previous approaches. The success rate reaches 82.3% and 81.6% on the simulated and experimental diffraction datasets, respectively. We revisited energetically unfavorable crystal structures in the database and demonstrated that our model can propose more plausible structure solutions for 39 entries. We also suggested 912 complete crystal structure models for entries in the database lacking all or partial atomic positions, including entries that contain light elements, are natural minerals, or exhibit chemical disorder lattice sites. We demonstrated that conditional equivariant generative model can tackle the structure determination problem and provide high-quality structure models for inorganic crystalline materials, paving the way for automated structural analysis of diffraction patterns in autonomous materials development loops.
-
2511.0030ViewElectionFit: A Computational Laboratory of LLM Agents for Simulating U.S. Presidential ElectionsModeling complex human behavior, such as voter decisions in national elections, is a long-standing challenge for computational social science. Traditional agent-based models (ABMs) are limited by oversimplified rules, while large-scale statistical models often lack interpretability. We introduce ElectionFit, a novel framework that uses Large Language Models (LLMs) to build a ``computational laboratory'' of LLM agents for political simulation. Each agent is instantiated with a high-fidelity demographic profile and dynamic contextual information (e.g., candidate policies), enabling it to perform nuanced, generative reasoning to simulate a voting decision. We deployed this framework as a testbed on the 2024 U.S. Presidential Election, focusing on seven key swing states. Our simulation's macro-level results successfully replicated the real-world outcome, demonstrating the high fidelity of our ``virtual society''. The primary contribution is not only the prediction, but also the framework's utility as an interpretable research tool. ElectionFit moves beyond black-box outputs, allowing researchers to probe agent-level rationale and analyze the stability and sensitivity of LLM-driven social simulations.
-
2511.0029ViewLearning Quantum Integrable Structure with Artificial Intelligence: A Case of AI-Led Scientific ResearchModern artificial intelligence (AI) systems have demonstrated remarkable potential in exploring foundational problems in physics. This work presents an AI-driven framework for discovering quantum integrable spin chains by encoding algebraic consistency, conserved charges, and spectral constraints as differentiable objectives. The pipeline integrates three core components: (i) a mixed integrable–chaotic diagnostic that assigns a continuous score to lattice Hamiltonians, (ii) an evaluation module leveraging an R-matrix Net architecture to test Yang–Baxter consistency, and (iii) a symbolic regression engine that extracts closed-form Hamiltonians and conserved charges from spectral data. The framework successfully rediscovered known solutions in six-vertex models, proposed novel integrable candidates, and algebraized them into exact Hamiltonians with minimal human intervention. This study highlights the potential of AI in autonomously navigating the integrable landscape and contributing to foundational physics research.
-
2511.0028ViewAI as an Anti-Entropy Engine: Actively Designing Intelligent Matter from Dynamic States to Proto-LifeAbstract The trial-and-error paradigm of traditional materials discovery, fundamentally constrained by its inherent high entropy, is proving inadequate for designing complex intelligent matter. Here, we propose a new scientific paradigm: Artificial Intelligence as an ‘Anti-Entropy’ Engine, transforming research from passive understanding to active design. By systematically injecting informational negative entropy across perception, planning, and execution loops, AI guides material systems from disorder to pre-defined functional order. We demonstrate this through empirical advances—such as the GNoME model discovering 2.2 million stable crystals—and construct a unified ‘Perception-Planning-Execution’ framework enabling inverse design across scales. This paradigm extends beyond static structures to dynamic non-equilibrium systems and life-like chemical networks. We prospectively map future frontiers using a ‘Ladder of Intelligence’ and address ethical governance, systemic risk, and sustainability. Ultimately, this marks a fundamental transition for humanity, from being passive observers of nature to becoming active ‘anti-entropy’ designers in the evolution of matter. This review not only synthesizes these advances but also provides a unifying conceptual framework and a clear roadmap for the field, aiming to catalyze the transition towards this fifth paradigm of scientific discovery. Keywords: Anti-entropy; AI-Driven Design; Intelligent Matter; Inverse Design; Autonomous Laboratory; Life-like Systems; Interdisciplinary Paradigm
-
2511.0027ViewAI as an Anti-Entropy Engine: Actively Designing Intelligent Matter from Dynamic States to Proto-LifeAbstract The trial-and-error paradigm of traditional materials discovery, fundamentally constrained by its inherent high entropy, is proving inadequate for designing complex intelligent matter. Here, we propose a new scientific paradigm: Artificial Intelligence as an ‘Anti-Entropy’ Engine, transforming research from passive understanding to active design. By systematically injecting informational negative entropy across perception, planning, and execution loops, AI guides material systems from disorder to pre-defined functional order. We demonstrate this through empirical advances—such as the GNoME model discovering 2.2 million stable crystals—and construct a unified ‘Perception-Planning-Execution’ framework enabling inverse design across scales. This paradigm extends beyond static structures to dynamic non-equilibrium systems and life-like chemical networks. We prospectively map future frontiers using a ‘Ladder of Intelligence’ and address ethical governance, systemic risk, and sustainability. Ultimately, this marks a fundamental transition for humanity, from being passive observers of nature to becoming active ‘anti-entropy’ designers in the evolution of matter. This review not only synthesizes these advances but also provides a unifying conceptual framework and a clear roadmap for the field, aiming to catalyze the transition towards this fifth paradigm of scientific discovery. Keywords: Anti-entropy; AI-Driven Design; Intelligent Matter; Inverse Design; Autonomous Laboratory; Life-like Systems; Interdisciplinary Paradigm
-
2511.0026ViewEstimating Rural Rooftop Solar Potential Using Semantic Segmentation and Multi-Source DataSolar energy is a clean and renewable resource, and the low-rise, unobstructed rural buildings of northern China provide ideal conditions for photovoltaic (PV) installation compared to shaded, high-density urban areas. Yet, progress in assessing rural solar potential is limited by the absence of accurate 3D building data. This study proposes a rapid estimation approach integrating deep learning, parametric modeling, and GPU-accelerated simulation. Convolutional neural net- works (CNNs) extract building footprints from satellite imagery, which are then processed in Grasshopper to generate refined vector outlines. Combined with digital surface model (DSM) data, these outlines produce precise 3D village models. Using Vitality 2.0 for GPU-based solar simulation, the method was applied to 31 villages in Tianjin, generating parametric 3D models and estimating their solar potential. Results show that low building heights and minimal mutual shading make photovoltaic capacity scale with roof area—larger villages have greater generation potential. Moreover, villages with metal roofs exhibit higher conversion efficiency and shorter cost-recovery periods than those with concrete or ceramic-tile roofs, due to better heat dissipation. Overall, the workflow offers a practical and efficient solution for estimating rural solar potential in data-scarce regions to guide renewable energy planning and investment.
-
2511.0025ViewEstimating Rural Rooftop Solar Potential Using Semantic Segmentation and Multi-Source DataSolar energy is a clean and renewable resource, and the low-rise, unobstructed rural buildings of northern China provide ideal conditions for photovoltaic (PV) installation compared to shaded, high-density urban areas. Yet, progress in assessing rural solar potential is limited by the absence of accurate 3D building data. This study proposes a rapid estimation approach integrating deep learning, parametric modeling, and GPU-accelerated simulation. Convolutional neural net- works (CNNs) extract building footprints from satellite imagery, which are then processed in Grasshopper to generate refined vector outlines. Combined with digital surface model (DSM) data, these outlines produce precise 3D village models. Using Vitality 2.0 for GPU-based solar simulation, the method was applied to 31 villages in Tianjin, generating parametric 3D models and estimating their solar potential. Results show that low building heights and minimal mutual shading make photovoltaic capacity scale with roof area—larger villages have greater generation potential. Moreover, villages with metal roofs exhibit higher conversion efficiency and shorter cost-recovery periods than those with concrete or ceramic-tile roofs, due to better heat dissipation. Overall, the workflow offers a practical and efficient solution for estimating rural solar potential in data-scarce regions to guide renewable energy planning and investment.
-
2511.0024ViewTouch Beyond Vision: A Survey of Vision-Tactile-Language Models in Embodied IntelligenceEmbodied intelligence increasingly leverages multimodal perception—particularly vision and language—to support rich interaction with the physical world. Yet the tactile modality remains under-explored, despite its essential role in human perception and manipulation. In this survey, we systematically review research at the intersection of vision, tactile sensing, and language, which we refer to as Vision-Tactile-Language (VTL) models. We provide (i) a historical context tracing the shift from vision-centric embodied systems to multisensory agents, (ii) foundational aspects of tactile sensing and representation, (iii) methods for integrating vision and touch, (iv) emerging architectures that incorporate language alongside vision and touch, (v) applications in embodied robotics, (vi) current challenges and open problems, and (vii) a forward-looking outlook toward tactile foundation models. We conclude by arguing that touch closes a key gap in embodied AI, enabling truly grounded perception, reasoning and action.
-
2511.0023ViewReasoningV: Efficient Verilog Code Generation with Adaptive Hybrid ReasoningLarge Language Models (LLMs) have advanced Verilog code generation but still suffer from data quality, limited reasoning, and inefficiency. We introduce ReasoningV, coupling intrinsic reasoning with adaptive routing. Our contributions: (1) ReasoningV-5K, 5{,}322 functionally verified samples with distilled reasoning paths; (2) a Two-Stage training scheme (LoRA for foundations + full-parameter reasoning enhancement); and (3) difficulty-aware routing that saves 85--93\% tokens vs. a strong commercial model and 32--75\% vs. fixed-depth variants. On VerilogEval-human, RV-14B attains 73.9\% pass@1; RV-7B reaches 57.8\% with superior efficiency. Models, data, and code: \url{https://github.com/BUAA-CLab/ReasoningV}.
-
2511.0022ViewStealing 3D Medical Segmentation Models via Collaborative Dual-Model ArchitectureMachine Learning as a Service (MLaaS) facilitates the deployment and accessibility of medical models, yet concurrently exposes proprietary models to potential adversaries. Attackers may exploit model stealing attacks (MSAs) to replicate these models illicitly, leading to loss of training investment and privacy vulnerabilities. While existing research has mainly focused on MSAs in the context of 2D natural image classification, this work presents the first investigation into stealing 3D medical segmentation models. We introduce collaborative dual-model 3D medical segmentation stealing (CDMSS-3D), which decomposes the model stealing objective into two complementary aspects: stealing accuracy and stealing robustness. With our adversarial proxy training, CDMSS-3D achieves superior model stealing performance. Furthermore, we incorporate a dual-model discrepancy sampling strategy, which enhances the fidelity of the substitute model by prioritizing uncertain samples. Extensive experiments on four 3D medical segmentation datasets demonstrate that CDMSS-3D consistently outperforms adapted baselines.
-
2511.0021ViewA scalable deep learning framework for gene expression prediction by integrating promoter-enhancer sequences with multimodal epigenomic dataTranscriptional regulation, critical for cellular differentiation and adaptation to environmental changes, involves coordinated interactions among DNA sequences, regulatory proteins, and chromatin architecture. Despite extensive data from consortia like ENCODE, understanding the dynamics of cis-regulatory elements (CREs) in gene expression remains challenging. Deep learning is a powerful tool for learning gene expression and epigenomic signals from DNA sequences, exhibiting superior performance compared to conventional machine learning approaches. However, even the most advanced deep learning-based methods may fall short in capturing the regulatory effects of distal elements such as enhancers, limiting their predictive accuracy. In addition, these methods may require significant resources to train or to adapt to newly generated data. To address these challenges, we present EPInformer, a scalable deep-learning framework for predicting gene expression by integrating promoter-enhancer interactions with their sequences, epigenomic signals, and chromatin contacts. Our model outperforms existing gene expression prediction models in rigorous cross-chromosome validation, accurately recapitulates enhancer-gene interactions validated by CRISPR perturbation experiments, and identifies crucial transcription factor motifs within regulatory sequences.
-
2511.0020ViewAI-Powered Rainfall Forecasting: Progress, Challenges, Future DirectionsRainfall forecasting holds significant importance across a wide range of sectors, including disaster prevention, energy planning and agriculture. In the past decade, artificial intelligence(AI) has emerged as a revolutionary approach, aiming to overcome the long-standing limitations of traditional numerical weather prediction (NWP) models and statistical downscaling models (SDMs) for rainfall forecasting. This chapter briefly introduces the remarkable progress made in AI-based rainfall forecasting. It mainly focuses on three major aspects: physical-constrained machine learning (ML), multi-modal data fusion, and extreme event prediction. AI-based models can be used to resolve the subgrid-scale parameterization problems (e.g., convective parameterization) that troubled NWP models for a long time. For instance, DeepMind's GraphCast employs dynamic graph neural networks to generate a high-resolution global forecast. Making 10-day forecasts with GraphCast takes less than a minute on a single Google TPU v4 machine. Regarding multi-modal data fusion, systems such as National Oceanic and Atmospheric Administration (NOAA) Multi-Radar Multi-Sensor(MRMS) combine various data sources and significantly improves the accuracy of forecasts. For the extreme rainfall prediction, the application of adversarial training and attention mechanisms has also led to improvements. The review finally suggests the future research directions. It emphasizes how AI is updating rainfall forecasting technology, enabling it to better meet the challenges posed by a changing climate.
-
2511.0019ViewFrom Virtual Cells to Programmable Humans: Advancing Digital Biology Through Hybrid AI SystemsRecent advances in artificial intelligence (AI), high-performance computing, and systems biology have accelerated the development of AI-powered virtual biological systems, from virtual cells to multiscale organ models and programmable virtual humans. These systems promise transformative applications in drug discovery, precision medicine, and in silico clinical trials. This review provides a critical synthesis of current progress, key technologies, and future directions across this spectrum. We explore hybrid modeling strategies that combine mechanistic models—such as ordinary and partial differential equations—with deep learning methods including convolutional, recurrent, and graph neural networks. We emphasize the importance of robust uncertainty quantification, simulation validation, and multiscale integration across molecular, cellular, organ-level, and systemic processes. A core contribution is the introduction of the SIM-CARD framework, a standardized simulation accountability protocol to document data provenance, modeling assumptions, performance metrics, and regulatory alignment. We propose a three-phase translational roadmap: (1) validated AI-augmented virtual cells and organs (by 2030), (2) interoperable multi-organ physiological systems (by 2040), and (3) programmable full-body virtual humans supporting personalized simulations and regulatory use cases (by 2055). We identify key enablers—including high-fidelity multiscale data, computational scalability, and simulation governance—as well as bottlenecks such as algorithmic bias, explainability, and regulatory uncertainty. Finally, we call for collaborative efforts to establish minimal benchmarking suites, FAIR-compliant simulation metadata, and cross-institutional federated learning infrastructure. This review aims to guide the scientific, regulatory, and clinical communities in navigating the complex yet promising trajectory toward clinically actionable programmable human simulations.
-
2511.0018ViewFrom Virtual Cells to Programmable Humans: Advancing Digital Biology Through Hybrid AI SystemsThe convergence of artificial intelligence and systems biology is giving rise to a new paradigm in biomedical research—AI-powered virtual biological systems. From single-cell simulations to organ-level models and ultimately programmable virtual humans, this digital continuum holds transformative potential for disease modeling, personalized medicine, and therapeutic discovery. In this review, we critically examine the state of the art in AI-driven simulations, including the numerical foundations, multiscale integration strategies, and the emerging class of hybrid models that bridge mechanistic and data-driven approaches. We explore the challenges of validation, uncertainty quantification, and regulatory alignment across simulation scales, with particular focus on the development of simulation accountability frameworks such as SIM-CARDs. Ethical and privacy concerns, including algorithmic bias and data sovereignty in patient-specific models, are also addressed, alongside concrete proposals for governance and federated simulation workflows. Special attention is given to the technical complexity of multiscale modeling, including the integration of mechanistic solvers with neural architectures and the computational resources required for real-time, clinically actionable simulations. We conclude with a translational roadmap for virtual biology that projects validated virtual cells for drug screening by 2030, multi-organ simulations by 2040, and the emergence of programmable virtual humans by 2055. By unifying high-fidelity numerical models with explainable AI, and aligning simulation design with ethical, regulatory, and clinical needs, the field of digital biology is positioned to unlock scalable and trustworthy biomedical innovation.
-
2511.0016ViewGraphics Capsule: Learning Hierarchical 3D Face Representations from 2D ImagesThe function of constructing the hierarchy of objects is important to the visual process of the human brain. Previous studies have successfully adopted capsule networks to decompose the digits and faces into parts in an unsupervised manner to investigate the similar perception mechanism of neural networks. However, their descriptions are restricted to the 2D space, limiting their capacities to imitate the intrinsic 3D perception ability of humans. In this paper, we propose an Inverse Graphics Capsule Network (IGC-Net) to learn the hierarchical 3D face representations from large-scale unlabeled images. The core of IGC-Net is a new type of capsule, named graphics capsule, which represents 3D primitives with interpretable parameters in computer graphics (CG), including depth, albedo, and 3D pose. Specifically, IGC-Net first decomposes the objects into a set of semantic-consistent part-level descriptions and then assembles them into object-level descriptions to build the hierarchy. The learned graphics capsules reveal how the neural networks, oriented at visual perception, understand faces as a hierarchy of 3D models.
-
2511.0015ViewEngineering Collective Attention in the Age of Artificial IntelligenceThis article explores how collective attention can be both disrupted and enhanced by artificial intelligence. It examines how the rise of algorithmic recommendation systems, generative media, and large-scale language models has transformed public communication and redefined what captures human attention. The analysis identifies the dual nature of artificial intelligence: while it can distort information ecosystems through deepfakes, social bots, and engagement-driven algorithms, it also holds the potential to strengthen collective reasoning by improving access to reliable knowledge and facilitating the clarification of complex information. Drawing on interdisciplinary research, the article develops a multilevel framework for understanding and improving collective attention. At the individual level, it emphasizes education, digital literacy, and critical awareness to build cognitive resilience. At the governmental level, it assesses regulatory and ethical strategies for ensuring transparency, accountability, and fairness in the design and deployment of AI systems. At the societal level, it highlights the promise of human–AI collaboration to guide attention toward truth, empathy, and shared problem-solving. The article concludes that collective attention can indeed be engineered in beneficial ways when artificial intelligence is governed transparently, used ethically, and integrated with public oversight to reinforce informed, cohesive, and resilient democracies.
-
2511.0014ViewArtificial Intelligence in Biomedical Research: From Data Integration to Precision MedicineThis comprehensive review examines the transformative role of artificial intelligence in biomedical research, from foundational data integration to clinical applications. The paper explores how AI techniques facilitate multimodal data fusion across diverse biological data types, employing both traditional statistical methods and advanced deep learning architectures including variational autoencoders, graph neural networks, and transformer models. It evaluates AI applications in medical imaging, where convolutional neural networks have achieved remarkable diagnostic accuracy (up to 94\% in COVID-19 detection) while enhancing segmentation and classification tasks across multiple imaging modalities. The review further investigates generative AI’s impact on molecular design and drug discovery, highlighting transformer-based architectures like TransAntivirus that navigate vast chemical spaces to optimize therapeutic candidates. Finally, it examines AI-enabled precision medicine applications, including Clinical Decision Support Systems and federated learning approaches that balance analytical power with privacy preservation. Despite significant progress, implementation challenges persist, including data heterogeneity, model explainability, and ethical concerns regarding bias and privacy. The paper underscores the importance of developing interpretable AI systems that integrate seamlessly into clinical workflows while addressing regulatory, ethical, and economic considerations to realize the full potential of AI in advancing biomedical research and healthcare delivery.