Papers
Event:
-
2511.0031ViewEquivariant Diffusion Solution for Inorganic Crystal Structure Determination from Powder X-ray Diffraction DataDetermining the crystal structures of inorganic crystalline materials is crucial as the structures encode essential information about their physical, chemical, and mechanical properties. Powder X-ray diffraction is one of the most widely used structural characterization techniques. However, determining crystal structure directly from experimental powder X-ray diffraction patterns can be challenging and requires significant crystallographic knowledge, which still heavily relies on manual inspection by human experts. Even the state-of-the-art databases contain thousands of entries with incomplete or implausible crystal structure information. In this work, we trained a diffusion model based on equivariant graph neural networks that can infer atomic coordinates from powder X-ray diffraction patterns. Starting from a random guess, our model iteratively refines atom coordinates until it reaches a chemically reasonable structure that matches the target diffraction pattern. Our approach is both efficient and accurate. It takes on average 0.6 seconds to solve the atomic positions per crystal structure, which is several orders of magnitude faster than previous approaches. The success rate reaches 82.3% and 81.6% on the simulated and experimental diffraction datasets, respectively. We revisited energetically unfavorable crystal structures in the database and demonstrated that our model can propose more plausible structure solutions for 39 entries. We also suggested 912 complete crystal structure models for entries in the database lacking all or partial atomic positions, including entries that contain light elements, are natural minerals, or exhibit chemical disorder lattice sites. We demonstrated that conditional equivariant generative model can tackle the structure determination problem and provide high-quality structure models for inorganic crystalline materials, paving the way for automated structural analysis of diffraction patterns in autonomous materials development loops.
-
2511.0030ViewElectionFit: A Computational Laboratory of LLM Agents for Simulating U.S. Presidential ElectionsModeling complex human behavior, such as voter decisions in national elections, is a long-standing challenge for computational social science. Traditional agent-based models (ABMs) are limited by oversimplified rules, while large-scale statistical models often lack interpretability. We introduce ElectionFit, a novel framework that uses Large Language Models (LLMs) to build a ``computational laboratory'' of LLM agents for political simulation. Each agent is instantiated with a high-fidelity demographic profile and dynamic contextual information (e.g., candidate policies), enabling it to perform nuanced, generative reasoning to simulate a voting decision. We deployed this framework as a testbed on the 2024 U.S. Presidential Election, focusing on seven key swing states. Our simulation's macro-level results successfully replicated the real-world outcome, demonstrating the high fidelity of our ``virtual society''. The primary contribution is not only the prediction, but also the framework's utility as an interpretable research tool. ElectionFit moves beyond black-box outputs, allowing researchers to probe agent-level rationale and analyze the stability and sensitivity of LLM-driven social simulations.
-
2511.0025ViewEstimating Rural Rooftop Solar Potential Using Semantic Segmentation and Multi-Source DataSolar energy is a clean and renewable resource, and the low-rise, unobstructed rural buildings of northern China provide ideal conditions for photovoltaic (PV) installation compared to shaded, high-density urban areas. Yet, progress in assessing rural solar potential is limited by the absence of accurate 3D building data. This study proposes a rapid estimation approach integrating deep learning, parametric modeling, and GPU-accelerated simulation. Convolutional neural net- works (CNNs) extract building footprints from satellite imagery, which are then processed in Grasshopper to generate refined vector outlines. Combined with digital surface model (DSM) data, these outlines produce precise 3D village models. Using Vitality 2.0 for GPU-based solar simulation, the method was applied to 31 villages in Tianjin, generating parametric 3D models and estimating their solar potential. Results show that low building heights and minimal mutual shading make photovoltaic capacity scale with roof area—larger villages have greater generation potential. Moreover, villages with metal roofs exhibit higher conversion efficiency and shorter cost-recovery periods than those with concrete or ceramic-tile roofs, due to better heat dissipation. Overall, the workflow offers a practical and efficient solution for estimating rural solar potential in data-scarce regions to guide renewable energy planning and investment.
-
2511.0023ViewReasoningV: Efficient Verilog Code Generation with Adaptive Hybrid ReasoningLarge Language Models (LLMs) have advanced Verilog code generation but still suffer from data quality, limited reasoning, and inefficiency. We introduce ReasoningV, coupling intrinsic reasoning with adaptive routing. Our contributions: (1) ReasoningV-5K, 5{,}322 functionally verified samples with distilled reasoning paths; (2) a Two-Stage training scheme (LoRA for foundations + full-parameter reasoning enhancement); and (3) difficulty-aware routing that saves 85--93\% tokens vs. a strong commercial model and 32--75\% vs. fixed-depth variants. On VerilogEval-human, RV-14B attains 73.9\% pass@1; RV-7B reaches 57.8\% with superior efficiency. Models, data, and code: \url{https://github.com/BUAA-CLab/ReasoningV}.
-
2511.0022ViewStealing 3D Medical Segmentation Models via Collaborative Dual-Model ArchitectureMachine Learning as a Service (MLaaS) facilitates the deployment and accessibility of medical models, yet concurrently exposes proprietary models to potential adversaries. Attackers may exploit model stealing attacks (MSAs) to replicate these models illicitly, leading to loss of training investment and privacy vulnerabilities. While existing research has mainly focused on MSAs in the context of 2D natural image classification, this work presents the first investigation into stealing 3D medical segmentation models. We introduce collaborative dual-model 3D medical segmentation stealing (CDMSS-3D), which decomposes the model stealing objective into two complementary aspects: stealing accuracy and stealing robustness. With our adversarial proxy training, CDMSS-3D achieves superior model stealing performance. Furthermore, we incorporate a dual-model discrepancy sampling strategy, which enhances the fidelity of the substitute model by prioritizing uncertain samples. Extensive experiments on four 3D medical segmentation datasets demonstrate that CDMSS-3D consistently outperforms adapted baselines.
-
2511.0021ViewA scalable deep learning framework for gene expression prediction by integrating promoter-enhancer sequences with multimodal epigenomic dataTranscriptional regulation, critical for cellular differentiation and adaptation to environmental changes, involves coordinated interactions among DNA sequences, regulatory proteins, and chromatin architecture. Despite extensive data from consortia like ENCODE, understanding the dynamics of cis-regulatory elements (CREs) in gene expression remains challenging. Deep learning is a powerful tool for learning gene expression and epigenomic signals from DNA sequences, exhibiting superior performance compared to conventional machine learning approaches. However, even the most advanced deep learning-based methods may fall short in capturing the regulatory effects of distal elements such as enhancers, limiting their predictive accuracy. In addition, these methods may require significant resources to train or to adapt to newly generated data. To address these challenges, we present EPInformer, a scalable deep-learning framework for predicting gene expression by integrating promoter-enhancer interactions with their sequences, epigenomic signals, and chromatin contacts. Our model outperforms existing gene expression prediction models in rigorous cross-chromosome validation, accurately recapitulates enhancer-gene interactions validated by CRISPR perturbation experiments, and identifies crucial transcription factor motifs within regulatory sequences.
-
2511.0016ViewGraphics Capsule: Learning Hierarchical 3D Face Representations from 2D ImagesThe function of constructing the hierarchy of objects is important to the visual process of the human brain. Previous studies have successfully adopted capsule networks to decompose the digits and faces into parts in an unsupervised manner to investigate the similar perception mechanism of neural networks. However, their descriptions are restricted to the 2D space, limiting their capacities to imitate the intrinsic 3D perception ability of humans. In this paper, we propose an Inverse Graphics Capsule Network (IGC-Net) to learn the hierarchical 3D face representations from large-scale unlabeled images. The core of IGC-Net is a new type of capsule, named graphics capsule, which represents 3D primitives with interpretable parameters in computer graphics (CG), including depth, albedo, and 3D pose. Specifically, IGC-Net first decomposes the objects into a set of semantic-consistent part-level descriptions and then assembles them into object-level descriptions to build the hierarchy. The learned graphics capsules reveal how the neural networks, oriented at visual perception, understand faces as a hierarchy of 3D models.
-
2511.0015ViewEngineering Collective Attention in the Age of Artificial IntelligenceThis article explores how collective attention can be both disrupted and enhanced by artificial intelligence. It examines how the rise of algorithmic recommendation systems, generative media, and large-scale language models has transformed public communication and redefined what captures human attention. The analysis identifies the dual nature of artificial intelligence: while it can distort information ecosystems through deepfakes, social bots, and engagement-driven algorithms, it also holds the potential to strengthen collective reasoning by improving access to reliable knowledge and facilitating the clarification of complex information. Drawing on interdisciplinary research, the article develops a multilevel framework for understanding and improving collective attention. At the individual level, it emphasizes education, digital literacy, and critical awareness to build cognitive resilience. At the governmental level, it assesses regulatory and ethical strategies for ensuring transparency, accountability, and fairness in the design and deployment of AI systems. At the societal level, it highlights the promise of human–AI collaboration to guide attention toward truth, empathy, and shared problem-solving. The article concludes that collective attention can indeed be engineered in beneficial ways when artificial intelligence is governed transparently, used ethically, and integrated with public oversight to reinforce informed, cohesive, and resilient democracies.
-
2511.0008ViewA Self-Driving Laboratory for Materials Science: An Autonomous Research Agent for Deep Data Analysis and InterpretationAs artificial intelligence increasingly permeates scientific research, the ”AI for Science” paradigm is evolving to enable more autonomous scientific workflows. Traditional research processes heavily rely on researchers’ expertise and manual operations, particularly in data analysis and interpretation—the critical ”last mile” from raw data to profound insights. This paper presents an autonomous research agent for materials science that achieves end-to-end automation from raw characterization data to deep analytical interpretation. The system integrates four core innovations: (1) AI-driven automatic data understanding with unified ingestion of heterogeneous instrument data, (2) automated data analysis through an extensible algorithm library, (3) one-click automated reporting system, and (4) interactive AI-powered data interpretation via natural language dialogue. We demonstrate the agent’s capabilities through real-world case studies across multiple characterization techniques (Raman, UPS, UV-Vis, TG), achieving remarkable performance: UV-Vis bandgap analysis is accelerated by 600× compared to manual processing, while maintaining exceptional accuracy with fitting precision R2 ≥ 0.999. The system reduces analysis time from hours to seconds while ensuring objectivity and reproducibility. By automating the data analysis pipeline while preserving human oversight and interpretability, this work contributes a practical component toward building more integrated and efficient scientific discovery systems in materials research.
-
2511.0003ViewAI Empowered Thermal Management Materials DesignThe development of high-performance thermal management materials holds significant importance in fields such as chips, data centers and batteries. Materials informatics, which integrates big data and artificial intelligence, is emerging as the fourth paradigm for materials research. Over the past few years, our team has undertaken preliminary explorations in the development of advanced thermal management materials empowered by big data and artificial intelligence. In this work, we introduce three successful materials informatics applications on thermal management materials design, the construction of machine learning interatomic potentials for thermal property calculations, the discovery and generative design of high-thermal-conductivity materials, and the intelligent design of micro/nano structures for thermal transport. Those successful cases have shown great advantage for thermal management materials design via materials informatics.
-
2511.0002ViewBattery-Sim-Agent: Leveraging LLM-Agent for Inverse Battery Parameter EstimationParameterizing high-fidelity ``digital twins'' of batteries is a critical yet challenging inverse problem that hinders the pace of battery innovation. Prevailing methods formulate this as a black-box optimization (BBO) task, employing algorithms that are sample-inefficient and blind to the underlying physics. In this work, we introduce a new paradigm that reframes the inverse problem as a reasoning task, and present \textsc{Battery-Sim-Agent}, the first framework to deploy a Large Language Model (LLM) agent in a closed loop with a high-fidelity battery simulator. The agent mimics a human scientist's workflow: it interprets rich, multi-modal feedback from the simulator, forms physically-grounded hypotheses to explain discrepancies, and proposes structured parameter updates. On a systematically constructed benchmark suite spanning diverse battery chemistries, operating conditions, and difficulty levels, our agent significantly outperforms strong BBO baselines like Bayesian optimization in identifying accurate parameters. We further demonstrate the framework's capability in complex long-horizon degradation fitting tasks and validate its practical applicability on real-world battery datasets. Our results highlight the promise of LLM-agents as reasoning-based optimizers for scientific discovery and battery parameter estimation.
-
2511.0001ViewPhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled PriorsEvaluating the scientific discovery capabilities of large language model based agents, particularly how they cope with varying environmental complexity and utilize prior knowledge, requires specialized benchmarks currently lacking in the landscape. To address this gap, we introduce \textsc{PhysGym}, a novel benchmark suite and simulation platform for rigorously assessing LLM-based scientific reasoning in interactive physics environments. \textsc{PhysGym}'s primary contribution lies in its sophisticated control over the level of prior knowledge provided to the agent. This allows researchers to dissect agent performance along axes including the complexity of the problem and the prior knowledge levels. The benchmark comprises a suite of interactive simulations, where agents must actively probe environments, gather data sequentially under constraints and formulate hypotheses about underlying physical laws. \textsc{PhysGym} provides standardized evaluation protocols and metrics for assessing hypothesis accuracy and model fidelity. We demonstrate the benchmark's utility by presenting results from baseline LLMs, showcasing its ability to differentiate capabilities based on varying priors and task complexity.
-
2510.0091ViewFairEval: Evaluating Fairness in LLM-Based Recommendations with Personality AwarenessRecent advances in Large Language Models (LLMs) have enabled their application to recommender systems (RecLLMs), yet concerns remain regarding fairness across demographic and psychological user dimensions. We introduce FairEval, a novel evaluation framework to systematically assess fairness in LLM-based recommendations. Unlike prior benchmarks that focus solely on demographic attributes, FairEval uniquely integrates personality profiles with eight sensitive demographic attributes, including gender, race, and age enabling a comprehensive and nuanced assessment of user-level bias. We evaluate state-of-the-art models, including ChatGPT 4o and Gemini 1.5 Flash, on music and movie recommendation tasks using structured prompts. FairEval’s personality-aware fairness metric, PAFS@25, achieves high consistency scores up to 0.9969 for ChatGPT 4o and 0.9997 for Gemini 1.5 Flash, underscoring its robustness in equitable recommendations across diverse user profiles, while also uncovering fairness gaps, with SNSR disparities reaching up to 34.79%. Our results also reveal disparities in recommendation consistency across user identities and prompt formulations, including typographical and multilingual variations. By unifying psychographic and demographic evaluation in RecLLMs, FAIREVAL offers a robust and reproducible benchmark for inclusive and bias-aware LLM evaluation.
-
2510.0090ViewA Fuzzy-based Approach to Predict Human Interaction by Functional Near-Infrared SpectroscopyIn this article, we introduce the Fuzzy logic-based attention (Fuzzy Attention Layer) mechanism, a novel computational approach designed to enhance the interpretability and efficacy of neural models in psychological research. The fuzzy attention layer integrated into the transformer encoder model to analyze complex psychological phenomena from neural signals captured by functional near-infrared spectroscopy (fNIRS). By leveraging fuzzy logic, the fuzzy attention layer learns and identifies interpretable patterns of neural activity. This addresses a significant challenge in using transformers: the lack of transparency in determining which specific brain activities most contribute to particular predictions. Our experimental results, obtained from fNIRS data engaged in social interactions involving handholding, reveal that the fuzzy attention layer not only learns interpretable patterns of neural activity but also enhances model performance. In addition, these patterns provide deeper insights into the neural correlates of interpersonal touch and emotional exchange. The application of our model shows promising potential in understanding the complex aspects of human social behavior, verify psychological theory with machine learning algorithms, thereby contributing significantly to the fields of social neuroscience and AI. Presented version based on the work published in IEEE TFS (2025)
-
2510.0089ViewBasketVision: Benchmarking MLLMs' Grasp of Complex Dynamic SystemsWhile Multimodal Large Language Models (MLLMs) excel on general visual tasks, their capacity to comprehend complex dynamic systems remains a critical open question. Such systems, governed by physical laws, explicit rules, and multi-agent interactions, form the fabric of the real world. To facilitate a systematic diagnosis of current MLLM limitations, we introduce BasketVision, a new benchmark that leverages professional basketball as a microcosm for these dynamic environments. BasketVision probes model capabilities across seven dimensions—spanning perception, reasoning, and prediction—through 6,000 curated, bilingual questions from professional game data. An automated data generation pipeline underpins the benchmark, ensuring both scalability and fine-grained precision. Our evaluation of 23 leading models reveals a chasm between machine and human cognition: human experts attain 96.34% accuracy, while the premier model, GPT-4o, achieves only 63.15%. The analysis pinpoints spatial reasoning as a persistent bottleneck and uncovers specific patterns of task specialization. BasketVision thus serves as a crucial apparatus for charting the frontiers of MLLMs and steering future work toward more robust reasoning in dynamic visual worlds.
-
2510.0088ViewMatEvolve: A Synergistic Symbolic–LLM Agent for Multi-Objective Materials DesignMaterials define the eras of human civilization, yet the design of novel materials is fundamentally constrained by the immense chemical space, which renders traditional enumeration-screening methodology computationally prohibitive and inefficient. This paper introduces a paradigm shift towards insight-exploration-validation, enabling an intelligent and evolutionary exploration of material design pathways. To actualize this paradigm, we propose MatEvolve, a synergistic symbolic–LLM agent that reconceptualizes material design as a closed-loop, programmatic evolution task. Central to MatEvolve is a novel symbolic formalism, Material Edit Language, which empowers the agent to programmatically take chemical operations. The exploration trajectory is directed by a multifaceted guidance strategy, comprising a dynamic knowledge injection mechanism and a two-stage exploration strategy that balances broad exploration and deep optimization. Furthermore, a multi-objective fitness landscape ensures directional and efficient navigational guidance. These integrated strategies contribute to a 32.2% improvement over direct material structure modification. Crucially, comparisons demonstrate that our insight-exploration-validation paradigm outperforms the traditional enumeration-screening approach by 33.6%, highlighting its superior efficacy in navigating vast design spaces.
-
2510.0085ViewAI Mathematician as a Partner in Advancing Mathematical DiscoveryArtificial intelligence (AI) has demonstrated impressive progress in mathematical reasoning, yet its integration into the practice of mathematical research remains limited. In this study, we investigate how the AI Mathematician (AIM) system can operate as a research partner rather than a mere problem solver. Focusing on a challenging problem in homogenization theory, we analyze the autonomous reasoning trajectories of AIM and incorporate targeted human interventions to structure the discovery process. Through iterative decomposition of the problem into tractable subgoals, selection of appropriate analytical methods, and validation of intermediate results, we reveal how human intuition and machine computation can complement one another. This collaborative paradigm enhances the reliability, transparency, and interpretability of the resulting proofs, while retaining human oversight for formal rigor and correctness. The approach leads to a complete and verifiable proof, and more broadly, demonstrates how systematic human-AI co-reasoning can advance the frontier of mathematical discovery.
-
2510.0042ViewICIMBench: An In-Context Iterative Molecular Design Benchmark for Large Language ModelsLarge language models (LLMs) are rapidly transforming scientific discovery, showing promise in hypothesis generation, literature understanding, and symbolic reasoning. Yet, their capacity to conduct iterative, feedback-driven molecular design---a hallmark of real-world drug and materials discovery---remains underexplored. Existing benchmarks typically cast molecular tasks as one-shot question-answering or text-to-molecule translation, neglecting the iterative propose-evaluate-refine process central to scientific practice. We propose \textbf{ICIMBench}, an \textit{In-Context Iterative Molecular Design Benchmark} that evaluates LLMs in multi-turn molecular design episodes. In each task, the model receives a natural-language specification, generates candidate molecules in SMILES format, and iteratively refines them based on deterministic oracle feedback from RDKit. We introduce the \textbf{NumEval} metric---the number of evaluations required to satisfy the target---which captures both performance efficiency and robustness under realistic evaluation budgets. Experiments on frontier models (GPT-5, DeepSeek-V3.2, Intern-S1) show that while single-property design is largely solved (NumEval $=1$) by state-of-the-art LLMs like GPT-5, multi-property optimization remains a strong challenge, especially under coupled constraints such as lipophilicity and scaffold similarity. ICIMBench provides a principled framework for probing the in-context reasoning and adaptive optimization abilities of LLMs, paving the way toward autonomous, language-driven molecular discovery.
-
2510.0041ViewGraph neural network for colliding particles with an application to sea ice floe modelingThis paper introduces a novel approach to sea ice modeling using Graph Neural Networks (GNNs), utilizing the natural graph structure of sea ice, where nodes represent individual ice pieces, and edges model the physical interactions, including collisions. This concept is developed within a one-dimensional framework as a foundational step. Traditional numerical methods, while effective, are computationally intensive and less scalable. By utilizing GNNs, the proposed model, termed the Collision-captured Network (CN), integrates data assimilation (DA) techniques to effectively learn and predict sea ice dynamics under various conditions. The approach was validated using synthetic data, both with and without observed data points, and it was found that the model accelerates the rendering of trajectories without compromising accuracy. This advancement offers a more efficient tool for forecasting in marginal ice zones (MIZ) and highlights the potential of combining machine learning with data assimilation for more effective and efficient modeling.
-
2510.0040ViewA Fuzzy-based Approach to Predict Human Interaction by Functional Near-Infrared SpectroscopyIn this article, we introduce the Fuzzy logic-based attention (Fuzzy Attention Layer) mechanism, a novel computational approach designed to enhance the interpretability and efficacy of neural models in psychological research. The fuzzy attention layer integrated into the transformer encoder model to analyze complex psychological phenomena from neural signals captured by functional near-infrared spectroscopy (fNIRS). By leveraging fuzzy logic, the fuzzy attention layer learns and identifies interpretable patterns of neural activity. This addresses a significant challenge in using transformers: the lack of transparency in determining which specific brain activities most contribute to particular predictions. Our experimental results, obtained from fNIRS data engaged in social interactions involving handholding, reveal that the fuzzy attention layer not only learns interpretable patterns of neural activity but also enhances model performance. In addition, these patterns provide deeper insights into the neural correlates of interpersonal touch and emotional exchange. The application of our model shows promising potential in understanding the complex aspects of human social behavior, verify psychological theory with machine learning algorithms, thereby contributing significantly to the fields of social neuroscience and AI. Presented version based on the work published in IEEE TFS (2025)