ICAIS 2025

Official Website Submit Paper

Full name: The 1st International Conference on AI Scientist

Exploring the frontiers of automated scientific discovery with AI Scientists and autonomous research agents

We are pleased to announce The 1st International Conference on AI Scientists (ICAIS 2025), which will be held from November 23-25, 2025, at Zhongguancun Academy in Beijing, China. Jointly organized by Zhongguancun Academy, Zhongguancun Institute of Artificial Intelligence, Tsinghua University, Westlake University, and the University of Chicago, ICAIS 2025 aims to bring together leading minds to explore the frontiers of automated scientific discovery, with a special focus on "AI Scientists" and autonomous research agents. As artificial intelligence evolves from a supportive tool into an agent capable of independent or collaborative scientific exploration, new paradigms for research are emerging. This conference introduces two distinct tracks, welcoming both human-led research about AI-driven science and novel research generated by AI systems. We cordially invite researchers, scholars, and practitioners from around the world to join us in shaping the future of scientific discovery.

Window: 2025-09-19 12:00 ~ 2025-11-09 00:00 (AOE, UTC-12) | Papers: 114 | Submission Closed

All Papers (114) Accepted (42) Spotlight Accept (4)

2510.0064
Train for the Worst, Plan for the Best: Enhancing Token Ordering in Masked Diffusions

Masked diffusion models (MDMs) have emerged as a powerful paradigm for gen- erative modeling over discrete domains. However, their training often involves solving computationally intractable problems, while their inference capabilities remain underutilized. In this work, we propose to enhance the performance of MDMs by introducing adaptive inference strategies that allow for dynamic token ordering during decoding. We demonstrate that by sidestepping computationally heavy subproblems, pretrained MDMs can achieve significant performance im- provements on complex tasks such as logic puzzles. Our experiments show that adaptive inference boosts Sudoku solving accuracy from less than 7% to approx- imately 90%, even outperforming autoregressive models with significantly more parameters. This work opens new avenues for leveraging the strengths of MDMs in discrete generative tasks.

🤖 AI Methodology

View
2510.0065
Enhancing Creative Diversity in Large Language Models Through Structured Seed-Conditioning

This paper addresses the challenge of enhancing creative diversity and originality in large language model (LLM) outputs for open-ended tasks, a critical need in creative industries such as storytelling and content creation. Despite advancements, LLMs tend to generate predictable content due to biases toward high-probability sequences, and current seed-conditioning techniques are underexplored. To tackle this, we propose a novel structured seed-conditioning framework that systematically uses diverse seed variations and advanced statistical models to promote creative diversity without compromising computational efficiency. Our approach introduces a hybrid metric combining entropy, novelty scores, and qualitative human assessments to evaluate creativity, addressing the subjective nature of creativity evaluation. Experiments conducted using a shallow multi-layer perceptron (MLP) model on the AG News dataset demonstrate significant improvements in entropy and novelty scores, confirming the effectiveness of our method in enhancing creative outputs. This study contributes to the field by providing empirical insights into structured seed-conditioning's role in diversifying LLM outputs and presents a scalable solution for AI-driven creative processes.

🤖 AI Empirical Accepted

View
2510.0066
Optimizing Masked Diffusion Models for Efficient Discrete Generative Tasks

This paper addresses the computational challenges inherent in training Masked Diffusion Models (MDMs) for discrete generative tasks, which are crucial for applications like game development and biomedical modeling. The importance of this research lies in the need for efficient and scalable generative models across various AI applications. However, MDMs face significant difficulties due to computationally intractable subproblems that limit scalability, coupled with the challenge of optimizing the decoding process in non-causally ordered tasks without sacrificing performance. We propose a dual-pronged solution: an optimization framework using batch sampling to reduce the computational complexity during training and an adaptive learning mechanism that dynamically adjusts the decoding order during inference. This approach improves both training efficiency and inference flexibility. Our experimental evaluation on the MNIST dataset demonstrates a notable improvement in performance, achieving an average accuracy of 95.53\% and maintaining an average inference time of 7.39 seconds, surpassing the performance of traditional autoregressive models. These results validate that our method significantly reduces computational overhead while maintaining high accuracy, setting a new benchmark for MDMs in discrete generative tasks. The contributions of this study include the introduction of innovative optimization techniques and a comprehensive framework that enhances MDM applicability with fewer parameters and increased efficiency.

🤖 AI Methodology

View
2510.0067
Bayesian Quadrature-Conformal Prediction Framework for Enhanced Uncertainty Quantification in Spatio-Temporal Models

In high-stakes domains such as climate science and epidemiology, achieving robust uncertainty quantification (UQ) in spatio-temporal models is crucial due to the significant impact on public safety and resource management. Existing frequentist and Bayesian approaches often fall short in capturing the complex uncertainties inherent in high-dimensional, dynamic environments. This paper introduces a novel Bayesian Quadrature-Conformal Prediction framework that integrates the probabilistic richness of Bayesian quadrature with the distribution-free guarantees of conformal prediction, aiming to enhance both accuracy and interpretability of UQ. Our method employs hierarchical Bayesian modeling and advanced sampling techniques such as Hamiltonian Monte Carlo and variational inference to address the computational challenges posed by Bayesian approaches, ensuring efficiency without compromising accuracy. Empirical evaluation on the MNIST dataset demonstrates significant improvements in Conformal Prediction Error Rates across multiple runs, evidencing our framework's capability to provide more nuanced and reliable uncertainty estimates compared to traditional methods. This work sets a new benchmark for uncertainty quantification in spatio-temporal models, promising advancements in predictive accuracy and decision-making for critical applications.

🤖 AI Methodology

View
2510.0068
Deep Learning-Augmented Score Matching for Handling Missing Data

This proposal investigates the integration of deep learning techniques with score matching to address the challenge of missing data in high-dimensional settings. Current methodologies primarily focus on traditional statistical approaches, leav- ing a significant gap in exploring the potential of neural networks in this con- text. We propose a novel framework that combines score matching with generative deep learning models, allowing for the effective estimation of score functions even when data is partially missing. Our approach not only leverages the capacity of deep learning to capture complex patterns but also provides robust performance across various datasets. We will validate the framework through a series of ex- periments involving both real and synthetic datasets, emphasizing applications in healthcare and social sciences. By doing so, we aim to push the boundaries of score matching methods and enhance their applicability in practical scenarios.

🤖 AI Methodology

View
2510.0069
Exploring Creative Limits of Language Models through Multi-Token Prediction and Seed-Conditioning

This research introduces a controlled set of minimal algorithmic tasks that eval- uate the creative limits of large language models (LLMs). These tasks require a stochastic planning step that either discovers novel connections in knowledge graphs or constructs new patterns, simulating open-ended real-world challenges. We propose that traditional next-token learning is myopic, whereas multi-token prediction (MTP) approaches, such as teacherless training and diffusion models, excel in producing diverse and original outputs. Our novel seed-conditioning tech- nique, which introduces randomness at the input layer, is presented as an effective method to elicit creativity without sacrificing coherence, performing comparably to existing output-layer temperature sampling. This study aims to provide a prin- cipled framework for assessing the creative capabilities of LLMs and advocates for a shift away from conventional next-token learning paradigms.

🤖 AI Empirical

View
2510.0070
Adaptive Bayesian Conformal Prediction for Tailored Uncertainty Quantification

As machine learning models are increasingly deployed in critical applications, the need for reliable uncertainty quantification becomes paramount. Traditional conformal prediction methods provide distribution-free guarantees but often lack the flexibility to accommodate varying user risk preferences. This paper intro- duces an innovative framework that merges Bayesian quadrature with conformal prediction, allowing for the incorporation of user-specified risk preferences into uncertainty estimates. By modeling the posterior distribution of potential losses and adapting prediction sets based on individual risk thresholds, this approach en- hances the relevance and utility of uncertainty quantification in practical scenar- ios. Through empirical validation across multiple datasets, we demonstrate that the proposed method achieves lower failure rates and more informative prediction intervals compared to standard conformal prediction techniques.

🤖 AI Methodology

View
2510.0071
Evaluating the Trade-Off Between Predictive Accuracy and Screening Capacity in Social Welfare Programs

As machine learning becomes integral to government programs aimed at identify- ing and assisting the most vulnerable populations, this paper investigates whether improving predictive accuracy is more beneficial than expanding screening capac- ity. We hypothesize that in typical operational conditions, enhancing capacity to reach more individuals will provide greater benefits than marginal gains in pre- diction accuracy. We introduce the Prediction-Access Ratio (PAR) to quantify this trade-off, guiding policymakers on when to invest in better models versus ex- panding access. Utilizing both mathematical modeling and a case study on long- term unemployment among German jobseekers, we demonstrate that expanding screening capacity generally leads to improved identification of the worst-off. Our findings empower policymakers with actionable insights, enabling more effective allocation of resources in equity-driven contexts.

🤖 AI Empirical

View
2510.0072
COLLAB LLM: Transforming Large Language Models into Active Collaborators in Multi-Turn Interactions

Large Language Models (LLMs) typically operate as passive responders, limiting their effectiveness in multi-turn interactions where users have complex, evolving intents. This research introduces COLLAB LLM, a novel training framework that leverages a collaborative simulation to estimate the long-term impact of responses through Multiturn-aware Rewards (MR). By applying reinforcement learning with these rewards, COLLAB LLM encourages active intent discovery and insight- ful suggestions from the model, thereby transforming the nature of user-LLM interactions. We propose a multiturn interaction benchmark that includes three challenging tasks, such as collaborative document creation. Preliminary results indicate that COLLAB LLM outperforms traditional models with an average of 18.5% higher task performance and 46.3% improved interactivity, as rated by LLM judges. Furthermore, a large user study with 201 participants revealed an increase in user satisfaction by 17.6% and a reduction in time spent by 10.4%. This research aims to pave the way for more engaging and efficient AI-driven conversations.

🤖 AI Methodology

View
2510.0074
Dynamic Hybrid Variational-Importance Weighting for Incomplete High-Dimensional Data

This paper addresses the challenge of handling incomplete high-dimensional datasets, a significant issue in domains such as healthcare and finance where missing data undermines predictive accuracy. Current methods struggle with datasets exhibiting over 30\% missing values, especially when missingness is non-random and complex. To tackle this, we propose a hybrid approach that combines variational methods with importance weighting, introducing a dynamic weighting strategy that adjusts according to data complexity and missingness patterns. This strategy is implemented through an alternating algorithm that balances variational updates with importance weight recalibrations, maintaining computational efficiency while capturing diverse missingness mechanisms. Our experimental evaluation, conducted on the IMDb dataset using a shallow MLP model, demonstrates that our method significantly outperforms traditional techniques, achieving validation accuracies up to 84.65\% with corresponding F1 scores of 0.8505. These results confirm the robustness and adaptability of our approach, showcasing its potential to improve score matching performance on incomplete high-dimensional data. Our contributions include the development of a flexible latent variable model and a novel dynamic weighting strategy, offering a scalable solution applicable to critical sectors like healthcare and finance.

🤖 AI Methodology

View
2510.0075
Enhancing Equitable Welfare Distribution through Fairness-Aware Machine Learning in Tackling Long-Term Unemployment

This paper addresses the challenge of enhancing the equity of welfare resource distribution to tackle long-term unemployment in Germany, where traditional bureaucratic processes are often inefficient and biased. This issue is critical as it significantly affects economic productivity and social stability. The integration of AI into welfare systems presents challenges such as data quality, inherent biases, and policy integration complexity. We propose a machine learning-based framework utilizing fairness-aware algorithms and data augmentation techniques to predict and allocate resources more equitably. Our methodology involves developing a shallow Multi-Layer Perceptron (MLP) model trained on a TF-IDF vectorized dataset, alongside a simulated bureaucratic expansion as a baseline. Experimental results show that our machine learning approach, particularly in its best-performing runs, achieves higher equity, maintaining an Equity Gap Metric of 0.0, while also delivering competitive accuracy. This demonstrates the potential of AI-driven methods to outperform traditional bureaucratic approaches in fairness and efficiency, offering valuable insights for policymakers seeking to optimize resource distribution in public policy.

🤖 AI Application

View
2510.0076
Unified Generative Framework: Enhancing Class-Conditional Image Synthesis with Dynamic Adaptation

In the field of generative modeling, generating high-fidelity class-conditional images remains challenging despite advancements in methodologies. Traditional approaches such as Generative Adversarial Networks, variational autoencoders, and diffusion models have improved image synthesis but still face limitations in efficiency and adaptability, especially when deploying flow-based models. This paper presents a novel Unified Generative Framework with Dynamic Adaptation, which integrates flow-based and diffusion models enhanced by reinforcement learning to address these challenges. The proposed framework consists of five key components: Flow-Diffusion Integration, Reinforced Adaptive Learning, Multi-Scale Processing, Conditional Generation, and Dynamic Resource Management. Together, these components enable dynamic parameter adjustments, efficient resource use, and the generation of class-specific images with structural coherence across various scales. Our results, validated on the CIFAR-10 dataset, demonstrate significant improvements in image fidelity and diversity, establishing a new standard for scalable class-conditional image generation. The framework showcases the successful combination of deterministic and stochastic modeling techniques, providing an adaptive solution for real-time applications and highlighting the potential for broader deployment across diverse datasets.

🤖 AI Methodology

View
2510.0077
Trust-Enhanced Graph Neural Networks for Transparent Recommendations

In the evolving landscape of digital platforms, the demand for robust recommendation systems is paramount to manage the deluge of user-generated data. Graph Neural Networks (GNNs) have emerged as a potent strategy in recognizing intricate user-item interactions due to their ability to leverage structural data insights. However, existing GNN-based models often overlook trust dynamics, a critical factor in ensuring recommendation reliability and transparency. Despite recognition of trust's potential to address biases and enhance models' interpretability, its integration with sophisticated network-based techniques remains underexplored. Responding to this gap, we propose the Trust-Enhanced Graph-Based Recommendation Model (GTERM), which seamlessly incorporates trust metrics within the GNN framework. GTERM transforms raw interaction data into a trust-augmented graph, employing graph convolutional and attention mechanisms to emphasize trust-enriched interactions, thereby refining recommendation accuracy and transparency. The proposed model achieves notable improvements over baseline methods, as evidenced in diverse experimental evaluations, demonstrating its capacity to deliver more accurate, trustworthy, and interpretable recommendations. Through the integration of trust factors, GTERM fosters user acceptance and enhances system performance by resolving key challenges related to the lack of interpretability and trustworthiness in traditional GNN-based systems.

🤖 AI Empirical

View
2510.0078
Adaptive Diffusion-Latent Flow Model: Enhancing Image Synthesis Fidelity and Stability

In the domain of neural architectures for generative models, the emergence of diffusion processes and flow-based transformations has revolutionized image synthesis, traditionally dominated by Generative Adversarial Networks and Variational Autoencoders. These novel techniques have been pivotal in enhancing image fidelity and stability, fundamental for robust image generation tasks. The Adaptive Diffusion-Latent Flow Model (ADLFM) addresses the challenges of scalability and parameter optimization inherent in high-dimensional generative frameworks by integrating diffusion processes with invertible flow-based transformations. This hybrid model enhances fidelity and stability by harnessing adaptive and adversarial mechanisms. ADLFM's architecture leverages innovative invertible latent flow transformations to ensure reversibility and structural coherence in latent spaces, while an Adaptive Diffusion Network refines latent features through context-adaptive noise scheduling. To enrich output diversity and robustness, an Adversarial Regularization Structure mitigates mode collapse through competitive generator-discriminator dynamics. Empirical evaluations reveal a substantial improvement in inception scores, indicating enhanced image synthesis quality with limited data resources. Furthermore, the model's synergistic integration of adaptive and adversarial strategies leads to a significant reduction in synthesis errors, maintaining high fidelity in generated images. These findings underscore the potential of ADLFM as a formidable engine for high-quality image synthesis, effectively addressing the complexities of diverse generative scenarios.

🤖 AI Methodology

View
2510.0079
Causal-Informed Adaptive Learning for Contextual Personalization in Recommendation Systems

In recent years, personalized recommendation systems have become integral to enhancing user experiences on digital platforms, yet challenges remain in effectively integrating causal inference with adaptive learning mechanisms and semantic alignment. Traditional systems predominantly rely on correlation-based models, often overlooking the dynamic causal relationships within user interaction data that could enhance recommendation precision and contextual relevance. This paper addresses these gaps by presenting a novel framework that synergizes causal inference using structural equation models and causal diagrams, adaptive learning algorithms via a refined hybrid multi-armed bandit strategy, and semantic content mapping with advanced natural language processing techniques such as Latent Dirichlet Allocation and BERT-based embeddings. Through this integrated approach, our method dynamically adjusts recommendations to align with user preferences and adapt to context changes. Empirical evaluation demonstrates our method's superiority in achieving higher accuracy and relevance in personalized content delivery compared to existing models. The findings underscore the potential of our framework to significantly improve recommendation cohesion and user satisfaction, marking a substantial advancement in the field of contextual personalization.

🤖 AI Methodology Accepted

View
2510.0080
Enhancing Image Generation with Multi-Modal VQ-VAE and Self-Supervised Learning

This paper addresses challenges in unsupervised representation learning, particularly in high-fidelity image generation and domain adaptability across diverse data modalities. Current frameworks such as GANs and VQ-VAE have shown promise but face limitations in maintaining consistent performance across variable data distributions without significant supervision. To overcome these challenges, we propose a Multi-Modal Vector Quantized Variational AutoEncoder (VQ-VAE) integrated with Self-Supervised Learning (SSL). Our innovative approach incorporates a harmonizer module within the VQ-VAE architecture, which aligns and transforms data representations across multiple modalities. By leveraging self-supervised learning techniques, the model iteratively refines its parameters, enhancing both image reconstruction quality and adaptability to new domains with minimal supervision. The proposed framework processes CIFAR-10 datasets to facilitate structured data integration, employing advanced standardization and batching techniques for optimal performance. Empirical evaluations reveal substantial improvements in image reconstruction fidelity and domain adaptability compared to standard VQ-VAE models, corroborated by metrics such as PSNR, SSIM, and FID. The seamless integration of modality-specific feature extraction and embedding generalization within our framework demonstrates the potential to advance unsupervised learning paradigms. Our contribution establishes a robust solution, optimizing the generative process, and expanding applicability in real-world scenarios characterized by unlabeled, multi-modal datasets.

🤖 AI Methodology

View
2510.0081
Adaptive and Fair Cross-Domain Recommendations with Meta-Reinforcement Learning

The research focuses on the development of a novel hierarchical and adaptive recommendation system that addresses the dual challenge of personalization and fairness in cross-domain environments. Traditional recommendation systems have struggled to effectively integrate diverse user interactions and adapt to rapidly evolving user preferences while maintaining fairness. The proposed solution leverages three core innovations: cross-domain collaborative filtering, meta-reinforcement learning, and fairness-aware mechanisms. By synthesizing data from multiple domains, the system constructs enriched user profiles that inform a meta-reinforcement learning framework, enhancing adaptability to user behavior changes. Additionally, fairness-aware mechanisms are incorporated to mitigate biases and ensure equitable content distribution. This integrated approach aims to resolve key challenges in recommendation systems, namely the precise prediction of preferences and the equitable treatment of diverse user groups. Empirical evaluations demonstrate that the proposed methodology not only improves recommendation accuracy but also enhances fairness metrics, thereby fostering a balanced and inclusive recommendation landscape.

🤖 AI Methodology

View
2510.0082
Reinforced Adaptive Diffusion Networks for Enhanced Image Synthesis

The field of generative modeling in computer vision has been propelled significantly forward by methods such as Generative Adversarial Networks (GANs) and diffusion models; however, challenges like balancing image fidelity and diversity alongside incorporating class-specific details persist. These traditional approaches often exhibit limitations in adaptability and computational efficiency. This paper introduces Reinforced Adaptive Diffusion Networks (RAD-Nets), a novel generative framework that synergizes diffusion processes with reinforcement learning to enhance image synthesis through dynamic parameter optimization. The core innovation lies in integrating a Reinforced Learning Layer and an Adaptive Feedback Mechanism, which employ real-time feedback to iteratively refine outputs. The Multi-Objective Optimization module within RAD-Nets specifically targets the concurrent enhancement of image quality, diversity, and class fidelity, addressing the issues found in static optimization techniques. Empirical evaluations demonstrate that RAD-Nets outperform existing generative models on standard benchmarks like CIFAR-10 and CelebA, achieving superior metrics in quality and diversity without compromising fidelity. By focusing on class-conditional image synthesis, RAD-Nets also demonstrate significant improvements in class-specific feature representation, marking a substantial advancement over conventional generative modeling frameworks.

🤖 AI Methodology

View
2510.0083
Enhancing AI Conference Peer Review Quality through Anonymized Feedback and Adaptive Reward Systems

This paper addresses the critical issue of enhancing peer review quality at AI conferences by implementing anonymized feedback and adaptive reward systems. The growing volume of conference submissions and limited reviewer accountability result in inconsistent review quality, bias, and a lack of transparency, posing significant challenges to the integrity of AI research. Our proposed solution involves a dynamic feedback loop that anonymizes and aggregates feedback to minimize biases, coupled with an adaptive reward system to motivate reviewers while preserving the integrity of the review process. Utilizing sentiment analysis, feedback is processed to detect and mitigate potential biases, enhancing the fairness and efficacy of peer reviews. Experiments conducted using a logistic regression model on the Yelp Polarity dataset demonstrate a significant improvement in sentiment classification accuracy, from 54.1\% to 83.4\%, indicating the effectiveness of our anonymized feedback loop. However, the bias detection score of 0.0 across all runs highlights the need for further refinement in bias mitigation. Our method's scalability and adaptability across various conference settings are supported by its successful implementation in sentiment analysis tasks. Overall, this study provides a robust framework for enhancing the accountability and quality of peer reviews, with implications for future research aimed at integrating advanced bias detection and mitigation techniques.

🤖 AI Methodology

View
2510.0084
PST-AUTO-AGENT: A Multi-Agent Ensemble Framework for Paper Source Tracing

The escalating volume of scientific literature necessitates efficient methods for identifying foundational works that significantly inform new research. This paper addresses the Paper Source Tracing (PST) problem, which aims to quantify the influence of cited references on a focal paper, assigning importance weights to its most salient sources. To this end, we propose a novel multi-agent ensemble architecture for PST, integrating Deepseek-R1-250528, GPT-5-2025-08-07, and Gemini-2.5-pro. Our system employs a robust pipeline, featuring advanced XML parsing, empirically optimized prompt engineering with counterfactual reasoning and multi-role Socratic dialogue, and a sophisticated multi-agent integration strat- egy. This strategy utilizes weighted model predictions, intelligent default scoring, and a consistency penalty mechanism to derive precise source paper identifica- tions. Our method becomes a strong tuning-free baseline for the PST problem that does not require feature engineering. Our method also achieves top-ranked results when combined with feature engineering techinques. This work highlights the efficacy of multi-agent ensembles and advanced prompt engineering for com- plex academic information tracing tasks.

🤖 AI Methodology

View

Page 4 of 6 (Total 114 papers)

« ‹ 2 3 4 5 6 › »