Papers
🌟 arXiv Spotlight-
ViewGRPO is Secretly a Process Reward Model
-
ViewWAVECLIP Wavelet Tokenization for Adaptive-Resolution CLIP
-
ViewLAVA Explainability for Unsupervised Latent Embeddings
-
ViewEmerging Paradigms for Securing Federated Learning Systems
-
ViewUniSS Unified Expressive Speech-to-Speech Translation with Your Voice
-
ViewEmbodied Representation Alignment with Mirror Neurons
-
ViewToMPO Training LLM Strategic Decision Making from a Multi-Agent Perspective
-
ViewRL Squeezes, SFT Expands A Comparative Study of Reasoning LLMs
-
ViewTeaching RL Agents to Act Better VLM as Action Advisor for Online Reinforcement Learning
-
ViewExpanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns
-
ViewTrustJudge Inconsistencies of LLM-as-a-Judge and How to Alleviate Them
-
ViewCross-Modal Instructions for Robot Motion Generation
-
ViewGraphUniverse Enabling Systematic Evaluation of Inductive Generalization
-
ViewBest-of-$infty$ -- Asymptotic Performance of Test-Time Compute
-
ViewVision Transformers the threat of realistic adversarial patches
-
ViewTyphoonMLA A Mixed Naive-Absorb MLA Kernel For Shared Prefix
-
ViewWhich Cultural Lens Do Models Adopt On Cultural Positioning Bias and Agentic Mitigation in LLMs
-
ViewCommunication Bias in Large Language Models A Regulatory Perspective
-
ViewRecon-Act A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation,
-
ViewScaleDiff Scaling Difficult Problems for Advanced Mathematical Reasoning