Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

MoSEs: Uncertainty-Aware AI-Generated Text Detection via Mixture of Stylistics Experts with Conditional Thresholds

Avoidance Decoding for Diverse Multi-Branch Story Generation

HydroVision: Predicting Optically Active Parameters in Surface Water Using Computer Vision

HodgeFormer: Transformers for Learnable Operators on Triangular Meshes through Data-Driven Hodge Matrices

MSA2-Net: Utilizing Self-Adaptive Convolution Module to Extract Multi-Scale Information in Medical Image Segmentation

Q-Learning-Driven Adaptive Rewiring for Cooperative Control in Heterogeneous Networks

Spotlighter: Revisiting Prompt Tuning from a Representative Mining View

Multimodal Iterative RAG for Knowledge Visual Question Answering

Embodied AI: Emerging Risks and Opportunities for Policy Action

Meta-learning ecological priors from large language models explains human learning and decision making

Scaffold Diffusion: Sparse Multi-Category Voxel Structure Generation with Discrete Diffusion

Locus: Agentic Predicate Synthesis for Directed Fuzzing

Network-Level Prompt and Trait Leakage in Local Research Agents

The Information Dynamics of Generative Diffusion

Arbitrary Precision Printed Ternary Neural Networks with Holistic Evolutionary Approximation

Murakkab: Resource-Efficient Agentic Workflow Orchestration in Cloud Platforms

LinkAnchor: An Autonomous LLM-Based Agent for Issue-to-Commit Link Recovery

MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents

STREAM (ChemBio): A Standard for Transparently Reporting Evaluations in AI Model Reports

BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models

Learning to Select MCP Algorithms: From Traditional ML to Dual-Channel GAT-MLP

MagicGUI: A Foundational Mobile GUI Agent with Scalable Data Pipeline and Reinforcement Fine-tuning

A DbC Inspired Neurosymbolic Layer for Trustworthy Agent Design

RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems

LanternNet: A Hub-and-Spoke System to Seek and Suppress Spotted Lanternfly Populations

When and Where do Data Poisons Attack Textual Inversion?

Covering a Few Submodular Constraints and Applications

Rethinking Data Protection in the (Generative) Artificial Intelligence Era

LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling

GroundingDINO-US-SAM: Text-Prompted Multi-Organ Segmentation in Ultrasound with LoRA-Tuned Vision-Language Models

IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech

HERCULES: Hierarchical Embedding-based Recursive Clustering Using LLMs for Efficient Summarization

Multimodal Medical Image Binding via Shared Text Embeddings

Open-Set LiDAR Panoptic Segmentation Guided by Uncertainty-Aware Learning

Revisiting Clustering of Neural Bandits: Selective Reinitialization for Mitigating Loss of Plasticity

LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis

A theoretical framework for self-supervised contrastive learning for continuous dependent data

Securing AI Agents with Information-Flow Control

FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation

Cog-TiPRO: Iterative Prompt Refinement with LLMs to Detect Cognitive Decline via Longitudinal Voice Assistant Commands

Unveil Multi-Picture Descriptions for Multilingual Mild Cognitive Impairment Detection via Contrastive Learning

NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning

When a Reinforcement Learning Agent Encounters Unknown Unknowns

Group-in-Group Policy Optimization for LLM Agent Training

Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer

LawFlow: Collecting and Simulating Lawyers' Thought Processes on Business Formation Case Studies

On Developers' Self-Declaration of AI-Generated Code: An Analysis of Practices

WildFireCan-MMD: A Multimodal Dataset for Classification of User-Generated Content During Wildfires in Canada

Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond

HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO

TruthLens: Visual Grounding for Universal DeepFake Reasoning

Impoola: The Power of Average Pooling for Image-Based Deep Reinforcement Learning

Efficiently Editing Mixture-of-Experts Models with Compressed Experts

Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs

Investigating a Model-Agnostic and Imputation-Free Approach for Irregularly-Sampled Multivariate Time-Series Modeling

Rapid Word Learning Through Meta In-Context Learning

FedP$^2$EFT: Federated Learning to Personalize PEFT for Multilingual LLMs

Predict, Cluster, Refine: A Joint Embedding Predictive Self-Supervised Framework for Graph Representation Learning

Survey on Hand Gesture Recognition from Visual Input

Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models

RouteNet-Gauss: Hardware-Enhanced Network Modeling with Machine Learning

GalaxAlign: Mimicking Citizen Scientists' Multimodal Guidance for Galaxy Morphology Analysis

Soft-Transformers for Continual Learning

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Domain Consistency Representation Learning for Lifelong Person Re-Identification

Aligning Machine and Human Visual Representations across Abstraction Levels

Towards Agentic AI on Particle Accelerators

Enhancing Natural Language Inference Performance with Knowledge Graph for COVID-19 Automated Fact-Checking in Indonesian Language

Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

Banishing LLM Hallucinations Requires Rethinking Generalization

SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention

MF-OML: Online Mean-Field Reinforcement Learning with Occupation Measures for Large Population Games

Explainable Machine Learning-Based Security and Privacy Protection Framework for Internet of Medical Things Systems

From Metrics to Meaning: Time to Rethink Evaluation in Human-AI Collaborative Design

P2DT: Mitigating Forgetting in task-incremental Learning with progressive prompt Decision Transformer

Towards Agentic OS: An LLM Agent Framework for Linux Schedulers

CoreThink: A Symbolic Reasoning Layer to reason over Long Horizon Tasks with LLMs

ChatCLIDS: Simulating Persuasive AI Dialogues to Promote Closed-Loop Insulin Adoption in Type 1 Diabetes Care

L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search

AHELM: A Holistic Evaluation of Audio-Language Models

The Ramon Llull's Thinking Machine for Automated Ideation

Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning

KIRETT: Knowledge-Graph-Based Smart Treatment Assistant for Intelligent Rescue Operations

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks

Integrating Activity Predictions in Knowledge Graphs

Symbiotic Agents: A Novel Paradigm for Trustworthy AGI-driven Networks

ChordPrompt: Orchestrating Cross-Modal Prompt Synergy for Multi-Domain Incremental Learning in CLIP

Deep Research Agents: A Systematic Examination And Roadmap

Gradients: When Markets Meet Fine-tuning -- A Distributed Approach to Model Optimization

ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research

Shutdownable Agents through POST-Agency

CyberBOT: Towards Reliable Cybersecurity Education via Ontology-Grounded Retrieval Augmented Generation

PadChest-GR: A Bilingual Chest X-ray Dataset for Grounded Radiology Report Generation

Can Large Language Models Act as Ensembler for Multi-GNNs?

MorphAgent: Empowering Agents through Self-Evolving Profiles and Decentralized Collaboration

Frugal inference for control

On Generating Monolithic and Model Reconciling Explanations in Probabilistic Scenarios

A Survey on Human-AI Collaboration with Large Foundation Models

JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents

MF-OML: Online Mean-Field Reinforcement Learning with Occupation Measures for Large Population Games

Created by

Haebom

Author

Anran Hu, Junzi Zhang

Outline

This paper proposes Mean-Field Occupation-Measure Learning (MF-OML), an online mean-field reinforcement learning algorithm for computing approximate Nash equilibria of large-scale collective sequentially symmetric games. MF-OML is the first fully polynomial-time multi-agent reinforcement learning algorithm that provably solves Nash equilibria (with vanishing mean-field approximation errors as the number of players N tends to infinity) beyond zero-sum games and latent game variants. For games with strong Lasry-Lions monotonicity, it achieves a high-probability regret upper bound of $\tilde{O}(M^{3/4}+N^{-1/2}M)$, as measured by the cumulative deviation from the Nash equilibrium, and for games with only Lasry-Lions monotonicity, it achieves a regret upper bound of $\tilde{O}(M^{11/12}+N^{- 1/6}M)$, where M is the total number of episodes and N is the number of agents in the game. As a by-product, we obtain the first tractable globally convergent computational algorithm for computing approximate Nash equilibria of monotonic mean-field games.

Takeaways, Limitations

•

Takeaways:

◦

We propose a new algorithm, MF-OML, for efficiently computing approximate Nash equilibria for large-scale collective sequential symmetric games.

◦

The first fully polynomial-time complexity algorithm that provably solves Nash equilibria beyond zero-sum games and variants of potential games.

◦

We present a tractable global convergence computation algorithm for computing approximate Nash equilibria of monotonic mean-field games.

◦

Lasry-Lions provides a clear upper bound on regret under monotonicity conditions.

•

Limitations:

◦

The algorithm's performance depends on the Lasry-Lions monotonicity condition and may not be applicable to all games.

◦

The regret upper bound includes the mean field approximation error and may not perfectly reflect the difference from the actual Nash equilibrium.

◦

The actual performance of the algorithm may vary depending on the characteristics of the game and requires further experimental verification.

View PDF

Made with Slashpage