Daily Arxiv

世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。

MoSEs: Uncertainty-Aware AI-Generated Text Detection via Mixture of Stylistics Experts with Conditional Thresholds

Avoidance Decoding for Diverse Multi-Branch Story Generation

HydroVision: Predicting Optically Active Parameters in Surface Water Using Computer Vision

HodgeFormer: Transformers for Learnable Operators on Triangular Meshes through Data-Driven Hodge Matrices

MSA2-Net: Utilizing Self-Adaptive Convolution Module to Extract Multi-Scale Information in Medical Image Segmentation

Q-Learning-Driven Adaptive Rewiring for Cooperative Control in Heterogeneous Networks

Spotlighter: Revisiting Prompt Tuning from a Representative Mining View

Multimodal Iterative RAG for Knowledge Visual Question Answering

Embodied AI: Emerging Risks and Opportunities for Policy Action

Meta-learning ecological priors from large language models explains human learning and decision making

Scaffold Diffusion: Sparse Multi-Category Voxel Structure Generation with Discrete Diffusion

Locus: Agentic Predicate Synthesis for Directed Fuzzing

Network-Level Prompt and Trait Leakage in Local Research Agents

The Information Dynamics of Generative Diffusion

Arbitrary Precision Printed Ternary Neural Networks with Holistic Evolutionary Approximation

Murakkab: Resource-Efficient Agentic Workflow Orchestration in Cloud Platforms

LinkAnchor: An Autonomous LLM-Based Agent for Issue-to-Commit Link Recovery

MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents

STREAM (ChemBio): A Standard for Transparently Reporting Evaluations in AI Model Reports

BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models

Learning to Select MCP Algorithms: From Traditional ML to Dual-Channel GAT-MLP

MagicGUI: A Foundational Mobile GUI Agent with Scalable Data Pipeline and Reinforcement Fine-tuning

A DbC Inspired Neurosymbolic Layer for Trustworthy Agent Design

RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems

LanternNet: A Hub-and-Spoke System to Seek and Suppress Spotted Lanternfly Populations

When and Where do Data Poisons Attack Textual Inversion?

Covering a Few Submodular Constraints and Applications

Rethinking Data Protection in the (Generative) Artificial Intelligence Era

LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling

GroundingDINO-US-SAM: Text-Prompted Multi-Organ Segmentation in Ultrasound with LoRA-Tuned Vision-Language Models

IndexTTS2: A Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech

HERCULES: Hierarchical Embedding-based Recursive Clustering Using LLMs for Efficient Summarization

Multimodal Medical Image Binding via Shared Text Embeddings

Open-Set LiDAR Panoptic Segmentation Guided by Uncertainty-Aware Learning

Revisiting Clustering of Neural Bandits: Selective Reinitialization for Mitigating Loss of Plasticity

LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis

A theoretical framework for self-supervised contrastive learning for continuous dependent data

Securing AI Agents with Information-Flow Control

FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation

Cog-TiPRO: Iterative Prompt Refinement with LLMs to Detect Cognitive Decline via Longitudinal Voice Assistant Commands

Unveil Multi-Picture Descriptions for Multilingual Mild Cognitive Impairment Detection via Contrastive Learning

NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning

When a Reinforcement Learning Agent Encounters Unknown Unknowns

Group-in-Group Policy Optimization for LLM Agent Training

Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer

LawFlow: Collecting and Simulating Lawyers' Thought Processes on Business Formation Case Studies

On Developers' Self-Declaration of AI-Generated Code: An Analysis of Practices

WildFireCan-MMD: A Multimodal Dataset for Classification of User-Generated Content During Wildfires in Canada

Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond

HDVIO2.0: Wind and Disturbance Estimation with Hybrid Dynamics VIO

TruthLens: Visual Grounding for Universal DeepFake Reasoning

Impoola: The Power of Average Pooling for Image-Based Deep Reinforcement Learning

Efficiently Editing Mixture-of-Experts Models with Compressed Experts

Problem Solved? Information Extraction Design Space for Layout-Rich Documents using LLMs

Investigating a Model-Agnostic and Imputation-Free Approach for Irregularly-Sampled Multivariate Time-Series Modeling

Rapid Word Learning Through Meta In-Context Learning

FedP$^2$EFT: Federated Learning to Personalize PEFT for Multilingual LLMs

Predict, Cluster, Refine: A Joint Embedding Predictive Self-Supervised Framework for Graph Representation Learning

Survey on Hand Gesture Recognition from Visual Input

Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models

RouteNet-Gauss: Hardware-Enhanced Network Modeling with Machine Learning

GalaxAlign: Mimicking Citizen Scientists' Multimodal Guidance for Galaxy Morphology Analysis

Soft-TransFormers for Continual Learning

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Domain Consistency Representation Learning for Lifelong Person Re-Identification

Aligning Machine and Human Visual Representations across Abstraction Levels

Towards Agentic AI on Particle Accelerators

Enhancing Natural Language Inference Performance with Knowledge Graph for COVID-19 Automated Fact-Checking in Indonesian Language

Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

Banishing LLM Hallucinations Requires Rethinking Generalization

SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention

MF-OML: Online Mean-Field Reinforcement Learning with Occupation Measures for Large Population Games

Explainable Machine Learning-Based Security and Privacy Protection Framework for Internet of Medical Things Systems

From Metrics to Meaning: Time to Rethink Evaluation in Human-AI Collaborative Design

P2DT: Mitigating Forgetting in task-incremental Learning with progressive prompt Decision Transformer

Towards Agentic OS: An LLM Agent Framework for Linux Schedulers

CoreThink: A Symbolic Reasoning Layer to reason over Long Horizon Tasks with LLMs

ChatCLIDS: Simulating Persuasive AI Dialogues to Promote Closed-Loop Insulin Adoption in Type 1 Diabetes Care

L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search

AHELM: A Holistic Evaluation of Audio-Language Models

The Ramon Llull's Thinking Machine for Automated Ideation

Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning

KIRETT: Knowledge-Graph-Based Smart Treatment Assistant for Intelligent Rescue Operations

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks

Integrating Activity Predictions in Knowledge Graphs

Symbiotic Agents: A Novel Paradigm for Trustworthy AGI-driven Networks

ChordPrompt: Orchestrating Cross-Modal Prompt Synergy for Multi-Domain Incremental Learning in CLIP

Deep Research Agents: A Systematic Examination And Roadmap

Gradients: When Markets Meet Fine-tuning - A Distributed Approach to Model Optimisation

ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research

Shutdownable Agents through POST-Agency

CyberBOT: Towards Reliable Cybersecurity Education via Ontology-Grounded Retrieval Augmented Generation

PadChest-GR: A Bilingual Chest X-ray Dataset for Grounded Radiology Report Generation

Can Large Language Models Act as Ensembler for Multi-GNNs?

MorphAgent: Empowering Agents through Self-Evolving Profiles and Decentralized Collaboration

Frugal inference for control

On Generating Monolithic and Model Reconciling Explanations in Probabilistic Scenarios

A Survey on Human-AI Collaboration with Large Foundation Models

JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents

Explicit vs Implicit Memory: Exploring Multi-hop Complex Reasoning Over Personalized Information

Created by

Haebom

作者

Zeyu Zhang, Yang Zhang, Haoran Tan, Rui Li, Xu Chen

概要

この論文は、大規模言語モデルベースのエージェントでユーザー情報を保存して活用して、パーソナライゼーションを達成する上で重要な役割を果たすメモリ機能に焦点を当てています。既存の研究は好みのソートと単純なクエリ応答にメモリを活用していますが、この論文は実際の世界の複雑な作業が大量のユーザー情報の多段階推論を必要とすることに注意してください。これらの制限を解決するために、多段階のパーソナライゼーション推論作業を提案し、そのためのデータセットと統合評価フレームワークを構築します。明示的および暗黙的なメモリ方法を実装して包括的な実験を実行し、さまざまな観点からパフォーマンスを評価し、強みと弱点を分析します。さらに、2つのパラダイムを組み合わせたハイブリッドアプローチを探索し、HybridMemと呼ばれる方法を提案することによって制限を解決します。広範な実験を通して提案されたモデルの効果を実証し、研究コミュニティに貢献するためにプロジェクトを公開します（ https://github.com/nuster1128/MPR ）。

Takeaways、Limitations

•

Takeaways：多段階パーソナライゼーション推論タスクの明確な定義とデータセット、統合評価フレームワークの提示。さまざまなメモリ技術のパフォーマンス比較分析とハイブリッドアプローチ（HybridMem）提案によるパフォーマンスの向上。研究コミュニティのためのプロジェクトの開示。

•

Limitations：提案されたデータセットと評価フレームワークの一般化の可能性に関する追加の検証が必要です。さまざまな種類の複雑な現実世界の仕事のための適用性レビューが必要です。 HybridMemを含む提案された方法のスケーラビリティと効率に関するさらなる研究の必要性。

Made with Slashpage