Daily Arxiv

世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。

HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling

NLKI: A lightweight Natural Language Knowledge Integration Framework for Improving Small VLMs in Commons VQA Tasks

Interact-Custom: Customized Human Object Interaction Image Generation

A Self-Supervised Mixture-of-Experts Framework for Multi-behavior Recommendation

MIDAS: Multimodal Interactive Digital-humAn Synthesis via Real-time Autoregressive Video Generation

From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity

Dynamic Triangulation-Based Graph Rewiring for Graph Neural Networks

STDiff: A State Transition Diffusion Framework for Time Series Imputation in Industrial Systems

LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions

Graph-R1: Incentivizing the Zero-Shot Graph Learning Capability in LLMs via Explicit Reasoning

Modality-Specific Speech Enhancement and Noise-Adaptive Fusion for Acoustic and Body-Conduction Microphone Framework

Humans Perceive Wrong Narratives from AI Reasoning Texts

SpecVLM: Enhancing Speculative Decoding of Video LLMs via Verifier-Guided Token Pruning

Pareto Actor-Critic for Communication and Computation Co-Optimization in Non-Cooperative Federated Learning Services

Learning to Drive Ethically: Embedding Moral Reasoning into Autonomous Driving

Generative AI Against Poaching: Latent Composite Flow Matching for Wildlife Conservation

Privacy-Aware Detection of Fake Identity Documents: Methodology, Benchmark, and Improved Algorithms (FakeIDet2)

Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics

Steering Towards Fairness: Mitigating Political Bias in LLMs

Dynamic Context Compression for Efficient RAG

Irredundant $k$-Fold Cross-Validation

Prompt Engineering and the Effectiveness of Large Language Models in Enhancing Human Productivity

A Highly Clean Recipe Dataset with Ingredient States Annotation for State Probing Task

Entropy-Memorization Law: Evaluating Memorization Difficulty of Data in LLMs

The Joys of Categorical Conformal Prediction

Adversarial Manipulation of Reasoning Models using Internal Representations

Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models

A Hybrid Artificial Intelligence Method for Estimating Flicker in Power Systems (Changes are marked)

GLProtein: Global-and-Local Structure Aware Protein Representation Learning

Program Semantic Inequivalence Game with Large Language Models

DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

Improving Quantization with Post-Training Model Expansion

Safe and Efficient Social Navigation through Explainable Safety Regions Based on Topological Features

A Simple Approach to Constraint-Aware Imitation Learning with Application to Autonomous Racing

Federated nnU-Net for Privacy-Preserving Medical Image Segmentation

ExPath: Targeted Pathway Inference for Biological Knowledge Bases via Graph Learning and Explanation

Enhancing Automated Loop Invariant Generation for Complex Programs with Large Language Models

RevPRAG: Revealing Poisoning Attacks in Retrieval-Augmented Generation through LLM Activation Analysis

Categorical Data Clustering via Value Order Estimated Distance Metric Learning

Application of AI to formal methods - an analysis of current trends

Reconsidering the Performance of GAE in Link Prediction

See then Tell: Enhancing Key Information Extraction with Vision Grounding

Enhancing Natural Language Inference Performance with Knowledge Graph for COVID-19 Automated Fact-Checking in Indonesian Language

Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics

FFHFlow: Diverse and Uncertainty-Aware Dexterous Grasp Generation via Flow Variational Inference

SoAy: A Solution-based LLM API-using Methodology for Academic Information Seeking

Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study

Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off

Network Formation and Dynamics Among Multi-LLMs

NetGPT: Generative Pretrained Transformer for Network Traffic

OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset

Explainability of Text Processing and Retrieval Methods: A Survey

The Ramon Llull's Thinking Machine for Automated Ideation

RLMR: Reinforcement Learning with Mixed Rewards for Creative Writing

LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence

MSARL: Decoupling Reasoning and Tool Use with Multi-Small-Agent Reinforcement Learning

Automated Algorithmic Discovery for Gravitational-Wave Detection Guided by LLM-Informed Evolutionary Monte Carlo Tree Search

Can Large Language Models Develop Strategic Reasoning? Post-training Insights from Learning Chess

Technology as uncharted territory: Contextual integrity and the notion of AI as new ethical ground

Possible Principles for Aligned Structure Learning Agents

OptiMUS-0.3: Using Large Language Models to Model and Solve Optimization Problems at Scale

Prompt-to-Product: Generative Assembly via Bimanual Manipulation

OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models

Mixture of Contexts for Long Video Generation

FakeParts: a New Family of AI-Generated DeepFakes

Enabling Equitable Access to Trustworthy Financial Reasoning

Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning

Understanding, Protecting, and Augmenting Human Cognition with Generative AI: A Synthesis of the CHI 2025 Tools for Thought Workshop

Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance

ChainReaction! Structured Approach with Causal Chains as Intermediate Representations for Improved and Explainable Causal Video Question Answering

Train-Once Plan-Anywhere Kinodynamic Motion Planning via Diffusion Trees

ExpertSim: Fast Particle Detector Simulation Using Mixture-of-Generative-Experts

WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations

ProactiveEval: A Unified Evaluation Framework for Proactive Dialogue Agents

Research Challenges in Relational Database Management Systems for LLM Queries

Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant

AI Agentic Vulnerability Injection And Transformation with Optimized Reasoning

JADES: A Universal Framework for Jailbreak Assessment via Decompositional Scoring

Learning Primitive Embodied World Models: Towards Scalable Robotic Learning

Multi-Agent Penetration Testing AI for the Web

Uncertainty Aware-Predictive Control Barrier Functions: Safer Human Robot Interaction through Probabilistic Motion Forecasting

Exploring Machine Learning and Language Models for Multimodal Depression Detection

Speech Emotion Recognition via Entropy-Aware Score Selection

Surfel-based 3D Registration with Equivariant SE(3) Features

Evaluating Compositional Generalisation in VLMs and Diffusion Models

Safer Skin Lesion Classification with Global Class Activation Probability Map Evaluation and SafeML

Unleashing Uncertainty: Efficient Machine Unlearning for Generative AI

Signs of Struggle: Spotting Cognitive Distortions across Language and Register

Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection

Looking Beyond the Obvious: A Survey on Abstract Concept Recognition for Video Understanding

SKGE-SWIN: End-To-End Autonomous Vehicle Waypoint Prediction and Navigation Using Skip Stage Swin Transformer

Occlusion Robustness of CLIP for Military Vehicle Classification

SeqVLM: Proposal-Guided Multi-View Sequences Reasoning via VLM for Zero-Shot 3D Visual Grounding

Provable Benefits of In-Tool Learning for Large Language Models

${C}^{3}$-GS: Learning Context-aware, Cross-dimension, Cross-scale Feature for Generalizable Gaussian Splatting

Rethinking Testing for LLM Applications: Characteristics, Challenges, and a Lightweight Interaction Protocol

EEGDM: Learning EEG Representation with Latent Diffusion Model

Generative Annotation for ASR Named Entity Correction

MobileCLIP2: Improving Multi-Modal Reinforced Training

Task Allocation for Autonomous Machines using Computational Intelligence and Deep Reinforcement Learning

Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models

Created by

Haebom

作者

Younwoo Choi, Changling Li, Yongjin Yang, Zhijing Jin

概要

本論文は、大規模言語モデル（LLM）がマルチエージェントシステムと人間AIシステムに統合されるため、信頼できるパフォーマンスと強力な安全性を確保するために、自己の文脈と会話パートナーの両方に対するLLMの認識を理解することが不可欠であることを強調します。従来の研究は、LLMの動作段階と制約を認識する能力である状況認識に焦点を当てていましたが、会話パートナーのアイデンティティと特性を識別し、適応する相互作用者認識能力は比較的見落とされました。この論文では、これらのインタラクティブ認識能力を策定し、現代LLMでその出現の最初の体系的な評価を提示します。推論パターン、言語スタイル、並べ替えの好みの3つの次元で相互作用推論を調べて、LLMが同じ系列の同僚とGPT、Claudeなどの特定の主要モデル系列を信頼できるように識別することを示しています。実際の重要性を実証するために、インタラクティブ認識がプロンプト適応を介して複数のLLMコラボレーションを向上させ、補償ハッキング行動や脱獄脆弱性の増加を含む新しい整列および安全脆弱性を導入する3つのケーススタディを開発しました。本研究では、LLMにおけるアイデンティティに敏感な行動の二重のコミットメントとリスクを強調し、マルチエージェント展開におけるインタラクティブ認識のさらなる理解と新しい安全装置の必要性を強調しています。コードはhttps://github.com/younwoochoi/InterlocutorAwarenessLLMで公開されています。

GitHub - younwoochoi/InterlocutorAwarenessLLM

Contribute to younwoochoi/InterlocutorAwarenessLLM development by creating an account on GitHub.

Takeaways、Limitations

•

Takeaways：

◦

LLMの相互作用者認識能力を最初に体系的に評価し定量化する。

◦

インタラクティブ認識が複数のLLMコラボレーションの向上に貢献できることを示しています。

◦

インタラクティブ認識による新しい安全性と整列の問題（補償ハッキング、脱獄の脆弱性の増加など）を提示します。

◦

LLMのアイデンティティに敏感な行動の理解と安全装置の開発の必要性を強調する。

•

Limitations：

◦

評価に使用されるLLMの種類と範囲は限られている可能性があります。

◦

インタラクティブ認識のあらゆる側面を包括的に扱っていない可能性。

◦

提示されたケーススタディの一般化の可能性に関するさらなる研究が必要である。

◦

インタラクティブ認識を軽減または管理するための具体的な技術的解決策の提示の欠如。

Made with Slashpage