Daily Arxiv

世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。

Language Models are Injective and Hence Invertible

Learning to Detect Unknown Jailbreak Attacks in Large Vision-Language Models

Latent Diffusion Model without Variational Autoencoder

Planner and Executor: Collaboration between Discrete Diffusion And Autoregressive Models in Reasoning

CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions

Architecture Is All You Need: Diversity-Enabled Sweet Spots for Robust Humanoid Locomotion

LeapFactual: Reliable Visual Counterfactual Explanation Using Conditional Flow Matching

Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering

STANCE: Motion Coherent Video Generation Via Sparse-to-Dense Anchored Encoding

MedTrust-RAG: Evidence Verification and Trust Alignment for Biomedical Question Answering

Beyond One World: Benchmarking Super Heros in Role-Playing Across Multiversal Contexts

Static Sandboxes Are Inadequate: Modeling Societal Complexity Requires Open-Ended Co-Evolution in LLM-Based Multi-Agent Simulations

Deflanderization for Game Dialogue: Balancing Character Authenticity with Task Execution in LLM-based NPCs

ConsintBench: Evaluating Language Models on Real-World Consumer Intent Understanding

Max It or Miss It: Benchmarking LLM On Solving Extremal Problems

Phenome-Wide Multi-Omics Integration Uncovers Distinct Archetypes of Human Aging

When Does Supervised Training Pay Off? The Hidden Economics of Object Detection in the Era of Vision-Language Models

The Curious Case of Factual (Mis)Alignment between LLMs' Short- and Long-Form Answers

A Vision for Access Control in LLM-based Agent Systems

Audit-of-Understanding: Posterior-Constrained Inference for Mathematical Reasoning in Language Models

Formally Verified Certification of Unsolvability of Temporal Planning Problems

DICE: Structured Reasoning in LLMs through SLM-Guided Chain-of-Thought Correction

MSDM: Generating Task-Specific Pathology Images with a Multimodal Conditioned Diffusion Model for Cell and Nuclei Segmentation

Synthetic Series-Symbol Data Generation for Time Series Foundation Models

SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation

Online automatic code generation for robot swarms: LLMs and self-organizing hierarchy

A New Digital Divide? Coder Worldviews, the Slop Economy, and Democracy in the Age of AI

Audit the Whisper: Detecting Steganographic Collusion in Multi-Agent LLMs

Creative synthesis of kinematic mechanisms

Market-Driven Subset Selection for Budgeted Training

Mini-vec2vec: Scaling Universal Geometry Alignment with Linear Transformations

A Comparison of Independent and Joint Fine-tuning Strategies for Retrieval-Augmented Generation

TimeEmb: A Lightweight Static-Dynamic Disentanglement Framework for Time Series Forecasting

Learning Generalizable Shape Completion with SIM(3) Equivariance

Dolphin v1.0 Technical Report

A Measurement Study of Model Context Protocol Ecosystem

Diffusion Models are Kelly Gamblers

RHYTHM: Reasoning with Hierarchical Temporal Tokenization for Human Mobility

Semantic Representation Attack against Aligned Large Language Models

Chiplet-Based RISC-V SoC with Modular AI Acceleration

Accurate and Efficient Low-Rank Model Merging in Core Space

The 1st Solution for 7th LSVOS RVOS Track: SaSaSa2VA

Graph Coloring for Multi-Task Learning

Robust LLM Training Infrastructure at ByteDance

RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation

Communications to Circulations: Real-Time 3D Wind Field Prediction Using 5G GNSS Signals and Deep Learning

Why and How Auxiliary Tasks Improve JEPA Representations

Creativity Benchmark: A benchmark for marketing creativity for large language models

SpikingBrain: Spiking Brain-inspired Large Models

Robust Pan-Cancer Mitotic Figure Detection with YOLOv12

BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design

A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers

FlowDet: Overcoming Perspective and Scale Challenges in Real-Time End-to-End Traffic Detection

Epistemic Trade-Off: An Analysis of the Operational Breakdown and Ontological Limits of "Certainty-Scope" in AI

ZeST: an LLM ベースの Zero-Shot Traversability Navigation for Unknown Environments

Interpretable Decision-Making for End-to-End Autonomous Driving

A Systematic Approach to Predict the Impact of Cybersecurity Vulnerabilities Using LLMs

Limitations of Normalization in Attention Mechanism

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

The GPT-4o Shock Emotional Attachment to AI Models and Its Impact on Regulatory Acceptance: A Cross-Cultural Analysis of the Immediate Transition from GPT-4o to GPT-5

CorrSteer: Generation-Time LLM Steering via Correlated Sparse Autoencoder Features

VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models

SegDAC: Improving Visual Reinforcement Learning by Extracting Dynamic Objectc-Centric Representations from Pretrained Vision Models

VGGSounder: Audio-Visual Evaluations for Foundation Models

Evolution of AI Agent Registry Solutions: Centralized, Enterprise, and Distributed Approaches

CAPO: Towards Enhancing LLM Reasoning through Generative Credit Assignment

FGBench: A Dataset and Benchmark for Molecular Property Reasoning at Functional Group-Level in Large Language Models

SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches

A Multi-Stage Hybrid CNN-Transformer Network for Automated Pediatric Lung Sound Classification

From Individual Learning to Market Equilibrium: Correcting Structural and Parametric Biases in RL Simulations of Economic Models

ReDi: Rectified Discrete Flow

Adaptive Policy Synchronization for Scalable Reinforcement Learning

From Sequence to Structure: Uncovering Substructure Reasoning in Transformers

Multimodal Fusion at Three Tiers: Physics-Driven Data Generation and Vision-Language Guidance for Brain Tumor Segmentation

Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences

DP-Fusion: Token-Level Differentially Private Inference for Large Language Models

AI-Generated Video Detection via Perceptual Straightening

From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging

Client Clustering Meets Knowledge Sharing: Enhancing Privacy and Robustness in Personalized Peer-to-Peer Learning

ADA-DPM: A Neural Descriptors-based Adaptive Noise Filtering Strategy for SLAM

GeNIE: A Generalizable Navigation System for In-the-Wild Environments

From Multimodal Perception to Strategic Reasoning: A Survey on AI-Generated Game Commentary

Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling

PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation

Code Execution as Grounded Supervision for LLM Reasoning

Denoising the Future: Top-p Distributions for Moving Through Time

HauntAttack: When Attack Follows Reasoning as a Shadow

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics

Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing

VisuRiddles: Fine-grained Perception is a Primary Bottleneck for Multimodal Large Language Models in Abstract Visual Reasoning

CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching

KG-TRACES: Enhancing Large Language Models with Knowledge Graph-constrained Trajectory Reasoning and Attribution Supervision

SATA-BENCH: Select All That Apply Benchmark for Multiple Choice Questions

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

VERINA: Benchmarking Verifiable Code Generation

RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation

The quest for the GRAph Level autoEncoder (GRALE)

Efficient Large Language Model Inference with Neural Block Linearization

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models

Semantic Representation Attack against Aligned Large Language Models

Created by

Haebom

作者

Jiawei Lian, Jianhong Pan, Lefan Wang, Yi Wang, Shaohui Mei, Lap-Pui Chau

概要

ソートされた大規模言語モデル（LLM）の有害な出力を生成するように促すプロンプトを作成する攻撃は、LLMの安全装置をバイパスする可能性があります。従来の攻撃方式は正確な肯定応答を目指し、制限的な収束、不自然なプロンプト、高い計算コストなどの欠点を示します。この論文では、セマンティック表現攻撃と呼ばれる新しいパラダイムを提案します。これは、正確なテキストパターンの代わりに、同じ有害な意味を持つさまざまな応答をカバーする意味表現空間を利用します。さらに、意味論的一貫性と簡潔性を維持しながら、効率的に敵対的なプロンプトを生成するために解釈可能性を維持する意味表現ヒューリスティック検索アルゴリズムを提案する。実験の結果、提案された方法は、前例のない攻撃成功率（18個のLLMで平均89.41％、11個のモデルで100％）を達成しながら、秘密性と効率性を維持することを示しました。

Takeaways、Limitations

•

Takeaways：

◦

既存攻撃方式の限界を克服し、LLMの安全装置を迂回する新たな攻撃方法を提示

◦

意味表現空間を活用して攻撃成功率を大幅に向上

◦

解釈可能性を維持し、効率的な敵対的なプロンプトを生成

◦

様々なLLMで高い攻撃成功率を示し,方法論の一般的適用性を証明

•

Limitations：

◦

コード公開予定だが、これまでは具体的な実装方法についての情報不足

◦

実験に使用したLLMの種類と詳細な特性に関する情報不足

◦

攻撃に対する防御技術に関する議論の欠如

Made with Slashpage