/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
VarCoNet: A variability-aware self-supervised framework for functional connectome extraction from resting-state fMRI
KAIROS: Unified Training for Universal Non-Autoregressive Time Series Forecasting
SingMOS-Pro: An Comprehensive Benchmark for Singing Quality Assessment
Pack and Force Your Memory: Long-form and Consistent Video Generation
Understanding Adversarial Transfer: Why Representation-Space Attacks Fail Where Data-Space Attacks Succeed
GPT and Prejudice: A Sparse Approach to Understanding Learned Representations in Large Language Models
Analyzing Latent Concepts in Code Language Models
Less is More: Lean yet Powerful Vision-Language Model for Autonomous Driving
DM-Bench: Benchmarking LLMs for Personalized Decision Making in Diabetes Management
YOLO-Based Defect Detection for Metal Sheets
Jina-reranker-v3: Last but Not Late Interaction for Listwise Document Reranking
SecInfer: Preventing Prompt Injection via Inference-time Scaling
Putnam-like dataset summary: LLMs as mathematical competition contestants
Causal-Adapter: Taming Text-to-Image Diffusion for Faithful Counterfactual Generation
Enhancing LLM Steering through Sparse Autoencoder-Based Vector Refinement
Observation-Free Attacks on Online Learning to Rank
MTRec: Learning to Align with User Preferences via Mental Reward Models
MobiLLM: An Agentic AI Framework for Closed-Loop Threat Mitigation in 6G Open RANs
When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models
Flow-Induced Diagonal Gaussian Processes
Towards Size-invariant Salient Object Detection: A Generic Evaluation and Optimization Approach
Dual-Stage Reweighted MoE for Long-Tailed Egocentric Mistake Detection
Robust Pan-Cancer Mitotic Figure Detection with YOLOv12
Scam2Prompt: A Scalable Framework for Auditing Malicious Scam Endpoints in Production LLMs
Better by Comparison: Retrieval-Augmented Contrastive Reasoning for Automatic Prompt Optimization
STORI: A Benchmark and Taxonomy for Stochastic Environments
A Study on the Framework for Evaluating the Ethics and Trustworthiness of Generative AI
Grounding the Ungrounded: A Spectral-Graph Framework for Quantifying Hallucinations in multimodal LLMs
FinAgentBench: A Benchmark Dataset for Agentic Retrieval in Financial Question Answering
RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization
Quantum-RAG and PunGPT2: Advancing Low-Resource Language Generation and Retrieval for the Punjabi Language
Tuning LLM-based Code Optimization via Meta-Prompting: An Industrial Perspective
SBP-YOLO:A Lightweight Real-Time Model for Detecting Speed Bumps and Potholes toward Intelligent Vehicle Suspension Systems
An Architecture for Spatial Networking
A Comprehensive Review on Harnessing Large Language Models to Overcome Recommender System Challenges
First Hallucination Tokens Are Different from Conditional Ones
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
Model Parallelism With Subnetwork Data Parallelism
VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting
A Survey of Pun Generation: Datasets, Evaluations and Methodologies
Controlled Generation with Equivariant Variational Flow Matching
CAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree
DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation
SP-VLA: A Joint Model Scheduling and Token Pruning Approach for VLA Model Acceleration
Semantic Preprocessing for LLM-based Malware Analysis
Manipulating 3D Molecules in a Fixed-Dimensional E(3)-Equivariant Latent Space
Permissioned LLMs: Enforcing Access Control in Large Language Models
Efficient Preimage Approximation for Neural Network Certification
JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models
NeSyGeo: A Neuro-Symbolic Framework for Multimodal Geometric Reasoning Data Generation
Leveraging Online Data to Enhance Medical Knowledge in a Small Persian Language Model
Pre-training Limited Memory Language Models with Internal and External Knowledge
OT Score: An OT based Confidence Score for Source Free Unsupervised Domain Adaptation
Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit Experiments
A Survey of Deep Learning for Complex Speech Spectrograms
Continuous Thought Machines
CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering
XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs
AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
PropRAG: Guiding Retrieval with Beam Search over Proposition Paths
Activated LoRA: Fine-tuned LLMs for Intrinsics
Not a nuisance but a useful heuristic: Outlier dimensions favor frequent tokens in language models
Verbosity Tradeoffs and the Impact of Scale on the Faithfulness of LLM Self-Explanations
Towards Quantifying Long-Range Interactions in Graph Machine Learning: a Large Graph Dataset and a Measurement
DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation
A Multi-Fidelity Control Variate Approach for Policy Gradient Estimation
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
Rethinking the Vulnerability of Concept Erasure and a New Method
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM トレーニング
MarketSenseAI 2.0: Enhancing Stock Analysis through LLM Agents
CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification
Graph Neural Networks for Transmission Grid Topology Control: Busbar Information Asymmetry and Heterogeneous Representations
Inferring Pluggable Types with Machine Learning
Optimizing Container Loading and Unloading through Dual-Cycling and Dockyard Rehandle Reduction Using a Hybrid Genetic Algorithm
LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing
Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders
RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives
Unified Domain Adaptive Semantic Segmentation
Do AI Models Perform Human-like Abstract Reasoning Across Modalities?
Learning to Decide with Just Enough: Information-Theoretic Context Summarization for CMDPs
Thinkquel: A Model Dedicated to Text-to-dbt Using Synthetic Data and a Span-Aware Objective
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!
Learning to Interact in World Latent for Team Coordination
Understanding Generative Recommendation with Semantic IDs from a Model-scaling View
GUI-PRA: Process Reward Agent for GUI Tasks
PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning
Efficient & Correct Predictive Equivalence for Decision Trees
THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning
Gala: Global LLM Agents for Text-to-Model Translation
Disentangling Multiplex Spatial-Temporal Transition Graph Representation Learning for Socially Enhanced POI Recommendation
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers
Bridging Ethical Principles and Algorithmic Methods: An Alternative Approach for Assessing Trustworthiness in AI Systems
V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving
MIRROR: Modular Internal Processing for Personalized Safety in LLM Dialogue
SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning
Grounding Multimodal LLMs to Embodied Agents that Ask for Help with Reinforcement Learning
ViLBias: Detecting and Reasoning about Bias in Multimodal Content
OML: A Primitive for Reconciling Open Access with Owner Control in AI Model Distribution
Improved Monte Carlo Planning via Causal Disentanglement for Structurally-Decomposed Markov Decision Processes
Load more
LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers
Created by
Haebom
作者
Jingze Zhu, Yongliang Wu, Wenbo Zhu, Jiawang Cao, Yanqiang Zheng, Jiawei Chen, Xu Yang, Bernt Schiele, Jonas Fischer, Xinting Hu
概要
大規模言語モデル(LLM)は自然言語の理解と生成に優れていますが、実際にはエラーに弱く、知識集約的な作業における信頼性を制限します。復号時点戦略は訓練なしで効率的な解決策を提供するが、既存の方法はトークンレベルおよびレイヤレベル信号を別々に処理し、それらの間の共同ダイナミクスを見落とす。本研究では、特定のトークンタイプを最も影響力のあるコンバータレイヤと整列させることで、リアルな生成を改善するためのトークン認識、レイヤローカライズコントラストデコード方法を紹介します。経験的注意分析により、句読点トークンは初期層で支配的な注意を受け、概念的トークンが中間層で意味論的推論を支配する2つの主なパターンを識別した。対応する深さでこのトークンタイプに対する注意を選択的に抑制することによって、制御された事実的劣化の導出を達成し、最終的な事実的復号を導くコントラスト信号を導出する。この方法は、追加の訓練やモデルの変更を必要とせず、いくつかのLLMおよびさまざまなベンチマークで一貫して現実性を向上させることを実験によって実証しています。
Takeaways、Limitations
•
Takeaways:
◦
トークンレベルとレイヤレベル信号間の共同ダイナミクスを考慮してLLMの現実性問題を解決する新しいアプローチを提示した。
◦
追加のトレーニングやモデルの変更なしに、さまざまなLLMでリアリティ性能を向上させます。
◦
句読点トークンと概念的なトークンの注意パターンを分析し、方法論の設計に活用する。
◦
現実的な劣化を制御してコントラスト信号を導く革新的な方法を提示します。
•
Limitations:
◦
特定のトークンタイプ(句読点、概念的なトークン)の注意パターン分析に依存するため、他のタイプのトークンまたはモデル構造の一般化が制限される可能性があります。
◦
方法論の性能は特定のLLMおよびベンチマークに限定することができ、さまざまなドメインへの適用可能性をさらに検証する必要があります。
◦
注意抑制メカニズムが他のLLMの能力(流暢さ、創造性など)に与える影響をさらに分析する必要があります。
PDFを見る
Made with Slashpage