Daily Arxiv

世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。

VarCoNet: A variability-aware self-supervised framework for functional connectome extraction from resting-state fMRI

KAIROS: Unified Training for Universal Non-Autoregressive Time Series Forecasting

SingMOS-Pro: An Comprehensive Benchmark for Singing Quality Assessment

Pack and Force Your Memory: Long-form and Consistent Video Generation

Understanding Adversarial Transfer: Why Representation-Space Attacks Fail Where Data-Space Attacks Succeed

GPT and Prejudice: A Sparse Approach to Understanding Learned Representations in Large Language Models

Analyzing Latent Concepts in Code Language Models

Less is More: Lean yet Powerful Vision-Language Model for Autonomous Driving

DM-Bench: Benchmarking LLMs for Personalized Decision Making in Diabetes Management

YOLO-Based Defect Detection for Metal Sheets

Jina-reranker-v3: Last but Not Late Interaction for Listwise Document Reranking

SecInfer: Preventing Prompt Injection via Inference-time Scaling

Putnam-like dataset summary: LLMs as mathematical competition contestants

Causal-Adapter: Taming Text-to-Image Diffusion for Faithful Counterfactual Generation

Enhancing LLM Steering through Sparse Autoencoder-Based Vector Refinement

Observation-Free Attacks on Online Learning to Rank

MTRec: Learning to Align with User Preferences via Mental Reward Models

MobiLLM: An Agentic AI Framework for Closed-Loop Threat Mitigation in 6G Open RANs

When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models

Flow-Induced Diagonal Gaussian Processes

Towards Size-invariant Salient Object Detection: A Generic Evaluation and Optimization Approach

Dual-Stage Reweighted MoE for Long-Tailed Egocentric Mistake Detection

Robust Pan-Cancer Mitotic Figure Detection with YOLOv12

Scam2Prompt: A Scalable Framework for Auditing Malicious Scam Endpoints in Production LLMs

Better by Comparison: Retrieval-Augmented Contrastive Reasoning for Automatic Prompt Optimization

STORI: A Benchmark and Taxonomy for Stochastic Environments

A Study on the Framework for Evaluating the Ethics and Trustworthiness of Generative AI

Grounding the Ungrounded: A Spectral-Graph Framework for Quantifying Hallucinations in multimodal LLMs

FinAgentBench: A Benchmark Dataset for Agentic Retrieval in Financial Question Answering

RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization

Quantum-RAG and PunGPT2: Advancing Low-Resource Language Generation and Retrieval for the Punjabi Language

Tuning LLM-based Code Optimization via Meta-Prompting: An Industrial Perspective

SBP-YOLO:A Lightweight Real-Time Model for Detecting Speed Bumps and Potholes toward Intelligent Vehicle Suspension Systems

An Architecture for Spatial Networking

A Comprehensive Review on Harnessing Large Language Models to Overcome Recommender System Challenges

First Hallucination Tokens Are Different from Conditional Ones

Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains

Model Parallelism With Subnetwork Data Parallelism

VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting

A Survey of Pun Generation: Datasets, Evaluations and Methodologies

Controlled Generation with Equivariant Variational Flow Matching

CAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree

DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation

SP-VLA: A Joint Model Scheduling and Token Pruning Approach for VLA Model Acceleration

Semantic Preprocessing for LLM-based Malware Analysis

Manipulating 3D Molecules in a Fixed-Dimensional E(3)-Equivariant Latent Space

Permissioned LLMs: Enforcing Access Control in Large Language Models

Efficient Preimage Approximation for Neural Network Certification

JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models

NeSyGeo: A Neuro-Symbolic Framework for Multimodal Geometric Reasoning Data Generation

Leveraging Online Data to Enhance Medical Knowledge in a Small Persian Language Model

Pre-training Limited Memory Language Models with Internal and External Knowledge

OT Score: An OT based Confidence Score for Source Free Unsupervised Domain Adaptation

Comparing Exploration-Exploitation Strategies of LLMs and Humans: Insights from Standard Multi-armed Bandit Experiments

A Survey of Deep Learning for Complex Speech Spectrograms

Continuous Thought Machines

CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering

XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs

AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation

PropRAG: Guiding Retrieval with Beam Search over Proposition Paths

Activated LoRA: Fine-tuned LLMs for Intrinsics

Not a nuisance but a useful heuristic: Outlier dimensions favor frequent tokens in language models

Verbosity Tradeoffs and the Impact of Scale on the Faithfulness of LLM Self-Explanations

Towards Quantifying Long-Range Interactions in Graph Machine Learning: a Large Graph Dataset and a Measurement

DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation

A Multi-Fidelity Control Variate Approach for Policy Gradient Estimation

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning

Rethinking the Vulnerability of Concept Erasure and a New Method

Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs

Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM トレーニング

MarketSenseAI 2.0: Enhancing Stock Analysis through LLM Agents

CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification

Graph Neural Networks for Transmission Grid Topology Control: Busbar Information Asymmetry and Heterogeneous Representations

Inferring Pluggable Types with Machine Learning

Optimizing Container Loading and Unloading through Dual-Cycling and Dockyard Rehandle Reduction Using a Hybrid Genetic Algorithm

LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing

Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders

RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives

Unified Domain Adaptive Semantic Segmentation

Do AI Models Perform Human-like Abstract Reasoning Across Modalities?

Learning to Decide with Just Enough: Information-Theoretic Context Summarization for CMDPs

Thinkquel: A Model Dedicated to Text-to-dbt Using Synthetic Data and a Span-Aware Objective

OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!

Learning to Interact in World Latent for Team Coordination

Understanding Generative Recommendation with Semantic IDs from a Model-scaling View

GUI-PRA: Process Reward Agent for GUI Tasks

PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning

Efficient & Correct Predictive Equivalence for Decision Trees

THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning

Gala: Global LLM Agents for Text-to-Model Translation

Disentangling Multiplex Spatial-Temporal Transition Graph Representation Learning for Socially Enhanced POI Recommendation

LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers

Bridging Ethical Principles and Algorithmic Methods: An Alternative Approach for Assessing Trustworthiness in AI Systems

V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving

MIRROR: Modular Internal Processing for Personalized Safety in LLM Dialogue

SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

Grounding Multimodal LLMs to Embodied Agents that Ask for Help with Reinforcement Learning

ViLBias: Detecting and Reasoning about Bias in Multimodal Content

OML: A Primitive for Reconciling Open Access with Owner Control in AI Model Distribution

Improved Monte Carlo Planning via Causal Disentanglement for Structurally-Decomposed Markov Decision Processes

When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models

Created by

Haebom

作者

Yingming Zheng, Hanqi Li, Kai Yu, Lu Chen

概要

大規模言語モデル（LLM）は、自然言語処理（NLP）操作で印象的なパフォーマンスを示しました。実際のアプリケーションでは、長いコンテキストウィンドウの需要が高まるにつれて、長いコンテキストデータの継続的な事前トレーニングと地図ベースの微調整（SFT）が一般的なアプローチになりました。データ長の影響は継続的な事前訓練について広く研究されてきたが、SFTに対する影響は不明であった。本研究では、SFT データ長が短いコンテキスト操作で LLM 動作にどのような影響を与えるかを体系的に調査した。逆説的に、長いコンテキストＳＦＴは、短いコンテキスト性能を改善することを見出した。これらの現象の根本的なメカニズムを明らかにするために、Multi-Head Attention（MHA）とFeed-Forward Network（FFN）の2つの主要コンポーネントを分離して分析し、両方のコンポーネントが長いコンテキストSFTから独立して利点を得ることを示しました。さらに、相互作用を研究することで知識の好みの偏りを明らかにしました。最後に、ハイブリッドトレーニングはこの偏向を軽減し、LLM微調整のための説明可能なガイダンスを提供することを実証しました。

Takeaways、Limitations

•

長いコンテキストSFTは、短いコンテキスト操作のパフォーマンスを向上させることができます。

•

MHAとFFNの両方が長いコンテキストSFTから恩恵を受けます。

•

長いコンテキストＳＦＴはコンテキスト知識を、短いコンテキストＳＦＴはパラメータ知識を好む知識偏向が存在する。

•

ハイブリッドトレーニングはこの偏りを軽減することができます。

•

本研究では、SFT データ長の影響を狭い範囲の作業に対してのみ調査したが、他の種類の作業の一般化の可能性についてはさらなる研究が必要となる可能性がある。

Made with Slashpage