Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

Exploring Diffusion Transformer Designs via Grafting

Teaming in the AI Era: AI-Augmented Frameworks for Forming, Simulating, and Optimizing Human Teams

ECoRAG: Evidentiality-guided Compression for Long Context RAG

Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective

Does It Make Sense to Speak of Introspection in Large Language Models?

Sparse Autoencoders, Again?

Feature-Based Lie Group Transformer for Real-World Applications

TracLLM: A Generic Framework for Attributing Long Context LLMs

SemiOccam: A Robust Semi-Supervised Image Recognition Network Using Sparse Labels

Labelling Data with Unknown References

Tug-of-war between idiom's figurative and literal meanings in LLMs

State-Covering Trajectory Stitching for Diffusion Planners

Deep Learning Weather Models for Subregional Ocean Forecasting: A Case Study on the Canary Current Upwelling System

DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation

Subspecialty-Specific Foundation Model for Intelligent Gastrointestinal Pathology

RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving

An Uncertainty-Aware ED-LSTM for Probabilistic Suffix Prediction

SageAttention2++: A More Efficient Implementation of SageAttention2

Autocomp: LLM-Driven Code Optimization for Tensor Accelerators

IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis

Decoupling Representation and Learning in Genetic Programming: the LaSER Approach

Common Data Format (CDF): A Standardized Format for Match-Data in Football (Soccer)

Web Intellectual Property at Risk: Preventing Unauthorized Real-Time Retrieval by Large Language Models

How can Diffusion Models Evolve into Continual Generators?

Open Your Eyes: Vision Enhances Message Passing Neural Networks in Link Prediction

m-KAILIN: Knowledge-Driven Agentic Scientific Corpus Distillation Framework for Biomedical Large Language Models Training

FinSage: A Multi-aspect RAG System for Financial Filings Question Answering

Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning

LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models

Reasoning Towards Fairness: Mitigating Bias in Language Models through Reasoning-Guided Fine-Tuning

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

Multivariate Temporal Regression at Scale: A Three-Pillar Framework Combining ML, XAI, and NLP

GENIUS: A Generative Framework for Universal Multimodal Search

TinySQL: A Progressive Text-to-SQL Dataset for Mechanistic Interpretability Research

ARMOR: Empowering Multimodal Understanding Model with Interleaved Multimodal Generation Capability

A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models

Knowledge Retention for Continual Model-Based Reinforcement Learning

Adversarial Tokenization

UDora: A Unified Red Teaming Framework against LLM Agents by Dynamically Hijacking Their Own Reasoning

SAGE: A Framework of Precise Retrieval for RAG

Graph Attention Networks Unleashed: A Fast and Explainable Vulnerability Assessment Framework for Microgrids

SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models

Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models

Improving Customer Service with Automatic Topic Detection in User Emails

DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model

Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations

SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference

MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models

A Comprehensive Survey on Concept Erasure in Text-to-Image Diffusion Models

LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws

Maximum Entropy Reinforcement Learning with Diffusion Policy

TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking

Relational Conformal Prediction for Correlated Time Series

RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models

UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control

The Complexity of Learning Sparse Superposed Features with Feedback

Peri-LN: Revisiting Normalization Layer in the Transformer Architecture

Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods

An Optimal Cascade Feature-Level Spatiotemporal Fusion Strategy for Anomaly Detection in CAN Bus

ProofAug: Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis

FDLLM: A Dedicated Detector for Black-Box LLMs Fingerprinting

The Bakers and Millers Game with Restricted Locations

Diving into Self-Evolving Training for Multimodal Reasoning

Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation

A Riemannian Optimization Perspective of the Gauss-Newton Method for Feedforward Neural Networks

CoopetitiveV: Leveraging LLM-powered Coopetitive Multi-Agent Prompting for High-quality Verilog Generation

TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies

The Synergy of LLMs & RL Unlocks Offline Learning of Generalizable Language-Conditioned Policies with Low-fidelity Data

Understanding Memorization in Generative Models via Sharpness in Probability Landscapes

A Cognac shot to forget bad memories: Corrective Unlearning in GNNs

Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

Who Can Withstand Chat-Audio Attacks? An Evaluation Benchmark for Large Audio-Language Models

CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP

The Impact of Inference Acceleration on Bias of LLMs

pLDDT-Predictor: High-speed Protein Screening Using Transformer and ESM2

Simmering: Sufficient is better than optimal for training neural networks

PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning

Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation

AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML

VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

Deconfounding Multi-Cause Latent Confounders: A Factor-Model Approach to Climate Model Bias Correction

DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models

Proximal Policy Distillation

BoA: Attention-aware Post-training Quantization without Backpropagation

Certification for Differentially Private Prediction in Gradient-Based Training

Multi-Agent Collaboration via Cross-Team Orchestration

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

Computational Limits of Low-Rank Adaptation (LoRA) Fine-Tuning for Transformer Models

Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models

Mirage: A Multi-Level Superoptimizer for Tensor Programs

Multidimensional Adaptive Coefficient for Inference Trajectory Optimization in Flow and Diffusion

Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages

Structure Guided Large Language Model for SQL Generation

GraphGPT: Generative Pre-trained Graph Eulerian Transformer

Graph Deep Learning for Time Series Forecasting

Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning

Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties Reinforcement Learning

Rethinking Machine Unlearning in Image Generation Models

The Coming Crisis of Multi-Agent Misalignment: AI Alignment Must Be a Dynamic and Social Process

Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods

Created by

Haebom

저자

Isha Puri, Shivchander Sudalairaj, Guangxuan Xu, Kai Xu, Akash Srivastava

개요

본 논문은 모델 크기나 데이터 크기를 키우는 대신 추론 시간에 사용하는 계산량을 늘리는 방법으로 대규모 언어 모델(LLM)의 성능 향상을 모색합니다. 기존의 추론 시간 확장 방법들은 보상 모델을 사용하여 문제를 탐색 문제로 규정하는데, 이는 보상 모델의 근사 오차로 인해 보상 해킹에 취약합니다. 본 논문에서는 추론 시간 확장을 확률적 추론 문제로 규정하고, 샘플링 기반 기법을 활용하여 근사 우도를 가진 상태 공간 모델의 상태 분포의 전형적인 집합을 탐색합니다. 입자 기반 몬테카를로 방법을 적용한 새로운 추론 시간 확장 방법을 제안하며, 다양한 어려운 수학적 추론 작업에서 기존 결정적 탐색 방법보다 4~16배 더 나은 확장률을 보임을 실험적으로 입증합니다. Qwen2.5-Math-1.5B-Instruct 모델은 제안된 방법을 사용하여 4번의 rollout만으로 GPT-4의 정확도를 능가하고, Qwen2.5-Math-7B-Instruct 모델은 32번의 rollout만으로 0.1 수준의 정확도를 달성합니다. 본 연구는 효과적인 추론 시간 확장 방법을 제시할 뿐만 아니라, 풍부한 확률적 추론 관련 연구와 LLM의 추론 시간 확장을 연결하여 향후 더욱 강력한 알고리즘 개발의 토대를 마련합니다.

시사점, 한계점

•

시사점:

◦

추론 시간 확장을 확률적 추론 문제로 규정함으로써 보상 해킹에 대한 취약성을 줄임.

◦

입자 기반 몬테카를로 방법을 활용하여 기존 방법보다 4~16배 향상된 확장률을 달성.

◦

제한된 rollout 횟수로도 우수한 성능을 보이는 것을 실험적으로 입증 (Qwen2.5 모델 예시).

◦

확률적 추론 분야와 LLM 추론 시간 확장 연구를 연결하여 향후 연구 방향 제시.

•

한계점:

◦

제안된 방법의 일반적인 LLM 작업에 대한 성능은 추가 연구가 필요함.

◦

특정 수학적 추론 작업에 대한 결과를 중심으로 제시되었으므로 다른 종류의 작업에 대한 일반화 가능성은 추가 검증 필요.

◦

사용된 Qwen 모델의 특성에 대한 자세한 설명이 부족함.

Made with Slashpage