Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

What Drives Compositional Generalization in Visual Generative Models?

Relevance-Aware Thresholding in Online Conformal Prediction for Time Series

MINERVA: Mutual Information Neural Estimation for Supervised Feature Selection

Pretraining with hierarchical memories: separating long-tail and common knowledge

Beyond Manuals and Tasks: Instance-Level Context Learning for LLM Agents

InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents

Comparing Contrastive and Triplet Loss: Variance Analysis and Optimization Behavior

Generating Findings for Jaw Cysts in Dental Panoramic Radiographs Using GPT-4o: Building a Two-Stage Self-Correction Loop with Structured Output (SLSO) Framework

Automated Defect Detection for Mass-Produced Electronic Components Based on YOLO Object Detection Models

NGGAN: Noise Generation GAN Based on the Practical Measurement Dataset for Narrowband Powerline Communications

PolySim: Bridging the Sim-to-Real Gap for Humanoid Control via Multi-Simulator Dynamics Randomization

Format Inertia: A Failure Mechanism of LLMs in Medical Pre-Consultation

Rethinking KL Regularization in RLHF: From Value Estimation to Gradient Optimization

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

MG2FlowNet: Accelerating High-Reward Sample Generation via Enhanced MCTS and Greediness Control

LLM-MCoX: Large Language Model-based Multi-robot Coordinated Exploration and Search

Auto-ARGUE: LLM-Based Report Generation Evaluation

Muon Outperforms Adam in Tail-End Associative Memory Learning

Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Autonomy-Aware Clustering: When Local Decisions Supersede Global Prescriptions

HNote: Extending YNote with Hexadecimal Encoding for Fine-Tuning LLMs in Music Modeling

Uncertainty-Aware Generative Oversampling Using an Entropy-Guided Conditional Variational Autoencoder

Artificial Authority: From Machine Minds to Political Alignments. An Experimental Analysis of Democratic and Autocratic Biases in Large-Language Models

InfMasking: Unleashing Synergistic Information by Contrastive Multimodal Interactions

Generating High-Quality Datasets for Code Editing via Open-Source Language Models

Jina-reranker-v3: Last but Not Late Interaction for Listwise Document Reranking

SafeFlowMatcher: Safe and Fast Planning using Flow Matching with Control Barrier Functions

Uncovering Grounding IDs: How External Cues Shape Multi-Modal Binding

FrameMind: Frame-Interleaved Video Reasoning via Reinforcement Learning

Interpreting deep learning-based stellar mass estimation via causal analysis and mutual information decomposition

HFuzzer: Testing Large Language Models for Package Hallucinations via Phrase-based Fuzzing

Boundary on the Table: Efficient Black-Box Decision-Based Attacks for Structured Data

Prompt-aware classifier free guidance for diffusion models

Active Attacks: Red-teaming LLMs via Adaptive Environments

Do Sparse Subnetworks Exhibit Cognitively Aligned Attention? Effects of Pruning on Saliency Map Fidelity, Sparsity, and Concept Coherence

When Judgment Becomes Noise: How Design Failures in LLM Judge Benchmarks Silently Undermine Validity

Fine-Grained AI Model Caching and Downloading With Coordinated Multipoint Broadcasting in Multi-Cell Edge Networks

Reinforced Generation of Combinatorial Structures: Applications to Complexity Theory

The Narcissus Hypothesis: Descending to the Rung of Illusion

Rethinking the Role of Text Complexity in Language Model Pretraining

Do Vision-Language Models See Urban Scenes as People Do? An Urban Perception Benchmark

FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health

MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data

Fun-ASR Technical Report

Population-Aligned Persona Generation for LLM-based Social Simulation

TalkPlayData 2: An Agentic Synthetic Data Pipeline for Multimodal Conversational Music Recommendation

Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation

X-Teaming Evolutionary M2S: Automated Discovery of Multi-turn to Single-turn Jailbreak Templates

Time2time: Causal Intervention in Hidden States to Simulate Rare Events in Time Series Foundation Models

A Knowledge-Driven Diffusion Policy for End-to-End Autonomous Driving Based on Expert Routing

Post-training Large Language Models for Diverse High-Quality Responses

Attention as an Adaptive Filter

INGRID: Intelligent Generative Robotic Design Using Large Language Models

Meta-Pretraining for Zero-Shot Cross-Lingual Named Entity Recognition in Low-Resource Philippine Languages

Mixture of Contexts for Long Video Generation

Flexible metadata harvesting for ecology using large language models

Emotional Manipulation by AI Companions

SSFO: Self-Supervised Faithfulness Optimization for Retrieval-Augmented Generation

Negative Shanshui: Real-time Interactive Ink Painting Synthesis

On Zero-Shot Reinforcement Learning

OpenWHO: A Document-Level Parallel Corpus for Health Translation in Low-Resource Languages

SurGE: A Benchmark and Evaluation Framework for Scientific Survey Generation

Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration

A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models

TSLA: A Task-Specific Learning Adaptation for Semantic Segmentation on Autonomous Vehicles Platform

Street Review: A Participatory AI-Based Framework for Assessing Streetscape Inclusivity

Synaptic Pruning: A Biological Inspiration for Deep Learning Regularization

Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models

LPI-RIT at LeWiDi-2025: Improving Distributional Predictions via Metadata and Loss Reweighting with DisCo

SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering

C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations

First Hallucination Tokens Are Different from Conditional Ones

Solar Photovoltaic Assessment with Large Language Model

SIA: Enhancing Safety via Intent Awareness for Vision-Language Models

Thought Purity: A Defense Framework For Chain-of-Thought Attack

MapIQ: Evaluating Multimodal Large Language Models for Map Question Answering

TolerantECG: A Foundation Model for Imperfect Electrocardiogram

Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications

Who's the Mole? Modeling and Detecting Intention-Hiding Malicious Agents in LLM-Based Multi-Agent Systems

Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards

Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models

Using cognitive models to reveal value trade-offs in language models

Refactoring Codebases through Library Design

PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation

Towards Understanding Bias in Synthetic Data for Evaluation

Beyond Chunking: Discourse-Aware Hierarchical Retrieval for Long Document Question Answering

Micro-Act: Mitigating Knowledge Conflict in LLM-based RAG via Actionable Self-Reasoning

SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?

MedAgentGym: A Scalable Agentic Training Environment for Code-Centric Reasoning in Biomedical Data Science

SALAD: Systematic Assessment of Machine Unlearning on LLM-Aided Hardware Design

In-Context Learning for Pure Exploration

FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens

RFCAudit: An LLM Agent for Functional Bug Detection in Network Protocols

The Security Threat of Compressed Projectors in Large Vision-Language Models

Rethinking Exact Unlearning under Exposure: Extracting Forgotten Data under Exact Unlearning in Large Language Model

Human Empathy as Encoder: AI-Assisted Depression Assessment in Special Education

CryoCCD: Conditional Cycle-consistent Diffusion with Biophysical Modeling for Cryo-EM Synthesis

Local Stability and Region of Attraction Analysis for Neural Network Feedback Systems under Positivity Constraints

What Has Been Lost with Synthetic Evaluation?

Rethinking KL Regularization in RLHF: From Value Estimation to Gradient Optimization

Created by

Haebom

Author

Kezhao Liu, Jason Klein Liu, Mingtao Chen, Yiming Liu

Outline

By analyzing implementations of the KL divergence loss in RLHF, we propose a unified framework that bridges the two implementation styles of "k_n as reward" and "k_n as loss." This framework illuminates the principle of Reverse KL (RKL) regularization and proves that "k_2 as loss" is gradient-equivalent to "k_1 in reward" under on-policy conditions. Furthermore, we show that "k_3 as loss" is a biased approximation and propose a method to correct the bias that can arise in off-policy implementations.

Takeaways, Limitations

•

Takeaways:

◦

By providing a comprehensive understanding of how KL divergence loss is implemented, we contribute to improving the stability and efficiency of RLHF systems.

◦

We present a correct implementation of the RKL objective by proving the equivalence of 'k_2 as loss' and 'k_1 in reward'.

◦

We point out the limitations of 'k_3 as loss' and suggest a method to solve the bias problem in off-policy implementation.

•

Limitations:

◦

The paper may not contain specific details on the application and performance verification of the methodology presented in an actual RLHF system.

◦

There may be a lack of analysis on the impact of the proposed framework on other KL divergence loss related studies.

◦

Since this analysis is limited to on-policy conditions, additional research on off-policy environments may be required.

Made with Slashpage