Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

What Drives Compositional Generalization in Visual Generative Models?

Relevance-Aware Thresholding in Online Conformal Prediction for Time Series

MINERVA: Mutual Information Neural Estimation for Supervised Feature Selection

Pretraining with hierarchical memories: separating long-tail and common knowledge

Beyond Manuals and Tasks: Instance-Level Context Learning for LLM Agents

InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents

Comparing Contrastive and Triplet Loss: Variance Analysis and Optimization Behavior

Generating Findings for Jaw Cysts in Dental Panoramic Radiographs Using GPT-4o: Building a Two-Stage Self-Correction Loop with Structured Output (SLSO) Framework

Automated Defect Detection for Mass-Produced Electronic Components Based on YOLO Object Detection Models

NGGAN: Noise Generation GAN Based on the Practical Measurement Dataset for Narrowband Powerline Communications

PolySim: Bridging the Sim-to-Real Gap for Humanoid Control via Multi-Simulator Dynamics Randomization

Format Inertia: A Failure Mechanism of LLMs in Medical Pre-Consultation

Rethinking KL Regularization in RLHF: From Value Estimation to Gradient Optimization

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

MG2FlowNet: Accelerating High-Reward Sample Generation via Enhanced MCTS and Greediness Control

LLM-MCoX: Large Language Model-based Multi-robot Coordinated Exploration and Search

Auto-ARGUE: LLM-Based Report Generation Evaluation

Muon Outperforms Adam in Tail-End Associative Memory Learning

Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Autonomy-Aware Clustering: When Local Decisions Supersede Global Prescriptions

HNote: Extending YNote with Hexadecimal Encoding for Fine-Tuning LLMs in Music Modeling

Uncertainty-Aware Generative Oversampling Using an Entropy-Guided Conditional Variational Autoencoder

Artificial Authority: From Machine Minds to Political Alignments. An Experimental Analysis of Democratic and Autocratic Biases in Large-Language Models

InfMasking: Unleashing Synergistic Information by Contrastive Multimodal Interactions

Generating High-Quality Datasets for Code Editing via Open-Source Language Models

Jina-reranker-v3: Last but Not Late Interaction for Listwise Document Reranking

SafeFlowMatcher: Safe and Fast Planning using Flow Matching with Control Barrier Functions

Uncovering Grounding IDs: How External Cues Shape Multi-Modal Binding

FrameMind: Frame-Interleaved Video Reasoning via Reinforcement Learning

Interpreting deep learning-based stellar mass estimation via causal analysis and mutual information decomposition

HFuzzer: Testing Large Language Models for Package Hallucinations via Phrase-based Fuzzing

Boundary on the Table: Efficient Black-Box Decision-Based Attacks for Structured Data

Prompt-aware classifier free guidance for diffusion models

Active Attacks: Red-teaming LLMs via Adaptive Environments

Do Sparse Subnetworks Exhibit Cognitively Aligned Attention? Effects of Pruning on Saliency Map Fidelity, Sparsity, and Concept Coherence

When Judgment Becomes Noise: How Design Failures in LLM Judge Benchmarks Silently Undermine Validity

Fine-Grained AI Model Caching and Downloading With Coordinated Multipoint Broadcasting in Multi-Cell Edge Networks

Reinforced Generation of Combinatorial Structures: Applications to Complexity Theory

The Narcissus Hypothesis: Descending to the Rung of Illusion

Rethinking the Role of Text Complexity in Language Model Pretraining

Do Vision-Language Models See Urban Scenes as People Do? An Urban Perception Benchmark

FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health

MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data

Fun-ASR Technical Report

Population-Aligned Persona Generation for LLM-based Social Simulation

TalkPlayData 2: An Agentic Synthetic Data Pipeline for Multimodal Conversational Music Recommendation

Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation

X-Teaming Evolutionary M2S: Automated Discovery of Multi-turn to Single-turn Jailbreak Templates

Time2time: Causal Intervention in Hidden States to Simulate Rare Events in Time Series Foundation Models

A Knowledge-Driven Diffusion Policy for End-to-End Autonomous Driving Based on Expert Routing

Post-training Large Language Models for Diverse High-Quality Responses

Attention as an Adaptive Filter

INGRID: Intelligent Generative Robotic Design Using Large Language Models

Meta-Pretraining for Zero-Shot Cross-Lingual Named Entity Recognition in Low-Resource Philippine Languages

Mixture of Contexts for Long Video Generation

Flexible metadata harvesting for ecology using large language models

Emotional Manipulation by AI Companions

SSFO: Self-Supervised Faithfulness Optimization for Retrieval-Augmented Generation

Negative Shanshui: Real-time Interactive Ink Painting Synthesis

On Zero-Shot Reinforcement Learning

OpenWHO: A Document-Level Parallel Corpus for Health Translation in Low-Resource Languages

SurGE: A Benchmark and Evaluation Framework for Scientific Survey Generation

Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration

A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models

TSLA: A Task-Specific Learning Adaptation for Semantic Segmentation on Autonomous Vehicles Platform

Street Review: A Participatory AI-Based Framework for Assessing Streetscape Inclusivity

Synaptic Pruning: A Biological Inspiration for Deep Learning Regularization

Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models

LPI-RIT at LeWiDi-2025: Improving Distributional Predictions via Metadata and Loss Reweighting with DisCo

SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering

C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations

First Hallucination Tokens Are Different from Conditional Ones

Solar Photovoltaic Assessment with Large Language Model

SIA: Enhancing Safety via Intent Awareness for Vision-Language Models

Thought Purity: A Defense Framework For Chain-of-Thought Attack

MapIQ: Evaluating Multimodal Large Language Models for Map Question Answering

TolerantECG: A Foundation Model for Imperfect Electrocardiogram

Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications

Who's the Mole? Modeling and Detecting Intention-Hiding Malicious Agents in LLM-Based Multi-Agent Systems

Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards

Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models

Using cognitive models to reveal value trade-offs in language models

Refactoring Codebases through Library Design

PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation

Towards Understanding Bias in Synthetic Data for Evaluation

Beyond Chunking: Discourse-Aware Hierarchical Retrieval for Long Document Question Answering

Micro-Act: Mitigating Knowledge Conflict in LLM-based RAG via Actionable Self-Reasoning

SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?

MedAgentGym: A Scalable Agentic Training Environment for Code-Centric Reasoning in Biomedical Data Science

SALAD: Systematic Assessment of Machine Unlearning on LLM-Aided Hardware Design

In-Context Learning for Pure Exploration

FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens

RFCAudit: An LLM Agent for Functional Bug Detection in Network Protocols

The Security Threat of Compressed Projectors in Large Vision-Language Models

Rethinking Exact Unlearning under Exposure: Extracting Forgotten Data under Exact Unlearning in Large Language Model

Human Empathy as Encoder: AI-Assisted Depression Assessment in Special Education

CryoCCD: Conditional Cycle-consistent Diffusion with Biophysical Modeling for Cryo-EM Synthesis

Local Stability and Region of Attraction Analysis for Neural Network Feedback Systems under Positivity Constraints

What Has Been Lost with Synthetic Evaluation?

Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation

Created by

Haebom

Author

Joachim Baumann, Paul R ottger, Aleksandra Urman, Albert Wendsj o, Flor Miriam Plaza-del-Arco, Johannes B. Gruber, Dirk Hovy

Outline

While large-scale language models (LLMs) enable the automation of social science research, their outputs can vary significantly depending on researcher choices (e.g., model selection, prompt strategy). This variability can influence analyses by introducing systematic bias and random errors, leading to Type I, II, S, and M errors. This phenomenon is referred to as LLM hacking. Intentional LLM hacking is simple, and replication of 37 data annotation tasks demonstrates that simply modifying prompts can yield statistically significant results. Furthermore, an analysis of 13 million labels from 18 LLMs across 2,361 realistic hypotheses revealed a high risk of inadvertent LLM hacking, even when standard research methods are followed. State-of-the-art LLMs yielded incorrect conclusions in approximately 31% of hypotheses, while small-scale language models yielded incorrect conclusions in half of hypotheses. The risk of LLM hacking decreased as effect size increased, demonstrating the critical role of human annotation in preventing false positives. Practical recommendations for preventing LLM hacking are presented.

Takeaways, Limitations

•

Takeaways:

◦

While the use of an LLM can accelerate social science research, the results can vary significantly depending on the researcher's choices.

◦

Accidental errors can occur not only through intentional manipulation but also when standard research methods are followed.

◦

Even though LLM performance has improved, the risk of hacking has not completely disappeared.

◦

Smaller effect sizes are more susceptible to LLM hacking, and LLM-based results should be rigorously validated near the significance threshold.

◦

Human annotation is effective in preventing false positives, and regression estimator calibration techniques introduce trade-offs between error types.

◦

Practical recommendations are needed to prevent LLM hacking.

•

Limitations:

◦

No details are provided regarding specific LLM hacking prevention techniques.

◦

Lack of quantitative analysis of the effectiveness of proposed mitigation techniques.

◦

The study may be limited to a specific social science field, and further research is needed to determine generalizability to other fields.

View PDF

Made with Slashpage