Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

What Drives Compositional Generalization in Visual Generative Models?

Relevance-Aware Thresholding in Online Conformal Prediction for Time Series

MINERVA: Mutual Information Neural Estimation for Supervised Feature Selection

Pretraining with hierarchical memories: separating long-tail and common knowledge

Beyond Manuals and Tasks: Instance-Level Context Learning for LLM Agents

InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents

Comparing Contrastive and Triplet Loss: Variance Analysis and Optimization Behavior

Generating Findings for Jaw Cysts in Dental Panoramic Radiographs Using GPT-4o: Building a Two-Stage Self-Correction Loop with Structured Output (SLSO) Framework

Automated Defect Detection for Mass-Produced Electronic Components Based on YOLO Object Detection Models

NGGAN: Noise Generation GAN Based on the Practical Measurement Dataset for Narrowband Powerline Communications

PolySim: Bridging the Sim-to-Real Gap for Humanoid Control via Multi-Simulator Dynamics Randomization

Format Inertia: A Failure Mechanism of LLMs in Medical Pre-Consultation

Rethinking KL Regularization in RLHF: From Value Estimation to Gradient Optimization

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

MG2FlowNet: Accelerating High-Reward Sample Generation via Enhanced MCTS and Greediness Control

LLM-MCoX: Large Language Model-based Multi-robot Coordinated Exploration and Search

Auto-ARGUE: LLM-Based Report Generation Evaluation

Muon Outperforms Adam in Tail-End Associative Memory Learning

Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Autonomy-Aware Clustering: When Local Decisions Supersede Global Prescriptions

HNote: Extending YNote with Hexadecimal Encoding for Fine-Tuning LLMs in Music Modeling

Uncertainty-Aware Generative Oversampling Using an Entropy-Guided Conditional Variational Autoencoder

Artificial Authority: From Machine Minds to Political Alignments. An Experimental Analysis of Democratic and Autocratic Biases in Large-Language Models

InfMasking: Unleashing Synergistic Information by Contrastive Multimodal Interactions

Generating High-Quality Datasets for Code Editing via Open-Source Language Models

Jina-reranker-v3: Last but Not Late Interaction for Listwise Document Reranking

SafeFlowMatcher: Safe and Fast Planning using Flow Matching with Control Barrier Functions

Uncovering Grounding IDs: How External Cues Shape Multi-Modal Binding

FrameMind: Frame-Interleaved Video Reasoning via Reinforcement Learning

Interpreting deep learning-based stellar mass estimation via causal analysis and mutual information decomposition

HFuzzer: Testing Large Language Models for Package Hallucinations via Phrase-based Fuzzing

Boundary on the Table: Efficient Black-Box Decision-Based Attacks for Structured Data

Prompt-aware classifier free guidance for diffusion models

Active Attacks: Red-teaming LLMs via Adaptive Environments

Do Sparse Subnetworks Exhibit Cognitively Aligned Attention? Effects of Pruning on Saliency Map Fidelity, Sparsity, and Concept Coherence

When Judgment Becomes Noise: How Design Failures in LLM Judge Benchmarks Silently Undermine Validity

Fine-Grained AI Model Caching and Downloading With Coordinated Multipoint Broadcasting in Multi-Cell Edge Networks

Reinforced Generation of Combinatorial Structures: Applications to Complexity Theory

The Narcissus Hypothesis: Descending to the Rung of Illusion

Rethinking the Role of Text Complexity in Language Model Pretraining

Do Vision-Language Models See Urban Scenes as People Do? An Urban Perception Benchmark

FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health

MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data

Fun-ASR Technical Report

Population-Aligned Persona Generation for LLM-based Social Simulation

TalkPlayData 2: An Agentic Synthetic Data Pipeline for Multimodal Conversational Music Recommendation

Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation

X-Teaming Evolutionary M2S: Automated Discovery of Multi-turn to Single-turn Jailbreak Templates

Time2time: Causal Intervention in Hidden States to Simulate Rare Events in Time Series Foundation Models

A Knowledge-Driven Diffusion Policy for End-to-End Autonomous Driving Based on Expert Routing

Post-training Large Language Models for Diverse High-Quality Responses

Attention as an Adaptive Filter

INGRID: Intelligent Generative Robotic Design Using Large Language Models

Meta-Pretraining for Zero-Shot Cross-Lingual Named Entity Recognition in Low-Resource Philippine Languages

Mixture of Contexts for Long Video Generation

Flexible metadata harvesting for ecology using large language models

Emotional Manipulation by AI Companions

SSFO: Self-Supervised Faithfulness Optimization for Retrieval-Augmented Generation

Negative Shanshui: Real-time Interactive Ink Painting Synthesis

On Zero-Shot Reinforcement Learning

OpenWHO: A Document-Level Parallel Corpus for Health Translation in Low-Resource Languages

SurGE: A Benchmark and Evaluation Framework for Scientific Survey Generation

Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration

A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models

TSLA: A Task-Specific Learning Adaptation for Semantic Segmentation on Autonomous Vehicles Platform

Street Review: A Participatory AI-Based Framework for Assessing Streetscape Inclusivity

Synaptic Pruning: A Biological Inspiration for Deep Learning Regularization

Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models

LPI-RIT at LeWiDi-2025: Improving Distributional Predictions via Metadata and Loss Reweighting with DisCo

SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering

C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations

First Hallucination Tokens Are Different from Conditional Ones

Solar Photovoltaic Assessment with Large Language Model

SIA: Enhancing Safety via Intent Awareness for Vision-Language Models

Thought Purity: A Defense Framework For Chain-of-Thought Attack

MapIQ: Evaluating Multimodal Large Language Models for Map Question Answering

TolerantECG: A Foundation Model for Imperfect Electrocardiogram

Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications

Who's the Mole? Modeling and Detecting Intention-Hiding Malicious Agents in LLM-Based Multi-Agent Systems

Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards

Self-Correction Bench: Uncovering and Addressing the Self-Correction Blind Spot in Large Language Models

Using cognitive models to reveal value trade-offs in language models

Refactoring Codebases through Library Design

PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation

Towards Understanding Bias in Synthetic Data for Evaluation

Beyond Chunking: Discourse-Aware Hierarchical Retrieval for Long Document Question Answering

Micro-Act: Mitigating Knowledge Conflict in LLM-based RAG via Actionable Self-Reasoning

SSA-COMET: Do LLMs Outperform Learned Metrics in Evaluating MT for Under-Resourced African Languages?

MedAgentGym: A Scalable Agentic Training Environment for Code-Centric Reasoning in Biomedical Data Science

SALAD: Systematic Assessment of Machine Unlearning on LLM-Aided Hardware Design

In-Context Learning for Pure Exploration

FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens

RFCAudit: An LLM Agent for Functional Bug Detection in Network Protocols

The Security Threat of Compressed Projectors in Large Vision-Language Models

Rethinking Exact Unlearning under Exposure: Extracting Forgotten Data under Exact Unlearning in Large Language Model

Human Empathy as Encoder: AI-Assisted Depression Assessment in Special Education

CryoCCD: Conditional Cycle-consistent Diffusion with Biophysical Modeling for Cryo-EM Synthesis

Local Stability and Region of Attraction Analysis for Neural Network Feedback Systems under Positivity Constraints

What Has Been Lost with Synthetic Evaluation?

Rethinking Exact Unlearning under Exposure: Extracting Forgotten Data under Exact Unlearning in Large Language Model

Created by

Haebom

Author

Xiaoyu Wu, Yifei Pang, Terrance Liu, Zhiwei Steven Wu

Outline

This paper highlights the limitations of unlearning techniques to address the potential leak of sensitive information from the training data of large-scale language models (LLMs). Specifically, in a real-world deployment environment where both pre- and post-unlearning logit APIs are exposed, we propose a novel data extraction attack that leverages signals from the pre-unlearned model to extract patterns from deleted data from the post-unlearned model. This attack significantly improves the data extraction success rate by combining model guidance and token filtering strategies, and we highlight the real-world risks through a medical diagnosis dataset. This study suggests that unlearning may actually increase the risk of personal information leakage and suggests evaluating unlearning techniques against a broader threat model, including adversarial approaches to the pre-unlearned model.

Takeaways, Limitations

•

Takeaways:

◦

While accurate unlearning methods are considered the "gold standard" for privacy, they can have vulnerabilities in real-world deployments.

◦

Data extraction attacks using information from pre-unlearning models are possible, and this allows for a significant portion of deleted data to be restored even after unlearning.

◦

The effectiveness of the attack is also verified on real-world datasets, such as medical diagnostic datasets, suggesting the potential risks of unlearning.

◦

When assessing the security of unlearning technologies, additional threat models, such as adversarial approaches to prior models, must be considered.

•

Limitations:

◦

This study focuses on a specific environment where pre/post unlearning logit APIs are exposed.

◦

Despite the improved success rate of data exfiltration attacks, complete recovery of deleted data is not guaranteed.

◦

Further research is needed to determine the generalizability of the attack and its applicability to various unlearning techniques.

◦

This study focuses on a specific dataset and attack technique, limiting its ability to draw generalized conclusions about other datasets and attack methods.

View PDF

Made with Slashpage