Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

(R)evolution of Programming: Vibe Coding as a Post-Coding Paradigm

Y-shaped Generative Flows

Trustworthy Retrosynthesis: Eliminating Hallucinations with a Diverse Ensemble of Reaction Scorers

The Algorithmic Regulator

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

Ultralytics YOLO Evolution: An Overview of YOLO26, YOLO11, YOLOv8 and YOLOv5 Object Detectors for Computer Vision and Pattern Recognition

Saving SWE-Bench: A Benchmark Mutation Approach for Realistic Agent Evaluation

MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning

H1: Bootstrapping LLMs to Reason over Longer Horizons via Reinforcement Learning

SafeGuider: Robust and Practical Content Safety Control for Text-to-Image Models

HybridFlow: Quantification of Aleatoric and Epistemic Uncertainty with a Single Hybrid Model

Detecting Distillation Data from Reasoning Models

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

SAGE-Music: Low-Latency Symbolic Music Generation via Attribute-Specialized Key-Value Head Sharing

On Robustness of Vision-Language-Action Model against Multi-Modal Perturbations

Learning Inter-Atomic Potentials without Explicit Equivariance

Towards A Universally Transferable Acceleration Method for Density Functional Theory

Functional Critic Modeling for Provably Convergent Off-Policy Actor-Critic

Variational Reasoning for Language Models

Learning Equivariant Functions via Quadratic Forms

Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning

Defending against Stegomalware in Deep Neural Networks with Permutation Symmetry

LibEMER: A novel benchmark and algorithms library for EEG-based Multimodal Emotion Recognition

Self-Evolving LLMs via Continual Instruction Tuning

Can an Individual Manipulate the Collective Decisions of Multi-Agents?

CAGE: Continuity-Aware edGE Network Unlocks Robust Floorplan Reconstruction

Adversarial Distilled Retrieval-Augmented Guarding Model for Online Malicious Intent Detection

Visible Yet Unreadable: A Systematic Blind Spot of Vision Language Models Across Writing Systems

Towards Methane Detection Onboard Satellites

EO-1: Interleaved Vision-Text-Action Pretraining for General Robot Control

GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Reliable generation of isomorphic physics problems using Generative AI with prompt-chaining and tool use

Hierarchical Evaluation Function: A Multi-Metric Approach for Optimizing Demand Forecasting Models

Geometry-Aware Global Feature Aggregation for Real-Time Indirect Illumination

Evolution of AI Agent Registry Solutions: Centralized, Enterprise, and Distributed Approaches

Your AI, Not Your View: The Bias of LLMs in Investment Analysis

DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning

Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations

Early Signs of Steganographic Capabilities in Frontier LLMs

Orthogonal Finetuning Made Scalable

LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

A Brain-to-Population Graph Learning Framework for Diagnosing Brain Disorders

Investigating the interaction of linguistic and mathematical reasoning in language models using multilingual number puzzles

PAL: Probing Audio Encoders via LLMs - Audio Information Transfer into LLMs

Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series

Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning

Decentralizing Multi-Agent Reinforcement Learning with Temporal Causal Information

Superior Molecular Representations from Intermediate Encoder Layers

FLEX: A Largescale Multimodal, Multiview Dataset for Learning Structured Representations for Fitness Action Quality Assessment

The quest for the GRAph Level autoEncoder (GRALE)

Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding

Multi-Scale Probabilistic Generation Theory: A Unified Information-Theoretic Framework for Hierarchical Structure in Large Language Models

ReasoningShield: Safety Detection over Reasoning Traces of Large Reasoning Models

R$^2$ec: Towards Large Recommender Models with Reasoning

Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

Flattening Hierarchies with Policy Bootstrapping

FineScope: Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation

MIRROR: Multimodal Cognitive Reframing Therapy for Rolling with Resistance

Statistical post-processing yields accurate probabilistic forecasts from Artificial Intelligence weather models

TMT: Cross-domain Semantic Segmentation with Region-adaptive Transferability Estimation

On the Consistency of Multilingual Context Utilization in Retrieval-Augmented Generation

A Personalized Data-Driven Generative Model of Human Repetitive Motion

Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations

Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding

PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection

FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery

CSI-BERT2: A BERT-inspired Framework for Efficient CSI Prediction and Classification in Wireless Communication and Sensing

SoundnessBench: A Soundness Benchmark for Neural Network Verifiers

Semantically Guided Action Anticipation

On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse

Reliable Decision Making via Calibration Oriented Retrieval Augmented Generation

A Risk Taxonomy and Reflection Tool for Large Language Model Adoption in Public Health

ProReason: Multi-Modal Proactive Reasoning with Decoupled Eyesight and Wisdom

Optimal Quantization for Matrix Multiplication

Temporal-Difference Variational Continual Learning

Hi-Drive: Hierarchical POMDP Planning for Safe Autonomous Driving in Diverse Urban Environments

Beyond Visual Appearances: Privacy-sensitive Objects Identification via Hybrid Graph Reasoning

Extreme Compression of Adaptive Neural Images

A Comprehensive Survey on Data Augmentation

Do LLM Agents Have Regrets? A Case Study in Online Learning and Games

MULTI: Multimodal Understanding Leaderboard with Text and Images

Nash Equilibria, Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization

When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning

Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs

HardcoreLogic: Challenging Large Reasoning Models with Long-tail Logic Puzzle Games

Tensor Logic: The Language of AI

Do Large Language Models Respect Contracts? Evaluating and Enforcing Contract-Adherence in Code Generation

Benchmarking is Broken -- Don't Let AI be its Own Judge

A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

Towards Unified Multimodal Misinformation Detection in Social Media: A Benchmark Dataset and Baseline

LLM/Agent-as-Data-Analyst: A Survey

SafeSearch: Automated Red-Teaming for the Safety of LLM-Based Search Agents

Coordination Requires Simplification: Thermodynamic Bounds on Multi-Objective Compromise in Natural and Artificial Intelligence

FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games

HealthProcessAI: A Technical Framework and Proof-of-Concept for LLM-Enhanced Healthcare Process Mining

TASER: Table Agents for Schema-guided Extraction and Recommendation

SafeGuider: Robust and Practical Content Safety Control for Text-to-Image Models

Created by

Haebom

Author

Peigui Qi, Kunsheng Tang, Wenbo Zhou, Weiming Zhang, Nenghai Yu, Tianwei Zhang, Qing Guo, Jie Zhang

Outline

Text-to-image models demonstrate remarkable ability to generate high-quality images from natural language descriptions, but they are highly vulnerable to adversarial prompts that can bypass safety measures and generate malicious content. In this paper, we experimentally study the text encoder of the Stable Diffusion (SD) model and find that the [EOS] token acts as a semantic aggregate and exhibits distinct distribution patterns between legitimate and adversarial prompts. Building on this, we introduce SafeGuider, a two-stage framework for robust safety control without compromising generation quality. Combining an embedding-level awareness model and a safety-aware feature-suppressing beam search algorithm, SafeGuider maintains high-quality image generation for legitimate prompts while ensuring robust defense against both in-domain and out-of-domain attacks. SafeGuider achieves an attack success rate of up to 5.48% across various attack scenarios and enhances practicality by generating safe, meaningful images for unsafe prompts instead of rejecting them or generating black images. Furthermore, we demonstrate that SafeGuider can be effectively applied to other text-to-image models, such as the Flux model, in addition to the SD model.

Takeaways, Limitations

•

Takeaways:

◦

SafeGuider provides an effective framework for improving the safety of text-image models.

◦

Increase usability by generating safe and meaningful images for unsafe prompts.

◦

Applicable to various text-image models.

•

Limitations:

◦

There is no specific mention of Limitations in the paper.

View PDF

Made with Slashpage