Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

(R)evolution of Programming: Vibe Coding as a Post-Coding Paradigm

Y-shaped Generative Flows

Trustworthy Retrosynthesis: Eliminating Hallucinations with a Diverse Ensemble of Reaction Scorers

The Algorithmic Regulator

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

Ultralytics YOLO Evolution: An Overview of YOLO26, YOLO11, YOLOv8 and YOLOv5 Object Detectors for Computer Vision and Pattern Recognition

Saving SWE-Bench: A Benchmark Mutation Approach for Realistic Agent Evaluation

MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning

H1: Bootstrapping LLMs to Reason over Longer Horizons via Reinforcement Learning

SafeGuider: Robust and Practical Content Safety Control for Text-to-Image Models

HybridFlow: Quantification of Aleatoric and Epistemic Uncertainty with a Single Hybrid Model

Detecting Distillation Data from Reasoning Models

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

SAGE-Music: Low-Latency Symbolic Music Generation via Attribute-Specialized Key-Value Head Sharing

On Robustness of Vision-Language-Action Model against Multi-Modal Perturbations

Learning Inter-Atomic Potentials without Explicit Equivariance

Towards A Universally Transferable Acceleration Method for Density Functional Theory

Functional Critic Modeling for Provably Convergent Off-Policy Actor-Critic

Variational Reasoning for Language Models

Learning Equivariant Functions via Quadratic Forms

Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning

Defending against Stegomalware in Deep Neural Networks with Permutation Symmetry

LibEMER: A novel benchmark and algorithms library for EEG-based Multimodal Emotion Recognition

Self-Evolving LLMs via Continual Instruction Tuning

Can an Individual Manipulate the Collective Decisions of Multi-Agents?

CAGE: Continuity-Aware edGE Network Unlocks Robust Floorplan Reconstruction

Adversarial Distilled Retrieval-Augmented Guarding Model for Online Malicious Intent Detection

Visible Yet Unreadable: A Systematic Blind Spot of Vision Language Models Across Writing Systems

Towards Methane Detection Onboard Satellites

EO-1: Interleaved Vision-Text-Action Pretraining for General Robot Control

GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Reliable generation of isomorphic physics problems using Generative AI with prompt-chaining and tool use

Hierarchical Evaluation Function: A Multi-Metric Approach for Optimizing Demand Forecasting Models

Geometry-Aware Global Feature Aggregation for Real-Time Indirect Illumination

Evolution of AI Agent Registry Solutions: Centralized, Enterprise, and Distributed Approaches

Your AI, Not Your View: The Bias of LLMs in Investment Analysis

DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning

Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations

Early Signs of Steganographic Capabilities in Frontier LLMs

Orthogonal Finetuning Made Scalable

LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

A Brain-to-Population Graph Learning Framework for Diagnosing Brain Disorders

Investigating the interaction of linguistic and mathematical reasoning in language models using multilingual number puzzles

PAL: Probing Audio Encoders via LLMs - Audio Information Transfer into LLMs

Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series

Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning

Decentralizing Multi-Agent Reinforcement Learning with Temporal Causal Information

Superior Molecular Representations from Intermediate Encoder Layers

FLEX: A Largescale Multimodal, Multiview Dataset for Learning Structured Representations for Fitness Action Quality Assessment

The quest for the GRAph Level autoEncoder (GRALE)

Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding

Multi-Scale Probabilistic Generation Theory: A Unified Information-Theoretic Framework for Hierarchical Structure in Large Language Models

ReasoningShield: Safety Detection over Reasoning Traces of Large Reasoning Models

R$^2$ec: Towards Large Recommender Models with Reasoning

Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

Flattening Hierarchies with Policy Bootstrapping

FineScope: Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation

MIRROR: Multimodal Cognitive Reframing Therapy for Rolling with Resistance

Statistical post-processing yields accurate probabilistic forecasts from Artificial Intelligence weather models

TMT: Cross-domain Semantic Segmentation with Region-adaptive Transferability Estimation

On the Consistency of Multilingual Context Utilization in Retrieval-Augmented Generation

A Personalized Data-Driven Generative Model of Human Repetitive Motion

Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations

Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding

PRISM: Self-Pruning Intrinsic Selection Method for Training-Free Multimodal Data Selection

FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery

CSI-BERT2: A BERT-inspired Framework for Efficient CSI Prediction and Classification in Wireless Communication and Sensing

SoundnessBench: A Soundness Benchmark for Neural Network Verifiers

Semantically Guided Action Anticipation

On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse

Reliable Decision Making via Calibration Oriented Retrieval Augmented Generation

A Risk Taxonomy and Reflection Tool for Large Language Model Adoption in Public Health

ProReason: Multi-Modal Proactive Reasoning with Decoupled Eyesight and Wisdom

Optimal Quantization for Matrix Multiplication

Temporal-Difference Variational Continual Learning

Hi-Drive: Hierarchical POMDP Planning for Safe Autonomous Driving in Diverse Urban Environments

Beyond Visual Appearances: Privacy-sensitive Objects Identification via Hybrid Graph Reasoning

Extreme Compression of Adaptive Neural Images

A Comprehensive Survey on Data Augmentation

Do LLM Agents Have Regrets? A Case Study in Online Learning and Games

MULTI: Multimodal Understanding Leaderboard with Text and Images

Nash Equilibria, Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization

When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning

Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs

HardcoreLogic: Challenging Large Reasoning Models with Long-tail Logic Puzzle Games

Tensor Logic: The Language of AI

Do Large Language Models Respect Contracts? Evaluating and Enforcing Contract-Adherence in Code Generation

Benchmarking is Broken -- Don't Let AI be its Own Judge

A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

Towards Unified Multimodal Misinformation Detection in Social Media: A Benchmark Dataset and Baseline

LLM/Agent-as-Data-Analyst: A Survey

SafeSearch: Automated Red-Teaming for the Safety of LLM-Based Search Agents

Coordination Requires Simplification: Thermodynamic Bounds on Multi-Objective Compromise in Natural and Artificial Intelligence

FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games

HealthProcessAI: A Technical Framework and Proof-of-Concept for LLM-Enhanced Healthcare Process Mining

TASER: Table Agents for Schema-guided Extraction and Recommendation

SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

Created by

Haebom

Author

Xiaonan Si, Meilin Zhu, Simeng Qin, Lijia Yu, Lijun Zhang, Shuaitong Liu, Xinfeng Li, Ranjie Duan, Yang Liu, Xiaojun Jia

SeCon-RAG: A Two-Stage Semantic Filtering and Contradiction-Free Framework for Reliable Augmented Search Generation

Outline

This paper proposes a two-stage semantic filtering and consistency framework to address the vulnerability of augmented search generation (RAG) systems that leverage external knowledge to corpus contamination and attacks. In the first stage, the entity-intent-relation extractor (EIRE) performs semantic and cluster-based filtering to evaluate the semantic relevance between user queries and filtered documents, selectively adding useful documents to the search database. In the second stage, an EIRE-based consistency filtering module analyzes the semantic consistency between the query, candidate answers, and retrieved knowledge, thereby removing internal and external contradictions that could mislead the model. Through this two-stage process, SeCon-RAG preserves useful knowledge while mitigating contamination-induced contradictions, enhancing the robustness of generation and the reliability of output.

Takeaways, Limitations

•

Takeaways:

◦

Improving the reliability and integrity of the RAG system: We present a defense mechanism against corpus contamination attacks to enhance the robustness of the model.

◦

Minimize knowledge loss: Avoid aggressive filtering methods and preserve useful information based on semantic relevance.

◦

Leverage EIRE: Perform more sophisticated filtering by extracting entities, potential targets, and relationships.

◦

Consistency Guarantee: Eliminates contradictions and improves answer accuracy by analyzing consistency between queries, answers, and knowledge.

◦

Achieving SOTA performance: Demonstrating superior performance compared to existing defense methodologies across a variety of LLMs and datasets.

•

Limitations:

◦

Performance dependence of EIRE: The performance of the overall framework may be affected by the accuracy of EIRE.

◦

Computational complexity: It can be computationally expensive because it involves two stages of filtering.

◦

Dataset dependence: It may show performance specific to a specific dataset, and generalization performance to other datasets requires further research.

◦

Model generalizability: Further analysis is needed to determine how well the proposed method generalizes to different LLMs.

View PDF

Made with Slashpage