Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts

Preacher: Paper-to-Video Agentic System

Hallucination vs interpretation: rethinking accuracy and precision in AI-assisted data extraction for knowledge synthesis

Decentralized Weather Forecasting via Distributed Machine Learning and Blockchain-Based Model Validation

Biased AI improves human decision-making but reduces trust

Personalized Feature Translation for Expression Recognition: An Efficient Source-Free Domain Adaptation Method

A Neurosymbolic Framework for Interpretable Cognitive Attack Detection in Augmented Reality

IAD-R1: Reinforcing Consistent Reasoning in Industrial Anomaly Detection

EvaDrive: Evolutionary Adversarial Policy Optimization for End-to-End Autonomous Driving

To Theoretically Understand Transformer-Based In-Context Learning for Optimizing CSMA

ASPD: Unlocking Adaptive Serial-Parallel Decoding by Exploring Intrinsic Parallelism in LLMs

BiasGym: Fantastic LLM Biases and How to Find (and Remove) Them

Yan: Foundational Interactive Video Generation

M3-Net: A Cost-Effective Graph-Free MLP-Based Model for Traffic Prediction

LLM-Driven Adaptive 6G-Ready Wireless Body Area Networks: Survey and Framework

The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs

On Understanding the Dynamics of Model Capacity in Continual Learning

WeChat-YATT: A Simple, Scalable and Balanced RLHF Trainer

Improved Personalized Headline Generation via Denoising Fake Interests from Implicit Feedback

Hardness-Aware Dynamic Curriculum Learning for Robust Multimodal Emotion Recognition with Missing Modalities

Echoes of Automation: The Increasing Use of LLMs in Newsmaking

SIFThinker: Spatially-Aware Image Focus for Visual Reasoning

Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle

Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction

Position: The Current AI Conference Model is Unsustainable! Diagnosing the Crisis of Centralized AI Conference

MSC: A Marine Wildlife Video Dataset with Grounded Segmentation and Clip-Level Captioning

Self-Questioning Language Models

Exploring the Application of Visual Question Answering (VQA) for Classroom Activity Monitoring

Oranits: Mission Assignment and Task Offloading in Open RAN-based ITS using Metaheuristic and Deep Reinforcement Learning

DeepWriter: A Fact-Grounded Multimodal Writing Assistant Based On Offline Knowledge Base

Class-Proportional Coreset Selection for Difficulty-Separable Data

Warehouse Spatial Question Answering with LLM Agent

CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks

AmpLyze: A Deep Learning Model for Predicting the Hemolytic Concentration

EXAONE Path 2.0: Pathology Foundation Model with End-to-End Supervision

GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study

Discrepancy-Aware Graph Mask Auto-Encoder

Semantic Structure-Aware Generative Attacks for Enhanced Adversarial Transferability

Quantitative Comparison of Fine-Tuning Techniques for Pretrained Latent Diffusion Models in the Generation of Unseen SAR Images

PromptTSS: A Prompting-Based Approach for Interactive Multi-Granularity Time Series Segmentation

15,500 Seconds: Lean UAV Classification Using EfficientNet and Lightweight Fine-Tuning

Prompt Attacks Reveal Superficial Knowledge Removal in Unlearning Methods

Data Pruning by Information Maximization

CCL-LGS: Contrastive Codebook Learning for 3D Language Gaussian Splatting

Security Concerns for Large Language Models: A Survey

Is Quantum Optimization Ready? An Effort Towards Neural Network Compression using Adiabatic Quantum Computing

Unraveling the iterative CHAD

FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation

Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free

Adaptive Budgeted Multi-Armed Bandits for IoT with Dynamic Resource Constraints

Vision Transformers in Precision Agriculture: A Comprehensive Survey

Goal-Oriented Time-Series Forecasting: Foundation Framework Design

CAPTURe: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting

FinSage: A Multi-aspect RAG System for Financial Filings Question Answering

GraspClutter6D: A Large-scale Real-world Dataset for Robust Perception and Grasping in Cluttered Scenes

Hyperflux: Pruning Reveals the Importance of Weights

ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving

VectorFit: Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models

BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache

Explainable Sentiment Analysis with DeepSeek-R1: Performance, Efficiency, and Few-Shot Learning

Continual Learning for Multiple Modalities

Advancing MAPF towards the Real World: A Scalable Multi-Agent Realistic Testbed (SMART)

LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint

Boosting Cross-problem Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation

Rhythmic sharing: A bio-inspired paradigm for zero-shot adaptive learning in neural networks

Measuring Diversity in Synthetic Datasets

Delayed Feedback Modeling with Influence Functions

Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods

CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization

Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding

Interpretable Neural ODEs for Gene Regulatory Network Discovery under Perturbations

A Lightweight Transformer with Phase-Only Cross-Attention for Illumination-Invariant Biometric Authentication

Understanding Transformer-based Vision Models through Inversion

INSIGHT: Explainable Weakly-Supervised Medical Image Analysis

Visual SLAMMOT Considering Multiple Motion Models

A Training-Free Approach for Music Style Transfer with Latent Diffusion Models

Multi-objective Optimization in CPU Design Space Exploration: Attention is All You Need

DiRW: Path-Aware Digraph Learning for Heterophily

Diversifying Policy Behaviors with Extrinsic Behavioral Curiosity

Episodic Memory Verbalization using Hierarchical Representations of Life-Long Robot Experience

Neural Networks Generalize on Low Complexity Data

Knowledge-based Consistency Testing of Large Language Models

Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning

An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach

Communication Cost Reduction for Subgraph Counting under Local Differential Privacy via Hash Functions

Mathematical Computation and Reasoning Errors by Large Language Models

OpenCUA: Open Foundations for Computer-Use Agents

Compass-Thinker-7B Technical Report

TextQuests: How Good are LLMs at Text-Based Video Games?

On the Definition of Intelligence

Beyond Accuracy: How AI Metacognitive Sensitivity improves AI-assisted Decision Making

LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization

FAIRGAME: a Framework for AI Agents Bias Recognition using Game Theory

MedRep: Medical Concept Representation for General Electronic Health Record Foundation Models

A Random-Key Optimizer for Combinatorial Optimization

Federated Cross-Training Learners for Robust Generalization under Data Heterogeneity

Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval

Boosting Cross-problem Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation

Created by

Haebom

Author

Haoyu Lei, Kaiwen Zhou, Yinchuan Li, Zhitang Chen, Farzan Farnia

Outline

This paper presents a method for solving NP-complete problems using a diffusion model based on neural combinatorial optimization (NCO). To address the challenges of existing NCO methods, including their size and cross-problem generalization, and their high training costs, we propose DIFU-Ada, a framework that adapts at the inference stage without training. DIFU-Ada utilizes predefined guidance functions to enable conditional generation and zero-shot cross-problem transfer and size generalization without additional training. We understand the cross-problem transferability through theoretical analysis, and experimentally demonstrate that a diffusion solver trained solely on the traveling salesman problem (TSP) achieves competitive zero-shot transfer performance on TSP variants such as PCTSP and OP.

Takeaways, Limitations

•

Takeaways:

◦

We propose a novel framework (DIFU-Ada) that adapts at the inference stage without training, thereby solving the problems of high training cost and poor generalization performance of existing diffusion-based NCOs (Limitations).

◦

Experimental verification of improved transfer and size generalization performance in zero-shot cross problem.

◦

A theoretical analysis further enhances our understanding of cross-problem transferability.

•

Limitations:

◦

The effectiveness of the proposed method is limited to TSP and its variant problems, and its generalization performance to other types of combinatorial optimization problems requires further study.

◦

The design of the predefined guidance function may affect performance, and further research is needed to design the optimal guidance function.

◦

Generalization performance evaluations are needed for problems of various sizes, and there is a possibility of bias toward problems of a certain size.

View PDF

Made with Slashpage