Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

Can LLMs Ground when they (Don't) Know: A Study on Direct and Loaded Political Questions

On The Impact of Merge Request Deviations on Code Review Practices

Societal AI Research Has Become Less Interdisciplinary

Geometric deep learning for local growth prediction on abdominal aortic aneurysm surfaces

Auto-Regressive vs Flow-Matching: a Comparative Study of Modeling Paradigms for Text-to-Music Generation

KP-PINNs: Kernel Packet Accelerated Physics Informed Neural Networks

Teaching Physical Awareness to LLMs through Sounds

TGRPO :Fine-tuning Vision-Language-Action Model via Trajectory-wise Group Relative Policy Optimization

TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration

Your Agent Can Defend Itself against Backdoor Attacks

Learnable Spatial-Temporal Positional Encoding for Link Prediction

Unable to Forget: Proactive lnterference Reveals Working Memory Limits in LLMs Beyond Context Length

IGraSS: Learning to Identify Infrastructure Networks from Satellite Imagery by Iterative Graph-constrained Semantic Segmentation

STAMImputer: Spatio-Temporal Attention MoE for Traffic Data Imputation

Physics-Informed Teleconnection-Aware Transformer for Global Subseasonal-to-Seasonal Forecasting

Toward Reliable AR-Guided Surgical Navigation: Interactive Deformation Modeling with Data-Driven Biomechanics and Prompts

Modality-Balancing Preference Optimization of Large Multimodal Models by Adversarial Negative Mining

Vision Transformers Don't Need Trained Registers

Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations

AbstRaL: Augmenting LLMs' Reasoning by Reinforcing Abstract Thinking

Synthesis by Design: Controlled Data Generation via Structural Guidance

MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert Specialization

MedChat: A Multi-Agent Framework for Multimodal Diagnosis with Large Language Models

Pre-trained Large Language Models Learn Hidden Markov Models In-context

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test

Can LLMs Generate Reliable Test Case Generators? A Study on Competition-Level Programming Problems

A Reinforcement Learning Approach for RIS-aided Fair Communications

Multi-Modal Multi-Task Federated Foundation Models for Next-Generation Extended Reality Systems: Towards Privacy-Preserving Distributed Intelligence in AR/VR/MR

Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs

Context Is Not Comprehension: Unmasking LLM reasoning blind spots with VLO

HoliSafe: Holistic Safety Benchmarking and Modeling with Safety Meta Token for Vision-Language Model

Technical Report for Ego4D Long-Term Action Anticipation Challenge 2025

GraphRAG-Bench: Challenging Domain-Specific Reasoning for Evaluating Graph Retrieval-Augmented Generation

Fourier-Modulated Implicit Neural Representation for Multispectral Satellite Image Compression

NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Prediction

Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis

Bayesian Neural Scaling Law Extrapolation with Prior-Fitted Networks

DeepMultiConnectome: Deep Multi-Task Prediction of Structural Connectomes Directly from Diffusion MRI Tractography

SplitLoRA: Balancing Stability and Plasticity in Continual Learning Through Gradient Space Splitting

Large Language Models Miss the Multi-Agent Mark

Rethinking Text-based Protein Understanding: Retrieval or LLM?

Follow the Energy, Find the Path: Riemannian Metrics from Energy-Based Models

Discovering Forbidden Topics in Language Models

LIFEBench: Evaluating Length Instruction Following in Large Language Models

Fine-tuning Diffusion Policies with Backpropagation Through Diffusion Timesteps

Reciprocity as the Foundational Substrate of Society: How Reciprocal Dynamics Scale into Social Systems

LLM Enhancers for GNNs: An Analysis from the Perspective of Causal Mechanism Identification

Product of Experts with LLMs: Boosting Performance on ARC Is a Matter of Perspective

Convert Language Model into a Value-based Strategic Planner

Griffin: Towards a Graph-Centric Relational Database Foundation Model

Value Portrait: Assessing Language Models' Values through Psychometrically and Ecologically Valid Items

Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism

Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment

Assessment of Evolving Large Language Models in Upper Secondary Mathematics

TerraMind: Large-Scale Generative Multimodality for Earth Observation

LEMUR Neural Network Dataset: Towards Seamless AutoML

Style over Substance: Distilled Language Models Reason Via Stylistic Replication

Temporal-Guided Spiking Neural Networks for Event-Based Human Action Recognition

Chem42: a Family of chemical Language Models for Target-aware Ligand Generation

AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification

FC-Attack: Jailbreaking Multimodal Large Language Models via Auto-Generated Flowcharts

Weakly Supervised Multiple Instance Learning for Whale Call Detection and Temporal Localization in Long-Duration Passive Acoustic Monitoring

Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation

AAD-LLM: Neural Attention-Driven Auditory Scene Understanding

Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models

Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation

Lost in Sequence: Do Large Language Models Understand Sequential Recommendation?

Conformal Prediction as Bayesian Quadrature

On the Privacy Risks of Spiking Neural Networks: A Membership Inference Analysis

Trustworthy AI: Safety, Bias, and Privacy -- A Survey

NestQuant: Nested Lattice Quantization for Matrix Products and LLMs

Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies

MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents

Position: Emergent Machina Sapiens Urge Rethinking Multi-Agent Paradigms

PatchPilot: A Cost-Efficient Software Engineering Agent with Early Attempts on Formal Verification

Bias Detection via Maximum Subgroup Discrepancy

Irony Detection, Reasoning and Understanding in Zero-shot Learning

TSVC:Tripartite Learning with Semantic Variation Consistency for Robust Image-Text Retrieval

An LLM-Empowered Adaptive Evolutionary Algorithm For Multi-Component Deep Learning Systems

Large Language Models for Scholarly Ontology Generation: An Extensive Analysis in the Engineering Field

7B Fully Open Source Moxin-LLM/VLM -- From Pretraining to GRPO-based Reinforcement Learning Enhancement

Multi-Party Supervised Fine-tuning of Language Models for Multi-Party Dialogue Generation

Meaningless is better: hashing bias-inducing words in LLM prompts improves performance in logical reasoning and statistical learning

CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization

GenJoin: Conditional Generative Plan-to-Plan Query Optimizer that Learns from Subplan Hints

Code-Switching Curriculum Learning for Multilingual Transfer in LLMs

CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis

Phonology-Guided Speech-to-Speech Translation for African Languages

The Causal Information Bottleneck and Optimal Causal Variable Abstractions

Multimodal Pragmatic Jailbreak on Text-to-image Models

Code Vulnerability Repair with Large Language Model using Context-Aware Prompt Tuning

A Survey on Knowledge Organization Systems of Research Fields: Resources and Challenges

LogProber: Disentangling confidence from contamination in LLM responses

Holistic Uncertainty Estimation For Open-Set Recognition

AcTracer: Active Testing of Large Language Model via Multi-Stage Sampling

XMeCap: Meme Caption Generation with Sub-Image Adaptability

CHOSEN: Compilation to Hardware Optimization Stack for Efficient Vision Transformer Inference

The Remarkable Robustness of LLMs: Stages of Inference?

BiCo-Fusion: Bidirectional Complementary LiDAR-Camera Fusion for Semantic- and Spatial-Aware 3D Object Detection

Invalid request

Invalid URL or request

For help, visitour support pageor contactsupport@slashpage.com

Try again

Back to Home