/
/
Daily Arxiv
Sign In
Daily Arxiv
전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.
What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
Squeeze the Soaked Sponge: Efficient Off-policy Reinforcement Finetuning for Large Language Model
The Dark Side of LLMs: Agent-based Attacks for Complete Computer Takeover
Artificial Generals Intelligence: Mastering Generals.io with Reinforcement Learning
HeLo: Heterogeneous Multi-Modal Fusion with Label Correlation for Emotion Distribution Learning
ixi-GEN: Efficient Industrial sLLMs through Domain Adaptive Continual Pretraining
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications
PWD: Prior-Guided and Wavelet-Enhanced Diffusion Model for Limited-Angle CT
VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting
Adaptation of Multi-modal Representation Models for Multi-task Surgical Computer Vision
Multi-modal Representations for Fine-grained Multi-label Critical View of Safety Recognition
MCFormer: A Multi-Cost-Volume Network and Comprehensive Benchmark for Particle Image Velocimetry
Toward Efficient Speech Emotion Recognition via Spectral Learning and Attention
Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards
Solving the Hubbard model with Neural Quantum States
S2FGL: Spatial Spectral Federated Graph Learning
Beyond Spatial Frequency: Pixel-wise Temporal Frequency-based Deepfake Video Detection
Are Vision Transformer Representations Semantically Meaningful? A Case Study in Medical Imaging
Description of the Training Process of Neural Networks via Ergodic Theorem : Ghost nodes
A Theory of Inference Compute Scaling: Reasoning through Directed Stochastic Skill Search
Damba-ST: Domain-Adaptive Mamba for Efficient Urban Spatio-Temporal Prediction
Studying and Improving Graph Neural Network-based Motif Estimation
Learning Algorithms in the Limit
Thought Crime: Backdoors and Emergent Misalignment in Reasoning Models
HadaNorm: Diffusion Transformer Quantization through Mean-Centered Transformations
MAEBE: Multi-Agent Emergent Behavior Framework
Evaluating LLM Agent Adherence to Hierarchical Safety Principles: A Lightweight Benchmark for Probing Foundational Controllability Components
What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training
From Images to Signals: Are Large Vision Models Useful for Time Series Analysis?
One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory
BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems
Anchoring AI Capabilities in Market Valuations: The Capability Realization Rate Model and Valuation Misalignment Risk
Fair Uncertainty Quantification for Depression Prediction
MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework
A Cryptographic Perspective on Mitigation vs. Detection in Machine Learning
Constraint Programming Models For Serial Batch Scheduling With Minimum Batch Size
Toward Holistic Evaluation of Recommender Systems Powered by Generative Models
Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation
Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation
Decoding AI Judgment: How LLMs Assess News Credibility and Bias
Ethical Concerns of Generative AI and Mitigation Strategies: A Systematic Mapping Study
Diffusion Augmented Retrieval: A Training-Free Approach to Interactive Text-to-Image Retrieval
Derivation of Output Correlation Inferences for Multi-Output (aka Multi-Task) Gaussian Process
Cosmos World Foundation Model Platform for Physical AI
Enhancing Transformers for Generalizable First-Order Logical Entailment
Multi-Scenario Reasoning: Unlocking Cognitive Autonomy in Humanoid Robots for Multimodal Understanding
DLaVA: Document Language and Vision Assistant for Answer Localization with Enhanced Interpretability and Trustworthiness
Tiny-Align: Bridging Automatic Speech Recognition and Large Language Model on the Edge
Understanding Chain-of-Thought in LLMs through Information Theory
A Multi-Granularity Supervised Contrastive Framework for Remaining Useful Life Prediction of Aero-engines
MarineFormer: A Spatio-Temporal Attention Model for USV Navigation in Dynamic Marine Environments
HARMONIC: Cognitive and Control Collaboration in Human-Robotic Teams
Investigating Context-Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style
Masked Image Modeling: A Survey
Time Makes Space: Emergence of Place Fields in Networks Encoding Temporally Continuous Sensory Experiences
Curriculum Negative Mining For Temporal Networks
C3T: Cross-modal Transfer Through Time for Sensor-based Human Activity Recognition
Multi-Head RAG: Solving Multi-Aspect Problems with LLMs
Solving Probabilistic Verification Problems of Neural Networks using Branch and Bound
Offline Trajectory Optimization for Offline Reinforcement Learning
Structure Guided Large Language Model for SQL Generation
A Theory of Response Sampling in LLMs: Part Descriptive and Part Prescriptive
Don't Push the Button! Exploring Data Leakage Risks in Machine Learning and Transfer Learning
Unsupervised Automata Learning via Discrete Optimization
Don't Get Me Wrong: How to Apply Deep Visual Interpretations to Time Series
An Algorithm for Learning Smaller Representations of Models With Scarce Data
GTA1: GUI Test-time Scaling Agent
Fuzzy Classification Aggregation for a Continuum of Agents
Rule Learning for Knowledge Graph Reasoning under Agnostic Distribution Shift
Establishing Best Practices for Building Rigorous Agentic Benchmarks
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
AI's Euclid's Elements Moment: From Language Models to Computable Thought
Closer to Language than Steam: AI as the Cognitive Engine of a New Productivity Revolution
Access Controls Will Solve the Dual-Use Dilemma
Task Assignment and Exploration Optimization for Low Altitude UAV Rescue via Generative AI Enhanced Multi-agent Reinforcement Learning
Affordable AI Assistants with Knowledge Graph of Thoughts
Deontic Temporal Logic for Formal Verification of AI Ethics
Multi-Agent Pathfinding Under Team-Connected Communication Constraint via Adaptive Path Expansion and Dynamic Leading
Constrain Alignment with Sparse Autoencoders
Multi-modal Generative AI: Multi-modal LLMs, Diffusions and the Unification
SimSUM: Simulated Benchmark with Structured and Unstructured Medical Records
Solving a Stackelberg Game on Transportation Networks in a Dynamic Crime Scenario: A Mixed Approach on Multi-Layer Networks
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
PyVision: Agentic Vision with Dynamic Tooling
Single-pass Adaptive Image Tokenization for Minimum Program Search
Multigranular Evaluation for Brain Visual Decoding
Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs
EXPO: Stable Reinforcement Learning with Expressive Policies
Performance and Practical Considerations of Large and Small Language Models in Clinical Decision Support in Rheumatology
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling
Why is Your Language Model a Poor Implicit Reward Model?
Reinforcement Learning with Action Chunking
Scaling RL to Long Videos
MIRIX: Multi-Agent Memory System for LLM-Based Agents
Low Resource Reconstruction Attacks Through Benign Prompts
Probing Experts' Perspectives on AI-Assisted Public Speaking Training
Towards Continuous Home Cage Monitoring: An Evaluation of Tracking and Identification Strategies for Laboratory Mice
DTECT: Dynamic Topic Explorer & Context Tracker
Agentic Retrieval of Topics and Insights from Earnings Calls
Load more
Made with Slashpage