haebom
Sign In
Daily Arxiv
전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.
Enhancing Efficiency and Performance in Deepfake Audio Detection through Neuron-level Dropin & Neuroplasticity Mechanisms
Probabilistic Geometric Alignment via Bayesian Latent Transport for Domain-Adaptive Foundation Models
Cognitive Training for Language Models: Towards General Capabilities via Cross-Entropy Games
Learning When to Act: Interval-Aware Reinforcement Learning with Predictive Temporal Structure
P^2O: Joint Policy and Prompt Optimization
mSFT: Addressing Dataset Mixtures Overfitting Heterogeneously in Multi-task SFT
TRACE: A Multi-Agent System for Autonomous Physical Reasoning for Seismology
Gastric-X: A Multimodal Multi-Phase Benchmark Dataset for Advancing Vision-Language Models in Gastric Cancer Analysis
Elastic Weight Consolidation Done Right for Continual Learning
When Should a Robot Think? Resource-Aware Reasoning via Reinforcement Learning for Embodied Robotic Decision-Making
360{\deg} Image Perception with MLLMs: A Comprehensive Benchmark and a Training-Free Method
Seeking Physics in Diffusion Noise
SemBench: A Universal Semantic Framework for LLM Evaluation
UtilityMax Prompting: A Formal Framework for Multi-Objective Large Language Model Optimization
Theory of Dynamic Adaptive Coordination
Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI
The DMA Streaming Framework: Kernel-Level Buffer Orchestration for High-Performance AI Data Paths
Graph-of-Mark: Promote Spatial Reasoning in Multimodal Language Models with Graph-Based Visual Prompting
Why Adam Can Beat SGD: Second-Moment Normalization Yields Sharper Tails
From Scale to Speed: Adaptive Test-Time Scaling for Image Editing
See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis
The Landscape of AI in Science Education: What is Changing and How to Respond
Impact of AI Search Summaries on Website Traffic: Evidence from Google AI Overviews and Wikipedia
Monocular Normal Estimation via Shading Sequence Estimation
Towards Exploratory and Focused Manipulation with Bimanual Active Perception: A New Problem, Benchmark and Strategy
Temporal Sepsis Modeling: a Fully Interpretable Relational Way
Gradient Regularized Natural Gradients
SciCoQA: Quality Assurance for Scientific Paper--Code Alignment
Information Access of the Oppressed: A Problem-Posing Framework for Envisioning Emancipatory Information Access Platforms
TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts
Context Matters: Peer-Aware Student Behavioral Engagement Measurement via VLM Action Parsing and LLM Sequence Classification
IDESplat: Iterative Depth Probability Estimation for Generalizable 3D Gaussian Splatting
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
SWAA: Sliding Window Attention Adaptation for Efficient and Quality Preserving Long Context Processing
ByteStorm: a multi-step data-driven approach for Tropical Cyclones detection and tracking
Constant-Time Motion Planning with Manipulation Behaviors
Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval
A cross-species neural foundation model for end-to-end speech decoding
Foundry: Distilling 3D Foundation Models for the Edge
Generative deep learning for foundational video translation in ultrasound
Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation
Constrained Diffusion for Protein Design with Hard Structural Constraints
CQA-Eval: Designing Reliable Evaluations of Multi-paragraph Clinical QA under Resource Constraints
DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models
GeoResponder: Towards Building Geospatial LLMs for Time-Critical Disaster Response
End-to-End Low-Level Neural Control of an Industrial-Grade 6D Magnetic Levitation System
MedShift: Implicit Conditional Transport for X-Ray Domain Adaptation
The Information Dynamics of Generative Diffusion
Mapping the Course for Prompt-based Structured Prediction
Hierarchical Adaptive networks with Task vectors for Test-Time Adaptation
Acoustic Imaging for Low-SNR UAV Detection: Dense Beamformed Energy Maps and U-Net SELD
CodeNER: Code Prompting for Named Entity Recognition
Predicting Human Mobility during Extreme Events via LLM-Enhanced Cross-City Learning
U-DREAM: Unsupervised Dereverberation guided by a Reverberation Model
BMFM-RNA: whole-cell expression decoding improves transcriptomic foundation models
Instruction Following by Principled Boosting Attention of Large Language Models
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents
The LLM Bottleneck: Why Open-Source Vision LLMs Struggle with Hierarchical Visual Recognition
Physics-Informed Evolution: An Evolutionary Framework for Solving Quantum Control Problems Involving the Schr\"odinger Equation
The Limits of Inference Scaling Through Resampling
Adaptive Online Mirror Descent for Tchebycheff Scalarization in Multi-Objective Learning
LLM4AD: Large Language Models for Autonomous Driving -- Concept, Review, Benchmark, Experiments, and Future Trends
LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts
CodeRefine: A Pipeline for Enhancing LLM-Generated Code Implementations of Research Papers
The Future of AI-Driven Software Engineering
MindSet: Vision. A toolbox for testing DNNs on key psychological experiments
A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge
AI-Supervisor: Autonomous AI Research Supervision via a Persistent Research World Model
Environment Maps: Structured Environmental Representations for Long-Horizon Agents
MIRAGE: The Illusion of Visual Understanding
Man and machine: artificial intelligence and judicial decision making
Characterizing Linear Alignment Across Language Models
Conflict-Based Search for Multi Agent Path Finding with Asynchronous Actions
Consequentialist Objectives and Catastrophe
RetroAgent: From Solving to Evolving via Retrospective Dual Intrinsic Feedback
XGrammar-2: Efficient Dynamic Structured Generation Engine for Agentic LLMs
Analysing Environmental Efficiency in AI for X-Ray Diagnosis
Planned Diffusion
From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning
Do Language Models Follow Occam's Razor? An Evaluation of Parsimony in Inductive and Abductive Reasoning
Interactive Query Answering on Knowledge Graphs with Soft Entity Constraints
Ludax: A GPU-Accelerated Domain Specific Language for Board Games
TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving
Concepts Learned Visually by Infants Can Contribute to Visual Learning and Understanding in AI Models
Research on environment perception and behavior prediction of intelligent UAV based on semantic communication
Semi-Strongly solved: a New Definition Leading Computer to Perfect Gameplay
Working Paper: Active Causal Structure Learning with Latent Variables: Towards Learning to Detour in Autonomous Robots
Vega: Learning to Drive with Natural Language Instructions
Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference
PixelSmile: Toward Fine-Grained Facial Expression Editing
Natural-Language Agent Harnesses
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models
Neural Network Conversion of Machine Learning Pipelines
The Kitchen Loop: User-Spec-Driven Development for a Self-Evolving Codebase
A Unified Memory Perspective for Probabilistic Trustworthy AI
Just Zoom In: Cross-View Geo-Localization via Autoregressive Zooming
Measuring What Matters -- or What's Convenient?: Robustness of LLM-Based Scoring Systems to Construct-Irrelevant Factors
Load more
Made with Slashpage