/
/
Daily Arxiv
Daily Arxiv
전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.
PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts
Preacher: Paper-to-Video Agentic System
Hallucination vs interpretation: rethinking accuracy and precision in AI-assisted data extraction for knowledge synthesis
Decentralized Weather Forecasting via Distributed Machine Learning and Blockchain-Based Model Validation
Biased AI improves human decision-making but reduces trust
Personalized Feature Translation for Expression Recognition: An Efficient Source-Free Domain Adaptation Method
A Neurosymbolic Framework for Interpretable Cognitive Attack Detection in Augmented Reality
IAD-R1: Reinforcing Consistent Reasoning in Industrial Anomaly Detection
EvaDrive: Evolutionary Adversarial Policy Optimization for End-to-End Autonomous Driving
To Theoretically Understand Transformer-Based In-Context Learning for Optimizing CSMA
ASPD: Unlocking Adaptive Serial-Parallel Decoding by Exploring Intrinsic Parallelism in LLMs
BiasGym: Fantastic LLM Biases and How to Find (and Remove) Them
Yan: Foundational Interactive Video Generation
M3-Net: A Cost-Effective Graph-Free MLP-Based Model for Traffic Prediction
LLM-Driven Adaptive 6G-Ready Wireless Body Area Networks: Survey and Framework
The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs
On Understanding of the Dynamics of Model Capacity in Continual Learning
WeChat-YATT: A Simple, Scalable and Balanced RLHF Trainer
Improved Personalized Headline Generation via Denoising Fake Interests from Implicit Feedback
Hardness-Aware Dynamic Curriculum Learning for Robust Multimodal Emotion Recognition with Missing Modalities
Echoes of Automation: The Increasing Use of LLMs in Newsmaking
SIFThinker: Spatially-Aware Image Focus for Visual Reasoning
Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle
Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction
Position: The Current AI Conference Model is Unsustainable! Diagnosing the Crisis of Centralized AI Conference
MSC: A Marine Wildlife Video Dataset with Grounded Segmentation and Clip-Level Captioning
Self-Questioning Language Models
Exploring the Application of Visual Question Answering (VQA) for Classroom Activity Monitoring
Oranits: Mission Assignment and Task Offloading in Open RAN-based ITS using Metaheuristic and Deep Reinforcement Learning
DeepWriter: A Fact-Grounded Multimodal Writing Assistant Based On Offline Knowledge Base
Class-Proportional Coreset Selection for Difficulty-Separable Data
Warehouse Spatial Question Answering with LLM Agent
CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks
AmpLyze: A Deep Learning Model for Predicting the Hemolytic Concentration
EXAONE Path 2.0: Pathology Foundation Model with End-to-End Supervision
GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
Discrepancy-Aware Graph Mask Auto-Encoder
Semantic Structure-Aware Generative Attacks for Enhanced Adversarial Transferability
Quantitative Comparison of Fine-Tuning Techniques for Pretrained Latent Diffusion Models in the Generation of Unseen SAR Images
PromptTSS: A Prompting-Based Approach for Interactive Multi-Granularity Time Series Segmentation
15,500 Seconds: Lean UAV Classification Using EfficientNet and Lightweight Fine-Tuning
Prompt Attacks Reveal Superficial Knowledge Removal in Unlearning Methods
Data Pruning by Information Maximization
CCL-LGS: Contrastive Codebook Learning for 3D Language Gaussian Splatting
Security Concerns for Large Language Models: A Survey
Is Quantum Optimization Ready? An Effort Towards Neural Network Compression using Adiabatic Quantum Computing
Unraveling the iterative CHAD
FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference
LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation
Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free
Adaptive Budgeted Multi-Armed Bandits for IoT with Dynamic Resource Constraints
Vision Transformers in Precision Agriculture: A Comprehensive Survey
Goal-Oriented Time-Series Forecasting: Foundation Framework Design
CAPTURe: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting
FinSage: A Multi-aspect RAG System for Financial Filings Question Answering
GraspClutter6D: A Large-scale Real-world Dataset for Robust Perception and Grasping in Cluttered Scenes
Hyperflux: Pruning Reveals the Importance of Weights
ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning
UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving
VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache
Explainable Sentiment Analysis with DeepSeek-R1: Performance, Efficiency, and Few-Shot Learning
Continual Learning for Multiple Modalities
Advancing MAPF towards the Real World: A Scalable Multi-Agent Realistic Testbed (SMART)
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
Boosting Cross-problem Generalization in Diffusion-Based Neural Combinatorial Solver via Inference Time Adaptation
Rhythmic sharing: A bio-inspired paradigm for zero-shot adaptive learning in neural networks
Measuring Diversity in Synthetic Datasets
Delayed Feedback Modeling with Influence Functions
Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods
CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding
Interpretable Neural ODEs for Gene Regulatory Network Discovery under Perturbations
A Lightweight Transformer with Phase-Only Cross-Attention for Illumination-Invariant Biometric Authentication
Understanding Transformer-based Vision Models through Inversion
INSIGHT: Explainable Weakly-Supervised Medical Image Analysis
Visual SLAMMOT Considering Multiple Motion Models
A Training-Free Approach for Music Style Transfer with Latent Diffusion Models
Multi-objective Optimization in CPU Design Space Exploration: Attention is All You Need
DiRW: Path-Aware Digraph Learning for Heterophily
Diversifying Policy Behaviors with Extrinsic Behavioral Curiosity
Episodic Memory Verbalization using Hierarchical Representations of Life-Long Robot Experience
Neural Networks Generalize on Low Complexity Data
Knowledge-based Consistency Testing of Large Language Models
Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning
An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach
Communication Cost Reduction for Subgraph Counting under Local Differential Privacy via Hash Functions
Mathematical Computation and Reasoning Errors by Large Language Models
OpenCUA: Open Foundations for Computer-Use Agents
Compass-Thinker-7B Technical Report
TextQuests: How Good are LLMs at Text-Based Video Games?
On the Definition of Intelligence
Beyond Accuracy: How AI Metacognitive Sensitivity improves AI-assisted Decision Making
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization
FAIRGAME: a Framework for AI Agents Bias Recognition using Game Theory
MedRep: Medical Concept Representation for General Electronic Health Record Foundation Models
A Random-Key Optimizer for Combinatorial Optimization
Federated Cross-Training Learners for Robust Generalization under Data Heterogeneity
Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval
Load more
Class-Proportional Coreset Selection for Difficulty-Separable Data
Created by
Haebom
저자
Elisa Tsai, Haizhong Zheng, Atul Prakash
PDF 보기
Made with Slashpage