/
/
Daily Arxiv
Share
Sign In
Daily Arxiv
전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.
A Survey on Event-driven 3D Reconstruction: Development under Different Categories
CamSAM2: Segment Anything Accurately in Camouflaged Videos
Scaling Laws of Synthetic Data for Language Models
Graph-Level Label-Only Membership Inference Attack against Graph Neural Networks
Threshold Crossings as Tail Events for Catastrophic AI Risk
Three Kinds of AI Ethics
PRECTR: A Synergistic Framework for Integrating Personalized Search Relevance Matching and CTR Prediction
PG-SAM: Prior-Guided SAM with Medical for Multi-organ Segmentation
LaMOuR: Leveraging Language Models for Out-of-Distribution Recovery in Reinforcement Learning
Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks
ARFlow: Human Action-Reaction Flow Matching with Physical Guidance
Unleashing Vecset Diffusion Model for Fast Shape Generation
Bayesian Modeling of Zero-Shot Classifications for Urban Flood Detection
Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data
Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model
OASST-ETC Dataset: Alignment Signals from Eye-tracking Analysis of LLM Responses
Task-Specific Activation Functions for Neuroevolution using Grammatical Evolution
Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis
Uni$\textbf{F}^2$ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models
Training Domain Draft Models for Speculative Decoding: Best Practices and Insights
Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection
Implementation of a Generative AI Assistant in K-12 Education: The CyberScholar Initiative
END: Early Noise Dropping for Efficient and Effective Context Denoising
VesselSAM: Leveraging SAM for Aortic Vessel Segmentation with LoRA and Atrous Attention
Agentic AI Software Engineer: Programming with Trust
MetaDE: Evolving Differential Evolution by Differential Evolution
MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation
Context-Aware Semantic Recomposition Mechanism for Large Language Models
Intelligent Code Embedding Framework for High-Precision Ransomware Detection via Multimodal Execution Path Analysis
TransPlace: Transferable Circuit Global Placement via Graph Neural Network
HLV-1K: A Large-scale Hour-Long Video Benchmark for Time-Specific Long Video Understanding
Towards End-to-End Neuromorphic Voxel-based 3D Object Reconstruction Without Physical Priors
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning
DEIM: DETR with Improved Matching for Fast Convergence
Intuitive Axial Augmentation Using Polar-Sine-Based Piecewise Distortion for Medical Slice-Wise Segmentation
Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation
Inference-Time Policy Steering through Human Interactions
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation
The importance of the clustering model to detect new types of intrusion in data traffic
MC-LLaVA: Multi-Concept Personalized Vision-Language Model
Two pathways to resolve relational inconsistencies
Bonsai: Gradient-free Graph Condensation for Node Classification
A Multimodal Vision Foundation Model for Clinical Dermatology
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation
Autoregressive Action Sequence Learning for Robotic Manipulation
Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization
General-purpose Clothes Manipulation with Semantic Keypoints
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Fine-Grained Domain Generalization with Feature Structuralization
Data Augmentation in Earth Observation: A Diffusion Model Approach
ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation
Networking Systems for Video Anomaly Detection: A Tutorial and Survey
High-Dimension Human Value Representation in Large Language Models
Motion-Boundary-Driven Unsupervised Surgical Instrument Segmentation in Low-Quality Optical Flow
Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging
Graph-Instructed Neural Networks for Sparse Grid-Based Discontinuity Detectors
Semiring Provenance for Lightweight Description Logics
Certified Robustness via Dynamic Margin Maximization and Improved Lipschitz Regularization
Elastic Federated Learning over Open Radio Access Network (O-RAN) for Concurrent Execution of Multiple Distributed Learning Tasks
Making AI Less "Thirsty": Uncovering and Addressing the Secret Water Footprint of AI Models
Mixture of Robust Experts (MoRE):A Robust Denoising Method towards multiple perturbations
Multi-agent Application System in Office Collaboration Scenarios
Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models
Human Motion Instruction Tuning
Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data
Fully Distributed Fog Load Balancing with Multi-Agent Reinforcement Learning
Socratic Planner: Self-QA-Based Zero-Shot Planning for Embodied Instruction Following
TwoStep: Multi-agent Task Planning using Classical Planners and Large Language Models
Truck Parking Usage Prediction with Decomposed Graph Neural Networks
FREIDA: A Framework for developing quantitative agent based models based on qualitative expert knowledge: an example of organised crime
Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark
Understanding R1-Zero-Like Training: A Critical Perspective
ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning
Optimal Scaling Laws for Efficiency Gains in a Theoretical Transformer-Augmented Sectional MoE Framework
High Quality Diffusion Distillation on a Single GPU with Relative and Absolute Position Matching
Quantum Neural Network Restatement of Markov Jump Process
Emotion Detection and Music Recommendation System
Flip Learning: Weakly Supervised Erase to Segment Nodules in Breast Ultrasound
Probabilistic Forecasting for Network Resource Analysis in Integrated Terrestrial and Non-Terrestrial Networks
AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports
TN-Eval: Rubric and Evaluation Protocols for Measuring the Quality of Behavioral Therapy Notes
$\beta$-GNN: A Robust Ensemble Approach Against Graph Structure Perturbation
Collaborative Storytelling and LLM: A Linguistic Analysis of Automatically-Generated Role-Playing Game Sessions
State-Aware Perturbation Optimization for Robust Deep Reinforcement Learning
A decision-theoretic approach to dealing with uncertainty in quantum mechanics
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIs
GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving
Design and Evaluation of Neural Network-Based Receiver Architectures for Reliable Communication
Towards Efficient and General-Purpose Few-Shot Misclassification Detection for Vision-Language Models
Underwater Image Enhancement by Convolutional Spiking Neural Networks
Contrastive Learning Guided Latent Diffusion Model for Image-to-Image Translation
A multi-agentic framework for real-time, autonomous freeform metasurface design
From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment
Attention Xception UNet (AXUNet): A Novel Combination of CNN and Self-Attention for Brain Tumor Segmentation
Load more
Made with SlashPage