haebom
Daily Arxiv
전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.
Surgical Agent Orchestration Platform for Voice-directed Patient Data Interaction
FinRpt: Dataset, Evaluation System and LLM-based Multi-agent Framework for Equity Research Report Generation
Hard vs. Noise: Resolving Hard-Noisy Sample Confusion in Recommender Systems via Large Language Models
Learning Quantized Continuous Controllers for Integer Hardware
Differentiated Directional Intervention A Framework for Evading LLM Safety Alignment
TiS-TSL: Image-Label Supervised Surgical Video Stereo Matching via Time-Switchable Teacher-Student Learning
Active Learning for Animal Re-Identification with Ambiguity-Aware Sampling
Transolver is a Linear Transformer: Revisiting Physics-Attention through the Lens of Linear Attention
Exploiting Inter-Session Information with Frequency-enhanced Dual-Path Networks for Sequential Recommendation
SWE-fficiency: Can Language Models Optimize Real-World Repositories on Real Workloads?
A Remarkably Efficient Paradigm to Multimodal Large Language Models for Sequential Recommendation
EndoIR: Degradation-Agnostic All-in-One Endoscopic Image Restoration via Noise-Aware Routing Diffusion
Enhancing Diffusion Model Guidance through Calibration and Regularization
DRAGON: Guard LLM Unlearning in Context via Negative Detection and Reasoning
Report from Workshop on Dialogue alongside Artificial Intelligence
Token Is All You Need: Cognitive Planning through Belief-Intent Co-Evolution
Selective Diabetic Retinopathy Screening with Accuracy-Weighted Deep Ensembles and Entropy-Guided Abstention
SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models
Alternative Fairness and Accuracy Optimization in Criminal Justice
I Detect What I Don't Know: Incremental Anomaly Learning with Stochastic Weight Averaging-Gaussian for Oracle-Free Medical Imaging
OMPILOT: Harnessing Transformer Models for Auto Parallelization to Shared Memory Computing Paradigms
Federated Learning with Gramian Angular Fields for Privacy-Preserving ECG Classification on Heterogeneous IoT Devices
How to Evaluate Speech Translation with Source-Aware Neural MT Metrics
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding
Normalization in Attention Dynamics
Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options
Language over Content: Tracing Cultural Understanding in Multilingual Large Language Models
TACL: Threshold-Adaptive Curriculum Learning Strategy for Enhancing Medical Text Understanding
TraceCoder: Towards Traceable ICD Coding via Multi-Source Knowledge Integration
Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs
Comparative Analysis of Large Language Models for the Machine-Assisted Resolution of User Intentions
Evolutionary Profiles for Protein Fitness Prediction
Epistemic Diversity and Knowledge Collapse in Large Language Models
MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages
Towards Foundation Models for Zero-Shot Time Series Anomaly Detection: Leveraging Synthetic Data and Relative Context Discrepancy
CyberSOCEval: Benchmarking LLMs Capabilities for Malware Analysis and Threat Intelligence Reasoning
A Realistic Evaluation of Cross-Frequency Transfer Learning and Foundation Forecasting Models
TimeMosaic: Temporal Heterogeneity Guided Time Series Forecasting via Adaptive Granularity Patch and Segment-wise Decoding
RSVG-ZeroOV: Exploring a Training-Free Framework for Zero-Shot Open-Vocabulary Visual Grounding in Remote Sensing Images
Instance Generation for Meta-Black-Box Optimization through Latent Space Reverse Engineering
Decoding Latent Attack Surfaces in LLMs: Prompt Injection via HTML in Web Summarization
Towards Methane Detection Onboard Satellites
OPERA: A Reinforcement Learning--Enhanced Orchestrated Planner-Executor Architecture for Reasoning-Oriented Multi-Hop Retrieval
Can LLM-Generated Textual Explanations Enhance Model Classification Performance? An Empirical Study
Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction
CoCoLIT: ControlNet-Conditioned Latent Image Translation for MRI to Amyloid PET Synthesis
Beyond Algorethics: Addressing the Ethical and Anthropological Challenges of AI Recommender Systems
Imbalance in Balance: Online Concept Balancing in Generation Models
ReCode: Updating Code API Knowledge with Reinforcement Learning
Rethinking Losses for Diffusion Bridge Samplers
Zeroth-Order Optimization Finds Flat Minima
UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection
A Unified and Fast-Sampling Diffusion Bridge Framework via Stochastic Optimal Control
BroadGen: A Framework for Generating Effective and Efficient Advertiser Broad Match Keyphrase Recommendations
FB-RAG: Improving RAG with Forward and Backward Lookup
FedSEA-LLaMA: A Secure, Efficient and Adaptive Federated Splitting Framework for Large Language Models
Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning
RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs
Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors
Tool-Aided Evolutionary LLM for Generative Policy Toward Efficient Resource Management in Wireless Federated Learning
FALCON: False-Negative Aware Learning of Contrastive Negatives in Vision-Language Alignment
How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference
FaSDiff: Balancing Perception and Semantics in Face Compression via Stable Diffusion Priors
On the generalization of language models from in-context learning and finetuning: a controlled study
Enhancing Speech-to-Speech Dialogue Modeling with End-to-End Retrieval-Augmented Generation
A Multimodal Recaptioning Framework to Account for Perceptual Diversity Across Languages in Vision-Language Modeling
WildFireCan-MMD: A Multimodal Dataset for Classification of User-Generated Content During Wildfires in Canada
MULTI-LF: A Continuous Learning Framework for Real-Time Malicious Traffic Detection in Multi-Environment Networks
STAR-1: Safer Alignment of Reasoning LLMs with 1K Data
COPA: Comparing the incomparable in multi-objective model evaluation
CLEV: LLM-Based Evaluation Through Lightweight Efficient Voting for Free-Form Question-Answering
Towards Synthesizing High-Dimensional Tabular Data with Limited Samples
Explaining the Unexplainable: A Systematic Review of Explainable AI in Finance
Learning Vision-Based Neural Network Controllers with Semi-Probabilistic Safety Guarantees
MA-GTS: A Multi-Agent Framework for Solving Complex Graph Problems in Real-World Applications
On the Convergence and Stability of Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning, and Online Decision Transformers
A Multi-Agent Conversational Bandit Approach to Online Evaluation and Selection of User-Aligned LLM Responses
Generalizing Weisfeiler-Lehman Kernels to Subgraphs
SCoTT: Strategic Chain-of-Thought Tasking for Wireless-Aware Robot Navigation in Digital Twins
GeMID: Generalizable Models for IoT Device Identification
Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Selection of LLM Fine-Tuning Data based on Orthogonal Rules
Benchmarking Domain Generalization Algorithms in Computational Pathology
Reference-Guided Verdict: LLMs-as-Judges in Automatic Evaluation of Free-Form QA
Identifying treatment response subgroups in observational time-to-event data
Informed Correctors for Discrete Diffusion Models
Integrating Artificial Intelligence into Operating Systems: A Survey on Techniques, Applications, and Future Directions
Multimodal Adversarial Defense for Vision-Language Models by Leveraging One-To-Many Relationships
Spikingformer: A Key Foundation Model for Spiking Neural Networks
DeepPersona: A Generative Engine for Scaling Deep Synthetic Personas
Two Heads are Better than One: Distilling Large Language Model Features Into Small Models with Feature Decomposition and Mixture
A Theoretical Analysis of Detecting Large Model-Generated Time Series
Green AI: A systematic review and meta-analysis of its definitions, lifecycle models, hardware and measurement attempts
Spilling the Beans: Teaching LLMs to Self-Report Their Hidden Objectives
Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads
When Object-Centric World Models Meet Policy Learning: From Pixels to Policies, and Where It Breaks
ScRPO: From Errors to Insights
Deep Value Benchmark: Measuring Whether Models Generalize Deep Values or Shallow Preferences
Glia: A Human-Inspired AI for Automated Systems Design and Optimization
Load more
An Introduction to Sliced Optimal Transport
Created by
Haebom
저자
Khai Nguyen
개요
Sliced Optimal Transport (SOT)는 1차원 최적 수송 문제의 처리 용이성을 활용하는 최적 수송 (OT)의 빠르게 발전하는 분야입니다. OT, 적분 기하학, 전산 통계학의 도구를 결합하여, SOT는 풍부한 기하학적 구조를 유지하면서 확률 측도에 대한 거리, 바리센터, 커널의 빠르고 확장 가능한 계산을 가능하게 합니다. 이 논문은 SOT의 수학적 기초, 방법론적 발전, 계산 방법 및 응용 분야를 포괄적으로 검토합니다.
시사점, 한계점
•
SOT의 수학적 기초, 방법론, 계산 방법 및 응용 분야에 대한 포괄적인 검토 제공
•
기존 OT에 비해 빠르고 확장 가능한 계산 능력
•
머신러닝, 통계, 컴퓨터 그래픽스, 컴퓨터 비전 등 다양한 분야에 적용 가능성 제시
•
비선형 투영, 개선된 몬테카를로 근사, 가중 슬라이싱 기술 등 최근 방법론적 발전 논의
•
불균형, 부분, 다중 주변 및 Gromov-Wasserstein 설정으로의 확장 연구
•
본 논문 자체의 한계점은 명시되지 않음.
PDF 보기
Made with Slashpage