/
/
Daily Arxiv
Daily Arxiv
전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.
Segment Anything for Satellite Imagery: A Strong Baseline and a Regional Dataset for Automatic Field Delineation
VRAIL: Vectorized Reward-based Attribution for Interpretable Learning
LLM Web Dynamics: Tracing Model Collapse in a Network of LLMs
Accurate and scalable exchange-correlation with deep learning
Working Document -- Formalising Software Requirements with Large Language Models
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
Bures-Wasserstein Flow Matching for Graph Generation
Distributional Training Data Attribution
AFBS:Buffer Gradient Selection in Semi-asynchronous Federated Learning
Robust LLM Unlearning with MUDMAN: Meta-Unlearning with Disruption Masking And Normalization
Exploring the Secondary Risks of Large Language Models
Supernova Event Dataset: Interpreting Large Language Models' Personality through Critical Event Analysis
Agent-RLVR: Training Software Engineering Agents via Guidance and Environment Rewards
C-SEO Bench: Does Conversational SEO Work?
xInv: Explainable Optimization of Inverse Problems
Effective Red-Teaming of Policy-Adherent Agents
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
PlantDeBERTa: An Open Source Language Model for Plant Science
Reinforcement Learning Teachers of Test Time Scaling
DriveSuprim: Towards Precise Trajectory Selection for End-to-End Planning
OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis
Optimizing Sensory Neurons: Nonlinear Attention Mechanisms for Accelerated Convergence in Permutation-Invariant Neural Networks for Reinforcement Learning
Dual Debiasing for Noisy In-Context Learning for Text Generation
Eye of Judgement: Dissecting the Evaluation of Russian-speaking LLMs with POLLUX
Maximizing Confidence Alone Improves Reasoning
Analysis and Evaluation of Synthetic Data Generation in Speech Dysfluency Detection
FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering
Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)
Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation
Pretraining Language Models to Ponder in Continuous Space
SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback
Learning to Insert for Constructive Neural Vehicle Routing Solver
LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference
MIRAGE: A Multi-modal Benchmark for Spatial Perception, Reasoning, and Intelligence
Introducing voice timbre attribute detection
The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan
TrumorGPT: Graph-Based Retrieval-Augmented Large Language Model for Fact-Checking
Democracy of AI Numerical Weather Models: An Example of Global Forecasting with FourCastNetv2 Made by a University Research Lab Using GPU
Learning to Reason under Off-Policy Guidance
Protecting Your Voice: Temporal-aware Robust Watermarking
Personalized News Recommendation with Multi-granularity Candidate-aware User Modeling
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
AutoPDL: Automatic Prompt Optimization for LLM Agents
PiCo: Jailbreaking Multimodal Large Language Models via $\textbf{Pi}$ctorial $\textbf{Co}$de Contextualization
From Easy to Hard: Building a Shortcut for Differentially Private Image Synthesis
Context-Aware Human Behavior Prediction Using Multimodal Large Language Models: Challenges and Insights
Rubric Is All You Need: Enhancing LLM-based Code Evaluation With Question-Specific Rubrics
Shapley Revisited: Tractable Responsibility Measures for Query Answers
Simple and Critical Iterative Denoising: A Recasting of Discrete Diffusion in Graph Generation
Large Language Models powered Malicious Traffic Detection: Architecture, Opportunities and Case Study
TreeSynth: Synthesizing Diverse Data from Scratch via Tree-Guided Subspace Partitioning
A Dual-Directional Context-Aware Test-Time Learning for Text Classification
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation
HiRAG: Retrieval-Augmented Generation with Hierarchical Knowledge
Trajectory Prediction for Autonomous Driving: Progress, Limitations, and Future Directions
POPGym Arcade: Parallel Pixelated POMDPs
BAnG: Bidirectional Anchored Generation for Conditional RNA Design
Directional Gradient Projection for Robust Fine-Tuning of Foundation Models
ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation
Steering LLMs for Formal Theorem Proving
Exploring the Potential of Encoder-free Architectures in 3D LMMs
Compromising Honesty and Harmlessness in Language Models via Deception Attacks
Large Language Model Guided Self-Debugging Code Generation
ASCenD-BDS: Adaptable, Stochastic and Context-aware framework for Detection of Bias, Discrimination and Stereotyping
LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently
Segmentation-Aware Generative Reinforcement Network (GRN) for Tissue Layer Segmentation in 3-D Ultrasound Images for Chronic Low-back Pain (cLBP) Assessment
AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs
The Impact of Input Order Bias on Large Language Models for Software Fault Localization
GeAR: Graph-enhanced Agent for Retrieval-augmented Generation
Rethinking Cancer Gene Identification through Graph Anomaly Analysis
SurgSora: Object-Aware Diffusion Model for Controllable Surgical Video Generation
DSGram: Dynamic Weighting Sub-Metrics for Grammatical Error Correction in the Era of Large Language Models
Human Action CLIPs: Detecting AI-generated Human Motion
FLARE: Toward Universal Dataset Purification against Backdoor Attacks
G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation
Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning
Song Form-aware Full-Song Text-to-Lyrics Generation with Multi-Level Granularity Syllable Count Control
Generating Energy-efficient code with LLMs
DeepMedcast: A Deep Learning Method for Generating Intermediate Weather Forecasts among Multiple NWP Models
How Far is Video Generation from World Model: A Physical Law Perspective
One-Step is Enough: Sparse Autoencoders for Text-to-Image Diffusion Models
The Hive Mind is a Single Reinforcement Learning Agent
How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMs
FutureFill: Fast Generation from Convolutional Sequence Models
Leveraging Model Guidance to Extract Training Data from Personalized Diffusion Models
Machine-learning based high-bandwidth magnetic sensing
MOST: MR reconstruction Optimization for multiple downStream Tasks via continual learning
Bridging Geometric Diffusion and Energy Minimization: A Unified Framework for Neural Message Passing
Large Language Models for Disease Diagnosis: A Scoping Review
RePST: Language Model Empowered Spatio-Temporal Forecasting via Semantic-Oriented Reprogramming
Smooth InfoMax -- Towards Easier Post-Hoc Interpretability
PREMAP: A Unifying PREiMage APproximation Framework for Neural Networks
Reasoning Circuits in Language Models: A Mechanistic Interpretation of Syllogistic Inference
UniMoT: Unified Molecule-Text Language Model with Discrete Token Representation
Handling Numeric Expressions in Automatic Speech Recognition
"I understand why I got this grade": Automatic Short Answer Grading with Feedback
DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training
Rich Interoperable Metadata for Cultural Heritage Projects at Jagiellonian University
Load more
Designing an efficient and equitable humanitarian supply chain dynamically via reinforcement learning
Created by
Haebom
저자
Weijia Jin
개요
본 연구는 강화학습(PPO)을 이용하여 효율적이고 공평한 인도적 지원 공급망을 동적으로 설계하고, 휴리스틱 알고리즘과 비교 분석했습니다. PPO 모델은 평균 만족도를 최우선으로 고려하는 것을 보여줍니다.
시사점, 한계점
•
시사점:
강화학습 기반 PPO 알고리즘을 활용하여 인도적 지원 공급망의 효율성 및 공평성을 동시에 향상시킬 수 있는 가능성을 제시합니다. 평균 만족도를 중시하는 PPO 모델의 특징을 확인했습니다.
•
한계점:
구체적인 휴리스틱 알고리즘의 종류 및 성능 비교에 대한 자세한 설명이 부족합니다. PPO 모델의 실제 적용 가능성 및 일반화 성능에 대한 추가적인 검증이 필요합니다. 다양한 상황과 제약 조건을 고려한 실험 설계의 부족이 존재할 수 있습니다.
PDF 보기
Made with Slashpage