Daily Arxiv

전 세계에서 발간되는 인공지능 관련 논문을 정리하는 페이지 입니다.
본 페이지는 Google Gemini를 활용해 요약 정리하며, 비영리로 운영 됩니다.
논문에 대한 저작권은 저자 및 해당 기관에 있으며, 공유 시 출처만 명기하면 됩니다.

AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation

Self-Guided Process Reward Optimization with Redefined Step-wise Advantage for Process Reinforcement Learning

Crafting Hanzi as Narrative Bridges: An AI Co-Creation Workshop for Elderly Migrants

Distributional Soft Actor-Critic with Diffusion Policy

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Fast AI Model Splitting over Edge Networks

From Sentences to Sequences: Rethinking Languages in Biological System

MTCNet: Motion and Topology Consistency Guided Learning for Mitral Valve Segmentationin 4D Ultrasound

Horus: A Protocol for Trustless Delegation Under Uncertainty

Mixture of Reasonings: Teach Large Language Models to Reason with Adaptive Strategies

Benchmarking Generalizable Bimanual Manipulation: RoboTwin Dual-Arm Collaboration Challenge at CVPR 2025 MEIS Workshop

Red Teaming for Generative AI, Report on a Copyright-Focused Exercise Completed in an Academic Medical Center

AirV2X: Unified Air-Ground Vehicle-to-Everything Collaboration

Semantic Structure-Aware Generative Attacks for Enhanced Adversarial Transferability

Aligning Frozen LLMs by Reinforcement Learning: An Iterative Reweight-then-Optimize Approach

Distinguishing Predictive and Generative AI in Regulation

AIn't Nothing But a Survey? Using Large Language Models for Coding German Open-Ended Survey Responses on Survey Motivation

Text-Aware Image Restoration with Diffusion Models

How Good LLM-Generated Password Policies Are?

Towards an Explainable Comparison and Alignment of Feature Embeddings

Gradient-Based Model Fingerprinting for LLM Similarity Detection and Family Classification

Empowering Intelligent Low-altitude Economy with Large AI Model Deployment

Incorporating LLMs for Large-Scale Urban Complex Mobility Simulation

Generating Hypotheses of Dynamic Causal Graphs in Neuroscience: Leveraging Generative Factor Models of Observed Time Series

Traveling Across Languages: Benchmarking Cross-Lingual Consistency in Multimodal LLMs

Threat Modeling for AI: The Case for an Asset-Centric Approach

SoccerDiffusion: Toward Learning End-to-End Humanoid Robot Soccer from Gameplay Recordings

PAD: Phase-Amplitude Decoupling Fusion for Multi-Modal Land Cover Classification

Significativity Indices for Agreement Values

Transferrable Surrogates in Expressive Neural Architecture Search Spaces

Privacy-Preserving Operating Room Workflow Analysis using Digital Twins

Uncertainty-Guided Coarse-to-Fine Tumor Segmentation with Anatomy-Aware Post-Processing

CMD-HAR: Cross-Modal Disentanglement for Wearable Human Activity Recognition

Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models

Understanding-informed Bias Mitigation for Fair CMR Segmentation

HAPI: A Model for Learning Robot Facial Expressions from Human Preferences

MaizeField3D: A Curated 3D Point Cloud and Procedural Model Dataset of Field-Grown Maize from a Diversity Panel

Illuminant and light direction estimation using Wasserstein distance method

Fundamental Limits of Hierarchical Secure Aggregation with Cyclic User Association

LLM-Powered Prediction of Hyperglycemia and Discovery of Behavioral Treatment Pathways from Wearables and Diet

Interleaved Gibbs Diffusion: Generating Discrete-Continuous Data with Implicit Constraints

EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks

Circuit-tuning: A Mechanistic Approach for Identifying Parameter Redundancy and Fine-tuning Neural Networks

EigenLoRAx: Recycling Adapters to Find Principal Subspaces for Resource-Efficient Adaptation and Inference

Learning Traffic Anomalies from Generative Models on Real-Time Observations

Enabling Population-Level Parallelism in Tree-Based Genetic Programming for Comprehensive GPU Acceleration

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Quantifying the Importance of Data Alignment in Downstream Model Performance

Quantum-enhanced causal discovery for a small number of samples

On Characterizations for Language Generation: Interplay of Hallucinations, Breadth, and Stability

Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs

COEF-VQ: Cost-Efficient Video Quality Understanding through a Cascaded Multimodal LLM Framework

GeMID: Generalizable Models for IoT Device Identification

Next-Token Prediction Task Assumes Optimal Data Ordering for LLM Training in Proof Generation

Is Complex Query Answering Really Complex?

Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning

Offline Reinforcement Learning for Learning to Dispatch for Job Shop Scheduling

Reconsidering the energy efficiency of spiking neural networks

Exploring the Integration of Large Language Models in Industrial Test Maintenance Processes

Sequence-aware Pre-training for Echocardiography Probe Movement Guidance

Anatomical Foundation Models for Brain MRIs

Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective

Quantifying the Cross-sectoral Intersecting Discrepancies within Multiple Groups Using Latent Class Analysis Towards Fairness

Delving into LLM-assisted writing in biomedical publications through excess vocabulary

Towards a Novel Measure of User Trust in XAI Systems

Avoiding Catastrophe in Online Learning by Asking for Help

Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning

Beyond Scale: The Diversity Coefficient as a Data Quality Metric for Variability in Natural Language Data

Kernel Density Bayesian Inverse Reinforcement Learning

Embodied AI Agents: Modeling the World

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

AI Flow: Perspectives, Scenarios, and Approaches

A framework for Conditional Reasoning in Answer Set Programming

Autoformalization in the Era of Large Language Models: A Survey

Agentic AI Process Observability: Discovering Behavioral Variability

Artificial Intelligence Index Report 2025

MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science

XGeM: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation

Direct Preference Optimization Using Sparse Feature-Level Constraints

Unsupervised Cognition

Urban Region Pre-training and Prompting: A Graph-based Approach

Road Graph Generator: Mapping roads at construction sites from GPS data

Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory

LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans

Answer Matching Outperforms Multiple Choice for Language Model Evaluation

Subtyping in DHOL -- Extended preprint

MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs

USAD: An Unsupervised Data Augmentation Spatio-Temporal Attention Diffusion Network

DNN-Based Precoding in RIS-Aided mmWave MIMO Systems With Practical Phase Shift

SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model

Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs

Multi-agent Auditory Scene Analysis

Fast and Simplex: 2-Simplicial Attention in Triton

Synthesizable by Design: A Retrosynthesis-Guided Framework for Molecular Analog Generation

Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics

Early Signs of Steganographic Capabilities in Frontier LLMs

Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks

FairHuman: Boosting Hand and Face Quality in Human Image Generation with Minimum Potential Delay Fairness in Diffusion Models

APT: Adaptive Personalized Training for Diffusion Models with Limited Data

ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning

Beamforming and Resource Allocation for Delay Optimization in RIS-Assisted OFDM Systems

Created by

Haebom

저자

Yu Ma, Xiao Li, Chongtao Guo, Le Liang, Shi Jin

개요

본 논문은 기지국에 데이터 패킷이 확률적으로 도착하는 다운링크 재구성 가능 지능형 표면(RIS) 지원 직교 주파수 분할 다중화(OFDM) 시스템에서 평균 지연을 최적화하기 위해 공동 위상 설계 및 자원 할당 문제를 조사합니다. 순차적 최적화 문제는 본질적으로 마르코프 의사 결정 프로세스(MDP)이므로 강화 학습의 범위에 속합니다. 혼합된 행동 공간을 효과적으로 처리하고 상태 공간 차원을 줄이기 위해 하이브리드 심층 강화 학습(DRL) 접근 방식을 제안합니다. 구체적으로, Proximal Policy Optimization (PPO)-$\Theta$는 RIS 위상 이동 설계를 최적화하는 데 사용되고, PPO-N은 부반송파 할당 결정을 담당합니다. 부반송파 할당과 관련된 차원의 저주를 더 완화하기 위해 다중 에이전트 전략을 도입하여 부반송파 할당 지표를 더 효율적으로 최적화합니다. 또한, 더욱 적응적인 자원 할당을 달성하고 네트워크 역학을 정확하게 포착하기 위해 버퍼의 대기 패킷 수와 현재 패킷 도착률을 포함한 평균 지연과 밀접하게 관련된 주요 요소가 상태 공간에 통합됩니다. 또한, 전이 학습 프레임워크를 도입하여 훈련 효율을 높이고 수렴을 가속화합니다. 시뮬레이션 결과는 제안된 알고리즘이 평균 지연을 크게 줄이고, 자원 할당 효율을 향상시키며, 기준 방법과 비교하여 우수한 시스템 강건성과 공정성을 달성함을 보여줍니다.

시사점, 한계점

•

시사점:

◦

RIS 지원 OFDM 시스템에서 평균 지연을 최소화하는 효율적인 하이브리드 DRL 기반 자원 할당 및 위상 설계 알고리즘을 제시.

◦

다중 에이전트 전략과 전이 학습을 활용하여 알고리즘의 효율성과 수렴 속도 향상.

◦

버퍼 상태 및 패킷 도착률을 고려하여 네트워크 동역학을 정확하게 반영.

◦

기존 방법 대비 우수한 평균 지연 감소, 자원 할당 효율, 시스템 강건성 및 공정성을 달성.

•

한계점:

◦

제안된 알고리즘의 성능은 시뮬레이션 결과에 의존적이며, 실제 환경에서의 성능 평가가 필요.

◦

다중 에이전트 전략의 에이전트 수 및 상호작용 방식에 대한 최적화 연구가 추가적으로 필요.

◦

고차원 상태 공간에서의 DRL 알고리즘 적용에 따른 계산 복잡도 문제 해결 방안 필요.

◦

특정 시스템 환경에 최적화된 알고리즘이므로, 다른 시스템 환경에 대한 일반화 성능 검증 필요.

Made with Slashpage