/
/
Daily Arxiv
Daily Arxiv
世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。
AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Self-Guided Process Reward Optimization with Redefined Step-wise Advantage for Process Reinforcement Learning
Crafting Hanzi as Narrative Bridges: An AI Co-Creation Workshop for Elderly Migrants
Distributional Soft Actor-Critic with Diffusion Policy
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy
Fast AI Model Splitting over Edge Networks
From Sentences to Sequences: Rethinking Languages in Biological System
MTCNet: Motion and Topology Consistency Guided Learning for Mitral Valve Segmentationin 4D Ultrasound
Horus: A Protocol for Trustless Delegation Under Uncertainty
Mixture of Reasonings: Teach Large Language Models to Reason with Adaptive Strategies
Benchmarking Generalizable Bimanual Manipulation: RoboTwin Dual-Arm Collaboration Challenge at CVPR 2025 MEIS Workshop
Red Teaming for Generative AI, Report on a Copyright-Focused Exercise Completed in an Academic Medical Center
AirV2X: Unified Air-Ground Vehicle-to-Everything Collaboration
Semantic Structure-Aware Generative Attacks for Enhanced Adversarial Transferability
Aligning Frozen LLMs by Reinforcement Learning: An Iterative Reweight-then-Optimize Approach
Distinguishing Predictive and Generative AI in Regulation
AIn't Nothing But a Survey? Using Large Language Models for Coding German Open-Ended Survey Responses on Survey Motivation
Text-Aware Image Restoration with Diffusion Models
How Good LLM-Generated Password Policies Are?
Towards an Explainable Comparison and Alignment of Feature Embeddings
Gradient-Based Model Fingerprinting for LLM Similarity Detection and Family Classification
Empowering Intelligent Low-altitude Economy with Large AI Model Deployment
Incorporating LLMs for Large-Scale Urban Complex Mobility Simulation
Generating Hypotheses of Dynamic Causal Graphs in Neuroscience: Leveraging Generative Factor Models of Observed Time Series
Traveling Across Languages: Benchmarking Cross-Lingual Consistency in Multimodal LLMs
Threat Modeling for AI: The Case for an Asset-Centric Approach
SoccerDiffusion: Toward Learning End-to-End Humanoid Robot Soccer from Gameplay Recordings
PAD: Phase-Amplitude Decoupling Fusion for Multi-Modal Land Cover Classification
Significativity Indices for Agreement Values
Transferrable Surrogates in Expressive Neural Architecture Search Spaces
Privacy-Preserving Operating Room Workflow Analysis using Digital Twins
Uncertainty-Guided Coarse-to-Fine Tumor Segmentation with Anatomy-Aware Post-Processing
CMD-HAR: Cross-Modal Disentanglement for Wearable Human Activity Recognition
Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models
Understanding-informed Bias Mitigation for Fair CMR Segmentation
HAPI: A Model for Learning Robot Facial Expressions from Human Preferences
MaizeField3D: A Curated 3D Point Cloud and Procedural Model Dataset of Field-Grown Maize from a Diversity Panel
Illuminant and light direction estimation using Wasserstein distance method
Fundamental Limits of Hierarchical Secure Aggregation with Cyclic User Association
LLM-Powered Prediction of Hyperglycemia and Discovery of Behavioral Treatment Pathways from Wearables and Diet
Interleaved Gibbs Diffusion: Generating Discrete-Continuous Data with Implicit Constraints
EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Networks
Circuit-tuning: A Mechanistic Approach for Identifying Parameter Redundancy and Fine-tuning Neural Networks
EigenLoRAx: Recycling Adapters to Find Principal Subspaces for Resource-Efficient Adaptation and Inference
Learning Traffic Anomalies from Generative Models on Real-Time Observations
Enabling Population-Level Parallelism in Tree-Based Genetic Programming for Comprehensive GPU Acceleration
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
Quantifying the Importance of Data Alignment in Downstream Model Performance
Quantum-enhanced causal discovery for a small number of samples
On Characterizations for Language Generation: Interplay of Hallucinations, Breadth, and Stability
Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs
COEF-VQ: Cost-Efficient Video Quality Understanding through a Cascaded Multimodal LLM Framework
GeMID: Generalizable Models for IoT Device Identification
Next-Token Prediction Task Assumes Optimal Data Ordering for LLM Training in Proof Generation
Is Complex Query Answering Really Complex?
Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning
Offline Reinforcement Learning for Learning to Dispatch for Job Shop Scheduling
Reconsidering the energy efficiency of spiking neural networks
Exploring the Integration of Large Language Models in Industrial Test Maintenance Processes
Sequence-aware Pre-training for Echocardiography Probe Movement Guidance
Anatomical Foundation Models for Brain MRIs
Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective
Quantifying the Cross-sectoral Intersecting Discrepancies within Multiple Groups Using Latent Class Analysis Towards Fairness
Delving into LLM-assisted writing in biomedical publications through excess vocabulary
Towards a Novel Measure of User Trust in XAI Systems
Avoiding Catastrophe in Online Learning by Asking for Help
Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning
Beyond Scale: The Diversity Coefficient as a Data Quality Metric for Variability in Natural Language Data
Kernel Density Bayesian Inverse Reinforcement Learning
Embodied AI Agents: Modeling the World
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
AI Flow: Perspectives, Scenarios, and Approaches
A framework for Conditional Reasoning in Answer Set Programming
Autoformalization in the Era of Large Language Models: A Survey
Agentic AI Process Observability: Discovering Behavioral Variability
Artificial Intelligence Index Report 2025
MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science
XGeM: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation
Direct Preference Optimization Using Sparse Feature-Level Constraints
Unsupervised Cognition
Urban Region Pre-training and Prompting: A Graph-based Approach
Road Graph Generator: Mapping roads at construction sites from GPS データ
Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory
LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans
Answer Matching Outperforms Multiple Choice for Language Model Evaluation
Subtyping in DHOL - Extended preprint
MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs
USAD: An Unsupervised Data Augmentation Spatio-Temporal Attention Diffusion Network
DNN-Based Precoding in RIS-Aided mmWave MIMO Systems With Practical Phase Shift
SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model
Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs
Multi-agent Auditory Scene Analysis
Fast and Simplex: 2-Simplicial Attention in Triton
Synthesizable by Design: A Retrosynthesis-Guided Framework for Molecular Analog Generation
Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics
Early Signs of Steganographic Capabilities in Frontier LLMs
Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks
FairHuman: Boosting Hand and Face Quality in Human Image Generation with Minimum Potential Delay Fairness in Diffusion Models
APT: Adaptive Personalized Training for Diffusion Models with Limited Data
ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning
Load more
Fast AI Model Splitting over Edge Networks
Created by
Haebom
作者
Zuguang Li (Sherman), Wen Wu (Sherman), Shaohua Wu (Sherman), Songge Zhang (Sherman), Ye Wang (Sherman), Xuemin (Sherman), Shen
概要
この論文では、分散学習の1つの方法であるSplit Learning、SLで最適なモデル分割を効率的に見つけるアルゴリズムを提案します。任意のAIモデルをDAG(Directed Acyclic Graph)で表現し、最適なモデル分割問題を最小stカット問題に再構成します。提案するDAGベースのアルゴリズムはDAGを再構成し,最大流量法による最適モデル分割を求める。理論的分析はアルゴリズムの最適性を証明し、ブロック構造を持つAIモデルのためのブロック単位の分割アルゴリズムも提示し、計算の複雑さを減らします。実験の結果、提案されたアルゴリズムは、最適なモデル分割をミリ秒単位で探し、最先端技術と比較して動的エッジネットワークでトレーニング遅延を24.62%〜38.95%減少させることを示しています。
Takeaways、Limitations
•
Takeaways:
◦
DAGを用いたモデル表現と最小カット問題への再構成により最適モデル分割問題を効率的に解決する新しいアルゴリズム提示
◦
ブロック構造AIモデルのブロック単位アルゴリズムによる計算複雑度の低減
◦
実験によりトレーニング遅延時間を大幅に短縮する効果を検証
◦
ミリ秒単位の高速最適モデル分割導出
•
Limitations:
◦
提案されたアルゴリズムの最適性は理論的に証明されているが、実際の様々なモデルおよび環境における一般化性能に関するさらなる研究が必要である。
◦
ブロック構造を持たないモデルの適用性と効率性の検討が必要
◦
実験環境の特殊性を考慮し,さらに多様な環境での性能評価が必要
◦
DAG再構成プロセスの複雑さのさらなる分析の必要性
PDFを見る
Made with Slashpage