[공지사항]을 빙자한 안부와 근황

Show more

Daily Arxiv

世界中で発行される人工知能関連の論文をまとめるページです。
このページはGoogle Geminiを活用して要約し、非営利で運営しています。
論文の著作権は著者および関連機関にあり、共有する際は出典を明記してください。

Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges

From Roots to Rewards: Dynamic Tree Reasoning with RL

Illuminating the Three Dogmas of Reinforcement Learning under Evolutionary Light

Instance space analysis of the capacitated vehicle routing problem

Multi-Agent LLMs as Ethics Advocates for AI-Based Systems

GATSim: Urban Mobility Simulation with Generative Agents

Reasoning about Uncertainty: Do Reasoning Models Know When They Don't Know?

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Strategic Reflectivism In Intelligent Systems

SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator

What the F*ck Is Artificial General Intelligence?

Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models

From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios

To Code or not to Code? Adaptive Tool Integration for Math Language Models via Expectation-Maximization

BLAST: A Stealthy Backdoor Leverage Attack against Cooperative Multi-Agent Deep Reinforcement Learning based Systems

UniEmoX: Cross-modal Semantic-Guided Large-Scale Pretraining for Universal Scene Emotion Perception

CorMulT: A Semi-supervised Modality Correlation-aware Multimodal Transformer for Sentiment Analysis

Toward Temporal Causal Representation Learning with Tensor Decomposition

Kolmogorov Arnold Networks (KANs) for Imbalanced Data - An Empirical Perspective

NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining

Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) トラック

Multi-Centre Validation of a Deep Learning Model for Scoliosis Assessment

The Emotion-Memory Link: Do Memorability Annotations Matter for Intelligent Systems?

DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits

Edge Intelligence with Spiking Neural Networks

VLA-Mark: A cross modal watermark for large vision-language alignment model

Noradrenergic-inspired gain modulation attenuates the stability gap in joint training

A multi-strategy improved snake optimizer for 3-dimensional UAV path planning and engineering problems

Photonic Fabric Platform for AI Accelerators

OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models

A segmented robot grasping perception neural network for edge AI

Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need

DUALRec: A Hybrid Sequential and Language Model Framework for Context-Aware Movie Recommendation

Exploiting Primacy Effect To Improve Large Language Models

Generalist Forecasting with Frozen Video Models via Latent Diffusion

Convergent transformations of visual representation in brains and models

Preprint: Did I Just Browse A Website Written by LLMs?

The Levers of Political Persuasion with Conversational AI

Political Leaning and Politicalness Classification of Texts

Self-supervised learning on gene expression data

Using LLMs to identify features of personal and professional skills in an open-response situational judgment test

Real-Time Fusion of Visual and Chart Data for Enhanced Maritime Vision

When Seeing Overrides Knowing: Disentangling Knowledge Conflicts in Vision-Language Models

SPARQL Query Generation with LLMs: Measuring the Impact of Training Data Memorization and Knowledge Injection

Scalable Submodular Policy Optimization via Pruned Submodularity Graph

RAG-based Architectures for Drug Side Effect Retrieval in LLMs

Team of One: Cracking Complex Video QA with Model Synergy

Food safety trends across Europe: insights from the 392-million-entry CompreHensive European Food Safety (CHEFS) database

One Step Closer: Creating the Future to Boost Monocular Semantic Scene Completion

Localized FNO for Spatiotemporal Hemodynamic Upsampling in Aneurysm MRI

Learning Spectral Diffusion Prior for Hyperspectral Image Reconstruction

Search-Optimized Quantization in Biomedical Ontology Alignment

SamGoG: A Sampling-Based Graph-of-Graphs Framework for Imbalanced Graph Classification

Can Synthetic Images Conquer Forgetting? Beyond Unexplored Doubts in Few-Shot Class-Incremental Learning

AGENTS-LLM: Augmentative GENeration of Challenging Traffic Scenarios with an Agentic LLM Framework

Point of Interest Recommendation: Pitfalls and Viable Solutions

Binarizing Physics-Inspired GNNs for Combinatorial Optimization

LoopServe: An Adaptive Dual-phase LLM Inference Acceleration System for Multi-Turn Dialogues

HeCoFuse: Cross-Modal Complementary V2X Cooperative Perception with Heterogeneous Sensors

When Person Re-Identification Meets Event Camera: A Benchmark Dataset and An Attribute-guided Re-Identification Framework

Improved particle swarm optimization algorithm: multi-target trajectory optimization for swarm drones

A Comprehensive Review of Transformer-based language models for Protein Sequence Analysis and Design

Large Language Models in Cybersecurity: Applications, Vulnerabilities, and Defense Techniques

Seed-X: Building Strong Multilingual Translation LLM with 7B パラメータ

Linguistic and Embedding-Based Profiling of Texts generated by Humans and Large Language Models

BreastSegNet: Multi-label Segmentation of Breast MRI

GIFT: Gradient-aware Immunization of diffusion models against malicious Fine-Tuning with safe concepts retention

Learning Pluralistic User Preferences through Reinforcement Learning Fine-tuned Summaries

Apple Intelligence Foundation Language Models: Tech Report 2025

Change of Thought: Adaptive Test-Time Computation

Time Series Forecastability Measures

Reading Between the Lines: Combining Pause Dynamics and Semantic Coherence for Automated Assessment of Thought Disorder

Loss-Complexity Landscape and Model Structure Functions

Acoustic Index: A Novel AI-Driven Parameter for Cardiac Disease Risk Stratification Using Echocardiography

Humans learn to prefer trustworthy AI over human partners

PHASE: Passive Human Activity Simulation Evaluation

AI-Assisted Fixes to Code Review Comments at Scale

Neural Architecture Search with Mixed Bio-inspired Learning Rules

ERR@HRI 2.0 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Conversations

Graph Neural Network Surrogates for Contacting Deformable Bodies with Necessary and Sufficient Contact Detection

"PhyWorldBench": A Comprehensive Evaluation of Physical Realism in Text-to-Video Models

CaSTFormer: Causal Spatio-Temporal Transformer for Driving Intention Prediction

Air Traffic Controller Task Demand via Graph Neural Networks: An Interpretable Approach to Airspace Complexity

AI-ming backwards: Vanishing archaeological landscapes in Mesopotamia and automatic detection of sites on CORONA imagery

Soft-ECM: An extension of Evidential C-Means for complex data

Single- to multi-fidelity history-dependent learning with uncertainty quantification and disentanglement: application to data-driven constitutive modeling

SEER: Semantic Enhancement and Emotional Reasoning Network for Multimodal Fake News Detection

Gauge Flow Models

Aligning Knowledge Graphs and Language Models for Factual Accuracy

Causal Language Control in Multilingual Transformers via Sparse Feature Steering

A Deep Learning-Based Ensemble System for Automated Shoulder Fracture Detection in Clinical Radiographs

IConMark: Robust Interpretable Concept-Based Watermark For AI Images

Mitigating Stylistic Biases of Machine Translation Systems via Monolingual Corpora Only

TopicImpact: Improving Customer Feedback Analysis with Opinion Units for Topic Modeling and Star-Rating Prediction

Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models

Persona-Based Synthetic Data Generation Using Multi-Stage Conditioning with Large Language Models for Emotion Recognition

Smart Routing for Multimodal Video Retrieval: When to Search What

Enhancing Breast Cancer Detection with Vision Transformers and Graph Neural Networks

Transformer-Based Framework for Motion Capture Denoising and Anomaly Detection in Medical Rehabilitation

High-Throughput LLM inference on Heterogeneous Clusters

Created by

Haebom

作者

Yi Xiong, Jinqi Huang, Wenjie Huang, Xuebing Yu, Entong Li, Zhixiong Ning, Jinhua Zhou, Li Zeng, Xin Chen

概要

本論文では、異機種クラスターでの大規模言語モデル（LLM）推論サービスのための高スループット推論システムを提案します。このシステムは、まずリソース量と予想スループットをモデル化し、フルナビゲーション技術を使用して展開構成を最適化します。第二に、異なるインスタンスの異なる処理能力を十分に考慮する新しい要求スケジューリングメカニズムを提案する。実験の結果、提案されたスケジューラは、2つの異機種クラスターのスループットをそれぞれ122.5％と33.6％向上させることを示しました。主な課題としては、異機種間クラスターのさまざまなデプロイメント構成によるパフォーマンスの違いと、インスタンス固有の処理能力の違いによる効率的な要求のスケジューリングが困難であることが挙げられます。

Takeaways、Limitations

•

Takeaways:

◦

異機種クラスタ環境におけるLLM推論サービスのスループットを大幅に向上させる効果的なシステムを提示する。

◦

フルナビゲーション技術を使用したデプロイメント構成の最適化とインスタンス固有の処理能力を考慮したスケジューリングメカニズムは、実質的なパフォーマンス向上をもたらします。

◦

提案されたシステムは、LLM推論サービスのコスト削減と作業処理速度の向上に貢献できます。

•

Limitations:

◦

フルナビゲーション技術は、クラスタ規模が大きくなるにつれて計算コストが指数関数的に増加する可能性があります。

◦

提案されたスケジューリングメカニズムのパフォーマンスは、インスタンスの処理能力を正確に予測することに依存します。予測誤差はパフォーマンスの低下を引き起こす可能性があります。

◦

様々な種類の異機種クラスタ環境に対する一般化の可能性に関するさらなる研究が必要である。

Made with Slashpage