Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Intra-DP: A High Performance Collaborative Inference System for Mobile Edge Computing

Created by
  • Haebom

Author

Zekai Sun, Xiuxian Guan, Zheng Lin, Zihan Fang, Xiangming Cai, Zhe Chen, Fangming Liu, Heming Cui, Jie Xiong, Wei Ni, Chau Yuen

Outline

This paper proposes Intra-DP, a high-performance collaborative inference system that addresses the challenges of limited computational resources and battery life in real-time operation of deep neural networks (DNNs) on resource-constrained mobile devices. Existing collaborative inference approaches based on Mobile Edge Computing (MEC) rely on layer-by-layer model partitioning, which leads to transmission bottlenecks caused by sequential execution of DNN operations. Intra-DP utilizes a novel parallel computing technique based on local operators (operators that use the smallest possible input rather than the entire input tensor, such as convolution kernels). This decomposes the computation into multiple independent sub-operations and overlaps the computation and transmission of these sub-operations through parallel execution, thereby mitigating the transmission bottleneck in MEC. This ultimately enables fast and energy-efficient inference.

Takeaways, Limitations

Takeaways:
We present a novel method to dramatically reduce the latency and energy consumption of DNN inference in MEC environments (up to 50% latency reduction and up to 75% energy reduction).
Effectively resolves transmission bottlenecks through local operator-based parallel computing techniques.
Achieve performance improvements without compromising accuracy.
Limitations:
The performance improvements of the proposed Intra-DP may vary depending on the specific DNN model and MEC environment. Further research is needed to determine its generalizability.
Further research into optimization techniques for local operator decomposition and parallel processing may be required.
Additional experimental validation is required on various mobile devices and network environments.
👍