Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Joint Memory Frequency and Computing Frequency Scaling for Energy-efficient DNN Inference

Created by
  • Haebom

Author

Yunchu Han, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

Outline

This paper highlights the importance of memory frequency tuning, as well as processor frequency tuning, to address high latency and energy consumption during deep neural network (DNN) inference in resource-constrained environments. Using model- and data-driven methods, we investigate the impact of co-tuning memory and compute frequencies on inference time and energy consumption. We also analyze the effectiveness of co-tuning models by combining the fit parameters of various DNN models. Finally, we verify the effectiveness of co-tuning memory and compute frequencies in reducing energy consumption through simulation results for local and collaborative inference.

Takeaways, Limitations

Takeaways:
We present the importance of optimizing DNN inference considering memory frequency tuning.
We propose an energy-efficient DNN inference method through joint tuning of memory and computation frequency.
We validate the effectiveness of the proposed method in both local and collaborative inference environments.
Limitations:
The applicability of the proposed model may be limited to specific DNN models and hardware platforms.
Since this analysis is based on simulation results, performance in real environments requires additional verification.
There is a lack of extensive experimentation across different DNN models and hardware platforms.
👍