Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Joint Memory Frequency and Computing Frequency Scaling for Energy-efficient DNN Inference

Created by
  • Haebom

Author

Yunchu Han, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

Outline

Deep neural networks (DNNs) are widely used in various fields, but they suffer from high latency and energy overhead on resource-constrained devices. Most research has focused on dynamic voltage and frequency scaling (DVFS), a technique that balances latency and energy consumption by varying the processor's computing frequency. However, memory frequency scaling is often neglected or underutilized to improve DNN inference efficiency. In this paper, we use model- and data-driven methods to investigate the impact of simultaneously scaling memory and computing frequencies on inference time and energy consumption. Furthermore, we conduct a preliminary analysis of the proposed model by combining the fitting parameters of various DNN models and verify the effectiveness of simultaneously scaling memory and computing frequencies. Finally, we validate the effectiveness of jointly scaling memory and computing frequencies in reducing device energy consumption through local and collaborative inference simulation results.

Takeaways, Limitations

It has been demonstrated that adjusting memory frequency and computing frequency together can reduce energy consumption.
Analysis using model-based and data-driven methods.
Validated through local and collaborative inference simulations.
Combining the fitting parameters of DNN models to suggest the generalizability of the analysis.
The specific content of the paper, Limitations, is not provided.
👍