Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

The Hidden Costs of Translation Accuracy: Distillation, Quantization, and Environmental Impact

Created by
  • Haebom

Author

Dhaathri Vijay, Anandaswarup Vadapalli

Outline

This study examines the trade-off between translation quality and efficiency of large-scale language models (LLMs), using machine translation as a case study. We compared the performance of full, distilled, and quantized models using the Flores+ benchmark and human evaluations on conversational translation tasks in French, Hindi, and Kannada. The 3.3B FP32 model achieved the highest BLEU score but incurred the largest environmental footprint. The distilled 600M FP32 model reduced inference time by 71-78% and carbon emissions by 63-65%, while minimizing the decrease in BLEU score. Aggressive quantization (INT4) also maintained high levels of accuracy and fluency.

Takeaways, Limitations

Takeaways:
Model compression strategies can significantly reduce computational requirements and environmental impact while maintaining competitive translation quality.
We need a framework that evaluates efficiency and sustainability, along with accuracy, as key metrics for NLP advancement.
Limitations:
In low-resource environments, the trade-offs are even more pronounced.
👍