From Words to Watts: Benchmarking the Energy Costs of Large Language
Model Inference

From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference

4 October 2023

Joseph McDonald

Baolin Li

Michael Jones

William Bergeron

Devesh Tiwari

Papers citing "From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference"

18 / 18 papers shown

Title
How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference Nidhal Jegham Marwen Abdelatti Lassad Elmoubarki Abdeltawab Hendawi 26 0 0 14 May 2025
Prediction-powered estimators for finite population statistics in highly imbalanced textual data: Public hate crime estimation Hannes Waldetoft Jakob Torgander Måns Magnusson 29 0 0 05 May 2025
Backdoor Attacks Against Patch-based Mixture of Experts Cedric Chan Jona te Lintelo S. Picek AAML MoE 151 0 0 03 May 2025
From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System Rohan Surana Junda Wu Zhouhang Xie Yu Xia Harald Steck Dawen Liang Nathan Kallus Julian McAuley 28 0 0 21 Apr 2025
Green Prompting Marta Adamska Daria Smirnova Hamid Nasiri Zhengxin Yu Peter Garraghan 160 0 0 09 Mar 2025
Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language Models S. Poddar Paramita Koley Janardan Misra Niloy Ganguly Saptarshi Ghosh Saptarshi Ghosh 61 0 0 08 Feb 2025
OverThink: Slowdown Attacks on Reasoning LLMs A. Kumar Jaechul Roh A. Naseh Marzena Karpinska Mohit Iyyer Amir Houmansadr Eugene Bagdasarian LRM 64 14 0 04 Feb 2025
Large Language Models, Knowledge Graphs and Search Engines: A Crossroads for Answering Users' Questions Aidan Hogan Xin Luna Dong Denny Vrandečić Gerhard Weikum 52 1 0 12 Jan 2025
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI Arya Tschand Arun Tejusve Raghunath Rajan S. Idgunji Anirban Ghosh J. Holleman ... Rowan Taubitz Sean Zhan Scott Wasson David Kanter Vijay Janapa Reddi 62 3 0 15 Oct 2024
L3iTC at the FinLLM Challenge Task: Quantization for Financial Text Classification & Summarization Elvys Linhares Pontes Carlos-Emiliano González-Gallardo Mohamed Benjannet Caryn Qu A. Doucet 24 1 0 06 Aug 2024
Accelerating Large Language Model Inference with Self-Supervised Early Exits Florian Valade LRM 44 1 0 30 Jul 2024
Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases Xiang Zhang Khatoon Khedri Reza Rawassizadeh 32 2 0 12 Apr 2024
SpikeExplorer: hardware-oriented Design Space Exploration for Spiking Neural Networks on FPGA Dario Padovano Alessio Carpegna Alessandro Savino S. Di Carlo 42 1 0 04 Apr 2024
Towards Pareto Optimal Throughput in Small Language Model Serving Pol G. Recasens Yue Zhu Chen Wang Eun Kyung Lee Olivier Tardieu Alaa Youssef Jordi Torres Josep Ll. Berral 40 4 0 04 Apr 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes Lucio Dery Steven Kolawole Jean-Francois Kagey Virginia Smith Graham Neubig Ameet Talwalkar 41 28 0 08 Feb 2024
Adaptive Inference: Theoretical Limits and Unexplored Opportunities S. Hor Ying Qian Mert Pilanci Amin Arbabian 23 0 0 06 Feb 2024
Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models Joseph McDonald Baolin Li Nathan C. Frey Devesh Tiwari V. Gadepally S. Samsi 34 44 0 19 May 2022
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism M. Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 245 1,821 0 17 Sep 2019