Feedback

HEC-Ecole de gestion de l'Université de Liège
HEC-Ecole de gestion de l'Université de Liège
MASTER THESIS

LLM Size Reduction & Carbon Footprint

Download
Dosquet, Pierre ULiège
Promotor(s) : Ittoo, Ashwin ULiège
Date of defense : 20-Jun-2025/24-Jun-2025 • Permalink : http://hdl.handle.net/2268.2/22785
Details
Title : LLM Size Reduction & Carbon Footprint
Author : Dosquet, Pierre ULiège
Date of defense  : 20-Jun-2025/24-Jun-2025
Advisor(s) : Ittoo, Ashwin ULiège
Committee's member(s) : Smitz, Joseph ULiège
Language : English
Number of pages : 68
Keywords : [en] large language models
[en] compression methods
[en] carbon footprint
[en] energy consumption
[en] inference
Discipline(s) : Business & economic sciences > Multidisciplinary, general & others
Target public : Researchers
Institution(s) : Université de Liège, Liège, Belgique
Degree: Master en ingénieur de gestion, à finalité spécialisée en digital business
Faculty: Master thesis of the HEC-Ecole de gestion de l'Université de Liège

Abstract

[fr] In recent years, model compression techniques have proven highly effective at reducing large language models’ storage footprint and accelerating inference. By lowering memory requirements and increasing computational speed, these methods offer a pathway to more energy-efficient large models. However, altering model parameters through compression inherently risks degrading performance. Thus, the trade-off between efficiency gains and potential quality loss remains central when adopting compression strategies.
While the environmental impact of training large models has received growing attention, the effects of compression on hardware energy consumption and related GHG emissions during inference remain largely unexplored.
To address this gap, I conducted an empirical study investigating the effects of compression on both model performance and energy consumption across hardware components—specifically CPU, GPU, and RAM. Four decoder-only transformer models were selected for their academic relevance and maturity: LLaMA-7B, LLaMA-30B, Mistral-7B-v0.3, and Mistral Small 3. Each model was compressed using the OPTQ method at 4-bit precision.
Model performance was evaluated using WikiText-2 perplexity, and MMLU and IFEval accuracy. Energy consumption was measured using CodeCarbon during inference on two high-end hardware configurations. To contextualize these results, I conducted a comparative carbon footprint analysis using four electricity mixes, offering a grid-aware perspective on compression-related emissions in CO₂eq.
The findings show that (1) quantization-induced performance degradation is marginal; compressed models retain nearly all capabilities across benchmarks. (2) Compression can substantially reduce energy use, but the magnitude depends on hardware—one configuration yielded up to 39% savings, while another saw a 26% increase. (3) Environmental impact hinges not only on model and energy use but also on deployment geography: a compressed model on a carbon-intensive grid can emit up to six times more than a full-sized model on a clean grid. Thus, sustainability benefits of compression must be assessed in relation to hardware and geographic context.


File(s)

Document(s)

File
Access Master_Thesis_s202250.pdf
Description:
Size: 2.88 MB
Format: Adobe PDF

Author

  • Dosquet, Pierre ULiège Université de Liège > Master ing. gest., fin. spéc. dig. busin.

Promotor(s)

Committee's member(s)









All documents available on MatheO are protected by copyright and subject to the usual rules for fair use.
The University of Liège does not guarantee the scientific quality of these students' works or the accuracy of all the information they contain.