The University of Rhode Island’s AI lab estimates that GPT-5 averages just over 18 Wh per query, so putting all of ChatGPT’s reported 2.5 billion requests a day through the model could see energy usage as high as 45 GWh.
A daily energy use of 45 GWh is enormous. A typical modern nuclear power plant produces between 1 and 1.6 GW of electricity per reactor per hour, so data centers running OpenAI’s GPT-5 at 18 Wh per query could require the power equivalent of two to three nuclear power reactors, an amount that could be enough to power a small country.



And an LLM that you could run local on a flash drive will do most of what it can do.
I mean no not at all, but local LLMs are a less energy reckless way to use AI
Why not… for the ignorant such as myself?
AI models require a LOT of VRAM to run. Failing that they need some serious CPU power but it’ll be dog slow.
A consumer model that is only a small fraction of the capability of the latest ChatGPT model would require at least a $2,000+ graphics card, if not more than one.
Like I run a local LLM with a etc 5070TI and the best model I can run with that thing is good for like ingesting some text to generate tags and such but not a whole lot else.
How slow?
Loading up a website with flash and GIF in the 90s dialup slow… Or worse?
Basicly I can run 9b models on my 16gb gpu mostly fine like getting responses of lets say 10 lines in a few seconds.
Bigger models if they don’t outright crash take for the same task then like 5x or 10x longer so long it isn’t even useful anymore
So very worse.
Like make a query and then go make yourself a sandwich while it spits out a word every other second slow.
There are very small models that can run on mid range graphics cards and all, but it’s not something you’d look at and say “Yeah this does most of what chatGPT does”
I have a model running on a gtx 1660 and I use it with Hoarder to parse articles and create a handful a tags for them and it’s not… great at that.
It’s horrendously slow, unusable imo. With the larger DeepSeek distilled models I tried that didn’t fit into VRAM you could easily wait 5 minutes until it was done writing its essay. Compared to just a few seconds when it does. Bit that’s with a RTX 3070 Ti, not something the average ChatGPT user has lying around probably.
Probably not a flash drive but you can get decent mileage out of 7b models that run on any old laptop for tasks like text generation, shortening or summarizing.
What do you use your usb drive llm for?
Porn. Obviously.
Can you give an example?