

Better yet, download Qwen 3.5/3.6, with a “raw” notepad like Mikupad. Try it yourself:
https://huggingface.co/ubergarm/Qwen3.6-27B-GGUF
https://github.com/lmg-anon/mikupad
One might observe:
-
Chat formating, and how janky the “thinking” block is.
-
How words are broken up into tokens, not characters.
-
How particularly funky that gets with numbers.
-
Precisely how sampling “randomizes” the answers by visualizing “all possible answers” with the logprobs display.
-
And, thus, precisely how and why carb counting in ChatGPT fails, yet a measly local LLM on a desktop/phone could get it right with a little tooling or adjustment.
This is exactly what OpenAI/Anthropic don’t want you to do. They want users dumb and tethered, like a cloud subscription or social media platform. Not cognizant of how tools they are peddling as magic lamps actually work. And why, and how, they’re often stupid.


Slap in a spare GPU, and self-host one!
The 30B-class models are unbelievably good now, for being so small. They’re kinda where Claude was like a year ago, if not less. And (with the right backend) they aren’t expensive to host.