@vrighter

vrighter@discuss.tchncs.de · 8 days ago

I will never go back to ubuntu, begrudgingly or not

vrighter@discuss.tchncs.de · 1 month ago

as far as i had read about it, accurate enough to rely on for a whole flight without worrying about drift

vrighter@discuss.tchncs.de · 1 month ago

“good catch! That’s a very astute observation. Here’s a bunch of paragraphs explaining (incorrectly) how you’re wrong!”

vrighter@discuss.tchncs.de · 1 month ago

the windows just works argument actually refers to the fact that it’s consistent.

If you have a problem with the desktop, nobody needs to ask you which de you use, or which parts you have substituted out. You have a graphics problem, nobody asks if wayland or x11. You have a problem with audio, nobody asks you whether you have pipewire-pulse installed and to use pipewire. Shit’s the same everywhere.

I say this as an arch linux user. The choice we all love, is actually a detriment to the average non-power user.

vrighter@discuss.tchncs.de · 1 month ago

is there any picture of the guy without his hand up like that?

vrighter@discuss.tchncs.de · 1 month ago

oh quit that bs. There was waterproof (not resistant) micro usb more than a literal decade ago. If anything they should have gottn better.

vrighter@discuss.tchncs.de · 1 month ago

this also happens to me occasionally when using wsl (i have to use windows at work). There’s an update to wsl? just force shutdown the wsl vm

vrighter@discuss.tchncs.de · 1 month ago

Why can’t they just say what they have published papers about? i.e. “it’s because ai does not work and can’t really be coerced to”

vrighter@discuss.tchncs.de · 1 month ago

are they still? I non’t really think so. They used to, that much I agree with. But those days are long gone

vrighter@discuss.tchncs.de · 2 months ago

exactly. But what if there were more than just three (the infamous “guardrails”)

vrighter@discuss.tchncs.de · 2 months ago

it is perfectly descriptive. It is not a forum. I wish it was, but those went pretty much extinct. If they called it a forum it’d be lying

vrighter@discuss.tchncs.de · 2 months ago

and the only reason it’s not slowing you down on other things is that you don’t know enough about those other things to recognize all the stuff you need to fix

vrighter@discuss.tchncs.de · 2 months ago

funny how everyone who wants to write a new browser (except the ladybird guys) always skimp on writing the actual browser part

vrighter@discuss.tchncs.de · 2 months ago

in yes/no type questions, 50% success rate is the absolute worst one can do. Any worse and you’re just giving an inverted correct answer more than half the time

vrighter@discuss.tchncs.de · 2 months ago

they are improving at an exponential rate. It’s just that the exponent is less than one.

vrighter@discuss.tchncs.de · 3 months ago

so? It was never advertised as intelligent and capable of solving any task other than that one.

Meanwhile slop generators are capable of doing a lot of things and reasoning.

One claims to be good at chess. The other claims to be good at everything.

vrighter@discuss.tchncs.de · edit-2 3 months ago

because the over 70 different binaries of systemd are “not modular” because they are designed to work together. What makes a monolith is, apparently, the name of the overarching project, not it being a single binary (which again, it’s not)

vrighter@discuss.tchncs.de · 3 months ago

you wouldn’t be “freezing” anything. Each possible combination of input tokens maps to one output probability distribution. Those values are fixed and they are what they are whether you compute them or not, or when, or how many times.

Now you can either precompute the whole table (theory), or somehow compute each cell value every time you need it (practice). In either case, the resulting function (table lookup vs matrix multiplications) takes in only the context, and produces a probability distribution. And the mapping they generate is the same for all possible inputs. So they are the same function. A function can be implemented in multiple ways, but the implementation is not the function itself. The only difference between the two in this case is the implementation, or more specifically, whether you precompute a table or not. But the function itself is the same.

You are somehow saying that your choice of implementation for that function will somehow change the function. Which means that according to you, if you do precompute (or possibly cache, full precomputation is just an infinite cache size) individual mappings it somehow magically makes some magic happen that gains some deep insight. It does not. We have already established that it is the same function.

vrighter@discuss.tchncs.de · 3 months ago

the fact that it is a fixed function, that only depends on the context AND there are a finite number of discrete inputs possible does make it equivalent to a huge, finite table. You really don’t want this to be true. And again, you are describing training. Once training finishes anything you said does not apply anymore and you are left with fixed, unchanging matrices, which in turn means that it is a mathematical function of the context (by the mathematical definition of “function”. stateless, and deterministic) which also has the property that the set of all possible inputs is finite. So the set of possible outputs is also finite and strictly smaller or equal to the size of the set of possible inputs. This makes the actual function that the tokens are passed through CAN be precomputed in full (in theory) making it equivalent to a conventional state transition table.

This is true whether you’d like it to or not. The training process builds a markov chain.

vrighter@discuss.tchncs.de · 3 months ago

no, not any computer program is a markov chain. only those that depend only on the current state and ignore prior history. Which fits llms perfectly.

Those sophisticated methods you talk about are just a couple of matrix multiplications. Those matrices are what’s learned. Anything sophisticated happens during training. Inference is so not sophisticated. sjusm mulmiplying some matrices together and taking the rightmost column of the result. That’s it.