kivikakk.ee

on llms (Ⅳ)

I’ve been thinking about LLMs a lot lately. Haven’t we all.

Here’s my reading from the last few years:


(skip)

2016 (easter egg!)

2023

2024

2025

2026


A lot of it is very, very negatively inclined; I’ve done my best to take an interest in alternate viewpoints over the years, as well as some technical detail to stay grounded in the “what”. Despite my obviously strong feelings about the matter, I really do believe in keeping an open mind and reading widely. (It goes without saying that an article appearing above does not equal endorsement.) Likewise, despite not having an account on Lobste.rs, I continue to read most of the discussion that goes on there, and there sure has been a lot of it about LLMs!

The thing about keeping an open mind is that it can change. My concerns about LLMs have been pretty total:

  • Environmental impact. Our world is burning, and you know what it doesn’t need more of right now? Datacentres.
  • Economic impact. The share markets are a fuck, the flow-on effects (RAM prices!) widely felt, and every product is now ✨ intelligent ✨ in a way no-one ever asked for.
  • They Can’t Work.
  • For programmers particularly, there’s a continuum of fun new stuff from identity loss to an overwhelm of slop patches and everything in between.
  • Theft and DDoS at scale are cool and we all love it

So far, so good. Sometimes the criticisms do get a little unhinged when evaluated from a “love thy neighbour” standpoint:

Well that’s obviously relative and subjective— it’s reasonable for people who cheer for LLMs to trample on others’ infra, DDoS it, perform incessant scraping, ignore copyright and treat everything on the Internet as their own public domain dataset, destroy FOSS ecosystems, destroy personal computing and move us into a future of rented compute all for the sake of apparent results and “substance”. It’s reasonable for such people to agree with the principle of the end justifies the means.

And I guess, as someone who really likes nuance, arguments that totally discard nuance and any appreciation whatsoever for the humanity and experiences of “the other side” do grate.

One argument that has come up a lot traditionally, and that I have made myself, is that they just Don’t Work; sometimes the argument is that they Can’t Work. Seeing the sheer quantity of people using these tools and being quite confident that they do work really made me wonder about this argument. People really still like it in the Year of our Lord 2026.

I’ve been employed at a big, publicly-owned software development software development company [sic] for the past five months. This means a lot of “AI” push from the top. One month in, I was pretty sure I could avoid using LLMs until the bubble popped. Eventually, though, a combination of (a) being pushed to use them, (b) the insistence that they Don’t Work, from people who by and large don’t use them, (c) the slowly growing portion of usually pretty level-headed people who are quite sure they do, (d) the availability of the best of breed of these tools to me at work without my having to personally give money to The Companies, and (e) that bubble not popping yet!!, all combined to my deciding to evaluate those claims.

I try really hard to steelman my own arguments against myself, so instead of sticking to my “running qwen3-coder:30b locally is completely not worth the time and results” line, given the opportunity, let’s see what agentic Opus 4.5 through OpenCode (via DAP) for work tasks can do. I started very small, and found myself remembering to try it out once every few days.

And what do you know, it’s really capable. It is very good. I wouldn’t ever let it or any LLM communicate for me — much like I have never let any tool do that, be it LLMs, Gmail’s age-old “here I’ll try to guess what you want to say!!” or otherwise — but at development tasks, it’s super capable. It will indefatigably work out why something happened, in a huge codebase, with reference to megabytes of logs. It’ll capably suggest solutions, poke holes in yours, and get things wrong itself. It will happily substitute for your theory if you let it, it’ll write poor code if you’re not careful, and if you’re not clear enough about what you’re doing, it may well be a net-negative. It is a very quick, tireless, and reasonably unhinged assistant, and it absolutely makes Enterprise™ working conditions for the senior+ engineer so much more palatable. I can only speak from my own experience, but my experience is that it’s solid.

I don’t say any of this lightly. I am really good at my job. I am not at all afraid of this replacing me. But holy shit, does it make some really tricky tasks virtually effortless?! Today a random set of 20 unrelated tests started to fail consistently in the GitLab 18.8 backport branch. I pointed it at some job logs, explained the situation and what I’d observed, and set the agents to work. I don’t have any experience with any of the tests or the feature areas affected; I have nothing to lean on here other than my observations trying and failing to get my backports past CI, realising these flakes were no longer flaking in the “let things pass” direction. It read all though failing tests’ code, read all the code those tests invoked (THERE IS A LOT), then read the test harnesses, the helpers for those, and within 10 minutes produced a theory which explained precisely what was going on to cause this extreme edge. That’s a super helpful tool to have, and saying otherwise is just incorrect? Would it have been better for me to load and commit all that to my own memory with some hours of reading and poking, and be a hero of diagnosing this one weird failure in the 18.8 backport branch, which will receive its last patch two months from now? I don’t think so.

And iunno, here we get to the pointy end of things. I long didn’t have to care about thinking through the Hard™ ethics issues around LLM use, because it fundamentally Doesn’t Work and therefore resolving them was fruitless. It turns out, though, that it Does Work. So where does that leave us?

Theft and DDoS at scale are cool and we all love it

The truth is, theft is cool. I mean it. A long time ago I believed in copyleft — I was a card-carrying FSF member, GNU plus Linux, GPLv3 all the things. Part of me still has a huge amount of sympathy for that position.

But part of me also slaps the UNLICENSE, WTFPL, CC0 or otherwise on a lot of things I produce. Part of me would much rather abolish copyright than try to use copyright against corporations. I felt the “it’s better to have a seat at the table” angle vacate my body entirely when GitHub’s top brass used that argument to maintain a cosy relationship with ICE: is weaponising copyright law legitimising it? We all hate the DMCA, yet a takedown notice is often the tool of choice for FOSS licensing violations!

I don’t think it’s easy to resolve this tension, but I’m very happy about Sci-Hub, I use The Pirate Bay and private trackers for some things while insisting on buying all the music I listen to (and buying copies for friends!), and I prefer the world where people and organisations get to use my open-sourced stuff (and yes, make money from it without having to give me any) it than the alternate one where essentially no-one uses it.

I realise this is a bit self-serving (inasmuch as I derive satisfaction from making stuff that’s useful), but I’m doing my best to be honest and not apologise for the fact that my positions don’t/can’t neatly resolve into one side. If there was a different alternate world where we weren’t already fatally capitalistically fucked in the head and the software was useful for lots of people without there being anyone needing or wanting to make money from it, that would be so cool! This ain’t it, though.

So like, okay. Abolish copyright. Abolish intellectual property. It’s actually fine that a bunch of companies decided to Compile The Whole Internet into a file. Wish I did that first. Hope someone leaks that file.

Now like, I don’t actually love that for a number of reasons; I’m particularly unhappy with OpenAI as a company and as a concept, based on just about everything I have ever heard of that has even the slightest thing to do with Sam Altman, as well as their scraping practices, business practices, etc. etc. etc.

But I no longer feel that someone using large models on the scale of those of OpenAI is necessarily committing a particularly ethically dubious act by virtue of the model they have used. Yeah, they ingested your writing and code and art. Mine too. Takes a village to build a real proper egregore I guess. Totally figures that it’d be Corporate America to do that first.

And it super totally sucks that they DDoS us all in order to do this. I deploy Anubis and I think you should too. (Pause for a beat while you think about Anubis’s creator using LLMs! Guess how much pause this has given me!) Inasmuch as it’s very hard to separate the practices involved in compiling these datasets and the result themselves, that seems pretty clear-cut?

But. I also think it’s ethically dubious to sample the DNA of living animals to create seeds to generate lab-grown meat. THAT SAID, if people have done that already — or do do that, today — and it means that as a result, vast quantities of lab-grown meat are consumed instead of killing animals, is that a bit of a win for me as a vegetarian? It is quite a bit of a win.

We are perhaps beyond the point where the frontier models need or can even find useful new data to ingest; it would appear the internet has been thoroughly scraped a hundred times over. That obviously doesn’t mean they’ll stop, or that new labs won’t start, but in terms of the models that exist today and which Generate Value™ not from further scraping but from what’s been done — imo, if they actually help us, real actual People, they can yet be part of what is “a bit of a win”. I’m not convinced we need to punish the technology for the sins of the creators, even if they are still entangled at this moment. (And I just don’t think we, as indivduals, can meaningfully punish the creators, but I’ll get to that.)

For programmers particularly, there’s a continuum of fun new stuff from identity loss to an overwhelm of slop patches and everything in between.

Yeah it sucks ass, but every technology transition has this; programmers have had an easy few decades when the worst thing most people can name is Agile or having to learn Kubernetes or AWS or whatever. I think Competence as Tragedy summed it up better than I could.

Slop patches are part of this too; Claude Code is the new AOL Usenet gateway. We’ll all adapt in different ways. On Comrak I simply banned LLMs. Everyone gets to make up their own mind, and they don’t have to be of the same mind everywhere. We are headed to nuance-town, baby.

I still think Naur is right, and if you forget to treat an LLM as an actual other — meaning, when it writes the code, if you haven’t exercised what it’s suggested in your mind, and ideally on the actual computer, you most certainly haven’t got the theory — you will struggle to keep a hold on what’s going on. But, like many other others (coworkers, past selves, etc.), you can ask an LLM to try to explain what it’s come up with, and these days it will often do a pretty good job. This dovetails with …

They Can’t Work.

So it turns out they do, and this throws a bit of a metaphysical spanner in the works. What does it mean if they can’t work, but they do? And like, a year ago it was very clearly “they kinda work but mostly don’t and this is actually gonna be a waste of time” (particularly if you were like me and insisted on only ever trying local models, so as not to provide demand signal beyond one-off model downloads). Today it is very clearly “they mostly work really fucking well, often absurdly so, and while they can still get lost or overlook things, generally this is akin to magic when it comes to actually useful needed tasks”. They make things possible. How does one square that with “but they can’t”?

A point I was really hung up on before was the notion that the transformer architecture fundamentally hallucinates; that when you boil it down, of course it’s just text prediction, heck I’ve written dozens of hidden Markov models in my lifetime, and so it’s not even like a model of a human-ish brain where we could say, oh sure, increase n by enough orders of magnitude and perhaps intelligence emerges, and we know it might because look at us. (And we’re not getting there because the human brain and human intelligence is very much embodied, and we’re never really proposing making full-on universal physics simulators. (are we?))

So it can’t be intelligent, and it can’t act intelligent. So what the fuck is going on when it does? Something else, obviously, but the fact is, you as an intelligent being can intelligently formulate an intelligent proposition to it, and it will give you an answer/construct a solution/form a counter-proposal/agree with what you said and execute on it while figuring out what do with all the gaps in the language you used in a very, uhhhhh, Clever, manner.

The tl;dr for this section is, I am super happy to say, y’know what, something like intelligence does ~emerge~ when you take a language model and keep cranking up the power factor. They briefly appear intelligent by their use of language at low n but are clearly not when they say or do things that don’t actually follow. Does it follow that they aren’t at high n, even when the percentage of things said or done that don’t follow falls to a very low number indeed? I don’t know, but I do know that if you reduce the total complexity of the human brain by, say, ablating part of Wernicke’s area, to outside observers you might also briefly appear intelligent, and then suddenly not. Does this mean we are not actually intelligent? Much to think about.

And while their failure modes are completely unrelatable to us, and “experience” likewise due to their completely different substrate and architecture, I just don’t think we can conclude that they therefore do not possess (or embody!!) any kind of intelligence, nor have any kind of experience. I’m not proposing sentience here, but I can’t even say for sure that they haven’t that in some way, either, or that our definition isn’t overly fitted to the ways in which we reason and use it! Can you say for sure anyone else does possess sentience? These weren’t exactly solved problems before LLMs came along; I was thinking about solipsism when I was in primary school. (Just trans things haha) LLMs are probably the single-most complex things we have ever created (in a information theoretic kind of way). I just, I dunno — I’m not ready to make solid assertions here, and I don’t really get how so many are so willing to dispense with nuance to make bold claims in completely understudied (and partly unknowable) areas! Please re-read this essay series:

Economic impact. The share markets are a fuck, the flow-on effects (RAM prices!) widely felt, and every product is now ✨ intelligent ✨ in ways no-one ever asked for.

WELCOME TO CAPITALISM OPERATING AS INTENDED. Remember “there is no ethical consumption under …”? We are SEVERELY beyond the point of normalisation, and refusing to admit that is like refusing to admit that your riding a bicycle to work isn’t going to make the rest of the world give up ICE vehicles and make us unpave all the roads. Yes, cars have had a really big head start, but we’ve barely begun building the new datacentres supposedly needed for AI’s Line Goes Up Forever and if you haven’t noticed, this stuff is already widespread! I bet you have noticed, though, because we’re all constantly talking about it! This is the magic of software, of things whose substrate is information, which I think we were all already aware of! The cat well and truly is out of the bag.

Choosing to abstain here is not going to make RAM prices come back down. The CEOs were busy masturbating at the altar of share prices when they were all making the replacement claims, and they have continued to do so as they realise that agents cannot in fact replace workers (though they may well try to reduce the number of juniors and then screw over the funnel for the industry as a whole, it’s been happening for a while now and we see the same happening with middle-management though I don’t feel the funnel concern carrying over there, and does this have anything do with LLMs? I don’t think so — but I digress). They still do so as they blame the recession-hiding-under-the-covers-of-inflated-share-prices layoffs on reduced need for workers due to AI. We will stop getting ✨ Summarize buttons (spelled with a ‘z’ ofc) as soon as they figure out the company to pump after Nvidia. And yet the models are not going to suddenly vanish when that happens.

In short, when did the decisions made by billion-to-trillion-dollar company CEOs actually make the slightest difference to what you should or should not be doing? You asking Gemini a translation question or vibe-coding a menubar widget is not what makes or breaks Sundar Pichai and topples this thing over. They’ve pushed everything this way; does that mean you must, by rights, do the opposite thing? It doesn’t follow.

I also really don’t want to seem like I’m saying “yeaahhhh just go do terrible shit”, here, either, because I’m not; I’m saying that while intoning YOU LLM FANATICS ALL THINK THE END JUSTIFIES THE MEANS is an emotionally satisfying position to hold, I don’t think it really holds. YOU CAR DRIVERS ALL THINK THE DESTINATION JUSTIFIES THE POLLUTION. Idk.

Environmental impact. Our world is burning, and you know what it doesn’t need more of right now? Datacentres.

I keep coming back to the cars/fossil fuels analogy and I see some other folks around this time are too. We have paved over most of our cities with horrid amounts of roads. Cars themselves are obviously polluting, with companies being happy to commit scandals to work around the fact that governments were starting to think this was maybe a bad idea. For a fun one, try considering air travel. Coal power sucks. It sucks that nuclear failed to get off the ground, but who knows, maybe we dodged a bullet.

You can usually point the finger at capitalism and be correct, and the same generally applies here. The (proposed) build-outs are not in service of any realistically desirable goal; just Line Goes Up. But at this point, the actual use of these services (inference costs) is negligible. The water consumption is negligible. The demand signal is negligible.

And I used to humm and haaa about normalisation specifically re: demand signals and the prospect of using these tools and that being known, even if all the other concerns were somehow negated, but like, that window has well and truly passed us, and I don’t think any one (or hundred or more) of us could have really impacted that. It is too late, and it’s really hard to say “this is realism and not fatalism” without it just sounding like fatalism to someone who’s two feet sunk deep in the contra- tribe (and it would be silly, I think, to not observe the tribalism that’s developed). I am aware of this. I am not really trying to convince you, though; here I’m just writing out my reasoning, because y’know, writing and thinking go hand-in-hand.

We were committed to this line as well and truly as we were to the global re-rise of fascism, to the “post COVID except for the bit where we’re post COVID, oops, here, have some vaccine denialism” bit, to the not-so-proxy wars bit, not because any one person was too happy to be swayed by Trump, or was too anti-authoritarian, etc. etc., but because the world system we’ve ended up with allowed the interests of not the 1% but the 0.01% to control everything. Neoliberalism has well and truly had its way with us.

This being the case, it makes sense to protect yourselves against racists and transphobes and do your best to educate where possible, to protect yourself against deadly airborne diseases and encourage your friends to do the same, and to be wary of your warmongering neighbour while also ever keeping in mind that your warmongering neighbour’s citizens are just as caught up in this as you. We do not simply give in.

What about LLMs? Does it make sense to then strictly abstain? I dunno for sure, but I’ve come to conclude: probably not.

You should be careful, of course, because there are new knives in the kitchen and they are sharp in unexpected ways; their failure modes of thinking (of “thinking”) are indeed unlike ours; LLM psychosis is real and those insisting it isn’t have no fucking idea what psychosis is (and I assure you, you do not truly find out any way except experentially); you will certainly lose the theory (or just fail to build one) if you go full-vibe-code; I imagine it’s relatively easy to use the tools in ways that cause a loss of skill, or degradation of memory. (It sounds a lot like I’m describing hallucinogens here, and qué sorpresa, I don’t think you should strictly abstain from those either.)

But I don’t believe any more that total abstention is the only moral way. I know there are loads of reasons to be angry, but I don’t think LLM users are the correct target for this anger, much like I don’t think car users are, or people who don’t fund 100% of their home’s electricity with Green Energy Credits™, or meat eaters. I say this having spent some time uncomfortably both being angry at LLM users while also tentatively occupying the role myself. It would’ve been a lot easier to retreat back to the comparatively psychologically safe black-and-white position I inhabited before (coulda saved myself this whole essay!), but it wouldn’t have been honest.

Thanks for reading.

< newer post
ingress-nginx is dead, long live the Gateway API
older post >
some things to make guix (system) less painful