A little while ago I gave a talk about what happens to a large organization once LLMs stop being a shiny novelty and start being everywhere. Not the "look, it wrote a whole function!" phase, we're well past that. I mean the part that comes after: the tools are here, people are already using them, half of them without asking. So now what?
That "now what" is the whole thing, and it turns out there's a lot of it.
Lines of code were never the point
We all know LLMs are great at producing code. The trouble is we figured out years ago that lines of code is a terrible way to measure value, and somehow the arrival of a machine that produces lines of code by the bucketload has made everyone forget that.
#LoC shipped != Value created
The labs would love for you to believe otherwise, every single one of them leans on some flavour of "look how much it writes". But shipping more code faster was never the goal. Shipping the right thing was. A faster way to type doesn't change that, it just makes it easier to produce the wrong thing at speed.
The monster in the shadows
Here's the part people skip past: the code is one small piece. There's a whole organization attached to it, built up over years, full of hidden processes, undocumented procedures and a long lifecycle that someone, somewhere, quietly depends on. The technical system survives because of all that scaffolding around it.
And right now LLMs are showing up all over that scaffolding, including the places they probably shouldn't be. In most of the cases I've seen there was no real plan to adopt these tools, people just... started. Shadow IT's best friend is Shadow AI, and it's already in the building.
This isn't hypothetical. Even when the tools are sanctioned, things break, see the Amazon all-hands not too long ago where senior engineers were pulled in over a spike in outages. Approval is not a plan.
You can't blame the agent
Two risks ride along with all that output.
The first is cognitive debt. If an agent wrote most of your code, against specs that were probably incomplete, how confident are you that the result is actually correct? Systems quietly turn into black boxes nobody on the team really understands anymore.
The second is accountability, and it's the uncomfortable one. When production falls over, you can't turn to your stakeholders and say "well, the agent wrote it". The model isn't on the hook. You are. An agent can author the code, but it can never be accountable for it. Hold that thought, it comes back later.
And if you want the worst-case version of "nobody was really watching", there's the now-infamous example of someone letting an agent run loose and watching it wipe a production database. Newer models might guard against that sort of thing better, but there's no way to know for sure. You find out the moment it happens, and by then your prod environment is already gone.
We need a new process
There are real gains here, when the conditions are right the productivity jump is obvious. But it shows up at the individual level. If only the engineers get faster, how is the rest of the organization supposed to keep pace?
Because shipping software was never just writing the code. We understand the build part pretty well, and now we've got a tool that can shrink "build" down to almost nothing. Great, except all that really does is reveal the bottlenecks that were hiding behind it.
So we need a different way of working to handle this rate of change. I'll be honest: I don't fully know what that looks like yet. I've got plenty of mileage with Agile and Extreme Programming, and both feel like a good fit. The balanced team of product, design and engineering probably still holds up, but the centre of gravity shifts hard towards working prototypes. There's figuring out left to do.
There are a few things I am fairly sure about, though.
The things I'm fairly sure about
Adopt it on purpose. Introduce AI into the organization deliberately. Set standards and policies: what can be used, what can't, and what data is allowed anywhere near a public model. Whatever you do, don't let this bubble up from the bottom. There's an overwhelming amount of choice out there and no shared standards, and "everyone picks their own" is just a slower word for chaos.
Ownership doesn't move. This is the answer to that accountability problem from earlier, and it's wonderfully boring: nothing changes. You build it, you run it, so you own it, start to finish. A platform might carry some of the operational weight, but when the app breaks the team handles it. AI-assisted coding makes the work faster, it doesn't transfer the responsibility anywhere. And this goes for every role, not just the engineers. If you ship it, you own it, yes, including you, Mr Vibecoding PO.
Professionals have standards. Put coding standards in place with the right tooling. Agree on a way of working that actually includes the agents, an agents.md / claude.md, whether you go full agentic with QA after or treat it more like pairing. Find what works and turn it into a practice rather than a vibe.
CI/CD earns its keep. It always had a place, both for validating changes predictably (the tests pass or they don't) and as the gate to production. When iteration gets cheap you don't want to ship every tiny change, you fire the delivery pipeline only when you're actually ready. And this is the hill I'll happily stand on: the LLM does not touch production. We spent years building deterministic systems, handing the keys to an unpredictable operator with a wildcard up its sleeve is an accident politely waiting to happen.
AI has no taste
This one deserves its own breath.
Give an underspecified task to a good developer and they'll start asking questions. Missing acceptance criteria, an unclear edge case, and they'll come back to check before writing a line. Give that same fuzzy task to an agent and it will cheerfully fill the gaps with the most mid solution sitting in its training data, with total confidence.
The worst part isn't the mediocre output, it's the timing. You don't catch it up front the way a developer's questions would have. You catch it during validation, after the work is already done. The cost quietly moved to the end of the process, which is the most expensive possible place for it to live.
So if agents are doing the work, the specification can't be an afterthought anymore.
Iteration is basically free
Now for the fun part. Creating things was never this easy. The front of your value stream, ideation and prototyping, can go absolutely nuts even while the tail end stays rigid.
PoC everything that might work. Try the bad ideas too, because implementation costs almost nothing now, validate, and start over if it flops. Agile can finally be agile again (round two). Roles start to blur in a good way: a designer building a working prototype instead of a mockup, a product owner spinning up a small experiment just to feel out an idea.
So fly a little closer to the sun. Take more risk on your PoCs, that's exactly what they're for. They can be messy, a bit buggy, held together with tape, that's all fine. They're the cheapest, fastest feedback you'll ever get.
The bottleneck just moved
Here's the catch. If you can spit out many possible solutions very quickly, how are you going to validate them all? When you hand stakeholders 37 options, how do you make sure you still only move forward with the genuinely good ones? The hard part stops being "can we build it" and becomes "are we building the right thing", and your process has to keep up with the new pace or it just becomes the new wall everything piles up against.
Which means a few things have to actually change:
- Design gets explicit. AI has no taste, remember, so anything you leave unsaid gets filled in with a hallucination.
- Automated testing matters more than ever. It's the only thing that makes a fail-fast loop safe, and it's what the agents lean on to know they haven't broken anything.
- QA moves to the people steering the agents. They own the result, so checking it is theirs too. Did the agent follow the standards? Is the codebase healthy? Does the thing actually work?
- Documentation flips. Humans read the docs maybe twice, at the start and when something breaks. An agent reads them at the top of every new session. Suddenly writing things down has a very direct payoff.
Fast isn't always better
We can go fast. Should we, always? Probably not. There's a real pull towards letting the agent own the whole loop, and that's exactly where quality quietly leaks out. You don't need to know every system in minute detail, but you do need to understand what it's supposed to do and what's expected of it, or the cognitive debt piles up until it's suddenly due.
And spare a thought for the people on the other end. If you ship sixteen updates a week and move the furniture every single time, don't be surprised when your users start a small riot. Speed isn't a feature to them.
You still have to act
For all the "this might not even work" energy, sitting it out isn't really an option. These tools are already showing up in your value streams, including the corners where you'd rather they didn't. The perceived boost is too good for most people to ignore, so they won't. Think of it as modernization on steroids, the same old movement to remove friction, just with sharper tools, and your job is to point it at the places where it actually has leverage.
So set guardrails, because people will find use-cases you never anticipated. Left to pick freely, things drift fast: GDPR violations lurking in places nobody's looking, tools and services nobody vetted. Who owns them? Who's paying for them? Are you even allowed to use them? Some freedom to experiment is essential, but freedom with no guardrails is just the shadow monster again, wearing a new hat.
Wrapping up
None of this is settled, and I'm genuinely curious where it all lands. We're all figuring it out at the same time, which is half the fun and half the terror.
If you take three things away: keep building those PoCs and one-shot them to your heart's content, keep production clean and controlled and out of the agent's direct reach, and please, please don't start counting pull requests or lines of code as value. We knew better than that before the robots showed up. Let's not unlearn it now.
Disclaimer: This post was created from my slides and talk notes with the help of an LLM. I did review the whole thing though.