The Gap Is Widening and It’s Not Slowing Down

Brendan O’Leary wrote a piece this week titled “The AI Coding Revolution Hasn’t Started Yet“. His headline observation: most professional engineers haven’t adopted AI coding tools in any meaningful way. Not even close.

In my experience too, he’s right. I’d push it further.

I’ve been deep in this work – helping companies actually get Generative AI tooling integrated into real development workflows, not just installed, not just “evaluated,” but actually integrated. Running sessions, doing the pairing, reviewing the setups, watching teams try to navigate the gap from “we have Copilot turned on” to “we’re doing something meaningfully different with how we build software.” and what I keep seeing isn’t just that most companies haven’t started. It’s that the gap between the teams that have started and everyone else is compounding. Every week that passes, that gap doesn’t hold steady. It widens.

The Observation from the Inside

The conference Brendan was at, where he made this observation is a good one. You talk to staff engineers, architects, team leads – people who are not amateurs at this craft – and you find out they’re still in the “I’ve heard of Claude Code” phase. That’s a real data point and it matches what I’m seeing at companies I work with. More than a few times they haven’t even gotten to that point.

But here’s the wrinkle that the conference floor view doesn’t fully capture: the gap isn’t just about adoption level. It’s about trajectory. The practitioners who’ve gone deep aren’t standing still. They’re getting faster, refining workflows, building intuition about how to orchestrate agents, where to trust the output, where to keep a tighter leash, and where to let things run. They’re compounding their advantage every single week. Meanwhile, a team that’s still on “tab completion is the whole idea” isn’t sitting on a fixed baseline. They’re falling behind in a relative sense, even if their absolute productivity is unchanged.

That’s a slow-moving disaster in competitive terms.

The Questions That Tell the Story

I’ve had almost the exact same hallway conversations described. The tells are in the questions:

“Wait, so the agent actually runs the tests itself?”

“How are you keeping it from just rewriting everything it touches?”

“Our security team said no to all of this, so we haven’t tried anything.”

None of these are dumb questions. In fact, they’re the correct questions — they mean the person is starting to actually reason about agentic tooling rather than dismissing it. But they’re also questions that someone who’s been working in this space for six months has already burned through, experimented on, and formed opinions about. There’s a widening experiential gap underneath the tooling gap and that part is harder to close quickly.

Why the Lag Is Rational (But Costly)

The reasons Brendan lists for slow adoption from security lockdowns, bad early Copilot experiences, tool landscape churn, team skepticism are all real and I’ve run into every single one of them. Let me add a few more from what I’ve encountered.

The “we tried it and it wasn’t impressive” problem is particularly pernicious, because the delta between the experience of using autocomplete in 2023 and using an agentic coding workflow in 2026 is genuinely massive. But if you hit the bad experience first and walked away, you should probably revisit it now. I know demos that happen at conferences aren’t likely to convince. A well-produced YouTube video either. But get your hands dirty and use it, otherwise the lag is going to send you, or the group you work with into luddite land and made “redundant” as they say in some places (i.e. laid off, unemployed, etc).

That’s one of the core things I try to do when working with teams: get past the “is this real” phase as fast as possible by showing it working on their actual code. Because abstract capabilities don’t move people. Seeing an agent navigate through a service you wrote, flag something you’d have missed, and propose a coherent refactor in thirty seconds — that moves people.

The security lockdown problem is also real but I want to name it more precisely: what organizations actually mean when they say “we can’t use AI tools on our codebase” is usually “we haven’t yet done the work to understand what the actual risk surface is.” Which is a very different thing than a reasoned security position. It’s a deferral masquerading as a decision. And that deferral has a cost that most orgs aren’t properly accounting for on their risk register.

The Vibe Coding Trap Is Real Too

Here’s where I’ll add some nuance that doesn’t always make it into the “you should adopt this” framing: adoption without discipline is its own problem.

There’s a Stanford study that just landed – SWE-chat, (which I have a lot of frustration with how it’s often mis-interpreted, for example the comment I’ve left here the post) looking at 6,000 real coding agent sessions from open-source developers in the wild – and the numbers are sobering. Only 44% of agent-produced code makes it into commits. Vibe-coded sessions (where the agent authors virtually all the code) burn roughly 3x more tokens and dollars per committed line than collaborative sessions. And vibe-coded code introduces about 9x more security vulnerabilities per committed line than code humans write themselves.

I’ve talked about this at length in Hurting or Helping Devs and in my breakdown of the types of code changes that AI agents produce. The tools are genuinely powerful. They’re also genuinely capable of quietly reshaping a codebase into something that looks correct but behaves like it was written by an overly confident intern with root access and no fear of consequences.

The right model isn’t “hand everything to the agent.” It’s orchestration with discipline – scoped prompts, diff limits, human gatekeeping on production deployments, and a clear-eyed understanding of where agent judgment is trustworthy and where it isn’t. That’s a craft skill that takes time to develop. It can’t be skipped.

So I’m not just saying adopt. I’m saying adopt correctly with discipline, which is harder, takes longer, and requires more deliberate investment. But the teams doing it right are building a durable advantage. The teams doing it sloppily are creating technical debt at a rate that will bite them in ways that are currently hard to see.

The Compounding Problem

Here’s the thing about compounding gaps that I keep coming back to: they don’t feel urgent when you’re inside them.

If your team is shipping at more or less the same pace it shipped at a year ago, nothing feels broken. Nothing is on fire. You’re not obviously behind. The gap is invisible to you because the other side of it isn’t in your day-to-day view.

But a team that’s been running agentic workflows for six months has built intuition, muscle memory, and workflow patterns that can’t be copied in a week. They’ve figured out what to scope, what to constrain, where to trust, and where to verify. They’ve failed in some interesting ways and learned from it. They’re operating at a different surface area of the problem than a team that’s starting from scratch — even if both teams have access to the same models and tools.

That’s the part that concerns me most when I work with orgs that are still in “evaluation mode” two years into this transition. The tools aren’t the moat. The practice is the moat. And practice requires time.

The clock is running.

What Needs to Happen

Brendan is optimistic about the diffusion curve tipping soon, and I think that’s probably right. The on-ramp needs to get lower – better model-agnostic tooling, less lock-in, less requirement to reconstruct your entire workflow to get started. Those are the right levers.

But I’d add one more: organizations need someone to physically show them what the other side looks like, in their context, with their problems. Not a demo environment. Not a benchmark. Their actual code. Their actual team. That’s the thing that moves the needle from “heard about it” to “we’re doing this.”

If you’re at a company still sitting on the sidelines on this – not because of a reasoned, deliberate decision, but because it hasn’t risen to the top of the priority stack yet – I’d genuinely encourage you to treat that as a risk and a significant one at that. Not a vague future risk. A significant present, compounding one.

The teams on the other side of that gap are not slowing down.

Hurting or Helping Devs?

This video features me, a Principal Software Engineer, discussing the impact of AI on software development, the risks of “vibe coding,” and the necessary shifts in engineering practices. Adron argues that traditional manual coding is becoming obsolete and that developers must adapt to a new paradigm defined by systems thinking and AI orchestration.

Key Takeaways:

The Dangers of “Vibe Coding”: Adron defines “vibe coding” as the practice of relying on AI to generate code without a deep understanding of the system (0:08:31). This often leads to unmaintainable, “disposable” software—a phenomenon he calls the shinification of software—which can cause significant production issues when systems fail (0:00:46, 0:08:31).
Managing AI Agents: To maintain code quality, developers must:
- Rein in Scope: Avoid open-ended prompts; instead, provide specific, well-defined architectural plans to AI agents (0:05:13, 0:06:01).
- Diff Discipline: Enforce hard limits on diff sizes (e.g., aiming for ~50 lines) to ensure human reviewers can feasibly audit changes (0:52:37, 0:55:00).
- Human Gatekeeping: Keep humans as the final gatekeepers for production deployments to ensure security and reliability (0:16:50, 0:57:29).
The Evolution of the Developer Role: The junior pipeline is changing; instead of focusing on syntax or pixel-pushing, future developers should act as systemic architects who understand how to orchestrate AI tools and manage complex workflows (0:24:05, 0:26:05).
The Industry Reckoning: As VC-subsidized AI adoption faces future economic corrections, companies will need to prioritize efficiency, energy production, and true orchestration over simply generating massive amounts of code (1:02:41, 1:05:00).
Future Predictions: Adron predicts that AI will eventually develop its own programming language optimized for machine-to-machine communication, further distancing development from manual human typing (1:09:48).

In this episode, you’ll learn:

Why writing code manually means you are already too far behind.
How to manage the six specific types of AI code changes.
The reason Diff Discipline is the only way to survive vibe coding.

Time Sliced Segments

(03:14) Why the junior developer pipeline is imploding
(05:13) How to reign in agent scope for better results
(08:31) The slow creeping dread of vibe coding
(12:50) Moving past communication cycles with prototypes
(16:50) Why shipping to production needs a human gatekeeper
(20:20) How roles shift when agents handle the workflow
(24:05) Why slinging individual lines of code is over
(29:47) Bringing a generalist approach back to computer science
(34:57) Breaking down the six types of code changes
(41:40) Why AI optimizes for plausible output instead of correctness
(52:37) Enforcing diff limits to keep human reviewers sane
(57:29) Setting up no-fly zones for sensitive code
(01:02:41) The coming hundred x shock to the tech industry
(01:11:27) What it means to be a coder in 2026

Security Was Already a Mess. Generative AI Is About to Prove It.

I was thinking about some of the points from the Polyglot Conf list of predictions for Gen AI, titled “Second Order Effects of AI Acceleration: 22 Predictions from Polyglot Conference Vancouver“. One thing that stands out to me, and I’m sure many of you have read about the scenario, of misplaced keys, tokens, passwords and usernames, or whatever other security collateral left in a repo. It’s been such an issue orgs like AWS have setup triggers that when they find keys on the internet, they trace back and try to alert their users (i.e. if a user of theirs has stuck account keys in a repo). It’s wild how big of a problem this is.

Once you’ve spent any serious amount of time inside corporate IT, you eventually come to a slightly uncomfortable realization. Exponentially so if you focus on InfoSec or other security related things. Security, broadly speaking, is not in a particularly great state.

That might sound dramatic, but it’s not really. It is the standard modus operandi of corporate IT. The cost of really good security is too high for more corporations to focus where they should and often when some corporations focus on security they’ll often miss the forrest for the trees. There are absolutely teams doing excellent security work, so don’t get the idea I’m saying there aren’t some solid people doing the work to secure systems and environments. There are some organizations that invest heavily in it. There are people in security roles who take the mission extremely seriously and do very good engineering.

A lot of what passes for security is really just a mixture of documentation, policy, and a little bit of obscurity. Systems are complicated enough that people assume things are protected. Access is restricted mostly because people don’t know where to look. Credentials are hidden in configuration files or environment variables that nobody outside the team sees.

And that becomes the de facto security posture.

Not deliberate protection.

Just… quiet obscurity.

I’ve lost count of the number of times I’ve been pulled into a system review, or some troubleshooting session, where a secret shows up in a place it absolutely shouldn’t be. An API key sitting in a script. A database password in a config file. An environment file committed to a repository six months ago that nobody noticed.

That sort of thing happens constantly. Not out of malice. Out of convenience. But now we’ve introduced something new into the environment.

Generative AI.

More importantly though, the agentic tooling built around it. Tooling that literally takes actions on your behalf. Tools that can read entire repositories, analyze logs, scan infrastructure configuration, generate code, and help debug systems in seconds. Tools that engineers increasingly rely on as a kind of external thinking partner while they work through problems.

All that benefit is coming with AI tools. However AI doesn’t care about the secret. It’s just processing text. But the act of pasting it there matters. Because the moment that secret leaves your controlled environment, you no longer know exactly where it goes, how it’s stored, or how long it persists in the LLM.

The mental model a lot of people are using right now is wrong. They treat AI like a scratch pad or an extension of their own thoughts.

It isn’t.

The more accurate model is this: an AI tool is another resource participating in your workflow. Another staff member, effectively.

Except instead of being a person sitting at the desk next to you, it’s a system operated by someone else, running on infrastructure you don’t control, processing information you send to it. Including keys and secrets.

Once you start looking at it that way, a few things become obvious. You wouldn’t casually hand a contractor your production API keys while asking them to help debug something. You wouldn’t drop a full .env file containing service credentials into a conversation with someone who doesn’t actually need those values.

Yet that is exactly the pattern that is quietly emerging with generative AI tools. Especially among new users of said tools! Developers paste configuration files, snippets of infrastructure code, environment variables, connection strings, and logs directly into prompts because it’s the fastest way to get an answer.

It feels harmless. But secrets have a way of spreading through systems once they start moving.

The real issue here is that generative AI doesn’t create security problems. It amplifies the ones that already exist. Problems that the industry has failed (miserably might I add) at solving. If an organization already has sloppy credential management, AI just gives those credentials another place to leak. If engineers already pass secrets around informally to get work done, AI becomes another convenient channel for that behavior.

And because AI tools accelerate everything, they accelerate the consequences too. What used to take hours of searching through documentation can now happen instantly. A repository full of configuration files can be analyzed in seconds. Systems that were once opaque are now far easier to reason about.

The Takeaway (Including secrets!)

The practical takeaway here isn’t that people should stop using AI tools. That’s not realistic and frankly a career limiting maneuver at this point. The tools are genuinely useful and they’re going to become a permanent part of how software gets built.

What needs to change – desperately – is operational discipline.

Secrets should never be treated casually, and that includes interactions with generative systems. API keys, tokens, passwords, certificates, environment files, connection strings—none of those belong in prompts or screenshots or debugging sessions with external tools.

If you need to ask an AI for help, scrub the sensitive pieces first. Replace real values with placeholders. Remove anything that grants access to a system. Setup ignore for the env files and don’t let production env values (or vault values, whatever you’re using) leak into your Generative AI systems.

Treat every AI interaction the same way you would treat a conversation with another engineer outside your organization, or better yet outside the company (or Government, etc) altogether.

But not someone you hand the keys to the kingdom. Don’t give them to your AI tooling.

AI Is Forcing Docs To Finally Grow Up

For years we talked a big game about documentation being “a product” (which I just wrote about yesterday right here) but let’s be honest, most of the industry never treated it that way. Docs were usually the afterthought stapled onto the release cycle, the box to tick for PMs, the chore no one wanted but everyone relied on. Then generative AI rolled in and quietly exposed just how brittle most documentation is. Suddenly the docs that were just barely acceptable for humans became completely useless for LLMs. That gap is now forcing organizations to rethink how docs get written, structured, published, and maintained.

The shift is subtle but fundamental. We’re no longer writing solely for people and search engines. We’re writing for people, search engines, and AI models that read differently than humans but still need clarity, structure, and semantic meaning to deliver accurate results. This new audience doesn’t replace human readers, it simply demands higher quality and tighter consistency. In the process, it pushes documentation to finally become the product we always claimed it was.

Why AI Is Changing How We Write Docs

AI assistants (tooling/agents/whatever) like ChatGPT and Claude don’t “browse” docs. They parse it. They consume it through embeddings or retrieval systems. They chunk it. They analyze the relationships between sentences, headings, bullets, and examples. When a user asks a question to an LLM, the model is leaning heavily on how well that documentation was written, how well it was structured, and how easily it can be transformed into a correct semantic representation.

When the docs are good, AI becomes the ultimate just-in-time guide. When the docs are sparse, meandering, inconsistent, or buried in PDFs, AI either hallucinate its way forward or simply fails. The AI lens exposes what humans have tolerated for years.

That is why companies are starting to optimize docs not only for readers and SEO crawlers, but for vector databases, RAG pipelines, and automated summarizers. The end result benefits everyone. Better structured content helps AI perform better and human readers navigate faster. AI becomes a multiplier for great doc systems and a harsh critic for bad ones.

What Makes Great Modern Documentation Now

Modern documentation can’t just be readable. It has to be machine digestible, SEO friendly, and human friendly at the same time. After picking through dozens of doc systems and tearing apart patterns in both good and terrible documentation, here is what consistently shows up in the good stuff.

The Criteria

Clear, hierarchical structure using consistent headings
Small, semantically meaningful chunks that can be indexed cleanly
Realistic examples, not toy snippets
Explicit pathfinding: quickstart, deeper guides, reference, troubleshooting
Direct language without fluff
Predictable URLs and logical navigation trees
Copy-pastable awexamples that actually work
Strong inbound and outbound linking
No PDF dumping ground
Schema, config, API, and CLI references that are complete, not partial
Contextual explanations right next to code samples
Versioning that doesn’t break links every release
Upgrade guides that don’t pretend breaking changes are rare
A single authoritative source of truth instead of fractured side systems
Accessible to LLMs: consistent formatting, predictable patterns, clean text, no wild markdown gymnastics

Nothing magical here. Most teams already know these rules. AI just stops letting you ignore them.

Five Examples Of Documentation That Nails It

Below are five strong documentation ecosystems. Each one does something particularly well and gives AI models enough structure to be genuinely useful when parsing or answering questions. I’ll break down why each works and how it maps to the criteria above.

1. Stripe API Docs

https://stripe.com/docs/api

Stripe has been the gold standard for a while. Even after dozens of competitors tried to clone the style, Stripe still leads because they iterate constantly and keep everything ruthlessly consistent.

Why it’s great
• Every endpoint is its own semantic block. LLMs love that.
• Request and response examples are always complete, never partial.
• Navigation is predictable and deep linking is stable.
• They pair conceptual docs, quickstarts, and reference material without overlap.
• All examples are real world and cross language.

How it maps to the criteria
• Structured headings and deep linking check 1, 6, and 12.
• Chunking and semantic units check 2 and 15.
• Real examples and direct language check 3 and 5.
• Pathfinding is excellent which checks 4.
• Copy-pasteable working examples check 7.

2. MDN Web Docs

https://developer.mozilla.org

MDN has decades of content, but it’s shockingly consistent, well-maintained, and semantically structured. It’s one of the best corpora for training and grounding AI models in web fundamentals.

Why it’s great
• Long history yet content stays current.
• Clear separation of reference vs guides vs tutorials.
• Canonical examples for everything the web platform offers.
• Clean, predictable markdown structure across thousands of pages.

How it maps
• Nearly perfect hierarchy and predictable formatting check 1 and 15.
• Chunked explanations with immediately adjacent examples check 2 and 11.
• Stable URLs for almost everything check 6 and 12.
• Strong pathfinding check 4.

3. HashiCorp Terraform Docs

https://developer.hashicorp.com/terraform/docs

Terraform’s documentation is extremely structured which makes it exceptionally machine readable.

Why it’s great
• Providers, resources, and data sources follow identical templates.
• Every argument and attribute is listed with exact behavior.
• Examples aren’t fluff, they reflect real infrastructure patterns.
• Cross linking between providers and core Terraform concepts is tight.

How it maps
• The template system hits 1, 2, 6, 10, 11, and 15.
• Cross linking and clear navigation cover 8.
• Complete reference material covers 10.
• Realistic examples check 3 and 7.

4. Kubernetes Documentation

https://kubernetes.io/docs/home

Kubernetes docs are huge, maybe too huge, but they’re structured well enough that LLMs and humans can still navigate them without losing their minds.

Why it’s great
• Strong concept guides and operator manuals.
• Structured task pages with prerequisites and step-by-step clarity.
• Reference pages built from source-of-truth schemas.
• Thoughtful linking between concepts and tasks.

How it maps
• Strong hierarchy and navigation hit 1 and 6.
• Machine readable chunks via consistent template patterns hit 2 and 15.
• Clear examples and commands check 3 and 7.
• Having both reference and conceptual breakdowns checks 4, 10, and 11.

5. Supabase Docs

https://supabase.com/docs

Supabase’s docs are modern, developer-focused, and written with obvious attention to how AI and search engines consume content. They basically optimized for RAG without ever claiming they did.

Why it’s great
• APIs, client libraries, schema definitions, and guides all interlink tightly.
• Clear quickstarts that become progressively more advanced.
• Rich examples spanning REST, RPC, SQL, and client SDKs.
• Consistent layouts across different product surfaces.

How it maps
• Strong pathfinding and multi-surface linking check 4 and 8.
• Full reference material checks 10.
• Predictable structure and formatting check 1 and 15.
• Example-rich guides check 3, 7, and 11.

Documentation Is Finally Being Treated As A Real Product

The interesting thing is that AI didn’t magically fix documentation. It simply raised expectations. Companies now need their documentation to be clean, complete, structured, predictable, link-friendly, example-rich, and semantically coherent because that is the only way AI can navigate it and support users in meaningful ways. This pressure is good. It forces consistency. It rewards clarity. It makes the entire documentation discipline more rigorous.

The companies that embrace this will have far better support funnels, drastically fewer user frustrations, higher product adoption, and an ecosystem that AI can actually help with instead of stumbling through. The ones that don’t will keep wondering why users stay confused and why their AI chatbots give terrible answers.

Documentation has always been a product. AI is just the first thing that has held us accountable to that truth.

VS Code & Copilot: The Chat-First Spec Definition Method

My initial review of CoPilot and getting started is available here.

(What it does well – and why it’s not magic, but almost!)

Let me be clear: the Copilot Chat feature in VS Code can feel like a miracle until it’s not. When it’s working, you fire off a multi-line prompt defining what you want: “Build a function for X, validate Y, return Z…” and boom VS Code’s inline chat generates a draft that is scary good.

What actually wins: It interprets your specification in context – your open file, project imports, naming conventions – and spits out runnable sample code. That’s not trivial; reputable models often lose context threading. Here, the chat lives in your editor, not detached, and that nuance matters.
It’s like sketching the spec in natural language, then having VS Code autocomplete not just code but entire behavior.
What you have to still do: Take a breath, a le sigh, and read it. Always. Control flow, edge cases, off-by-one errors Copilot doesn’t care. Security? Data leakage? All on you. Copilot doesn’t own the logic; it just stitches together patterns it’s seen. You own the correctness.
Trick that matters: Iterate. Ask follow-up: “Okay, now handle invalid inputs by throwing InvalidArgumentException,” or “Refactor this to async/await.” Having a chat continuum in the editor is powerful but don’t forget it’s your spec, not the AI’s.

Technique 2: Prompt With Skeleton First

Skip blindly describing behavior. Instead, scaffold it:

// Function: validateUserInput
// Takes { name: string, age: number }
// Returns { valid: boolean, errors: string[] }
// Edge cases: missing name, non-numeric age

function validateUserInput(input) {
  // ...
}

Then let Copilot fill in the body. Why this rocks:

You’re giving structure; types, return shapes, edge conditions.
The code auto-generated fits into your skeleton, adhering to your naming, your data model.
You retain control over boundaries, types, and structure even before Copilot chimes in.

Downside? If your skeleton is misleading or incomplete, Copilot will “fill in” confidently, in code that compiles but does the wrong thing. Again, your code review has to rule.

Technique 3: In-Context Refactoring Conversations (AKA “Let me fix your mess, Copilot”)

Ever accepted a Copilot suggestion, then hated it? Instead of discarding, turn on Copilot Chat:

Ask it: “Refactor this to reduce nesting and improve readability,” or “Convert this to use .reduce() instead of .forEach().”
Watch it rewrite within the same context not tangential code thrown at you.

That’s one of its massive values – context-aware surgical refactoring – not blanket “clean this up” that ends in a different variable naming scheme or method order from your repo.

The catch: refactor prompts depend on Copilot’s parsing of your style. If your code is sloppy, it’s going to be sloppily refactored. So yes you still have to keep code clean, comment clearly, and limit complexity. Copilot is the editor version of duct tape not a refactor wizard.

The Brutal Truth

VS Code + Copilot isn’t a magical co-developer. It’s a smart auto-completer with chat, living in your IDE, context-aware but utterly obedient to your prompts.
The trick is not the AI it’s how you lead it. The better your spec, skeleton, or prompt, the better your code.
Your style skeptical, questioning, pragmatic fits perfectly. You don’t let it ride; you interrogate. And that’s exactly how it should be.

TL;DR Summary

Technique	What Works	What Fails Without You
Chat-first spec	Detailed natural-language spec → meaningful code	No spec clarity → garbage logic
Skeleton prompts	Provides structure, types, expectations	Bad skeleton = bad code, fast
In-editor refactoring chat	Context-preserving improvements	Messy code → messy refactor

If you want more details on how you integrate Copilot into CI, or your personal prompt templates drop me the demand below, and I’ll tackle it head-on next time.

Tag: ai