Vibe coding tools

January 27, 2026

A lot of people that I know professionally and respect deeply have not yet had a good experience getting real programming work done with LLM assistance, whether through tools like Claude Code or Cursor or just through chatting with ChatGPT or Claude or Gemini or whatever. This includes folks who are definitely better programmers than me, some with more experience and some with less. I have been able to get real things built and do real software engineer like things with these tools. This post hopefully helps folks learn how to get unstuck if they want.

You can’t just vibe code

The phrase vibe coding comes from Andrej Karpathy in February 2025, and describes letting your AI coding assistant just write a bunch of code that you then won’t bother to review or understand, but will probably just test for basic correctness and then call it a day. But you can use these things like a real programmer, with a real software development lifecycle. It’s great!

If you aren’t a programmer anyway then this is pretty much what you do anyway when you ask programmers to write something for you. “Hey nerd, I need a thing,” you might say, and then the nerd builds the thing, and you don’t really look at the code or understand it because it’s not worth your time. This is fine.

If you are a programmer then this is still what you already do anyway when you write code. You rely on the Python or Java runtime to do garbage collection for you, or you rely on malloc and free to do what is described on the tin without actually reading your operating system’s implementation of those routines. Even if you do that kind of kernel hacking, odds are good that you expect CPUs and DRAMs and PCI buses and USB chips to do what they say on the tin, and that you don’t do any analytical materials science or microscopy work to see how the photolithography and the ion implantation was done.

So we want to be able to treat LLM coding assistant as just another layer in the stack, just another tool. Maybe you can think of pure YOLO-mode vibe coding as the equivalent of rolling your head across the keyboard in VS Code until your pyunit tests pass. It might work, but it’ll take a while and probably won’t be exactly what you meant. For anything nontrivial you’re better off actually thinking yourself. Here are some ways I’ve found that help.

Aside: One-shotting things actually does work for trivial stuff

For non-trivial things you still have to understand what’s going on so you can test and maintain it, or debug it, or perceive when the one-shot vibe coded solution fails. But for trivial stuff it’s great. It’s “autocomplete” in Cursor and Windsurf, and it’s asking simple questions of the AI chatbot.

Last year at work (pre-sabbatical) I had a Spring Boot app where I wanted to have two different url paths map to the same function, but my code had red squigglies under it and the thing wouldn’t compile.

    @RestController
    public class BlogController {

        @GetMapping("/posts")    // red squiggles here
        @GetMapping("/blog")     // red squiggles here
        public String blogIndex() {
            return "This is the blog index";
        }
    }

It gave me errors along the lines of “Can’t have two of these” which is total nonsense because this is exactly what you’d do in Flask in Python. In theory I could have found a bunch of (maybe current?) documentation or run a bunch of spammy web searches or interrupted a colleague and ruined their momentum for the afternoon, but instead I asked Cursor to one-shot the problem by typing:

     make this not be an error

and the code instantly rewrote to the correct syntax:

    @RestController
    public class BlogController {

        @GetMapping({"/blog", "/posts"})  // I mean, sure, this is also what I mean
        public String blogIndex() {
            return "This is the blog index";
        }
    }

I have neither curiosity nor professional interest in understanding why this is the right syntax for a multi-path GetMapping annotation in Java code using the Spring Boot framework, but my AI coding assistant oneshotted the problem for me and I went on with my actual coding work that addressed the problem I was actually trying to solve, and my concentration remained unbroken. I didn’t even have to formulate a coherent question. I didn’t even have to switch windows.

This “do what I mean without breaking my concentration” party trick is a trivial but real improvement in software tooling, even if not billions of dollars worth of benefit.

Making the billions of dollars in tooling shine

Your experience as a software engineer is exactly what can cause these tools to sing.

Tell it to follow a normal software development lifecycle. A good but very long example is all the things you find in the Google SWE Book. A better and shorter example is “spec-driven development”, which GitHub calls spec-kit; the version I’ve been using is called Superpowers.

Start by reading the Superpowers repo. It’s almost entirely English text in Markdown files. It’s a tree of prompts, made available to a coding agent (like Claude Code) which is then directed to obey them. There are a few short scripts alongside some of the skills/prompts (so that you can speed things up and save some tokens instead of asking the LLM to one-shot some bash command).

Veterans of not just Google-sized organizations but really any places with more than a dozen or two programmers will recognize the pattern right away:

requirements,
then a detailed engineering design,
then a step by step implementation plan,
then execution of the plan.

It can feel like an annoyingly heavy amount of process when it’s all people driven (and definitely when you have to argue about it at every step with skeptical or overenthusiastic colleagues). But it’s super fast and useful with modern AI coding assistants.

The first bit of magic is that the first two steps can be massively improved by an LLM-driven Socratic dialog with you about what you really want, and what implementation choices make sense to you, and how to weigh engineering design tradeoffs. You need to be alert and engaged here, and need to be able to think clearly. Writing clearly helps a great deal, though I’d argue that if you can’t write clearly then you may never have thought clearly to begin with.

The second bit of magic is that once the detailed engineering design is done, the tree of skills under “go implement this now” can run without your intervention. These skills (prompts) break down the implementation plan into steps, each of which gets its own detailed LLM-generated instructions—more prompts for a different independent LLM agent to execute in its own context—including using skills like “Test Driven Development” and “Independent Code Review” and “Use a Task List” and “Write Clearly”. Because of this hierarchical task breakdown and automatable checks for whether things are working, the “execution of the plan” bit can be done (if you like) without human intervention. The result is (or should be) a working feature branch that you can test out, ask for improvements to (I usually find things that need fixing before release), and then merge if you like. (It’s possible to have the thing run autonomously on your laptop; but you should probably spin up a disposable VM workspace instead. Claude Code and probably all the others have this ability baked into their web versions.)

Evidence that this works

You should have high expectations of people (like me!) who say that this is a better way to work. If it really is, then average programmers who can’t complete side projects should suddenly start finishing things. Excellent programmers should suddenly become visibly more productive. If slop isn’t a blocking problem, then it should be possible (not automatic) for a project to move quickly without code quality crashing through the floor.

In three months of holidays and using this stuff in mild anger, I personally have cranked out (or at least made noticeable progress on) the following side projects that definitely were not moving quickly before AI coding assistance:

a fix to a keybinding bug in version 1.2 of ghostty (which ended up being a dupe of this conversation) even though I don’t know Zig.
docker-compose, hugo, and caddy magic for this space that you’re reading (which I could have figured out unassisted, but it would have taken way, way more than ninety minutes)
a macOS menu bar app that alerts me (and emails me) when aurora borealis might be visible in Massachusetts
a commuter rail schedule app with exactly the data I want for my personal commute (in Phoenix LiveView, which I learned a bit of during this project)
fancy network code to push data from my in-car speedometer to my laptop (in Swift, which I don’t actually know)
serious improvements to reliability and observability in my twenty-year GPS archive, soon to be written up here
stealth mode prototypes in the space of data quality and analytics
the ad blocker and focus app soon to be described elsewhere in this space (and I’m not really a macOS programmer)

Better programmers, meanwhile, are doing a bunch of LLM coding things. Just off the top of my head:

antirez (author of Redis) ported a 700-line library from C to Rust, and detected and fixed a bunch of bugs in Redis.
simonw (Python board member, author of Django and Datasette) has made hundreds of little tools and has a very active blog full of intelligent stuff you should read about AI. And a ton of Datasette work.
YCombinator has claimed that a quarter of its Winter 2025 class are vibe coding everything. Wait, is that hype?
Andrej Karpathy himself sheepishly admits that he’s recently cut over to more of this coding style; long analysis there. He observes that this cutover has already been done by some small double digit percentage of programmers out there. It seems I may be one of them.
mitchellh enjoying retirement from Hashicorp by writing ghostty and running coding agents for stuff. (Added Feb 2026)

January 2026 reading list:

superpowers
skills
spec-kit
Just make it keep track of its work: read and write markdown files, tell it to run simple scripts you provide. Just breaking things into a todo list helps a lot.

Hopefully this is all helpful to someone. Drop me a line if you want me to show you what I think I know about how to use this stuff.