Karpathy Says Vibe Coding Is Passé. He's Calling It 'Agentic Engineering' Now.

Andrej Karpathy coined 'vibe coding' in February 2025 in a single tweet that launched an industry category. Now, roughly 14 months later, Karpathy has published a new framing that he says supersedes it: 'agentic engineering.' In a long-form essay published this week, Karpathy argues that vibe coding — defined by its improvisational, natural-language-first, 'just vibe with the AI' ethos — was appropriate for the tool capabilities of early 2025 but has been outgrown by what current AI systems can actually do. 'Agentic engineering,' in Karpathy's framing, is what vibe coding matures into when you take it seriously: systematic, specification-driven, AI-assisted development with genuine engineering discipline applied to what agents can and can't be trusted to do autonomously. This post unpacks what Karpathy actually said, how it differs from vibe coding, and what the evolution means for how you should be working in 2026.

What You'll Learn

You'll understand exactly what Karpathy means by 'agentic engineering' and how it differs from his original vibe coding definition, why Karpathy thinks the improvisational vibe coding approach has reached its limits and what evidence he cites, the practical workflow differences between vibe coding and agentic engineering, whether this is a genuine paradigm shift or a rebranding of the same practice, and how to adapt your current vibe coding workflow toward agentic engineering principles without starting over.

What Karpathy Actually Said

Karpathy's essay (published on his blog and cross-posted to X) is careful to acknowledge vibe coding's role while arguing it's being superseded:

'Vibe coding served its purpose. It lowered the activation energy to build software with AI dramatically. It made the tools accessible and the practice fun. But the tools have grown up. Agents can now run for hours, modify hundreds of files, execute tests, read documentation, and self-correct. The 'just vibe with it' relationship to an AI is insufficient for that level of autonomy. You need engineering judgment to be the load-bearing element, not the vibe.' — Andrej Karpathy, May 2026

His core argument has three parts:

Part 1: Vibe coding's improvisational ethos was appropriate for early 2025 tools.

In early 2025, AI coding tools were still primarily single-turn code generators. You prompted, you got code, you reviewed it, you adjusted. The feedback loop was tight and human-in-the-loop. 'Vibing with the AI' meant staying curious, trying things, accepting imperfection, and iterating quickly. The ethos matched the tool:

Vibe coding (early 2025 tool capabilities):
├── Tool: Single-turn code generation + chat
├── Autonomy level: Low — AI generates, human reviews each step
├── Error surface: Small — one file or function at a time
├── Required human role: Creative direction + acceptance judgment
├── Appropriate mindset: Exploratory, improvisational, low-friction
└── 'Just vibe with it' worked because: errors were visible and small-scope

Part 2: 2026 agents require a different human role.

Background agents running for hours on multi-file tasks create a fundamentally different risk surface. When an agent runs 200 tool calls across 50 files while you sleep, 'vibing with the output' in the morning is insufficient oversight:

Agentic engineering (2026 tool capabilities):
├── Tool: Persistent background agents, parallel execution, multi-hour runs
├── Autonomy level: High — agent plans and executes entire feature implementations
├── Error surface: Large — agent can introduce subtle bugs across many files,
│   write code that tests don't catch, make architectural decisions at scale
├── Required human role: Specification author + system boundary designer
│                        + acceptance criteria enforcer + architectural judge
├── Appropriate mindset: Systematic, specification-driven, risk-aware
└── 'Just vibe with it' fails because: errors are large-scope and not immediately visible

Part 3: The new required skills are engineering skills, not vibe skills.

Karpathy argues that the skills that make someone effective with 2026-level agents are classical software engineering skills, applied to the human-AI boundary:

Karpathy's 'agentic engineering' required skills:

1. Specification writing:
   'Can you write a precise, unambiguous spec that an agent can execute
   without mid-task clarification?' — this is a hard engineering skill.

2. Exit criteria definition:
   'Can you define what 'done' looks like in testable terms before
   the agent starts?' — this is test engineering applied to agent tasking.

3. Trust boundary design:
   'Can you identify which decisions should be delegated to the agent and
   which require human judgment?' — this is systems design applied to AI.

4. Output verification:
   'Can you efficiently verify that a 200-file agent changeset is correct
   and safe to merge?' — this is code review at scale.

5. Cognitive debt management:
   'Can you maintain architectural understanding of a codebase that an
   agent is evolving faster than you can manually review?' — this is
   a new challenge with no classical analog.

The Cognitive Debt Warning

The most provocative section of Karpathy's essay introduces the concept of 'cognitive debt' — his term for the gap between what an agent has built and what the human developer actually understands:

'The productivity gains from vibe coding are real. The cognitive debt it generates is also real, and it accumulates faster than anyone is acknowledging. You can vibe-code a 50,000 line codebase in six months that would have taken three years to write manually. But can you modify it safely? Do you understand its failure modes? Can you onboard a new engineer to it? Cognitive debt is vibe coding's dark side. Agentic engineering is the discipline that controls it.'

Cognitive debt: the gap between what AI built and what you understand

Manually-written codebase:
├── Understanding: ~95% (you wrote it)
├── Modification safety: High (you know the risks)
├── Onboarding: Difficult but possible (documentation + code reading)
└── Failure mode discovery: Gradual (errors appear as you add features)

Vibe-coded codebase (no agentic engineering discipline):
├── Understanding: ~40-60% (you directed it, AI built it)
├── Modification safety: Medium-low (subtle interactions you didn't design)
├── Onboarding: Very difficult (AI logic can be opaque)
└── Failure mode discovery: Sudden (complex agent-built interactions fail unexpectedly)

Agentic engineering approach:
├── Understanding: ~75-85% (you specified it precisely, agent executed)
├── Modification safety: Medium-high (spec-driven development leaves decision trail)
├── Onboarding: Moderate (specs + tests document intent)
└── Failure mode discovery: Controlled (exit criteria caught issues at build time)

Vibe Coding vs. Agentic Engineering: The Practical Workflow Differences

Karpathy's framing has concrete workflow implications:

Vibe Coding workflow:
1. Identify what you want to build
2. Describe it conversationally to the AI
3. Review output, give feedback, iterate
4. Accept when it 'looks right'
5. Move to next feature

Agentic Engineering workflow:
1. Write a precise specification:
   - What the feature does
   - Inputs, outputs, edge cases
   - Acceptance criteria (how will we verify it works?)
   - Architectural constraints (what must the agent NOT touch?)
2. Define exit criteria in testable terms:
   - Which tests must pass?
   - What does the system behavior look like when done?
3. Delegate to agent with explicit scope:
   - Which files can the agent modify?
   - Which decisions require human approval before proceeding?
4. Verify output against specification:
   - Run acceptance tests
   - Review architectural decisions made by the agent
   - Check for cognitive debt: does the agent's solution make sense?
5. Document the decision trail:
   - Record why the agent made the architectural choices it did
   - Update mental model of codebase

The agentic engineering workflow is more upfront work per task — the spec step alone can add 30-60 minutes to a task that a vibe coder would start in 2 minutes. Karpathy argues the time investment pays off in two ways: the agent produces better output when given a precise spec (fewer iteration loops), and the cognitive debt generated is lower (the spec documents intent).

Is This a Real Paradigm Shift or a Rebrand?

The honest answer: it's both, and the distinction matters.

It's a genuine shift in required practice because background agents with 200+ tool call chains and multi-hour runtimes genuinely require more engineering discipline than single-turn code generation. The error surface is orders of magnitude larger. Vibe coding's improvisational, low-overhead approach was optimized for a tool capability that no longer represents the frontier.

It's also continuity because the best vibe coders were already applying these disciplines intuitively. The developers getting the most from Claude Code Routines and Cursor 3.0 parallel agents are already writing precise task specs, defining exit criteria, and reviewing agent diffs carefully before merging. Karpathy is naming and systematizing a practice that the top quartile of vibe coders had already evolved toward.

Who this matters most for:

Early adopters (already evolved):
├── Likely already doing agentic engineering intuitively
├── Karpathy's framing gives vocabulary for what's working
└── Value: framework to teach others and codify their instincts

Main stream vibe coders:
├── The 'just vibe with it' ethos may be limiting as they move to background agents
├── Karpathy's warning about cognitive debt is directly relevant
└── Value: explicit permission to add engineering discipline to vibe coding

Beginners:
├── 'Agentic engineering' is more intimidating than 'vibe coding'
├── The lower activation energy of vibe coding is still valuable as an entry point
└── Value: understand where the practice is heading, not where to start

How to Evolve Your Vibe Coding Workflow Toward Agentic Engineering

You don't need to abandon vibe coding — you need to add engineering discipline at the agent boundary:

Step 1: Adopt a lightweight spec template for background agent tasks.

Before launching any Claude Code Routine or Cursor background agent, write a brief spec:

## Task: [Name]

**What it does:** [1-2 sentences, precise]
**Inputs:** [What data/state does this depend on?]
**Outputs:** [What files/state does this produce?]
**Acceptance criteria:** [How will I verify it worked?]
**Off-limits:** [What should the agent NOT touch?]
**Architectural constraints:** [What patterns must be followed?]

This takes 5-10 minutes per task and dramatically reduces agent iteration loops and cognitive debt.

Step 2: Write acceptance tests before the agent runs (TDD for agentic tasks).

Define what 'done' looks like in code before the agent starts. The agent then has a runnable success criterion and you have an objective verification method.

Step 3: Conduct a 'cognitive debt audit' after every major agent session.

After a multi-file agent changeset, spend 15 minutes actively reviewing what the agent built — not to find bugs, but to update your mental model. Ask: 'Do I understand what this code does? Could I explain it to a new team member? Do I know how it will fail?'

Step 4: Use CLAUDE.md to encode architectural constraints as persistent agent context.

Think of CLAUDE.md not just as preferences but as the engineering specification that all agents must follow. Include: architectural decisions and their rationale, patterns that must be followed, areas that require human review before modification, and the cognitive debt audit results from previous sessions.

Common Challenges

'Is Karpathy saying vibe coding was a mistake?' — No. He's explicit that vibe coding served its purpose and lowered the barrier to AI-assisted development. He's saying the tools have grown past where the vibe coding mindset is sufficient for the most powerful workflows. Think of it as vibe coding graduating into something more rigorous, not being discredited. 'Should I stop calling myself a vibe coder?' — Labels are less important than the practices. If you're already writing precise specs for background agents and auditing cognitive debt, you're doing agentic engineering regardless of what you call it. The Vibe Coding Academy will continue using 'vibe coding' as its curriculum label — it remains the most widely understood term for AI-assisted development — while incorporating agentic engineering disciplines into the Advanced Track. 'Does this mean I need a computer science degree to use AI coding tools effectively?' — No. Karpathy's required skills are learnable without a formal CS background: specification writing is technical writing, exit criteria definition is QA thinking, cognitive debt management is a new skill everyone is learning. The skills are professional discipline, not academic prerequisites. 'My codebase is entirely vibe-coded. Am I in trouble?' — Depends on your development stage. If you're still iterating quickly on a prototype, pure vibe coding is fine. If you're at a stage where stability, onboarding, or safety matters, start auditing your cognitive debt and adding specs to new agent tasks going forward. You don't need to retrofit the whole codebase — just stop the accumulation.

Advanced Tips

Read Karpathy's full essay. This post summarizes and synthesizes, but his original framing has nuances worth reading directly. Search for 'Karpathy agentic engineering 2026' — the full essay is worth the 30 minutes. Build a spec habit before you need it. The best time to start writing agent specs is before your codebase becomes complex enough that cognitive debt is a problem. A 5-line spec takes 5 minutes when you're building a new feature and is extremely hard to reconstruct retroactively for features an agent built 3 months ago. Treat CLAUDE.md as your living architectural specification. As you apply agentic engineering discipline, your CLAUDE.md evolves from a preferences file into a document that encodes your codebase's architectural decisions, constraints, and cognitive debt audit results. This is the most direct implementation of Karpathy's agentic engineering approach in Claude Code. Cognitive debt audits as a team practice. If you're on a team using AI coding tools, make cognitive debt audits a sprint ritual — 30 minutes every two weeks where the team reviews what agents built and updates the shared mental model. This is the team-level analog of Karpathy's individual cognitive debt management practice. The Vibe Coding Academy Advanced Track Module 14 (Sustainable Workflows) covers the full agentic engineering framework — spec writing, exit criteria, cognitive debt management, and CLAUDE.md as a living spec — with practical templates and exercises. The Vibe Coding Ebook Chapter 6 (Agent Revolution) has been updated today with Karpathy's agentic engineering framing and the cognitive debt concept.

Conclusion

Karpathy's 'agentic engineering' framing is both a natural evolution of vibe coding and a genuine call to more disciplined practice. He's not saying vibe coding failed — he's saying it succeeded well enough that the tools are now powerful enough to require more from the humans directing them. Background agents that run for hours across hundreds of files create error surfaces and cognitive debt that improvisational oversight can't adequately manage. The discipline that agentic engineering adds — spec writing, exit criteria, cognitive debt audits, architectural constraint encoding — isn't a retreat from the 'ship fast with AI' ethos. It's the infrastructure that makes fast AI-assisted shipping sustainable at scale. The vibe coders who add this discipline will build codebases they can maintain, onboard, and scale. Those who don't will hit a wall when the cognitive debt exceeds their ability to safely modify what their agents built. Start with the spec template. Add the cognitive debt audit habit. Keep vibing — just with more engineering judgment at the agent boundary. The Vibe Coding Academy teaches both the vibe coding entry point and the agentic engineering evolution in a single curriculum. Follow Karpathy's ongoing thinking and all major AI coding developments at EndOfCoding.