Skip to content

Add principle 5: Machine-Verified Completeness#44

Open
Mehdys wants to merge 1 commit intoforrestchang:mainfrom
Mehdys:add-machine-verified-completeness
Open

Add principle 5: Machine-Verified Completeness#44
Mehdys wants to merge 1 commit intoforrestchang:mainfrom
Mehdys:add-machine-verified-completeness

Conversation

@Mehdys
Copy link
Copy Markdown

@Mehdys Mehdys commented Apr 13, 2026

What this adds

A fifth principle: Machine-Verified Completeness — the enforcement layer missing from the four behavioral principles.

Karpathy's principles fix behavior. This fixes verification: how do you actually confirm the AI did what it said?

The principle

## 5. Machine-Verified Completeness

**"Looks right" is not done. Passing gates are done.**

After every non-trivial change:
- Run your project's lint and type-check. Both must pass before claiming "fixed"
- For multi-step tasks, declare success criteria upfront — not just a plan, but what passing looks like
- Declare gaps explicitly before saying "done": what was verified by automation vs. what requires runtime/browser testing

The test: "Build passes" ≠ "done". Only verified execution of the user's actual scenario is done.

Why it belongs here

The four principles address what the LLM does. This addresses what it claims. Without it, the model can follow all four principles perfectly and still deliver unverified work — the most common failure mode when shipping real product.

Also updated the trailing "working if" line to include the machine verification signal.


Derived from shipping a real product with Claude Code as a co-engineer.

Karpathy's four principles address behavior and planning.
This adds the missing enforcement layer: machine gates,
gap declaration, and verified completeness.

'Build passes' ≠ 'done'. Only verified execution of the
user's actual scenario is done.
@Mehdys
Copy link
Copy Markdown
Author

Mehdys commented Apr 13, 2026

Derived from shipping a real production app with Claude Code as a co-engineer.

The four principles fix behavior — what the model does while working. But there's a gap: nothing enforces that the model actually verified its work before declaring it done.

In practice, the most common failure mode isn't overcomplication or style drift. It's the model saying "fixed" when it only checked the code visually — not ran lint, not checked types, not declared what still needs browser/runtime testing.

Principle 5 closes that loop:

  • Machine gates (lint + type-check) must pass, not just "looks right"
  • Gap declaration before "done": what was verified by automation vs. what needs manual testing
  • Multi-step tasks get explicit [step] → verify: [check] success criteria upfront

Happy to iterate on the wording if it doesn't fit the style of the other principles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant