Validation Playbooks
Three demand tests — smoke offers, concierge MVPs, paid signals — plus threshold metrics and kill criteria that prove buyers exist before build capital moves.
How Exiid's Validation Engine prices demand before a single build sprint is funded. Three playbooks, one signal ladder, and the kill criteria that keep capital honest.
Who this is for
For: founders, operators, and capital partners deciding whether the next dollar goes to a test or a build.
Useful when: a thesis has cleared screening and the team is itching to ship. That itch is exactly what this system is designed to interrupt.
What must be true
Validation is not market research. Research asks people what they would do. Validation puts a real offer in front of real buyers and measures what they actually do. Inside Exiid's six-engine system, the Validation Engine sits between Opportunity and Product for one reason: building is the most expensive way to learn whether anyone wants the thing.
Four rules govern every test we run:
| Rule | Meaning | | --- | --- | | Behavior over opinion | Surveys and interviews inform tests. They never replace them. | | Price in the test | An offer without a price tests politeness, not demand. | | Thresholds before traffic | The pass number is written down before the test goes live. | | Kill criteria written first | The conditions for stopping are named while everyone is still calm. |
The Signal Ladder
Demand evidence has rungs. Most failed ventures were "validated" on the bottom three.
| Rung | Signal | Example | Strength | | --- | --- | --- | --- | | 1 | Stated interest | "I would definitely use this" | Noise | | 2 | Attention | Clicks, time on page, video completion | Weak | | 3 | Identity | Email signup, waitlist entry | Moderate | | 4 | Commitment | Booked call, detailed application, signed LOI | Strong | | 5 | Payment | Deposit, pre-order, paid pilot | Decisive |
The operating rule: no build capital moves on evidence below rung 4. Each playbook below exists to push a prospect up the ladder — and to make the climb measurable.
Playbook 1: The Smoke Offer
A smoke offer is a complete sales experience for a product that does not exist yet. One landing page, one offer, one real price, one call to action — backed by paid traffic and, behind the button, an honest "we are onboarding our first cohort" message.
Mechanics:
- One page, one buyer, one promise. Multi-audience pages produce uninterpretable data.
- Paid traffic only, on a small budget. Organic traffic from your own network flatters the numbers.
- The CTA must cross rung 4 at minimum: checkout intent, a deposit, or a booked sales call. A bare email capture stops at rung 3 and proves curiosity, not demand.
- Run until the sample is real — typically 300 to 500 qualified visitors, not 40.
What it proves: that strangers, at a stated price, take a costly action. What it cannot prove: retention, delivery economics, or repeat behavior. A smoke offer that converts earns a deeper test. It does not earn a roadmap.
Playbook 2: The Concierge MVP
A concierge MVP delivers the full outcome manually before any software exists. The customer experiences a product. The team experiences a checklist, spreadsheets, and late nights.
Mechanics:
- Recruit 5 to 15 paying customers — ideally converted from the smoke offer, which keeps the evidence chain unbroken.
- Charge full price or close to it. Heavy discounting contaminates the willingness-to-pay signal.
- Deliver everything by hand and document every step: time spent, tools touched, decisions made, exceptions hit.
- Measure the second purchase, not the first. First purchases test the pitch. Second purchases test the product.
The documentation is not overhead. At Exiid, concierge delivery logs become the automation spec the Product and AI engines build against. The manual run is simultaneously a demand test and a systems blueprint — every step a human performed is a candidate for software or an agent.
Playbook 3: The Paid Signal Test
When the offer is high-trust or high-ticket, traffic math breaks down and money becomes the instrument. The paid signal test makes payment itself the experiment: refundable deposits, paid pilots, pre-orders, or a paid discovery engagement.
Mechanics:
- The amount must sting slightly. A token charge filters bots; a meaningful one filters intent.
- Make refunds easy and say so. A refundable deposit that people still hesitate over is one of the cleanest demand reads available.
- Track withdrawal rate as closely as conversion rate. Deposits that quietly evaporate before delivery are a no, delivered late.
Money is the only survey question nobody answers politely. Ten refundable deposits from strangers outweigh a hundred enthusiastic interviews.
Threshold metrics
A threshold is a pass number committed to before launch. Without it, every result gets interpreted toward yes — sunk cost does the reading. Calibration starting points by playbook:
| Test | Metric | Typical kill line | | --- | --- | --- | | Smoke offer | Qualified visitor to rung-4 action | Under 3 percent | | Smoke offer | Cost per committed lead vs modeled CAC | Above 50 percent of target CAC | | Concierge MVP | Repeat purchase within one buying cycle | Under 40 percent | | Concierge MVP | Manual delivery cost vs price | Above 70 percent with no automation path | | Paid signal | Deposit conversion from qualified conversations | Under 20 percent | | All | Refund or withdrawal rate | Above 25 percent |
The specific numbers shift by category and ticket size. The discipline does not: the number exists in writing before the first visitor arrives, and missing it means something.
Kill criteria
Thresholds catch weak conversion. Kill criteria catch weak ventures that convert anyway. A test stops — regardless of topline numbers — when any of these holds:
- The wrong buyer converted. The people paying are not the segment in the thesis, and the actual segment is too small or too costly to reach.
- The signal needed founder distortion. Conversions came from personal networks, manual heroics, or quiet discounting that cannot scale.
- The economics fail at test scale. If CAC math is already underwater with hand-picked channels and a motivated founder, scale makes it worse, not better.
- Delivery cannot be systematized. Concierge delivery revealed steps that resist automation and require talent the model cannot afford.
- The threshold missed after a full sample. Not paused, not reframed, not "directionally encouraging." Missed.
A kill is the system working. A killed smoke offer costs a few thousand dollars and two weeks. A killed build costs six figures, two quarters, and the team's best operators. The Validation Engine exists to make sure death happens at the cheap end.
The operating cadence
The full sequence runs in roughly six weeks:
- Week 0: test brief — thesis, target buyer, price, thresholds, kill criteria. One page, signed before anything ships.
- Weeks 1–2: smoke offer live against paid traffic.
- Weeks 2–6: concierge MVP or paid pilot, seeded with converted demand from the smoke test.
- Decision: build, re-test, or kill.
One re-test maximum, and only when a single variable — price, segment, or channel — has a named reason to change. A thesis on its third validation attempt is not being tested. It is being rescued. The Product Engine takes what survives; everything else returns capital and attention to the Opportunity queue, where they compound.
Read next
- Recon before roadmap — the sprint that decides whether a thesis deserves a validation pass at all.
- Is this model worth transferring? — four gates for scoring the thesis before any test is designed.