04 — Design & Run Experiments

Purpose: Run structured tests that produce decision-grade evidence.

Outcome

Experiments executed with interpretable results tied to hypotheses.

This is the point where the team turns planning into contact with reality.

Time to complete

Varies by experiment type; target shortest reliable cycle.

Inputs

  • Hypotheses from Step 03.

  • Experiment constraints (time, audience, channels, budget).

Steps in SwiftCNS

  1. Define experiment method and scope.

  2. Set success criteria and invalidation criteria.

  3. Assign owner and timeline.

  4. Run the experiment and capture results in SwiftCNS.

  5. Record anomalies and context notes for interpretation.

Why this stage matters

Experiments are where teams earn the right to learn. Without a real test, the cycle stays theoretical. But the value of experimentation is not the activity itself. It is the clarity it creates for a decision.

That is why this stage should always stay connected to the question: “What are we trying to learn, and how will this change what we do next?”

Role lenses

  • Startup: optimize for speed while preserving evidence quality.

  • Program manager: monitor cycle time and quality gate compliance.

  • Mentor: review experimental validity and bias risk.

What strong output looks like

A strong experiment:

  • is tied to a specific hypothesis,

  • has visible success and invalidation criteria,

  • fits the team’s actual constraints,

  • produces evidence the team can interpret without guesswork.

Weak vs strong pattern

Weak

  • experiment is busy but not decision-relevant,

  • criteria are vague or added after the fact,

  • output is hard to interpret,

  • the team cannot explain what confidence should change.

Strong

  • experiment is focused,

  • criteria are clear before execution,

  • evidence is interpretable,

  • outcome clearly informs the next stage.

Outputs

  • Experiment record with measurable outcomes.

  • Evidence set ready for learning extraction.

Definition of done

  • Results are documented and interpretable.

  • Team can state what changed in confidence.

Common failure mode

The most common trap here is experiment theater: a lot of visible effort with little decision value. If the team cannot connect the experiment back to a hypothesis and forward to a likely decision, the design needs work before the result does.

If blocked

Use Common Failure Patterns to fix experiment theater and unclear criteria.

Next step

Continue to 05 — Extract Key Learnings.

Last updated