A methodology for flowing intent through formal proof into running code — with mechanical consistency enforced at every seam.
Most software methodologies fail at the same seam — the gap between what the business meant and what the code does.
Requirements describe behaviour in prose. Code expresses behaviour in machinery. Tests confirm a thin slice of agreement between the two.
Over time, the prose and the machinery drift apart, until nobody trusts either as the source of truth.
The usual responses — more documentation, more testing, more conversation — each address a piece of the problem. None close the gap.
No amount of discipline applied to either side closes a gap that the artifacts themselves cannot close.
·
·
·
·
·
·
}
L.1
The Catalog— business behaviour in the BA's vocabulary
A shared, versioned vocabulary for a bounded context. Commands carry intent and preconditions. Events are immutable facts. Invariants are policy statements in the native register of the business. Scenarios are how the business says behaviour should run.
Aliases are append-only. Cross-references resolve through wiki-links. The vocabulary is what every other layer dereferences — a scenario that uses a noun not in the catalog is a lint failure.
This is what the business says is true.
2
3
4
5
6
7
8
9
10
kind: command aliases: [IssueInvoice] aggregate: [[Invoice]] emits: - [[InvoiceIssued]] - [[LineItemRecorded]] enforces: - [[InvoiceTotalMatchesLineItems]] # Issue an invoice against a confirmed order. # Pre: order confirmed; customer has billing terms.
Every scenario has a witnessing trace in the spec. Every restatement returns to the BA's register. When the two diverge, the concept has not yet resolved.
// drift becomes a question, not a defect. bidirectionality is dialogue, not duplication.
·
·
·
·
·
·
⟩
L.2
Quint— the same behaviour, expressed mathematically
A state-machine modelling language descended from TLA+, with model-checking support and a modern toolkit. Actions produce events. Temporal properties hold across reachable states. The checker proves or refutes claims automatically.
Every scenario in the catalog is emitted as an executable run. The checker explores reachable states beyond the written scenarios — surfacing invariant violations and edge cases that human authors miss.
This is what the mathematics holds.
2
3
4
5
6
7
8
9
10
11
action issueInvoice(order: OrderId) = all { orderStatus.get(order) == "confirmed", customerHasBillingTerms(customer.get(order)), invoiceIssued' = invoiceIssued.put(order, true), totals' = totals.put(order, computeTotal(order)), } temporal invoiceTotalsAlwaysMatch = always(all_orders.forall(o => consistent(o))) // every scenario in the catalog is a `run` here
Every trace the implementation produces refines a trace in the spec. Every counterexample becomes a test the code must answer.
// the refinement mapping is the seam where the two layers shake hands. drift is detected mechanically, not noticed eventually.
·
·
·
·
·
·
]
L.3
The Implementation— running F# code, refined against the spec
Each aggregate exposes a refinement mapping — a pure function from its internal state to the formal model's state — which is the seam at which Quint-generated traces are replayed against the implementation.
Every counterexample becomes a test case. Every scenario becomes a golden trace. Every invariant becomes both a runtime guard and a property-based check. The glue library is small, stable, and invisible to upper layers.
This is what executes.
2
3
4
5
6
7
8
9
10
let issueInvoice : IssueInvoice -> InvoiceState -> Result<_, _> = action { quintRef "billing::issueInvoice" requires (fun cmd state -> state.OrderConfirmed cmd.OrderId) ensures (fun _ state -> state.TotalsMatch) emit (fun cmd -> [ InvoiceIssued cmd; LineItemsRecorded cmd ]) } // each binding pairs with its Quint counterpart via `quintRef` // the Mandorla CLI verifies the pairing on every build
Five qualities. Two registers.
3 atomporal qualities answer is this artifact good right now? 2 temporal attributes answer is this artifact good over time? Trade-offs happen within each group, not across the divide.
| # | quality | scale | definition |
|---|---|---|---|
| Atemporal — is this artifact good right now? | |||
| 01 | Correctness | snapshot | What ships behaves as the business specified — not merely as a developer understood the specification. Every scenario has a witnessing trace; every invariant is a theorem or a verified runtime guard. |
| 02 | Readability | snapshot | Each artifact is comprehensible in the native register of its primary audience. Scenarios read as scenarios. Mathematics reads as mathematics. Code reads as domain workflow, not as plumbing. |
| 03 | Provability | snapshot | Claims are backed by mechanical evidence. Invariants are checked by a model checker or a property-based test, not asserted in prose. Refinement is verified by replaying traces. |
| Temporal — is this artifact good over time? | |||
| 04 | Stability | over time | Committed shapes impose bounded cost on their consumers. A name, contract, or correspondence continues to mean what it meant; the work of relying on it does not grow as the system ages. |
| 05 | Evolvability | over time | Committed shapes can be reshaped — through supersession, refinement, or replacement — with cost proportional to the size of the change rather than to the size of the system. |
The priority shifts as the shape settles.
Draft favours Evolvability: the shape is still finding itself. Names you might rename. Loose parameterisation.