mithril-go/STM-SPRINT.md
Kayos 20853a9d44 STM-SPRINT.md: detailed plan for the BLS verification sprint
Read the upstream Rust (mithril-stm) end to end for STM-side types,
wrote up what we need to port, library choices, gotchas to watch,
and milestone breakdown (A: decoder, B: BLS single verify, C: Merkle,
D: lottery threshold, E: full aggregate verify, F: chain verify).

Ready for a focused crypto sprint — genesis Ed25519 + M2M plumbing
are done and shipping.
2026-04-23 15:36:05 -07:00

164 lines
6.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# STM BLS Verification Sprint
Notes for the next push. Everything here comes from reading the upstream
Rust (`input-output-hk/mithril`, as of 2026-04) plus the Mithril paper
(Chotard/Kiayias/Peters 2021).
## Scheme summary
Mithril uses the **BLS `min_sig`** variant of `blst`:
- Signatures in **G1** (48 bytes compressed)
- Verification keys in **G2** (96 bytes compressed)
- Pairing used for batch verification
Every signer has a BLS keypair plus a stake weight. Signers try to sign
for lottery indices `0..m` — they qualify at index `i` if the dense-map
hash over `(msg, i, sig)` falls below a threshold proportional to their
stake fraction. A `k`-of-`m` threshold of lottery wins is required for
a valid aggregate.
## Key primitives to port
### `BlsSignature::verify(msg, vk)` (mithril-stm `signature_scheme/bls_multi_signature/signature.rs`)
Plain BLS verification — hash msg to G1, pair against vk in G2. Use
`gnark-crypto/ecc/bls12-381` with the standard IETF `BLS_SIG_BLS12381G1_XMD:SHA-256_SSWU_RO_NUL_`
ciphersuite (matches `blst`'s `min_sig` default).
### `evaluate_dense_mapping(msg, index)` (same file)
```
ev = BLAKE2b512( "map" || msg || index.to_le_bytes() || sig.to_bytes() )
```
Returns 64 bytes. Used for lottery check.
### Lottery threshold
`ev < 2^{512} · (1 - (1 - phi_f)^{stake/totalStake})`
Where `phi_f` is an `f64` from the protocol parameters (e.g. 0.7 on preprod).
Care: the 2^512 vs ev comparison is an arbitrary-precision compare; use
`math/big.Int`.
### `ProtocolAggregateVerificationKey` — Merkle commitment
AVK = Merkle-tree root over a **sorted** list of `(vk_bytes, stake_u64)`
pairs. Tree leaves are `BLAKE2b-256(vk || stake_le_bytes)`. Internal nodes
are `BLAKE2b-256(left || right)`.
The AVK that the aggregator publishes in `protocol_message.next_aggregate_verification_key`
is the hex-of-JSON of a struct like:
```json
{"mt_commitment":{"root":[b0,b1,...,b31],"nr_leaves":N,"hasher":null},
"total_stake":<u64>}
```
(Seen in preprod epoch 196 genesis). `nr_leaves` = number of registered
signers; `total_stake` is the sum.
### `StmAggrSig` — the multi_signature payload
From `mithril-stm/src/protocol/aggregate_signature/signature.rs`. The
`verify_aggregate(msg, avk, params)` function is the top-level we're
mirroring. Struct shape (after CBOR decode):
- **signatures**: k × `StmSigRegParty`
- `sig` (BlsSignature, 48 bytes)
- `indexes` (Vec<LotteryIndex>, u64 each — winning indices for this signer)
- `signer_index` (u64 — signer's position in the registered party list)
- `reg_party` (BlsVerificationKey 96B + stake u64 + Merkle path bytes)
- **batch_proof**: Merkle multi-proof certifying all included signers
belong to the AVK
CBOR encoding is **serde_cbor with default settings** in the Rust code.
Use `fxamacker/cbor/v2` on the Go side with `cbor.DecOptions{...}`
matching.
### Verify algorithm sketch
Given `(msg, multi_sig, avk, params={k, m, phi_f})`:
1. Decode CBOR → struct.
2. Check there are ≥ k distinct lottery indices across all signatures.
3. For each `(signer, index, sig)` tuple:
- BLS-verify `sig` of `concat(msg, index_le_bytes)` against `signer.vk`.
(Check upstream for exact message construction — might include a
domain separator.)
- Compute `ev = evaluate_dense_mapping(msg, index)` with `sig`.
- Compute threshold `T = 2^512 * (1 - (1 - phi_f)^{stake/total})`.
- Assert `ev_as_bignum < T`.
4. Merkle-verify each `signer` is in `avk` (batch proof).
5. All checks pass → valid aggregate.
## Plan of attack
**Milestone A — decoder first (no crypto).** Goal: parse a real `multi_signature`
CBOR from a preprod cert into Go structs and print it. One day, high
confidence. Unlocks visibility into all the field sizes and shapes.
**Milestone B — single-sig BLS verify.** Implement `BlsSignature::verify`
using `gnark-crypto`. Test against a hand-constructed case from the Rust
reference (grab the test vectors from `mithril-stm/src/signature_scheme/bls_multi_signature/signature.rs`
test section if they publish them; else round-trip-test via generating
a pair in Rust and verifying in Go). One to two days.
**Milestone C — Merkle AVK verification.** Implement BLAKE2b-256 Merkle
proof verification against the `mt_commitment.root`. Standalone, easy
to unit-test.
**Milestone D — lottery threshold arithmetic.** `math/big.Int` + `math/big.Float`
for the `phi_f` power computation; reference the Rust code for exact
semantics (they use `rug` or `malachite` — careful about rounding).
**Milestone E — full aggregate verify.** Glue A-D together, test against
real preprod certs. This is the milestone where we can say "mithril-go
is a real validating client."
**Milestone F — certificate chain verify.** Walk the chain: each STM
cert's AVK must match the previous cert's `next_aggregate_verification_key`,
plus the STM verify above. Closes the loop all the way to genesis.
## Library choice reminder
- `gnark-crypto` (`github.com/consensys/gnark-crypto`) — pure Go, audited,
standard library for BLS12-381 in Go land. Has min_sig mode via
`ecc/bls12-381/signature/bls`.
- `fxamacker/cbor/v2` — pure Go CBOR, serde-compatible in the common
shapes. Confirm byte-exact interop with serde_cbor via round-trip
tests against captured multi_signature bytes.
- `golang.org/x/crypto/blake2b` — stdlib-adjacent, works for BLAKE2b-256
and BLAKE2b-512.
`blst` Go bindings are faster but CGo — rejected per the "pure single
binary" project goal.
## Gotchas to watch
- **Byte order on indexes**: `index.to_le_bytes()` in Rust — little-endian
8-byte u64. Get this wrong and every evaluate_dense_mapping disagrees.
- **Signature compression**: `blst` default is compressed in G1 (48B).
gnark-crypto uses the same serialization (IETF 4.2.1) — but confirm
the sign-bit convention matches when round-tripping.
- **CBOR tag handling**: serde_cbor historically used canonical CBOR;
`fxamacker/cbor` needs `DecOptions{DupMapKey: DupMapKeyEnforcedAPIError}`
if we want strict compatibility.
- **BTreeMap serialization order**: the `protocol_message` hash (already
handled for genesis) relies on Rust enum declaration order, not alpha.
Any NEW fields upstream need the Go `messagePartOrder` slice updated.
- **phi_f arithmetic**: the Rust uses higher-precision rationals for
`(1 - phi_f)^stake_frac`. Go's `math/big.Float` at 512-bit precision
should be enough; double-check against the Rust unit tests.
## What's already done
Genesis Ed25519 path is fully working end-to-end:
- SHA256(protocol_message) digest ✓
- Ed25519 verify over ASCII-of-hex ✓
- Genesis verify key decode ✓
- Tested against live mainnet + preprod ✓
The verify subcommand dispatches to `verifyGenesisCert` already; STM
needs a parallel `verifyStandardCert` wired through the same path.