STM-SPRINT.md: detailed plan for the BLS verification sprint

Read the upstream Rust (mithril-stm) end to end for STM-side types,
wrote up what we need to port, library choices, gotchas to watch,
and milestone breakdown (A: decoder, B: BLS single verify, C: Merkle,
D: lottery threshold, E: full aggregate verify, F: chain verify).

Ready for a focused crypto sprint — genesis Ed25519 + M2M plumbing
are done and shipping.
This commit is contained in:
Kayos 2026-04-23 15:36:05 -07:00
parent 97a9434106
commit 20853a9d44

164
STM-SPRINT.md Normal file
View file

@ -0,0 +1,164 @@
# STM BLS Verification Sprint
Notes for the next push. Everything here comes from reading the upstream
Rust (`input-output-hk/mithril`, as of 2026-04) plus the Mithril paper
(Chotard/Kiayias/Peters 2021).
## Scheme summary
Mithril uses the **BLS `min_sig`** variant of `blst`:
- Signatures in **G1** (48 bytes compressed)
- Verification keys in **G2** (96 bytes compressed)
- Pairing used for batch verification
Every signer has a BLS keypair plus a stake weight. Signers try to sign
for lottery indices `0..m` — they qualify at index `i` if the dense-map
hash over `(msg, i, sig)` falls below a threshold proportional to their
stake fraction. A `k`-of-`m` threshold of lottery wins is required for
a valid aggregate.
## Key primitives to port
### `BlsSignature::verify(msg, vk)` (mithril-stm `signature_scheme/bls_multi_signature/signature.rs`)
Plain BLS verification — hash msg to G1, pair against vk in G2. Use
`gnark-crypto/ecc/bls12-381` with the standard IETF `BLS_SIG_BLS12381G1_XMD:SHA-256_SSWU_RO_NUL_`
ciphersuite (matches `blst`'s `min_sig` default).
### `evaluate_dense_mapping(msg, index)` (same file)
```
ev = BLAKE2b512( "map" || msg || index.to_le_bytes() || sig.to_bytes() )
```
Returns 64 bytes. Used for lottery check.
### Lottery threshold
`ev < 2^{512} · (1 - (1 - phi_f)^{stake/totalStake})`
Where `phi_f` is an `f64` from the protocol parameters (e.g. 0.7 on preprod).
Care: the 2^512 vs ev comparison is an arbitrary-precision compare; use
`math/big.Int`.
### `ProtocolAggregateVerificationKey` — Merkle commitment
AVK = Merkle-tree root over a **sorted** list of `(vk_bytes, stake_u64)`
pairs. Tree leaves are `BLAKE2b-256(vk || stake_le_bytes)`. Internal nodes
are `BLAKE2b-256(left || right)`.
The AVK that the aggregator publishes in `protocol_message.next_aggregate_verification_key`
is the hex-of-JSON of a struct like:
```json
{"mt_commitment":{"root":[b0,b1,...,b31],"nr_leaves":N,"hasher":null},
"total_stake":<u64>}
```
(Seen in preprod epoch 196 genesis). `nr_leaves` = number of registered
signers; `total_stake` is the sum.
### `StmAggrSig` — the multi_signature payload
From `mithril-stm/src/protocol/aggregate_signature/signature.rs`. The
`verify_aggregate(msg, avk, params)` function is the top-level we're
mirroring. Struct shape (after CBOR decode):
- **signatures**: k × `StmSigRegParty`
- `sig` (BlsSignature, 48 bytes)
- `indexes` (Vec<LotteryIndex>, u64 each — winning indices for this signer)
- `signer_index` (u64 — signer's position in the registered party list)
- `reg_party` (BlsVerificationKey 96B + stake u64 + Merkle path bytes)
- **batch_proof**: Merkle multi-proof certifying all included signers
belong to the AVK
CBOR encoding is **serde_cbor with default settings** in the Rust code.
Use `fxamacker/cbor/v2` on the Go side with `cbor.DecOptions{...}`
matching.
### Verify algorithm sketch
Given `(msg, multi_sig, avk, params={k, m, phi_f})`:
1. Decode CBOR → struct.
2. Check there are ≥ k distinct lottery indices across all signatures.
3. For each `(signer, index, sig)` tuple:
- BLS-verify `sig` of `concat(msg, index_le_bytes)` against `signer.vk`.
(Check upstream for exact message construction — might include a
domain separator.)
- Compute `ev = evaluate_dense_mapping(msg, index)` with `sig`.
- Compute threshold `T = 2^512 * (1 - (1 - phi_f)^{stake/total})`.
- Assert `ev_as_bignum < T`.
4. Merkle-verify each `signer` is in `avk` (batch proof).
5. All checks pass → valid aggregate.
## Plan of attack
**Milestone A — decoder first (no crypto).** Goal: parse a real `multi_signature`
CBOR from a preprod cert into Go structs and print it. One day, high
confidence. Unlocks visibility into all the field sizes and shapes.
**Milestone B — single-sig BLS verify.** Implement `BlsSignature::verify`
using `gnark-crypto`. Test against a hand-constructed case from the Rust
reference (grab the test vectors from `mithril-stm/src/signature_scheme/bls_multi_signature/signature.rs`
test section if they publish them; else round-trip-test via generating
a pair in Rust and verifying in Go). One to two days.
**Milestone C — Merkle AVK verification.** Implement BLAKE2b-256 Merkle
proof verification against the `mt_commitment.root`. Standalone, easy
to unit-test.
**Milestone D — lottery threshold arithmetic.** `math/big.Int` + `math/big.Float`
for the `phi_f` power computation; reference the Rust code for exact
semantics (they use `rug` or `malachite` — careful about rounding).
**Milestone E — full aggregate verify.** Glue A-D together, test against
real preprod certs. This is the milestone where we can say "mithril-go
is a real validating client."
**Milestone F — certificate chain verify.** Walk the chain: each STM
cert's AVK must match the previous cert's `next_aggregate_verification_key`,
plus the STM verify above. Closes the loop all the way to genesis.
## Library choice reminder
- `gnark-crypto` (`github.com/consensys/gnark-crypto`) — pure Go, audited,
standard library for BLS12-381 in Go land. Has min_sig mode via
`ecc/bls12-381/signature/bls`.
- `fxamacker/cbor/v2` — pure Go CBOR, serde-compatible in the common
shapes. Confirm byte-exact interop with serde_cbor via round-trip
tests against captured multi_signature bytes.
- `golang.org/x/crypto/blake2b` — stdlib-adjacent, works for BLAKE2b-256
and BLAKE2b-512.
`blst` Go bindings are faster but CGo — rejected per the "pure single
binary" project goal.
## Gotchas to watch
- **Byte order on indexes**: `index.to_le_bytes()` in Rust — little-endian
8-byte u64. Get this wrong and every evaluate_dense_mapping disagrees.
- **Signature compression**: `blst` default is compressed in G1 (48B).
gnark-crypto uses the same serialization (IETF 4.2.1) — but confirm
the sign-bit convention matches when round-tripping.
- **CBOR tag handling**: serde_cbor historically used canonical CBOR;
`fxamacker/cbor` needs `DecOptions{DupMapKey: DupMapKeyEnforcedAPIError}`
if we want strict compatibility.
- **BTreeMap serialization order**: the `protocol_message` hash (already
handled for genesis) relies on Rust enum declaration order, not alpha.
Any NEW fields upstream need the Go `messagePartOrder` slice updated.
- **phi_f arithmetic**: the Rust uses higher-precision rationals for
`(1 - phi_f)^stake_frac`. Go's `math/big.Float` at 512-bit precision
should be enough; double-check against the Rust unit tests.
## What's already done
Genesis Ed25519 path is fully working end-to-end:
- SHA256(protocol_message) digest ✓
- Ed25519 verify over ASCII-of-hex ✓
- Genesis verify key decode ✓
- Tested against live mainnet + preprod ✓
The verify subcommand dispatches to `verifyGenesisCert` already; STM
needs a parallel `verifyStandardCert` wired through the same path.