mithril-go/STM-SPRINT.md
Kayos 20853a9d44 STM-SPRINT.md: detailed plan for the BLS verification sprint
Read the upstream Rust (mithril-stm) end to end for STM-side types,
wrote up what we need to port, library choices, gotchas to watch,
and milestone breakdown (A: decoder, B: BLS single verify, C: Merkle,
D: lottery threshold, E: full aggregate verify, F: chain verify).

Ready for a focused crypto sprint — genesis Ed25519 + M2M plumbing
are done and shipping.
2026-04-23 15:36:05 -07:00

6.7 KiB
Raw Permalink Blame History

STM BLS Verification Sprint

Notes for the next push. Everything here comes from reading the upstream Rust (input-output-hk/mithril, as of 2026-04) plus the Mithril paper (Chotard/Kiayias/Peters 2021).

Scheme summary

Mithril uses the BLS min_sig variant of blst:

  • Signatures in G1 (48 bytes compressed)
  • Verification keys in G2 (96 bytes compressed)
  • Pairing used for batch verification

Every signer has a BLS keypair plus a stake weight. Signers try to sign for lottery indices 0..m — they qualify at index i if the dense-map hash over (msg, i, sig) falls below a threshold proportional to their stake fraction. A k-of-m threshold of lottery wins is required for a valid aggregate.

Key primitives to port

BlsSignature::verify(msg, vk) (mithril-stm signature_scheme/bls_multi_signature/signature.rs)

Plain BLS verification — hash msg to G1, pair against vk in G2. Use gnark-crypto/ecc/bls12-381 with the standard IETF BLS_SIG_BLS12381G1_XMD:SHA-256_SSWU_RO_NUL_ ciphersuite (matches blst's min_sig default).

evaluate_dense_mapping(msg, index) (same file)

ev = BLAKE2b512( "map" || msg || index.to_le_bytes() || sig.to_bytes() )

Returns 64 bytes. Used for lottery check.

Lottery threshold

ev < 2^{512} · (1 - (1 - phi_f)^{stake/totalStake})

Where phi_f is an f64 from the protocol parameters (e.g. 0.7 on preprod). Care: the 2^512 vs ev comparison is an arbitrary-precision compare; use math/big.Int.

ProtocolAggregateVerificationKey — Merkle commitment

AVK = Merkle-tree root over a sorted list of (vk_bytes, stake_u64) pairs. Tree leaves are BLAKE2b-256(vk || stake_le_bytes). Internal nodes are BLAKE2b-256(left || right).

The AVK that the aggregator publishes in protocol_message.next_aggregate_verification_key is the hex-of-JSON of a struct like:

{"mt_commitment":{"root":[b0,b1,...,b31],"nr_leaves":N,"hasher":null},
 "total_stake":<u64>}

(Seen in preprod epoch 196 genesis). nr_leaves = number of registered signers; total_stake is the sum.

StmAggrSig — the multi_signature payload

From mithril-stm/src/protocol/aggregate_signature/signature.rs. The verify_aggregate(msg, avk, params) function is the top-level we're mirroring. Struct shape (after CBOR decode):

  • signatures: k × StmSigRegParty
    • sig (BlsSignature, 48 bytes)
    • indexes (Vec, u64 each — winning indices for this signer)
    • signer_index (u64 — signer's position in the registered party list)
    • reg_party (BlsVerificationKey 96B + stake u64 + Merkle path bytes)
  • batch_proof: Merkle multi-proof certifying all included signers belong to the AVK

CBOR encoding is serde_cbor with default settings in the Rust code. Use fxamacker/cbor/v2 on the Go side with cbor.DecOptions{...} matching.

Verify algorithm sketch

Given (msg, multi_sig, avk, params={k, m, phi_f}):

  1. Decode CBOR → struct.
  2. Check there are ≥ k distinct lottery indices across all signatures.
  3. For each (signer, index, sig) tuple:
    • BLS-verify sig of concat(msg, index_le_bytes) against signer.vk. (Check upstream for exact message construction — might include a domain separator.)
    • Compute ev = evaluate_dense_mapping(msg, index) with sig.
    • Compute threshold T = 2^512 * (1 - (1 - phi_f)^{stake/total}).
    • Assert ev_as_bignum < T.
  4. Merkle-verify each signer is in avk (batch proof).
  5. All checks pass → valid aggregate.

Plan of attack

Milestone A — decoder first (no crypto). Goal: parse a real multi_signature CBOR from a preprod cert into Go structs and print it. One day, high confidence. Unlocks visibility into all the field sizes and shapes.

Milestone B — single-sig BLS verify. Implement BlsSignature::verify using gnark-crypto. Test against a hand-constructed case from the Rust reference (grab the test vectors from mithril-stm/src/signature_scheme/bls_multi_signature/signature.rs test section if they publish them; else round-trip-test via generating a pair in Rust and verifying in Go). One to two days.

Milestone C — Merkle AVK verification. Implement BLAKE2b-256 Merkle proof verification against the mt_commitment.root. Standalone, easy to unit-test.

Milestone D — lottery threshold arithmetic. math/big.Int + math/big.Float for the phi_f power computation; reference the Rust code for exact semantics (they use rug or malachite — careful about rounding).

Milestone E — full aggregate verify. Glue A-D together, test against real preprod certs. This is the milestone where we can say "mithril-go is a real validating client."

Milestone F — certificate chain verify. Walk the chain: each STM cert's AVK must match the previous cert's next_aggregate_verification_key, plus the STM verify above. Closes the loop all the way to genesis.

Library choice reminder

  • gnark-crypto (github.com/consensys/gnark-crypto) — pure Go, audited, standard library for BLS12-381 in Go land. Has min_sig mode via ecc/bls12-381/signature/bls.
  • fxamacker/cbor/v2 — pure Go CBOR, serde-compatible in the common shapes. Confirm byte-exact interop with serde_cbor via round-trip tests against captured multi_signature bytes.
  • golang.org/x/crypto/blake2b — stdlib-adjacent, works for BLAKE2b-256 and BLAKE2b-512.

blst Go bindings are faster but CGo — rejected per the "pure single binary" project goal.

Gotchas to watch

  • Byte order on indexes: index.to_le_bytes() in Rust — little-endian 8-byte u64. Get this wrong and every evaluate_dense_mapping disagrees.
  • Signature compression: blst default is compressed in G1 (48B). gnark-crypto uses the same serialization (IETF 4.2.1) — but confirm the sign-bit convention matches when round-tripping.
  • CBOR tag handling: serde_cbor historically used canonical CBOR; fxamacker/cbor needs DecOptions{DupMapKey: DupMapKeyEnforcedAPIError} if we want strict compatibility.
  • BTreeMap serialization order: the protocol_message hash (already handled for genesis) relies on Rust enum declaration order, not alpha. Any NEW fields upstream need the Go messagePartOrder slice updated.
  • phi_f arithmetic: the Rust uses higher-precision rationals for (1 - phi_f)^stake_frac. Go's math/big.Float at 512-bit precision should be enough; double-check against the Rust unit tests.

What's already done

Genesis Ed25519 path is fully working end-to-end:

  • SHA256(protocol_message) digest ✓
  • Ed25519 verify over ASCII-of-hex ✓
  • Genesis verify key decode ✓
  • Tested against live mainnet + preprod ✓

The verify subcommand dispatches to verifyGenesisCert already; STM needs a parallel verifyStandardCert wired through the same path.