New bulk job that scans Mealie's foods table for duplicate-feeling
clusters, asks Sonnet to pick the canonical survivor + flag the rest
as merge candidates, and uses Mealie's PUT /api/foods/merge to
consolidate. After each successful merge, alias_additions get pushed
onto the survivor so Mealie's CRF/NLP parser fuzzy-matches the
discarded variant names from then on.
Architecture mirrors bulk_sterilize.py:
- Migrations 018+019 add cauldron_consolidate_jobs +
cauldron_consolidate_proposals (state machine: running → review →
applying → done/failed/cancelled)
- New consolidate_foods.py — daemon-thread runner with cancel-respect
and stuck-job recovery
- /api/foods/consolidate-{start,status,jobs,apply,cancel} for session
users + /api/admin/foods/consolidate-{start,jobs,cancel} for kayos
Sonnet integration:
- forge.cluster_decision(foods) → returns {merge, canonical_id,
canonical_name, discard_ids, alias_additions, reason}
- Conservative-by-default: when in doubt Sonnet returns merge=false
(the "olive oil vs olive" false-positive case from the prompt)
- Alias rules in the prompt explain why we want discarded names to
travel back to the survivor as aliases (parser future-proofing)
Mealie integration:
- mealie.merge_foods(from_id, to_id) → PUT /api/foods/merge
- mealie.update_food(food_id, body) → for pushing aliases onto the
survivor after merges land
- Apply path catches 403/permission errors and surfaces them as the
per-cluster apply_error (cross-household merge attempts will fail
here, same way as the sterilize cross-household path)
Clustering:
- rapidfuzz token_set_ratio ≥ 88 (slightly stricter than Mealie's
parser threshold of 85 to reduce false-positive clusters)
- Single-link agglomerative — O(n²) but Cobb's ~3000 foods = ~9M
comparisons, runs in seconds
- Singleton clusters (no merge candidates) are dropped, not stored
UI:
- /consolidate — same shape as /sterilize: progress bar → review grid
→ apply button. Cards show member chips with the canonical marked
★, discards marked × in red, alias_additions listed in green, plus
Sonnet's one-line reasoning. Mergeable approved by default; user
toggles individual clusters off if they disagree.
- Linked from /me → tools section, alongside bulk sterilize.
Total: ~600 LoC across 6 files. Foundation for the "Mealie owns
canonical names" architectural rule is now actually enforceable —
cobb runs this once, his foods table gets cleaned up, and Sonnet's
catalog-aware parser (Step 1) starts matching aliases for free.