cauldron/cauldron/data
Kayos d649b99aef v0.3 step 5: lean shopping list — claude on-demand foods + game strip
Two changes:

1. foods catalog grows organically. Switch the canonical seed from the
   noisy USDA dump (2462 rows of "'s, classic chicken noodle soup")
   to the Sonnet-curated cut (229 clean rows). search_food() is now
   exact + case-insensitive — Mealie's parser already canonicalizes
   food names household-side, so cauldron just needs to look them up
   verbatim. On miss, the /list view calls forge.fetch_food_info() to
   ask Sonnet for {density_g_per_ml, default_unit_class, common_size_g,
   category}, persists the row with source='claude', and the household's
   actual kitchen catalog builds itself out as Abby uses it.

   Killer case verified end-to-end: "2 cups + 50g + 1.25 lb rice"
   collapses to a single "2.25 lb rice" line on the shopping list once
   rice has a density row.

2. Game system stripped from /plan. Scoreboard panel, streak banner,
   "first to lock takes the week" / "🏆 you locked this one in" copy
   all gone. award_pick_points calls in /api/plan/generate +
   /api/plan/regenerate stopped firing. household_scoreboard /
   household_streak DB methods kept as dead code; cauldron_pick_points
   table left in place — non-destructive, easy to revive later if
   gamification comes back. Goal: get the base flow (pick → plan →
   list) working for Abby first, layer features on after.
2026-04-29 22:02:20 -07:00
..
foods_seed.json v0.3 step 5: lean shopping list — claude on-demand foods + game strip 2026-04-29 22:02:20 -07:00
foods_seed_usda.json v0.3 step 1: foods schema + USDA SR Legacy density seed 2026-04-28 22:03:17 -07:00
README.md v0.3 step 1: foods schema + USDA SR Legacy density seed 2026-04-28 22:03:17 -07:00

cauldron/data — seed data shipped with the app

foods_seed_usda.json

Canonical foods catalog seeded from USDA SR Legacy (2018-04 release). Each entry has a derived density_g_per_ml from USDA's foodPortions data — the quantity-of-grams reported for one cup / tablespoon / etc.

The aggregator engine (Phase A step 2) uses these density values to combine "2 cups rice + 1.25 lb rice" into a single shopping-list line.

Regenerate

If USDA ships a new SR Legacy dataset:

# 1. Download the new dataset from https://fdc.nal.usda.gov/download-datasets
#    Pick the "SR Legacy / Full Download — JSON" link.
# 2. Unzip somewhere local, e.g. /tmp/usda-sr.json
# 3. Re-run the extractor:
python3 scripts/build_foods_seed.py /tmp/usda-sr.json > cauldron/data/foods_seed_usda.json
# 4. Commit the resulting JSON

Loading

cauldron/foods.py runs load_seed_if_empty(db) on app boot — only loads when the cauldron_foods table is empty. Safe to redeploy without re-loading. For a manual reload (e.g. after updating the seed without dropping the table), call foods.reload_seed(db) which uses INSERT IGNORE on the canonical_name unique key.

Known issues / iteration

The USDA SR Legacy descriptions are verbose and brand-laden ("Apples, sulfured, stewed, with added sugar"). Our normalization is a heuristic — expect ~15% of entries to have suboptimal canonical names. The Phase A step 3 plan is to feed the seed through clawdforge → Sonnet to:

  1. Drop entries that aren't useful cooking foods
  2. Normalize names (drop ", raw", merge brand variants)
  3. Add count-based foods USDA doesn't cover (e.g. "egg", "onion" in count form)
  4. Curate down to ~500-800 high-relevance foods

Until that step lands, expect to manually search the canonical_name field to find what you want; the aggregator's fuzzy matching covers most of it.