Two changes:
1. foods catalog grows organically. Switch the canonical seed from the
noisy USDA dump (2462 rows of "'s, classic chicken noodle soup")
to the Sonnet-curated cut (229 clean rows). search_food() is now
exact + case-insensitive — Mealie's parser already canonicalizes
food names household-side, so cauldron just needs to look them up
verbatim. On miss, the /list view calls forge.fetch_food_info() to
ask Sonnet for {density_g_per_ml, default_unit_class, common_size_g,
category}, persists the row with source='claude', and the household's
actual kitchen catalog builds itself out as Abby uses it.
Killer case verified end-to-end: "2 cups + 50g + 1.25 lb rice"
collapses to a single "2.25 lb rice" line on the shopping list once
rice has a density row.
2. Game system stripped from /plan. Scoreboard panel, streak banner,
"first to lock takes the week" / "🏆 you locked this one in" copy
all gone. award_pick_points calls in /api/plan/generate +
/api/plan/regenerate stopped firing. household_scoreboard /
household_streak DB methods kept as dead code; cauldron_pick_points
table left in place — non-destructive, easy to revive later if
gamification comes back. Goal: get the base flow (pick → plan →
list) working for Abby first, layer features on after.
|
||
|---|---|---|
| .. | ||
| foods_seed.json | ||
| foods_seed_usda.json | ||
| README.md | ||
cauldron/data — seed data shipped with the app
foods_seed_usda.json
Canonical foods catalog seeded from USDA SR Legacy (2018-04 release).
Each entry has a derived density_g_per_ml from USDA's foodPortions
data — the quantity-of-grams reported for one cup / tablespoon / etc.
The aggregator engine (Phase A step 2) uses these density values to combine "2 cups rice + 1.25 lb rice" into a single shopping-list line.
Regenerate
If USDA ships a new SR Legacy dataset:
# 1. Download the new dataset from https://fdc.nal.usda.gov/download-datasets
# Pick the "SR Legacy / Full Download — JSON" link.
# 2. Unzip somewhere local, e.g. /tmp/usda-sr.json
# 3. Re-run the extractor:
python3 scripts/build_foods_seed.py /tmp/usda-sr.json > cauldron/data/foods_seed_usda.json
# 4. Commit the resulting JSON
Loading
cauldron/foods.py runs load_seed_if_empty(db) on app boot — only loads
when the cauldron_foods table is empty. Safe to redeploy without re-loading.
For a manual reload (e.g. after updating the seed without dropping the
table), call foods.reload_seed(db) which uses INSERT IGNORE on the
canonical_name unique key.
Known issues / iteration
The USDA SR Legacy descriptions are verbose and brand-laden ("Apples, sulfured, stewed, with added sugar"). Our normalization is a heuristic — expect ~15% of entries to have suboptimal canonical names. The Phase A step 3 plan is to feed the seed through clawdforge → Sonnet to:
- Drop entries that aren't useful cooking foods
- Normalize names (drop ", raw", merge brand variants)
- Add count-based foods USDA doesn't cover (e.g. "egg", "onion" in count form)
- Curate down to ~500-800 high-relevance foods
Until that step lands, expect to manually search the canonical_name field to find what you want; the aggregator's fuzzy matching covers most of it.