Commit graph

1 commit

Author SHA1 Message Date
edf679504d v0.3 step 1: foods schema + USDA SR Legacy density seed
Phase A foundation. Cobb 2026-04-29: 'go big or go home' on density-table
aggregator — this commit lands the schema + seed data so the aggregator
engine has something to look up against in step 2.

DB:
- migration 010: cauldron_foods (canonical_name PK, density_g_per_ml,
  default_unit_class enum mass/volume/count/mixed, common_size_g,
  category, usda_fdc_id, source enum)
- migration 011: cauldron_food_mapping (per-household Mealie food_id →
  cauldron canonical food_id, used by aggregator + foods-dedupe later)

Seed data:
- scripts/build_foods_seed.py — extractor that walks USDA SR Legacy
  foodPortions, derives density g/ml from cup/tbsp/tsp/fl-oz/ml/etc
  measurements (handles SR Legacy's quirk of putting unit in 'modifier'
  with measureUnit.name='undetermined'), filters out babyfood / branded
  / fast-food / alcoholic-beverage clutter, normalizes names, categorizes
  via longest-keyword-wins
- cauldron/data/foods_seed_usda.json — 2,462 foods with density values
  derived from USDA. 636KB, ships in the image.
- cauldron/data/README.md — regen instructions + known issues / iteration
  plan (next pass: claude-curated cleanup → ~500-800 high-relevance entries
  + count-based foods like egg/onion that USDA doesn't cover)

Loader (cauldron/foods.py):
- load_seed_if_empty(db) called on app startup right after migrate().
  Idempotent — won't reload if table is non-empty.
- reload_seed(db) for forced reloads (INSERT IGNORE).
- search_food(db, name) helper for the aggregator + UI.

Categories present in seed:
  produce-vegetable: 300, spice: 256, dairy: 207, condiment: 197,
  legume: 189, meat: 166, beverage: 153, baking: 129, produce-fruit: 128,
  oil-fat: 126, nut-seed: 115, grain: 89, other: 407

The 407 'other' bucket and the verbose USDA names ('mayonnaise, reduced
fat, with olive oil') will get cleaned up via clawdforge in step 3.
For now the aggregator can already do the math against this seed; the
unit-conversion engine is the next commit.
2026-04-28 22:03:17 -07:00