recipe enrichment: per-recipe Sonnet meta for smarter planning

The 'fancy data fun' Cobb wanted: pre-compute structured metadata for
every recipe so the plan generator can match preferences to actual
recipe characteristics, not just match keywords on names.

Sonnet returns per recipe:
  - tags[]: curated descriptors (high-protein, weeknight, one-pan,
    leftovers-good, kid-friendly, etc — picks 3-8 that genuinely apply)
  - cuisine, complexity (easy/medium/involved), estimated_minutes
  - meal_type (breakfast/lunch/dinner/snack/dessert/side/sauce/drink)
  - primary_protein (chicken/beef/pork/fish/seafood/tofu/...)
  - primary_carb (rice/pasta/bread/potato/tortilla/quinoa/...)
  - veg_forward (veg-forward/mixed/meat-forward)
  - comfort_tier (weeknight-easy/hearty-comfort/fancy-occasion/...)
  - season_fit[] + summary one-liner + best_for short phrase

Schema:
- Migration 024: cauldron_recipe_meta keyed by (household_id, recipe_slug),
  meta_json + enrich_version (bumping the version invalidates the cache
  and forces re-walk). One row per Mealie recipe Cobb owns.
- Migration 025: cauldron_enrich_jobs — job runner state. No
  proposals/review needed since metadata is purely additive.

Forge:
- enrich_recipe(recipe) builds a compact prompt with name + description
  + ingredients + steps (capped at 2000 chars total) + yields, asks
  Sonnet for the structured blob. _extract_recipe_meta validates and
  coerces types.

Module enrich_recipes.py:
- Daemon thread runner, walks all household recipes, skips already-
  enriched at current ENRICH_VERSION (idempotent), respects external
  cancel + stuck-job recovery. Skips cross-household recipes (Lake
  Elsinore stuff visible but not enrichable).

Plan generator hookup:
- /api/plan/generate + regenerate now pulls cauldron_recipe_meta and
  splices it into the recipe pool prompt. Each pool line goes from:
      - chicken-stir-fry: Chicken Stir Fry  [asian]
  to:
      - chicken-stir-fry: Chicken Stir Fry  [asian · easy · 30min ·
        protein:chicken · carb:rice · high-protein/weeknight/one-pan]
        quick weeknight stir-fry with leftover-friendly portions
  Sonnet now has rich attributes to actually match a 'high protein
  week' or 'comfort food' or 'quick' preference against, instead of
  guessing from titles.

Endpoints:
- /enrich-recipes UI page (progress bar + start + force re-enrich +
  cancel; no review/approve since meta is additive)
- /api/recipes/enrich-{start,status,cancel} session-authed
- /api/admin/recipes/enrich-start bearer-authed for kayos kick-off

Cost (one-time): ~5s/recipe × 226 = ~20 min walk. Subsequent runs
only process new/changed recipes.
This commit is contained in:
Kayos 2026-04-30 20:08:20 -07:00
parent 820d65171b
commit 10849e0e95
6 changed files with 828 additions and 7 deletions

View file

@ -408,6 +408,52 @@ MIGRATIONS = [
ALTER TABLE cauldron_meal_plans
ADD COLUMN IF NOT EXISTS preference_prompt VARCHAR(1000)
""",
# 024 — Per-recipe AI-generated metadata. Sonnet looks at the full
# recipe (name, description, ingredients, steps, yields) and returns
# a structured blob: tags, cuisine, complexity, estimated_minutes,
# meal_type, primary_protein, primary_carb, veg_forward, comfort_tier,
# season_fit, summary, best_for. Plan generator uses this so "high
# protein week" becomes a real query, not just a vibe-prompt.
# enrich_version lets us bump the prompt and re-enrich without
# losing the prior data.
"""
CREATE TABLE IF NOT EXISTS cauldron_recipe_meta (
household_id BIGINT NOT NULL,
recipe_slug VARCHAR(255) NOT NULL,
meta_json JSON,
enrich_version INT NOT NULL DEFAULT 1,
last_enriched_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (household_id, recipe_slug),
INDEX idx_household (household_id),
FOREIGN KEY (household_id) REFERENCES cauldron_households(id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
""",
# 025 — Recipe-enrichment bulk job state. Runs through every household
# recipe, calls Sonnet, persists meta. No apply/review step — meta is
# purely additive so we just write it. Same daemon-thread runner +
# cancel + stuck-recovery pattern.
"""
CREATE TABLE IF NOT EXISTS cauldron_enrich_jobs (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
household_id BIGINT NOT NULL,
started_by_sub VARCHAR(190) NOT NULL,
started_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
last_progress_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
finished_at DATETIME,
total_recipes INT NOT NULL DEFAULT 0,
enriched_count INT NOT NULL DEFAULT 0,
skipped_count INT NOT NULL DEFAULT 0,
error_count INT NOT NULL DEFAULT 0,
current_slug VARCHAR(255),
last_error VARCHAR(500),
state ENUM('running','done','failed','cancelled')
NOT NULL DEFAULT 'running',
INDEX idx_household_state (household_id, state),
FOREIGN KEY (household_id) REFERENCES cauldron_households(id) ON DELETE CASCADE,
FOREIGN KEY (started_by_sub) REFERENCES cauldron_users(authentik_sub) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
""",
]
@ -1499,6 +1545,142 @@ class DB:
(proposal_id,),
)
# --- recipe enrichment ------------------------------------------------
ENRICH_VERSION = 1
def get_recipe_meta(self, household_id: int, recipe_slug: str) -> dict | None:
with self.conn() as c, c.cursor() as cur:
cur.execute(
"""SELECT recipe_slug, meta_json, enrich_version, last_enriched_at
FROM cauldron_recipe_meta
WHERE household_id=%s AND recipe_slug=%s""",
(household_id, recipe_slug),
)
row = cur.fetchone()
return dict(row) if row else None
def list_recipe_meta_for_household(self, household_id: int) -> list[dict]:
"""Used by the plan generator to splice meta into the recipe pool prompt."""
with self.conn() as c, c.cursor() as cur:
cur.execute(
"""SELECT recipe_slug, meta_json, enrich_version
FROM cauldron_recipe_meta
WHERE household_id=%s""",
(household_id,),
)
return [dict(r) for r in cur.fetchall()]
def upsert_recipe_meta(
self,
*,
household_id: int,
recipe_slug: str,
meta_json: str,
version: int,
) -> None:
with self.conn() as c, c.cursor() as cur:
cur.execute(
"""INSERT INTO cauldron_recipe_meta
(household_id, recipe_slug, meta_json, enrich_version)
VALUES (%s, %s, %s, %s)
ON DUPLICATE KEY UPDATE
meta_json = VALUES(meta_json),
enrich_version = VALUES(enrich_version)""",
(household_id, recipe_slug, meta_json, version),
)
def create_enrich_job(self, *, household_id: int, started_by_sub: str) -> int:
with self.conn() as c, c.cursor() as cur:
cur.execute(
"""INSERT INTO cauldron_enrich_jobs
(household_id, started_by_sub, state)
VALUES (%s, %s, 'running')""",
(household_id, started_by_sub),
)
return cur.lastrowid
def get_enrich_job(self, job_id: int) -> dict | None:
with self.conn() as c, c.cursor() as cur:
cur.execute("SELECT * FROM cauldron_enrich_jobs WHERE id=%s", (job_id,))
return cur.fetchone()
def get_enrich_job_state(self, job_id: int) -> str | None:
with self.conn() as c, c.cursor() as cur:
cur.execute("SELECT state FROM cauldron_enrich_jobs WHERE id=%s", (job_id,))
row = cur.fetchone()
return row["state"] if row else None
def latest_enrich_job_for_household(self, household_id: int) -> dict | None:
with self.conn() as c, c.cursor() as cur:
cur.execute(
"""SELECT * FROM cauldron_enrich_jobs
WHERE household_id=%s ORDER BY started_at DESC LIMIT 1""",
(household_id,),
)
return cur.fetchone()
def running_enrich_job_for_household(self, household_id: int) -> dict | None:
with self.conn() as c, c.cursor() as cur:
cur.execute(
"""SELECT * FROM cauldron_enrich_jobs
WHERE household_id=%s AND state='running'
ORDER BY started_at DESC LIMIT 1""",
(household_id,),
)
return cur.fetchone()
def update_enrich_job_progress(
self,
job_id: int,
*,
enriched_delta: int = 0,
skipped_delta: int = 0,
error_delta: int = 0,
current_slug: str | None = None,
last_error: str | None = None,
) -> None:
with self.conn() as c, c.cursor() as cur:
cur.execute(
"""UPDATE cauldron_enrich_jobs
SET enriched_count = enriched_count + %s,
skipped_count = skipped_count + %s,
error_count = error_count + %s,
current_slug = COALESCE(%s, current_slug),
last_error = COALESCE(%s, last_error),
last_progress_at = NOW()
WHERE id=%s""",
(enriched_delta, skipped_delta, error_delta,
current_slug, last_error, job_id),
)
def finalize_enrich_job(self, job_id: int, *, state: str) -> None:
with self.conn() as c, c.cursor() as cur:
cur.execute(
"""UPDATE cauldron_enrich_jobs
SET state=%s,
finished_at = CASE WHEN %s IN ('done','failed','cancelled')
THEN NOW() ELSE finished_at END,
last_progress_at = NOW(),
current_slug = NULL
WHERE id=%s
AND state NOT IN ('done','failed','cancelled')""",
(state, state, job_id),
)
def fail_stuck_enrich_jobs(self, *, stale_minutes: int = 15) -> int:
with self.conn() as c, c.cursor() as cur:
cur.execute(
"""UPDATE cauldron_enrich_jobs
SET state='failed',
finished_at=NOW(),
last_error=COALESCE(last_error, 'recovery: worker exited mid-run')
WHERE state='running'
AND last_progress_at < NOW() - INTERVAL %s MINUTE""",
(stale_minutes,),
)
return cur.rowcount
def fail_stuck_recipe_dedupe_jobs(self, *, stale_minutes: int = 15) -> int:
with self.conn() as c, c.cursor() as cur:
cur.execute(

179
cauldron/enrich_recipes.py Normal file
View file

@ -0,0 +1,179 @@
"""Recipe metadata enrichment — once per recipe, persist forever.
Walks the user's household recipes, calls forge.enrich_recipe(recipe)
on each one, persists the structured metadata to cauldron_recipe_meta
keyed by (household_id, recipe_slug).
No review/apply step the metadata is purely additive. The plan
generator reads it next time it runs.
Idempotent: skips recipes already enriched at the current
db.DB.ENRICH_VERSION. Bumping the version (when the prompt or schema
changes) forces a re-walk.
Same daemon-thread + cancel + stuck-recovery pattern as the rest.
"""
from __future__ import annotations
import json
import logging
import threading
from .db import DB
from .forge import Forge, ForgeError
from .mealie import Mealie, MealieError
log = logging.getLogger(__name__)
def _household_id_for(mealie: Mealie) -> str | None:
me = mealie.who_am_i()
hid = me.get("householdId") or me.get("household_id")
if not hid:
h = me.get("household")
if isinstance(h, dict):
hid = h.get("id")
return hid
def _recipe_household_id(recipe: dict) -> str | None:
hid = recipe.get("householdId") or recipe.get("household_id")
if hid:
return hid
h = recipe.get("household")
if isinstance(h, dict):
return h.get("id")
return None
def run_enrich(
*,
db: DB,
job_id: int,
household_id: int,
mealie: Mealie,
forge: Forge,
force: bool = False,
) -> None:
"""Walk all recipes in the user's household, enrich each via Sonnet,
persist. Runs in a daemon thread; respects external cancel."""
log.info("[enrich:%s] start (force=%s)", job_id, force)
def _cancelled() -> bool:
s = db.get_enrich_job_state(job_id)
return s in ("cancelled", "failed", "done")
try:
user_household = _household_id_for(mealie)
# Pull every recipe slug from Mealie (paginated)
slugs: list[tuple[str, str]] = []
page = 1
while page <= 50:
resp = mealie.list_recipes(page=page, per_page=100)
items = resp.get("items") or []
for r in items:
slug = r.get("slug")
name = r.get("name") or slug or ""
if slug:
slugs.append((slug, name))
tp = resp.get("total_pages") or resp.get("totalPages") or 1
if not items or page >= tp:
break
page += 1
with db.conn() as c, c.cursor() as cur:
cur.execute(
"UPDATE cauldron_enrich_jobs SET total_recipes=%s WHERE id=%s",
(len(slugs), job_id),
)
for slug, name in slugs:
if _cancelled():
log.info("[enrich:%s] aborted (state changed)", job_id)
return
# Skip cross-household — only enrich what the user owns
try:
recipe = mealie.get_recipe(slug)
except MealieError as e:
msg = str(e)[:500]
log.warning("[enrich:%s] get_recipe(%s): %s", job_id, slug, msg)
db.update_enrich_job_progress(
job_id, error_delta=1, current_slug=slug, last_error=msg
)
continue
if user_household:
rec_hh = _recipe_household_id(recipe)
if rec_hh and rec_hh != user_household:
db.update_enrich_job_progress(
job_id, skipped_delta=1, current_slug=slug
)
continue
# Skip if already enriched at the current version (unless forced)
if not force:
existing = db.get_recipe_meta(household_id, slug)
if existing and existing.get("enrich_version") == db.ENRICH_VERSION:
db.update_enrich_job_progress(
job_id, skipped_delta=1, current_slug=slug
)
continue
db.update_enrich_job_progress(job_id, current_slug=slug)
try:
meta = forge.enrich_recipe(recipe)
except (ForgeError, RuntimeError) as e:
msg = str(e)[:500]
log.warning("[enrich:%s] enrich_recipe(%s): %s", job_id, slug, msg)
db.update_enrich_job_progress(
job_id, error_delta=1, current_slug=slug, last_error=msg
)
continue
try:
db.upsert_recipe_meta(
household_id=household_id,
recipe_slug=slug,
meta_json=json.dumps(meta, ensure_ascii=False),
version=db.ENRICH_VERSION,
)
db.update_enrich_job_progress(job_id, enriched_delta=1)
except Exception as e:
msg = str(e)[:500]
log.warning("[enrich:%s] persist(%s): %s", job_id, slug, msg)
db.update_enrich_job_progress(
job_id, error_delta=1, current_slug=slug, last_error=msg
)
db.finalize_enrich_job(job_id, state="done")
log.info("[enrich:%s] done", job_id)
except Exception:
log.exception("[enrich:%s] crashed", job_id)
try:
db.finalize_enrich_job(job_id, state="failed")
except Exception:
pass
def spawn_thread(
*,
db: DB,
job_id: int,
household_id: int,
mealie: Mealie,
forge: Forge,
force: bool = False,
) -> threading.Thread:
t = threading.Thread(
target=run_enrich,
kwargs={
"db": db, "job_id": job_id, "household_id": household_id,
"mealie": mealie, "forge": forge, "force": force,
},
name=f"enrich-recipes-{job_id}",
daemon=True,
)
t.start()
return t

View file

@ -169,9 +169,11 @@ class Forge:
slug = r.get("slug") or ""
name = r.get("name") or slug
tags = r.get("tags") or []
tag_str = ""
meta = r.get("meta") or {}
extras: list[str] = []
# First 3 Mealie tags
if tags:
# First 3 tags only — keeps prompt token count under control
cleaned = []
for t in tags[:3]:
if isinstance(t, dict):
@ -180,8 +182,32 @@ class Forge:
cleaned.append(t)
cleaned = [c for c in cleaned if c]
if cleaned:
tag_str = f" [{', '.join(cleaned)}]"
pool_lines.append(f"- {slug}: {name}{tag_str}")
extras.append(", ".join(cleaned))
# Sonnet-generated meta — the actual high-signal stuff
if meta:
if meta.get("cuisine") and meta["cuisine"] not in ("unknown", "other"):
extras.append(meta["cuisine"])
if meta.get("complexity"):
extras.append(meta["complexity"])
em = meta.get("estimated_minutes")
if isinstance(em, int) and em > 0:
extras.append(f"{em}min")
if meta.get("primary_protein") and meta["primary_protein"] != "none":
extras.append(f"protein:{meta['primary_protein']}")
if meta.get("primary_carb") and meta["primary_carb"] != "none":
extras.append(f"carb:{meta['primary_carb']}")
if meta.get("veg_forward") and meta["veg_forward"] != "mixed":
extras.append(meta["veg_forward"])
meta_tags = meta.get("tags") or []
if meta_tags:
extras.append("/".join(meta_tags[:5]))
if meta.get("summary"):
# Inline 1-line summary helps Sonnet match preferences
summary = str(meta["summary"])[:140]
pool_lines.append(f"- {slug}: {name} [{' · '.join(extras)}]\n {summary}")
continue
extra_str = f" [{' · '.join(extras)}]" if extras else ""
pool_lines.append(f"- {slug}: {name}{extra_str}")
pick_lines = []
for p in picks:
@ -354,6 +380,113 @@ class Forge:
result = self.run(prompt, model=model or "sonnet", timeout_secs=60)
return _extract_cluster_decision(result)
def enrich_recipe(self, recipe: dict, *, model: str | None = None) -> dict:
"""Generate structured metadata for a recipe so the plan generator
can match preferences to actual recipe characteristics, not just
names.
Input: a Mealie recipe dict (uses name + description + ingredients
+ instructions + yields + recipeYield).
Output (validated):
{
"tags": [<curated descriptor strings>],
# e.g. "high-protein", "weeknight", "one-pan",
# "kid-friendly", "leftovers-good", "freezer-friendly"
"cuisine": "<american|italian|asian|mexican|...|other|unknown>",
"complexity": "easy|medium|involved",
"estimated_minutes": <int>,
"meal_type": "breakfast|lunch|dinner|snack|dessert|side",
"primary_protein": "<chicken|beef|pork|fish|tofu|beans|eggs|none|mixed>",
"primary_carb": "<rice|pasta|bread|potato|tortilla|quinoa|none|mixed>",
"veg_forward": "veg-forward|mixed|meat-forward",
"comfort_tier": "<weeknight-easy|comfort|fancy|kid-friendly|...>",
"season_fit": [<season strings>],
"summary": "<one-line vibe>",
"best_for": "<short phrase about when this is the right pick>"
}
Cheap call, idempotent run once per recipe and cache forever
(or until enrich_version bumps)."""
# Build a compact recipe summary for the prompt
ings = recipe.get("recipeIngredient") or []
ing_lines: list[str] = []
for i in ings[:30]:
food = (i.get("food") or {}).get("name") if isinstance(i.get("food"), dict) else None
qty = i.get("quantity")
unit = (i.get("unit") or {}).get("name") if isinstance(i.get("unit"), dict) else None
note = i.get("note") or ""
line = ""
if qty not in (None, ""):
line += f"{qty} "
if unit:
line += f"{unit} "
if food:
line += food
elif note:
line += note
if line.strip():
ing_lines.append(line.strip())
instructions = recipe.get("recipeInstructions") or []
steps: list[str] = []
char_budget = 2000
for step in instructions:
if not isinstance(step, dict):
continue
text = (step.get("text") or "").strip()
if not text or char_budget <= 0:
continue
if len(text) > char_budget:
text = text[:char_budget] + ""
steps.append(text)
char_budget -= len(text)
prompt = (
"Given the following recipe, return structured metadata to help "
"an AI meal planner pick recipes that match user preferences "
"('high protein week', 'carb load', 'light recovery', etc).\n\n"
f"NAME: {recipe.get('name') or '(unnamed)'}\n"
f"DESCRIPTION: {(recipe.get('description') or '').strip()[:400]}\n"
f"YIELDS: {(recipe.get('recipeYield') or '').strip()[:80]}\n"
f"INGREDIENTS:\n - " + "\n - ".join(ing_lines or ['(none listed)']) + "\n"
f"STEPS:\n - " + "\n - ".join(steps or ['(none listed)']) + "\n\n"
"Output JSON ONLY, no prose:\n"
"{\n"
' "tags": [<curated descriptor strings — pick 3-8 from these or invent close variants: '
'"high-protein","low-carb","high-carb","low-fat","high-fiber",'
'"vegetarian","vegan","gluten-free","dairy-free","keto","paleo",'
'"weeknight","weekend","one-pan","one-pot","sheet-pan","slow-cooker","instant-pot",'
'"freezer-friendly","leftovers-good","kid-friendly","spicy","mild",'
'"hearty","light","fresh","comfort","fancy","quick","make-ahead">],\n'
' "cuisine": "<american|italian|asian|mexican|mediterranean|indian|french|middle-eastern|other|unknown>",\n'
' "complexity": "<easy|medium|involved>",\n'
' "estimated_minutes": <int total time including prep>,\n'
' "meal_type": "<breakfast|lunch|dinner|snack|dessert|side|sauce|drink>",\n'
' "primary_protein": "<chicken|beef|pork|fish|seafood|tofu|tempeh|beans|eggs|cheese|nuts|none|mixed>",\n'
' "primary_carb": "<rice|pasta|bread|potato|tortilla|quinoa|noodles|grain|none|mixed>",\n'
' "veg_forward": "<veg-forward|mixed|meat-forward>",\n'
' "comfort_tier": "<weeknight-easy|hearty-comfort|fancy-occasion|kid-friendly|date-night|crowd-pleaser>",\n'
' "season_fit": [<one or more of "spring","summer","fall","winter","year-round">],\n'
' "summary": "<one-line vibe — what KIND of meal is this>",\n'
' "best_for": "<short phrase: when is this the right pick>"\n'
"}\n\n"
"Rules:\n"
"- Return ONLY the JSON object, no markdown fences, no prose.\n"
"- Be concrete: 'high-protein' goes in tags ONLY if the recipe genuinely "
"qualifies (significant meat/eggs/dairy/protein source per serving).\n"
"- estimated_minutes: best guess from prep + cook implied by steps. Dishes "
"needing rise/marinade time count that time.\n"
"- complexity: 'easy' = ≤30 min + ≤7 ingredients + simple technique; "
"'medium' = 30-90 min OR moderate technique; 'involved' = >90 min OR "
"advanced technique (lamination, fermentation, multi-component).\n"
"- summary should describe the vibe / use-case, not just restate the name. "
"e.g. 'quick weeknight stir-fry with leftover-friendly portions' beats "
"'chicken stir fry with rice'.\n"
"- When uncertain on a categorical, use 'unknown' or 'other' rather than guessing."
)
result = self.run(prompt, model=model or "sonnet", timeout_secs=90)
return _extract_recipe_meta(result)
def fetch_food_info(self, name: str, *, model: str | None = None) -> dict:
"""Ask Sonnet for density + unit class + common size of a single
food. Returns a dict shaped like:
@ -390,6 +523,54 @@ class Forge:
return _extract_food_info(result)
def _extract_recipe_meta(forge_result: dict) -> dict:
"""Validate the recipe metadata blob from Sonnet. Coerces types,
normalizes enums to lowercase, drops fields not in the schema."""
if not isinstance(forge_result, dict):
raise ForgeError("forge result not a dict")
inner = forge_result.get("result", forge_result)
if isinstance(inner, str):
inner = _parse_json_blob(inner)
if not isinstance(inner, dict):
raise ForgeError(f"recipe meta not a dict: {str(inner)[:200]}")
def _str(v, default=""):
return str(v).strip().lower()[:64] if isinstance(v, str) and v.strip() else default
def _str_long(v, default=""):
return str(v).strip()[:300] if isinstance(v, str) and v.strip() else default
def _str_list(v) -> list[str]:
if not isinstance(v, list):
return []
out = []
for item in v:
if isinstance(item, str) and item.strip():
out.append(item.strip().lower()[:48])
return out[:12]
def _int(v, default=0):
try:
return max(0, int(v))
except (TypeError, ValueError):
return default
return {
"tags": _str_list(inner.get("tags")),
"cuisine": _str(inner.get("cuisine"), "unknown"),
"complexity": _str(inner.get("complexity"), "medium"),
"estimated_minutes": _int(inner.get("estimated_minutes")),
"meal_type": _str(inner.get("meal_type"), "dinner"),
"primary_protein": _str(inner.get("primary_protein"), "none"),
"primary_carb": _str(inner.get("primary_carb"), "none"),
"veg_forward": _str(inner.get("veg_forward"), "mixed"),
"comfort_tier": _str(inner.get("comfort_tier"), "weeknight-easy"),
"season_fit": _str_list(inner.get("season_fit")) or ["year-round"],
"summary": _str_long(inner.get("summary")),
"best_for": _str_long(inner.get("best_for")),
}
def _extract_recipe_dedupe_decision(forge_result: dict) -> dict:
if not isinstance(forge_result, dict):
raise ForgeError("forge result not a dict")

View file

@ -33,7 +33,7 @@ from .config import load
from .crypto import TokenCrypto
from .db import DB
from .forge import Forge, ForgeError
from . import aggregator, bulk_sterilize, consolidate_foods, dedupe_recipes, foods
from . import aggregator, bulk_sterilize, consolidate_foods, dedupe_recipes, enrich_recipes, foods
from .mealie import Mealie, MealieError
from .oidc import init_oauth
from .recipe_index import flatten_recipe, refresh_household_index, search_index
@ -125,6 +125,13 @@ def create_app() -> Flask:
except Exception as e:
app.logger.warning("recipe-dedupe stuck-job recovery failed: %s", e)
try:
n_failed = db.fail_stuck_enrich_jobs(stale_minutes=15)
if n_failed:
app.logger.info("failed %d stuck enrich jobs at boot", n_failed)
except Exception as e:
app.logger.warning("enrich stuck-job recovery failed: %s", e)
oauth = init_oauth(
app,
issuer=cfg.oidc_issuer,
@ -662,9 +669,23 @@ def create_app() -> Flask:
db.set_plan_preference(plan["id"], preference)
plan["preference_prompt"] = preference[:1000]
# Pull picks (with picker_subs) + recipe pool (slug+name+tags only)
# Pull picks + recipe pool. The pool now splices in cauldron_recipe_meta
# (Sonnet-generated per-recipe attributes — cuisine, complexity, macros,
# meal_type, primary_protein/carb, comfort_tier, summary) so the planner
# can match preferences to actual recipe characteristics, not just names.
picks = db.list_household_picks_with_pickers(hid)
rows = db.list_indexed_recipes(hid, limit=2000, offset=0)
meta_rows = db.list_recipe_meta_for_household(hid)
meta_by_slug: dict[str, dict] = {}
for mr in meta_rows:
blob = mr.get("meta_json")
if isinstance(blob, str):
try:
meta_by_slug[mr["recipe_slug"]] = _json_loads(blob)
except Exception:
pass
elif isinstance(blob, dict):
meta_by_slug[mr["recipe_slug"]] = blob
recipes = []
for r in rows:
tags = []
@ -676,7 +697,11 @@ def create_app() -> Flask:
raw = None
if isinstance(raw, dict):
tags = raw.get("tags") or []
recipes.append({"slug": r["slug"], "name": r["name"], "tags": tags})
entry = {"slug": r["slug"], "name": r["name"], "tags": tags}
m = meta_by_slug.get(r["slug"])
if m:
entry["meta"] = m
recipes.append(entry)
if not recipes:
return jsonify({"error": "no_recipes_indexed"}), 409
@ -1049,6 +1074,99 @@ def create_app() -> Flask:
db.finalize_sterilize_job(job_id, state="cancelled")
return jsonify({"ok": True})
# ---------- recipe metadata enrichment -----------------------------
@app.get("/enrich-recipes")
@require_session
def enrich_recipes_page():
hid = current_household_id()
if not hid:
return redirect(url_for("connect_mealie_get"))
latest = db.latest_enrich_job_for_household(hid)
existing_count = len(db.list_recipe_meta_for_household(hid))
return render_template(
"enrich_recipes.html",
active="enrich",
latest_job=latest,
existing_count=existing_count,
)
@app.post("/api/recipes/enrich-start")
@require_session
def enrich_recipes_start():
u = session["user"]
hid = current_household_id()
if not hid:
return jsonify({"error": "no household"}), 409
active = db.running_enrich_job_for_household(hid)
if active:
return jsonify({"error": "already_running", "job_id": active["id"]}), 409
client = current_user_mealie()
if client is None:
return redirect(url_for("connect_mealie_get"))
body = request.get_json(silent=True) or {}
force = bool(body.get("force"))
job_id = db.create_enrich_job(household_id=hid, started_by_sub=u["sub"])
enrich_recipes.spawn_thread(
db=db, job_id=job_id, household_id=hid,
mealie=client, forge=forge, force=force,
)
return jsonify({"ok": True, "job_id": job_id})
@app.get("/api/recipes/enrich-status")
@require_session
def enrich_recipes_status():
hid = current_household_id()
if not hid:
return jsonify({"error": "no household"}), 409
job = db.latest_enrich_job_for_household(hid)
if not job:
return jsonify({"job": None})
return jsonify({"job": _consolidate_job_payload(job)})
@app.post("/api/recipes/enrich-cancel/<int:job_id>")
@require_session
def enrich_recipes_cancel(job_id: int):
hid = current_household_id()
if not hid:
return jsonify({"error": "no household"}), 409
job = db.get_enrich_job(job_id)
if not job or job["household_id"] != hid:
return jsonify({"error": "not_found"}), 404
if job["state"] != "running":
return jsonify({"error": f"bad_state:{job['state']}"}), 409
db.finalize_enrich_job(job_id, state="cancelled")
return jsonify({"ok": True})
@app.post("/api/admin/recipes/enrich-start")
@require_bearer
def admin_enrich_recipes_start():
body = request.get_json(silent=True) or {}
sub = (body.get("started_by_sub") or "").strip()
if not sub:
return jsonify({"error": "started_by_sub required"}), 400
hid = db.get_user_household_id(sub)
if not hid:
return jsonify({"error": "user has no household"}), 404
active = db.running_enrich_job_for_household(hid)
if active:
return jsonify({"error": "already_running", "job_id": active["id"]}), 409
blob = db.get_user_mealie_token_blob(sub)
if not blob:
return jsonify({"error": "user_not_connected_to_mealie"}), 409
try:
tok = crypto.decrypt(blob)
except Exception:
return jsonify({"error": "user_token_undecryptable"}), 500
mealie = Mealie(base_url=cfg.mealie_api_url, api_token=tok)
force = bool(body.get("force"))
job_id = db.create_enrich_job(household_id=hid, started_by_sub=sub)
enrich_recipes.spawn_thread(
db=db, job_id=job_id, household_id=hid,
mealie=mealie, forge=forge, force=force,
)
return jsonify({"ok": True, "job_id": job_id})
# ---------- recipe dedupe ------------------------------------------
@app.get("/dedupe-recipes")

View file

@ -0,0 +1,158 @@
{% extends "_base.html" %}
{% block title %}Enrich Recipes · Cauldron{% endblock %}
{% block content %}
<style>
.progress-rail { width:100%; height:14px; background:var(--bg-2);
border:1px solid var(--line); border-radius:8px; overflow:hidden;
margin:12px 0 6px 0; }
.progress-fill { height:100%;
background:linear-gradient(90deg, var(--purple-deep), var(--purple-bright));
transition:width .3s ease; box-shadow:0 0 12px -2px var(--purple-glow); }
.progress-meta { color:var(--bone-dim); font-family:var(--mono); font-size:12px;
letter-spacing:.1em; display:flex; gap:18px; flex-wrap:wrap; }
.progress-meta strong { color:var(--bone); }
.stat-card { padding:14px; background:var(--bg-2); border:1px solid var(--line);
border-radius:8px; margin-bottom:14px; }
.stat-card .big { font-family:var(--serif); font-size:1.5em; color:var(--bone); }
.stat-card .lbl { color:var(--muted); font-family:var(--mono); font-size:11px;
letter-spacing:.15em; text-transform:uppercase; }
</style>
<div class="page-head">
<div class="crumb">// enrich · per-recipe metadata for smarter planning</div>
<h1>recipe <span class="accent">enrich</span></h1>
<div class="lede">
walk every household recipe and have sonnet generate structured metadata —
cuisine, complexity, macros, meal type, primary protein + carb,
comfort tier, one-line summary. the plan generator uses this so
"high protein week" actually filters the pool, not just biases the vibe.
</div>
</div>
<section class="panel">
<div class="panel-head">
<h2>state</h2>
<span class="pill" id="state-pill">loading…</span>
<span class="ctx" id="state-ctx"></span>
</div>
<div class="stat-card">
<div class="big" id="existing-count">{{ existing_count }}</div>
<div class="lbl">recipes already enriched in this household</div>
</div>
<div id="empty-pane" style="display:none;">
<p>kick off an enrichment run? walks every recipe and skips any already
enriched at the current schema version.</p>
<button class="btn btn-purple" id="start-btn" type="button" onclick="startRun(false)">🪄 enrich recipes</button>
<button class="btn" type="button" onclick="startRun(true)" title="ignore enrich_version cache and re-run all">↻ force re-enrich all</button>
<p class="muted" style="margin-top:8px;">
cost: ~5s/recipe via clawdforge. ~226 recipes ≈ 20 min the first time.
after that, only newly-imported / unenriched recipes process.
</p>
</div>
<div id="progress-pane" style="display:none;">
<div class="progress-rail"><div class="progress-fill" id="bar" style="width:0%;"></div></div>
<div class="progress-meta">
<span><strong id="enriched">0</strong> enriched</span>
<span><strong id="skipped">0</strong> already-enriched</span>
<span><strong id="errors">0</strong> errors</span>
<span>of <strong id="total">?</strong></span>
<span class="muted" id="current-slug"></span>
</div>
<div class="btn-row" style="margin-top:12px;">
<button class="btn" type="button" onclick="cancelJob()">cancel</button>
</div>
</div>
<div id="done-pane" style="display:none;">
<p id="done-line"></p>
<button class="btn btn-purple" type="button" onclick="startRun(false)">↻ enrich newly-imported</button>
<button class="btn" type="button" onclick="startRun(true)">↻ force re-enrich all</button>
</div>
<div id="failed-pane" style="display:none;">
<p style="color:var(--crit);" id="failed-line"></p>
<button class="btn btn-purple" type="button" onclick="startRun(false)">↻ retry</button>
</div>
</section>
<script>
let job = {{ (latest_job | tojson) if latest_job else 'null' }};
let pollTimer = null;
function $(id){return document.getElementById(id);}
function showPane(name){
for(const p of ['empty','progress','done','failed']){
$(`${p}-pane`).style.display = (p === name) ? '' : 'none';
}
}
function setStatePill(text, klass){
const el = $('state-pill'); el.textContent = text;
el.className = 'pill ' + (klass || 'pill-mute');
}
function paint(){
if(!job) return;
const total = job.total_recipes || 0;
const done = (job.enriched_count || 0) + (job.skipped_count || 0) + (job.error_count || 0);
const pct = total>0 ? Math.round((done/total)*100) : 0;
$('bar').style.width = pct+'%';
$('enriched').textContent = job.enriched_count || 0;
$('skipped').textContent = job.skipped_count || 0;
$('errors').textContent = job.error_count || 0;
$('total').textContent = total || '?';
$('current-slug').textContent = job.current_slug ? `· ${job.current_slug}` : '';
}
async function fetchJob(){
try {
const r = await fetch('/api/recipes/enrich-status');
const d = await r.json();
job = d.job || null; route();
} catch(e){ console.error('status poll failed', e); }
}
function route(){
if(!job){ stopPoll(); setStatePill('idle','pill-mute'); $('state-ctx').textContent=''; showPane('empty'); return; }
$('state-ctx').textContent = `started ${new Date(job.started_at).toLocaleString()}`;
const s = job.state;
if(s === 'running'){ setStatePill('walking','pill-ok'); paint(); showPane('progress'); startPoll(); }
else if(s === 'done'){
setStatePill('done','pill-mute');
const e = job.enriched_count || 0;
const sk = job.skipped_count || 0;
$('done-line').textContent = `enriched ${e} recipe${e===1?'':'s'} · ${sk} already-current.`;
showPane('done'); stopPoll();
}
else if(s === 'failed'){ setStatePill('failed','pill-mute'); $('failed-line').textContent = job.last_error || 'job failed'; showPane('failed'); stopPoll(); }
else if(s === 'cancelled'){ setStatePill('cancelled','pill-mute'); $('done-line').textContent='job cancelled.'; showPane('done'); stopPoll(); }
}
function startPoll(){ if(pollTimer) return; pollTimer = setInterval(fetchJob, 2000); }
function stopPoll(){ if(pollTimer){ clearInterval(pollTimer); pollTimer=null; } }
async function startRun(force){
const btn = $('start-btn');
if(btn){ btn.disabled = true; btn.textContent = 'kicking off…'; }
try {
const r = await fetch('/api/recipes/enrich-start',{
method:'POST', headers:{'Content-Type':'application/json'},
body: JSON.stringify({force: !!force}),
});
if(!r.ok){ const j = await r.json().catch(()=>({})); throw new Error(j.error || r.status); }
await fetchJob();
} catch(e){
alert('start failed: ' + e.message);
if(btn){ btn.disabled = false; btn.textContent = '🪄 enrich recipes'; }
}
}
async function cancelJob(){
if(!job) return;
if(!confirm('cancel?')) return;
try { await fetch('/api/recipes/enrich-cancel/'+job.id,{method:'POST'}); await fetchJob(); }
catch(e){ alert('cancel failed: '+e.message); }
}
route();
if(job && job.state === 'running') startPoll();
</script>
{% endblock %}

View file

@ -62,6 +62,9 @@
<p class="muted" style="margin-top:14px;">find duplicate recipes by name + ingredient similarity. sonnet picks the canonical to keep; you confirm per cluster before mealie deletes the others. permanent — review carefully.</p>
<p><a class="btn" href="/dedupe-recipes">🌀 dedupe recipes →</a></p>
<p class="muted" style="margin-top:14px;">have sonnet generate per-recipe metadata — cuisine, complexity, macros, primary protein/carb, comfort tier, summary. the plan generator reads this so "high protein week" is a real query, not just a vibe.</p>
<p><a class="btn" href="/enrich-recipes">✨ enrich recipes →</a></p>
</section>
{% endif %}