skald/migrations
Kayos 4a91e0738d schema: narration_findings — audio-layer audit table
Closes the TTS schema layer. The v0.2 render pipeline auto-runs an
audit chain after each chapter narration:

  F5 render → narration_runs (succeeded)
    → ffmpeg chunk into ~30s windows
    → Whisper-large-v3 STT each chunk
    → word-level diff vs source chapter text
    → mismatches → narration_findings (kind=pronunciation|skip|insert)
    → ffmpeg silence/clip detect → narration_findings (kind=glitch)
    → (optional) Gemini Flash audio review pass
      → narration_findings (kind=prosody|tone)
    → unresolved crits trigger automatic re-roll with new seed

Distinct from audit_findings: that table is canon/continuity at the
text layer, populated by the third-Opus canon-audit pass.
narration_findings is audio-quality only, populated by detectors
that consume the rendered WAV.

The 'detector' field captures which model produced the finding so
we can tune thresholds per detector when one over- or under-flags.

cobb's audio agent intuition was right: STT-and-diff catches the
'name came out wrong' case airtight, and a separate audio-native
LLM call catches the subtler 'this sentence sounded weird' cases
Whisper can't see.
2026-05-13 10:10:04 -07:00
..
0001_init.sql scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI 2026-05-13 09:04:28 -07:00
0002_voices_and_pronunciation.sql schema: voices + pronunciation_overrides + narration_runs (v0.2 prep) 2026-05-13 10:07:32 -07:00
0003_narration_findings.sql schema: narration_findings — audio-layer audit table 2026-05-13 10:10:04 -07:00