skald

7 commits 3 branches 0 tags 304 KiB

Author	SHA1	Message	Date
Kayos	39e991240a	summarize: first real forge call — generate per-chapter summaries skald summarize --story <uuid> walks every chapter without an existing summary, calls Forge::summarize() (clawdforge → opus → ~250 words of plot/character/setting/threads), and inserts the result into chapter_summaries. Side effects: - generation_runs row per chapter (kind='summary', status flow running → succeeded\|failed). Errors update the row + bail; happy path closes it with ended_at + tokens. - ON CONFLICT (chapter_id) means re-running with --force replaces the previous summary cleanly. CLI: skald summarize --story <uuid> # only-missing skald summarize --story <uuid> --force # re-summarize all Reads from env (loaded by skald.env in the container): CLAWDFORGE_URL — base URL of clawdforge HTTP service CLAWDFORGE_TOKEN — app-level bearer (per-app, not the admin token) SKALD_MODEL — defaults to 'opus' This is the first subcommand that actually exercises the forge. Unlocks ContinuationContext::assemble's coverage metric (was stuck at 24%% on Coast-Down because the 5 placeholder summaries don't actually carry the prose). After running summarize against Coast-Down: coverage should jump to ~100%% and the context blob for any sequel becomes fully canon-faithful without dragging the full ~21k words of earlier-chapter prose along. Forge prompt template for summarize ships REAL (not stubbed) — it's the simplest pass and has a well-defined shape. The gen/cleanup/ audit prompts remain stubs pending the deeper prose-craft session.	2026-05-13 10:42:51 -07:00
Kayos	b32938ef43	dockerfile: copy vendor/ during cache layer (path-dep needs full crate)	2026-05-13 10:30:58 -07:00
Kayos	5b418369c0	context: assemble DB→opus blob + skald show-context CLI skald-core::context is the bridge between 'rows in postgres' and 'prompt-ready markdown blob.' ContinuationContext::assemble(pool, parent_story_id, recent_n) pulls: - parent story meta (title, series, total word count) - characters split real / fictional - canon_facts grouped by category - chapter summaries for everything older than the recent window - FULL prose for the last recent_n chapters render_markdown() formats it with the most-condensed data first (characters, canon) and the richest detail last (recent chapter prose). Opus reads it linearly so by the time it's writing the new chapter, the previous chapter's prose is freshest in its context window. The 'continuation reads ≥85% of parent' rule lands here via parent_coverage() which counts recent prose + summaries-as-proxy (250 words / summary) against parent word_count. The web UI / CLI can warn before firing a gen pass if coverage is below threshold. New CLI subcommand: skald show-context --story <uuid> --recent <N> Assembles + prints the blob to stdout (eprintln'd stats summary goes to stderr). No LLM call — pre-flight inspection so we see what would be sent before paying for it. Useful for prompt-eng work in the next session. Module structure now: skald-core/ config.rs ForgeConfig context.rs ContinuationContext (new) db.rs connect_and_migrate forge.rs Forge — three-pass orchestration ingest.rs markdown parser models.rs row types lib.rs MIGRATOR + module exports skald/ main.rs clap CLI serve.rs axum + /health + migrations import.rs skald import-markdown show_context.rs skald show-context (new)	2026-05-13 10:30:16 -07:00
Kayos	f71b533e52	v0.2 scaffold: vendor clawdforge SDK + forge module + Whisper plan The Rust SDK already existed at Sulkta-Coop/clawdforge clients/rust/ — async, reqwest-based, bearer-auth, exposes Client::run() + Session for multi-turn. Vendoring it into vendor/clawdforge so skald is self-contained: no git-submodule + no needing the clawdforge repo cloned next to skald. Trade-off accepted: updates require manual re-copy until both sides stabilize and we publish to a private cargo registry. What landed: - vendor/clawdforge/ — full SDK source from Sulkta-Coop/clawdforge HEAD. Pinned in skald-core/Cargo.toml as a path dep. - skald-core/src/forge.rs — three-pass orchestration shell. Forge wraps clawdforge::Client; generate() / cleanup() / audit() each build a RunRequest with the right system prompt + model alias (always opus), call client.run(), return a PassOutput. Prompt templates are TODO stubs (SYSTEM_GEN_TODO etc) — filling in the actual prose-craft prompts is its own deep session. - skald-core/src/config.rs — ForgeConfig { base_url, app_token, model }. Resolved by the binary from env (CLAWDFORGE_URL + CLAWDFORGE_TOKEN); lib stays env-agnostic. - skald-core::AuditFinding + AuditResponse — parse shape for what the third-Opus canon audit returns, ready to map onto audit_findings rows. - docs/tts-pipeline.md — full plan for v0.2 narration + post-TTS audit chain. Whisper-large-v3 STT does text-to-text verification on every render; an optional Gemini Flash audio pass catches subjective issues (prosody, tone) Whisper can't see. Reroll loop on crit findings. What's still stubbed: - Prompt templates in forge.rs (gen / cleanup / audit) — placeholders that describe the role but don't constrain output shape yet. - context.rs (assemble the LLM context blob from DB rows) — entire module TBD. - No CLI subcommand yet for invoking forge — that comes after context.rs. Naming note: in Rust 2024 'gen' is a reserved keyword (for generators), so the method is Forge::generate(), not Forge::gen().	2026-05-13 10:18:56 -07:00
Kayos	4a91e0738d	schema: narration_findings — audio-layer audit table Closes the TTS schema layer. The v0.2 render pipeline auto-runs an audit chain after each chapter narration: F5 render → narration_runs (succeeded) → ffmpeg chunk into ~30s windows → Whisper-large-v3 STT each chunk → word-level diff vs source chapter text → mismatches → narration_findings (kind=pronunciation\|skip\|insert) → ffmpeg silence/clip detect → narration_findings (kind=glitch) → (optional) Gemini Flash audio review pass → narration_findings (kind=prosody\|tone) → unresolved crits trigger automatic re-roll with new seed Distinct from audit_findings: that table is canon/continuity at the text layer, populated by the third-Opus canon-audit pass. narration_findings is audio-quality only, populated by detectors that consume the rendered WAV. The 'detector' field captures which model produced the finding so we can tune thresholds per detector when one over- or under-flags. cobb's audio agent intuition was right: STT-and-diff catches the 'name came out wrong' case airtight, and a separate audio-native LLM call catches the subtler 'this sentence sounded weird' cases Whisper can't see.	2026-05-13 10:10:04 -07:00
Kayos	465c94b745	schema: voices + pronunciation_overrides + narration_runs (v0.2 prep) TTS layer landed as schema-only — synthesis pipeline ships in v0.2. Putting the tables in v0.1 means imports already carry the right shape; we won't need a 'migrate every existing story' pass later. Decisions locked 2026-05-13: - Engine: F5-TTS (best 8GB FOSS option, mid-2026 SOTA) - Default voice source: LJ Speech (Linda Johnson, PD released specifically for TTS training — airtight for sharing/uploading generated audio. The 'AI-consent-released' license posture is the difference between 'should be fine' and 'definitely fine.') - Variety voices: Hi-Fi TTS speaker IDs (Apache 2.0, same consent shape). LibriVox is optional but never default. - Pronunciation overrides DB layer (story-scoped + global) to fix proper-noun mispronunciation — the actual TTS-quality gap on Cobb's bar of 'must not wake me up.' Pre-pass with Opus extracts proper nouns + IPA, operator verifies, table caches forever. Tables: - voices — name, license, reference_path/text, sample_rate, default flag - pronunciation_overrides — story-scoped or global, IPA/arpabet - narration_runs — TTS audit trail mirroring generation_runs - stories.preferred_voice_id FK Unique constraints: - one default voice (partial index) - one row per (story, word) override - one global row per word	2026-05-13 10:07:32 -07:00
Kayos	f575ad3722	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI Skald is a generic story-writer. The database is the product; the binary is the tooling. Everything story-specific lives in rows, not in code. cwho's monorepo + binary-per-role pattern transplanted to this domain. What this commit ships: - Cargo workspace (resolver=3, edition 2024): skald-core (lib) + skald (bin) - Migration 0001: stories, characters, canon_facts, chapters, chapter_summaries, passages (vector(1536)), generation_runs, audit_findings, tags. pgvector + pg_trgm extensions. ivfflat index deferred until we have data (post-import the first ~1k passages and add the index). - skald-core::ingest — markdown parser for the cwho/coast-down shape: '# Title' → '## Chapter N — date' headings → '# Continuity Bible' section with character roster (real + fictional sub-sections) + setting / mystery / historical / liberty / hook sub-sections. Decomposed into structured rows; original bullet body preserved in key_facts/body fields for fidelity. 6 unit tests cover the shape. - skald-core::db — Postgres connection pool + migration runner. - skald-core::models — row types via sqlx::FromRow. - skald binary — clap CLI: 'serve' (http + migrations) and 'import-markdown' (one-shot ingest). - Dockerfile — multi-stage: rust:1.95-bookworm builder, pgvector/ pgvector:pg17 runtime, tini under PID 1, custom entrypoint.sh that boots embedded postgres then execs skald serve. - compose.yml — singleton container, postgres data in volume, story corpus mounted read-only at /seed. Decisions locked 2026-05-13: 1. DB in same container 'till we have a real working tool' (cobb) 2. postgres+pgvector (NOT sqlite) — keeps semantic-search story 3. Network-not-socket connection (postgresql://localhost:5432) from day one so future split is config-only, not code-rewrite Not yet wired: - Web UI - clawdforge calls (gen → cleanup → canon-audit pipeline) - Embedding pass - TTS sidecar	2026-05-13 09:04:28 -07:00