Commit graph

4 commits

Author SHA1 Message Date
c9bd38034c multi-voice: per-character dialogue rendering
Schema: characters.voice_id + characters.slug (migration 0007).
voice_id is FK to voices(id); slug is the stable lowercase token
the narrate_prep pass uses inside [voice:slug]...[/voice].

Forge::narrate_prep takes &[CharacterSpeaker]. System prompt
expanded to instruct the author to wrap dialogue lines in voice
tags based on a roster supplied in the user prompt (slug + name +
short hint from key_facts). Unattributed dialogue stays unwrapped
and inherits the narrator voice.

skald narrate substitutes [voice:<character-slug>] →
[voice:<kokoro-voice-name>] right before sending to Kokoro, using
characters.voice_id JOIN voices.reference_path as the map. Slugs
with no voice or no character row fall back to the narrator voice
defensively (logged as warn).

kokoro_server.py v0.4: splitter recognises [voice:X]...[/voice]
blocks at the paragraph level. Each text node carries an optional
voice attribution; renderer feeds it to Kokoro per-segment. Outside
voice blocks the request's default voice is used. voices_used is
reported back so callers can verify multi-voice actually ran.

Only kokoro-routed renders pre-process voice tags; F5 paths leave
the tags in place (F5 multi-voice not implemented). Defensive
fallback: orphan/unclosed [/voice] markers are silently absorbed
rather than failing the render.
2026-05-14 08:35:33 -07:00
75a609d507 web: chapter audio player + render button
Chapter view now shows a narration card between title and prose
with three states:
  - succeeded → HTML5 <audio> + voice + duration + download link
  - running   → 'rendering…' banner with relative start time
  - none/failed → 'Render audio' POST button (spawns background
                  tokio task calling narrate::run)

ServeDir mounted at /audio serves WAVs from the f5-tts bind-mount
read-only. Range requests work, so 16-min chapters seek cleanly.

Deploy needs: compose mount /mnt/cache/appdata/f5-tts/audio:/audio:ro
on skald (already staged in /mnt/cache/appdata/skald/compose.yml on
Lucy).
2026-05-13 17:08:43 -07:00
f71b533e52 v0.2 scaffold: vendor clawdforge SDK + forge module + Whisper plan
The Rust SDK already existed at Sulkta-Coop/clawdforge clients/rust/ — async,
reqwest-based, bearer-auth, exposes Client::run() + Session for multi-turn.
Vendoring it into vendor/clawdforge so skald is self-contained: no
git-submodule + no needing the clawdforge repo cloned next to skald.
Trade-off accepted: updates require manual re-copy until both sides
stabilize and we publish to a private cargo registry.

What landed:

- vendor/clawdforge/ — full SDK source from Sulkta-Coop/clawdforge HEAD.
  Pinned in skald-core/Cargo.toml as a path dep.
- skald-core/src/forge.rs — three-pass orchestration shell. Forge wraps
  clawdforge::Client; generate() / cleanup() / audit() each build a
  RunRequest with the right system prompt + model alias (always opus),
  call client.run(), return a PassOutput.
  Prompt templates are TODO stubs (SYSTEM_GEN_TODO etc) — filling in the
  actual prose-craft prompts is its own deep session.
- skald-core/src/config.rs — ForgeConfig { base_url, app_token, model }.
  Resolved by the binary from env (CLAWDFORGE_URL + CLAWDFORGE_TOKEN);
  lib stays env-agnostic.
- skald-core::AuditFinding + AuditResponse — parse shape for what the
  third-Opus canon audit returns, ready to map onto audit_findings rows.
- docs/tts-pipeline.md — full plan for v0.2 narration + post-TTS audit
  chain. Whisper-large-v3 STT does text-to-text verification on every
  render; an optional Gemini Flash audio pass catches subjective issues
  (prosody, tone) Whisper can't see. Reroll loop on crit findings.

What's still stubbed:

- Prompt templates in forge.rs (gen / cleanup / audit) — placeholders
  that describe the role but don't constrain output shape yet.
- context.rs (assemble the LLM context blob from DB rows) — entire module
  TBD.
- No CLI subcommand yet for invoking forge — that comes after context.rs.

Naming note: in Rust 2024 'gen' is a reserved keyword (for generators),
so the method is Forge::generate(), not Forge::gen().
2026-05-13 10:18:56 -07:00
f575ad3722 scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI
Skald is a generic story-writer. The database is the product; the
binary is the tooling. Everything story-specific lives in rows, not
in code. cwho's monorepo + binary-per-role pattern transplanted to
this domain.

What this commit ships:
- Cargo workspace (resolver=3, edition 2024): skald-core (lib) +
  skald (bin)
- Migration 0001: stories, characters, canon_facts, chapters,
  chapter_summaries, passages (vector(1536)), generation_runs,
  audit_findings, tags. pgvector + pg_trgm extensions. ivfflat
  index deferred until we have data (post-import the first ~1k
  passages and add the index).
- skald-core::ingest — markdown parser for the cwho/coast-down shape:
  '# Title' → '## Chapter N — date' headings → '# Continuity Bible'
  section with character roster (real + fictional sub-sections) +
  setting / mystery / historical / liberty / hook sub-sections.
  Decomposed into structured rows; original bullet body preserved
  in key_facts/body fields for fidelity. 6 unit tests cover the
  shape.
- skald-core::db — Postgres connection pool + migration runner.
- skald-core::models — row types via sqlx::FromRow.
- skald binary — clap CLI: 'serve' (http + migrations) and
  'import-markdown' (one-shot ingest).
- Dockerfile — multi-stage: rust:1.95-bookworm builder, pgvector/
  pgvector:pg17 runtime, tini under PID 1, custom entrypoint.sh
  that boots embedded postgres then execs skald serve.
- compose.yml — singleton container, postgres data in volume,
  story corpus mounted read-only at /seed.

Decisions locked 2026-05-13:
1. DB in same container 'till we have a real working tool' (cobb)
2. postgres+pgvector (NOT sqlite) — keeps semantic-search story
3. Network-not-socket connection (postgresql://localhost:5432) from
   day one so future split is config-only, not code-rewrite

Not yet wired:
- Web UI
- clawdforge calls (gen → cleanup → canon-audit pipeline)
- Embedding pass
- TTS sidecar
2026-05-13 09:04:28 -07:00