skald

cobb/skald

Fork 0

Commit graph

Author	SHA1	Message	Date
Kayos	465c94b745	schema: voices + pronunciation_overrides + narration_runs (v0.2 prep) TTS layer landed as schema-only — synthesis pipeline ships in v0.2. Putting the tables in v0.1 means imports already carry the right shape; we won't need a 'migrate every existing story' pass later. Decisions locked 2026-05-13: - Engine: F5-TTS (best 8GB FOSS option, mid-2026 SOTA) - Default voice source: LJ Speech (Linda Johnson, PD released specifically for TTS training — airtight for sharing/uploading generated audio. The 'AI-consent-released' license posture is the difference between 'should be fine' and 'definitely fine.') - Variety voices: Hi-Fi TTS speaker IDs (Apache 2.0, same consent shape). LibriVox is optional but never default. - Pronunciation overrides DB layer (story-scoped + global) to fix proper-noun mispronunciation — the actual TTS-quality gap on Cobb's bar of 'must not wake me up.' Pre-pass with Opus extracts proper nouns + IPA, operator verifies, table caches forever. Tables: - voices — name, license, reference_path/text, sample_rate, default flag - pronunciation_overrides — story-scoped or global, IPA/arpabet - narration_runs — TTS audit trail mirroring generation_runs - stories.preferred_voice_id FK Unique constraints: - one default voice (partial index) - one row per (story, word) override - one global row per word	2026-05-13 10:07:32 -07:00
Kayos	f575ad3722	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI Skald is a generic story-writer. The database is the product; the binary is the tooling. Everything story-specific lives in rows, not in code. cwho's monorepo + binary-per-role pattern transplanted to this domain. What this commit ships: - Cargo workspace (resolver=3, edition 2024): skald-core (lib) + skald (bin) - Migration 0001: stories, characters, canon_facts, chapters, chapter_summaries, passages (vector(1536)), generation_runs, audit_findings, tags. pgvector + pg_trgm extensions. ivfflat index deferred until we have data (post-import the first ~1k passages and add the index). - skald-core::ingest — markdown parser for the cwho/coast-down shape: '# Title' → '## Chapter N — date' headings → '# Continuity Bible' section with character roster (real + fictional sub-sections) + setting / mystery / historical / liberty / hook sub-sections. Decomposed into structured rows; original bullet body preserved in key_facts/body fields for fidelity. 6 unit tests cover the shape. - skald-core::db — Postgres connection pool + migration runner. - skald-core::models — row types via sqlx::FromRow. - skald binary — clap CLI: 'serve' (http + migrations) and 'import-markdown' (one-shot ingest). - Dockerfile — multi-stage: rust:1.95-bookworm builder, pgvector/ pgvector:pg17 runtime, tini under PID 1, custom entrypoint.sh that boots embedded postgres then execs skald serve. - compose.yml — singleton container, postgres data in volume, story corpus mounted read-only at /seed. Decisions locked 2026-05-13: 1. DB in same container 'till we have a real working tool' (cobb) 2. postgres+pgvector (NOT sqlite) — keeps semantic-search story 3. Network-not-socket connection (postgresql://localhost:5432) from day one so future split is config-only, not code-rewrite Not yet wired: - Web UI - clawdforge calls (gen → cleanup → canon-audit pipeline) - Embedding pass - TTS sidecar	2026-05-13 09:04:28 -07:00

Author

SHA1

Message

Date

Kayos

465c94b745

schema: voices + pronunciation_overrides + narration_runs (v0.2 prep)

TTS layer landed as schema-only — synthesis pipeline ships in v0.2.
Putting the tables in v0.1 means imports already carry the right
shape; we won't need a 'migrate every existing story' pass later.

Decisions locked 2026-05-13:
- Engine: F5-TTS (best 8GB FOSS option, mid-2026 SOTA)
- Default voice source: LJ Speech (Linda Johnson, PD released
  specifically for TTS training — airtight for sharing/uploading
  generated audio. The 'AI-consent-released' license posture is
  the difference between 'should be fine' and 'definitely fine.')
- Variety voices: Hi-Fi TTS speaker IDs (Apache 2.0, same consent
  shape). LibriVox is optional but never default.
- Pronunciation overrides DB layer (story-scoped + global) to fix
  proper-noun mispronunciation — the actual TTS-quality gap on
  Cobb's bar of 'must not wake me up.' Pre-pass with Opus extracts
  proper nouns + IPA, operator verifies, table caches forever.

Tables:
- voices — name, license, reference_path/text, sample_rate, default flag
- pronunciation_overrides — story-scoped or global, IPA/arpabet
- narration_runs — TTS audit trail mirroring generation_runs
- stories.preferred_voice_id FK

Unique constraints:
- one default voice (partial index)
- one row per (story, word) override
- one global row per word

2026-05-13 10:07:32 -07:00

Kayos

f575ad3722

scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI

Skald is a generic story-writer. The database is the product; the
binary is the tooling. Everything story-specific lives in rows, not
in code. cwho's monorepo + binary-per-role pattern transplanted to
this domain.

What this commit ships:
- Cargo workspace (resolver=3, edition 2024): skald-core (lib) +
  skald (bin)
- Migration 0001: stories, characters, canon_facts, chapters,
  chapter_summaries, passages (vector(1536)), generation_runs,
  audit_findings, tags. pgvector + pg_trgm extensions. ivfflat
  index deferred until we have data (post-import the first ~1k
  passages and add the index).
- skald-core::ingest — markdown parser for the cwho/coast-down shape:
  '# Title' → '## Chapter N — date' headings → '# Continuity Bible'
  section with character roster (real + fictional sub-sections) +
  setting / mystery / historical / liberty / hook sub-sections.
  Decomposed into structured rows; original bullet body preserved
  in key_facts/body fields for fidelity. 6 unit tests cover the
  shape.
- skald-core::db — Postgres connection pool + migration runner.
- skald-core::models — row types via sqlx::FromRow.
- skald binary — clap CLI: 'serve' (http + migrations) and
  'import-markdown' (one-shot ingest).
- Dockerfile — multi-stage: rust:1.95-bookworm builder, pgvector/
  pgvector:pg17 runtime, tini under PID 1, custom entrypoint.sh
  that boots embedded postgres then execs skald serve.
- compose.yml — singleton container, postgres data in volume,
  story corpus mounted read-only at /seed.

Decisions locked 2026-05-13:
1. DB in same container 'till we have a real working tool' (cobb)
2. postgres+pgvector (NOT sqlite) — keeps semantic-search story
3. Network-not-socket connection (postgresql://localhost:5432) from
   day one so future split is config-only, not code-rewrite

Not yet wired:
- Web UI
- clawdforge calls (gen → cleanup → canon-audit pipeline)
- Embedding pass
- TTS sidecar

2026-05-13 09:04:28 -07:00

2 commits