cobb/skald

Fork 0

Long-form story-writer with canon-keeping, sequel-continuity, and self-hosted narration. Database-is-source-of-truth — writer is the tooling.

Find a file

Kayos 465c94b745 schema: voices + pronunciation_overrides + narration_runs (v0.2 prep) TTS layer landed as schema-only — synthesis pipeline ships in v0.2. Putting the tables in v0.1 means imports already carry the right shape; we won't need a 'migrate every existing story' pass later. Decisions locked 2026-05-13: - Engine: F5-TTS (best 8GB FOSS option, mid-2026 SOTA) - Default voice source: LJ Speech (Linda Johnson, PD released specifically for TTS training — airtight for sharing/uploading generated audio. The 'AI-consent-released' license posture is the difference between 'should be fine' and 'definitely fine.') - Variety voices: Hi-Fi TTS speaker IDs (Apache 2.0, same consent shape). LibriVox is optional but never default. - Pronunciation overrides DB layer (story-scoped + global) to fix proper-noun mispronunciation — the actual TTS-quality gap on Cobb's bar of 'must not wake me up.' Pre-pass with Opus extracts proper nouns + IPA, operator verifies, table caches forever. Tables: - voices — name, license, reference_path/text, sample_rate, default flag - pronunciation_overrides — story-scoped or global, IPA/arpabet - narration_runs — TTS audit trail mirroring generation_runs - stories.preferred_voice_id FK Unique constraints: - one default voice (partial index) - one row per (story, word) override - one global row per word		2026-05-13 10:07:32 -07:00
migrations	schema: voices + pronunciation_overrides + narration_runs (v0.2 prep)	2026-05-13 10:07:32 -07:00
skald	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI	2026-05-13 09:04:28 -07:00
skald-core	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI	2026-05-13 09:04:28 -07:00
.gitignore	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI	2026-05-13 09:04:28 -07:00
Cargo.lock	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI	2026-05-13 09:04:28 -07:00
Cargo.toml	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI	2026-05-13 09:04:28 -07:00
compose.yml	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI	2026-05-13 09:04:28 -07:00
Dockerfile	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI	2026-05-13 09:04:28 -07:00
entrypoint.sh	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI	2026-05-13 09:04:28 -07:00
README.md	scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI	2026-05-13 09:04:28 -07:00

README.md

skald

Long-form story-writer with canon-keeping, sequel continuity, and (future) self-hosted audiobook narration. Database is the source of truth — the writer is the tooling.

Named for the Old Norse poets who composed and memorized kings' sagas across generations.

Status: v0.1 — scaffold

What's wired:

Rust workspace (skald-core + skald)
Postgres schema for stories, characters, canon facts, chapters, passages, generation runs, audit findings, tags
pgvector extension installed for future similarity search
skald import-markdown ingests a story file (chapters + bible) into the schema
skald serve exposes /health and runs migrations on boot
Single-container deploy: postgres + skald in one image

Not yet wired:

Web UI (the inbox + browse + queue surface)
clawdforge calls (the actual generate / cleanup / canon-audit pipeline)
Embeddings + similarity search
TTS sidecar

v0.1 smoke

docker compose -p skald up -d
docker exec skald skald import-markdown \
    --path /seed/coast-down.md \
    --title "The Coast-Down"

curl http://lucy:7780/health
# → { ok: true, db_ok: true, story_count: 1, ... }

Schema (cheat sheet)

stories         → meta + status + parent/root for series
characters      → real or fictional, story-scoped
canon_facts     → setting, mystery, theme, rule, historical_anchor, hook
chapters        → full prose body
chapter_summaries → short summaries for cheap context loading
passages        → paragraph-level + embedding vector(1536)
generation_runs → every LLM call logged
audit_findings  → canon audit output (severity + area)
tags            → arbitrary labels

Architecture (v0.1 + the plan)

┌─────────────────────────────────┐
│  skald container                │
│  ┌───────────┐  ┌────────────┐  │
│  │ postgres  │  │ skald-rust │  │
│  │ pgvector  │←─│ axum + cli │  │
│  │ localhost │  │ :7780      │  │
│  └───────────┘  └─────┬──────┘  │
└─────────────────────────┼────────┘
                          │ HTTP (future)
                          ↓
                    ┌──────────┐
                    │clawdforge│
                    └─────┬────┘
                          ↓
                     opus calls

v1.0+: extract postgres to its own container on db-net. skald becomes pure stateless rust, connects via DATABASE_URL. Migration is a connection-string change + a network move; the binary doesn't care where the DB lives.

License

MIT.