skald/README.md
Kayos f575ad3722 scaffold v0.1: postgres+pgvector inside-container, schema, markdown ingest, CLI
Skald is a generic story-writer. The database is the product; the
binary is the tooling. Everything story-specific lives in rows, not
in code. cwho's monorepo + binary-per-role pattern transplanted to
this domain.

What this commit ships:
- Cargo workspace (resolver=3, edition 2024): skald-core (lib) +
  skald (bin)
- Migration 0001: stories, characters, canon_facts, chapters,
  chapter_summaries, passages (vector(1536)), generation_runs,
  audit_findings, tags. pgvector + pg_trgm extensions. ivfflat
  index deferred until we have data (post-import the first ~1k
  passages and add the index).
- skald-core::ingest — markdown parser for the cwho/coast-down shape:
  '# Title' → '## Chapter N — date' headings → '# Continuity Bible'
  section with character roster (real + fictional sub-sections) +
  setting / mystery / historical / liberty / hook sub-sections.
  Decomposed into structured rows; original bullet body preserved
  in key_facts/body fields for fidelity. 6 unit tests cover the
  shape.
- skald-core::db — Postgres connection pool + migration runner.
- skald-core::models — row types via sqlx::FromRow.
- skald binary — clap CLI: 'serve' (http + migrations) and
  'import-markdown' (one-shot ingest).
- Dockerfile — multi-stage: rust:1.95-bookworm builder, pgvector/
  pgvector:pg17 runtime, tini under PID 1, custom entrypoint.sh
  that boots embedded postgres then execs skald serve.
- compose.yml — singleton container, postgres data in volume,
  story corpus mounted read-only at /seed.

Decisions locked 2026-05-13:
1. DB in same container 'till we have a real working tool' (cobb)
2. postgres+pgvector (NOT sqlite) — keeps semantic-search story
3. Network-not-socket connection (postgresql://localhost:5432) from
   day one so future split is config-only, not code-rewrite

Not yet wired:
- Web UI
- clawdforge calls (gen → cleanup → canon-audit pipeline)
- Embedding pass
- TTS sidecar
2026-05-13 09:04:28 -07:00

84 lines
2.8 KiB
Markdown

# skald
Long-form story-writer with canon-keeping, sequel continuity, and
(future) self-hosted audiobook narration. Database is the source of
truth — the writer is the tooling.
Named for the Old Norse poets who composed and memorized kings'
sagas across generations.
## Status: v0.1 — scaffold
What's wired:
- Rust workspace (`skald-core` + `skald`)
- Postgres schema for stories, characters, canon facts, chapters,
passages, generation runs, audit findings, tags
- pgvector extension installed for future similarity search
- `skald import-markdown` ingests a story file (chapters + bible)
into the schema
- `skald serve` exposes `/health` and runs migrations on boot
- Single-container deploy: postgres + skald in one image
Not yet wired:
- Web UI (the inbox + browse + queue surface)
- clawdforge calls (the actual generate / cleanup / canon-audit
pipeline)
- Embeddings + similarity search
- TTS sidecar
## v0.1 smoke
```bash
docker compose -p skald up -d
docker exec skald skald import-markdown \
--path /seed/coast-down.md \
--title "The Coast-Down"
curl http://lucy:7780/health
# → { ok: true, db_ok: true, story_count: 1, ... }
```
## Schema (cheat sheet)
```
stories → meta + status + parent/root for series
characters → real or fictional, story-scoped
canon_facts → setting, mystery, theme, rule, historical_anchor, hook
chapters → full prose body
chapter_summaries → short summaries for cheap context loading
passages → paragraph-level + embedding vector(1536)
generation_runs → every LLM call logged
audit_findings → canon audit output (severity + area)
tags → arbitrary labels
```
## Architecture (v0.1 + the plan)
```
┌─────────────────────────────────┐
│ skald container │
│ ┌───────────┐ ┌────────────┐ │
│ │ postgres │ │ skald-rust │ │
│ │ pgvector │←─│ axum + cli │ │
│ │ localhost │ │ :7780 │ │
│ └───────────┘ └─────┬──────┘ │
└─────────────────────────┼────────┘
│ HTTP (future)
┌──────────┐
│clawdforge│
└─────┬────┘
opus calls
```
v1.0+: extract postgres to its own container on db-net. skald
becomes pure stateless rust, connects via `DATABASE_URL`. Migration
is a connection-string change + a network move; the binary doesn't
care where the DB lives.
## License
MIT.