skald/skald-core/src/forge.rs
Kayos 39e991240a summarize: first real forge call — generate per-chapter summaries
skald summarize --story <uuid> walks every chapter without an
existing summary, calls Forge::summarize() (clawdforge → opus →
~250 words of plot/character/setting/threads), and inserts the
result into chapter_summaries.

Side effects:
- generation_runs row per chapter (kind='summary', status flow
  running → succeeded|failed). Errors update the row + bail; happy
  path closes it with ended_at + tokens.
- ON CONFLICT (chapter_id) means re-running with --force replaces
  the previous summary cleanly.

CLI:
  skald summarize --story <uuid>           # only-missing
  skald summarize --story <uuid> --force   # re-summarize all

Reads from env (loaded by skald.env in the container):
  CLAWDFORGE_URL    — base URL of clawdforge HTTP service
  CLAWDFORGE_TOKEN  — app-level bearer (per-app, not the admin token)
  SKALD_MODEL       — defaults to 'opus'

This is the first subcommand that actually exercises the forge.
Unlocks ContinuationContext::assemble's coverage metric (was stuck
at 24%% on Coast-Down because the 5 placeholder summaries don't
actually carry the prose). After running summarize against
Coast-Down: coverage should jump to ~100%% and the context blob
for any sequel becomes fully canon-faithful without dragging the
full ~21k words of earlier-chapter prose along.

Forge prompt template for summarize ships REAL (not stubbed) — it's
the simplest pass and has a well-defined shape. The gen/cleanup/
audit prompts remain stubs pending the deeper prose-craft session.
2026-05-13 10:42:51 -07:00

244 lines
9.4 KiB
Rust

//! clawdforge wiring. Three passes per chapter; the actual prompt
//! templates are TODO (v0.2 prompt-engineering sprint) — this module
//! ships the plumbing so prompts can be filled in without
//! refactoring.
//!
//! The three passes:
//!
//! 1. **gen** — produces a new chapter draft from an assembled
//! context blob (parent prose + bible + characters + similarity-
//! matched passages, all from the database). Opus, max effort.
//!
//! 2. **cleanup** — polishes the draft for prose quality, voice
//! consistency, dialogue rhythm, pacing dead spots. Same Opus,
//! fresh eyes; sees gen pass output + same context.
//!
//! 3. **audit** — third Opus reads parent prose + sequel prose +
//! bible, returns structured findings: dropped threads, character
//! voice drift, retconned facts, timeline contradictions. Output
//! parses into rows for the `audit_findings` table.
//!
//! Every pass is logged as a `generation_runs` row before / after
//! for cost tracking, replay, and forensics.
//!
//! ## Naming context
//!
//! The Rust binding for clawdforge is the upstream `clawdforge` crate
//! (vendored at `vendor/clawdforge`). This module is the skald-side
//! glue: turn a story-id + a pass-kind into the right RunRequest +
//! parse the response into the right shape.
use std::time::Duration;
use clawdforge::{Client, ClientBuilder, RunRequest, RunResult};
use serde::{Deserialize, Serialize};
use crate::config::ForgeConfig;
/// Thin wrapper around the clawdforge `Client`. Configured once,
/// cheap to clone — each pass just calls `.run()` with a different
/// prompt.
#[derive(Clone)]
pub struct Forge {
client: Client,
/// The model alias we pass to clawdforge. Skald is opinionated:
/// always opus max effort. (See `project_story_writer_container.md`.)
/// `clawdforge` resolves the alias to the actual claude CLI flag.
model: String,
}
/// Per-pass output. `result` is the raw response from clawdforge.
/// Callers parse it into the shape they need.
#[derive(Debug, Clone)]
pub struct PassOutput {
pub kind: PassKind,
pub result: RunResult,
pub duration_ms: u64,
}
/// What a given pass over the model is for.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum PassKind {
/// First-pass long-form draft from prompt + context.
Gen,
/// Polish + humanize the gen pass output.
Cleanup,
/// Canon audit across parent + sequel. Outputs findings JSON.
Audit,
/// Chapter summary for cheap context loading on long series.
Summary,
}
impl PassKind {
pub fn as_str(self) -> &'static str {
match self {
Self::Gen => "gen",
Self::Cleanup => "cleanup",
Self::Audit => "audit",
Self::Summary => "summary",
}
}
}
impl Forge {
pub fn new(cfg: &ForgeConfig) -> anyhow::Result<Self> {
let client = ClientBuilder::default()
.base_url(&cfg.base_url)
.token(&cfg.app_token)
// Generation passes are slow — 600s is the clawdforge
// server-side max anyway, and gen passes routinely hit
// 5+ minutes on opus max-effort. Default 120s would
// strand them.
.timeout(Duration::from_secs(600))
.user_agent(concat!("skald/", env!("CARGO_PKG_VERSION")))
.build()?;
Ok(Self {
client,
model: cfg.model.clone(),
})
}
/// First-pass draft. `prompt` is the user-supplied story prompt;
/// `context` is the full assembled blob (bible + characters +
/// parent prose summaries + passages).
///
/// Prompt template is TODO (v0.2). Stub builds the simplest
/// possible request shape so the wiring compiles.
pub async fn generate(&self, prompt: &str, context: &str) -> anyhow::Result<PassOutput> {
let body = build_request(
&self.model,
PassKind::Gen,
prompt,
context,
SYSTEM_GEN_TODO,
);
let r = self.client.run(body).await?;
let duration_ms = r.duration_ms;
Ok(PassOutput { kind: PassKind::Gen, result: r, duration_ms })
}
/// Cleanup / humanize pass over the gen draft.
pub async fn cleanup(&self, draft: &str, context: &str) -> anyhow::Result<PassOutput> {
let body = build_request(
&self.model,
PassKind::Cleanup,
draft,
context,
SYSTEM_CLEANUP_TODO,
);
let r = self.client.run(body).await?;
let duration_ms = r.duration_ms;
Ok(PassOutput { kind: PassKind::Cleanup, result: r, duration_ms })
}
/// Canon audit comparing parent + sequel against the bible.
/// Expected to return structured JSON parseable into
/// `Vec<AuditFinding>`.
pub async fn audit(&self, parent_prose: &str, sequel_prose: &str, bible: &str) -> anyhow::Result<PassOutput> {
let body = build_audit_request(
&self.model,
parent_prose,
sequel_prose,
bible,
);
let r = self.client.run(body).await?;
let duration_ms = r.duration_ms;
Ok(PassOutput { kind: PassKind::Audit, result: r, duration_ms })
}
/// Summarize one chapter to ~250 words. The summary feeds into
/// the continuation context for older chapters so the token
/// budget stays sane on long series (book 12 doesn't carry book 1
/// in full prose; carries summaries of books 1-10 + full prose of
/// books 11-12).
///
/// Unlike gen/cleanup/audit, summarize has a real prompt template
/// shipped here — summarization is a simple, well-defined task
/// and doesn't need the prose-craft TODO treatment.
pub async fn summarize(&self, chapter_body_md: &str, chapter_label: &str) -> anyhow::Result<PassOutput> {
let prompt = format!(
"Summarize the following chapter in ~250 words for use as future \
sequel context. Capture: (1) plot beats in order, (2) character \
developments and emotional shifts, (3) setting changes, (4) any \
explicit or implied unresolved threads, (5) the chapter's \
closing position for each named character.\n\nReturn prose only \
— no headings, no bullet lists, no commentary about the task. \
Write as if you're handing this to another author who needs to \
write the next chapter without re-reading this one.\n\n\
## {chapter_label}\n\n{chapter_body_md}"
);
let body = RunRequest {
prompt,
model: Some(self.model.clone()),
system: Some(SYSTEM_SUMMARIZE.to_string()),
timeout_secs: Some(300),
..Default::default()
};
let r = self.client.run(body).await?;
let duration_ms = r.duration_ms;
Ok(PassOutput { kind: PassKind::Summary, result: r, duration_ms })
}
}
const SYSTEM_SUMMARIZE: &str = "You are a continuity assistant for a long-form \
fiction author. You write chapter summaries that future authors of sequels \
will read to understand what happened. Be specific. Names, dates, locations. \
Don't editorialize — just compress the events.";
fn build_request(model: &str, kind: PassKind, primary: &str, context: &str, system: &str) -> RunRequest {
let prompt = format!(
"# Pass: {kind}\n\n## Context\n\n{context}\n\n## Input\n\n{primary}",
kind = kind.as_str(),
);
RunRequest {
prompt,
model: Some(model.to_string()),
system: Some(system.to_string()),
timeout_secs: Some(600),
..Default::default()
}
}
fn build_audit_request(model: &str, parent: &str, sequel: &str, bible: &str) -> RunRequest {
let prompt = format!(
"## Bible\n\n{bible}\n\n## Parent story prose\n\n{parent}\n\n## Sequel story prose\n\n{sequel}\n\nReturn JSON: {{ \"findings\": [ {{ \"severity\": \"info|warn|crit\", \"area\": \"character|continuity|tone|fact|timeline|other\", \"body\": \"...\" }} ] }}"
);
RunRequest {
prompt,
model: Some(model.to_string()),
system: Some(SYSTEM_AUDIT_TODO.to_string()),
timeout_secs: Some(600),
..Default::default()
}
}
// ─── Prompt templates (TODO v0.2 — these are placeholder stubs) ───
const SYSTEM_GEN_TODO: &str = "You are a long-form fiction author. \
Write in measured, literary prose. Honor the bible and character voices \
exactly. (Full prompt template: TODO v0.2.)";
const SYSTEM_CLEANUP_TODO: &str = "You are a copy editor for long-form fiction. \
Polish the draft for prose quality, tighten dialogue, fix pacing dead \
spots, keep voice consistent. Do not add new plot. (Full prompt template: TODO v0.2.)";
const SYSTEM_AUDIT_TODO: &str = "You are a canon auditor. Compare the parent \
and sequel against the bible. Flag contradictions, character voice drift, \
retconned facts, dropped threads, timeline issues. Output structured \
JSON only — no commentary. (Full prompt template: TODO v0.2.)";
/// Audit finding shape returned by the audit pass. Parses out of the
/// `result` field on the audit pass's [`RunResult`].
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AuditFinding {
pub severity: String,
pub area: String,
pub body: String,
}
/// Wrapper shape for the audit response.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AuditResponse {
pub findings: Vec<AuditFinding>,
}