forge: dedup pass — the fix half of the audit loop

Adds `skald dedup --story <id>`: reads a story's most recent
prose-audit findings and walks every chapter, handing the author
the chapter prose + the findings with instructions to rephrase
ONLY the flagged repetitions (each recurrence made distinct) and
fix flagged continuity errors — everything else stays verbatim.
A surgical dedup, not a rewrite. Overwrites body_md, clears
body_md_tts so the chapter is re-prepped before narration. High
effort (prose-craft). Migration 0011 adds the 'dedup' run kind.

Completes the QC loop: audit (find) -> dedup (fix) -> re-audit.
This commit is contained in:
Kayos 2026-05-15 14:49:08 -07:00
parent 4de484cd35
commit 2820d173e8
4 changed files with 317 additions and 0 deletions

View file

@ -85,6 +85,10 @@ pub enum PassKind {
/// to end and flags repetition, template tics, self-restatement
/// and continuity drift. The QC gate before narration.
ProseAudit,
/// Surgical dedup — takes a chapter plus the story's audit
/// findings and rephrases only the flagged repetitions, leaving
/// everything else verbatim. The fix half of the audit loop.
Dedup,
}
impl PassKind {
@ -97,6 +101,7 @@ impl PassKind {
Self::NarratePrep => "narrate_prep",
Self::Rewrite => "rewrite",
Self::ProseAudit => "prose_audit",
Self::Dedup => "dedup",
}
}
}
@ -232,6 +237,43 @@ impl Forge {
Ok(PassOutput { kind: PassKind::ProseAudit, result: r, duration_ms })
}
/// Surgical dedup of one chapter — the fix half of the audit
/// loop. Receives the chapter's prose plus the whole story's
/// audit findings, and rephrases ONLY the flagged repetitions
/// that occur in this chapter; everything the findings do not
/// flag stays verbatim. Author REQUIRED — the fresh phrasing
/// lands in the author's voice (SystemMode::Replace). High
/// effort: it is prose-craft, same posture as rewrite.
pub async fn dedup(
&self,
prose: &str,
findings: &str,
author: &AuthorWithRevision,
) -> anyhow::Result<PassOutput> {
let scaffold = author
.revision
.system_template
.as_deref()
.unwrap_or(DEFAULT_AUTHOR_SCAFFOLD);
let system = scaffold
.replace("{{display_name}}", &author.author.display_name)
.replace("{{pass_directive}}", DEDUP_DIRECTIVE)
.replace("{{soul}}", &author.revision.soul);
let user_prompt = dedup_user_prompt(prose, findings);
let body = RunRequest {
prompt: user_prompt,
model: Some(self.model.clone()),
system: Some(system),
system_mode: Some(SystemMode::Replace),
effort: Some(Effort::High),
timeout_secs: Some(1800),
..Default::default()
};
let r = self.client.run(body).await?;
let duration_ms = r.duration_ms;
Ok(PassOutput { kind: PassKind::Dedup, result: r, duration_ms })
}
/// Annotate prose with narration control tags. The model
/// receives the full chapter prose and returns the SAME prose
/// with `[pause:Xs]`, `[breath]`, `[scene]` markers inserted
@ -470,6 +512,8 @@ const HOUSE_NARRATE_PREP_SYSTEM: &str = "You are a senior audiobook director ann
/// voice tags, one narrator throughout.
const HOUSE_NARRATE_PREP_SYSTEM_SINGLE: &str = "You are a senior audiobook director annotating prose for a SINGLE-narrator reading. You insert (a) beat markers — `[breath]`, `[pause:Xs]`, `[scene]` — where a skilled narrator would breathe or pause, and (b) occasional humanizing narrator stumbles using em-dash repetition or self-correction (sparingly — maybe 1-3 per chapter, on proper nouns or hard words). Do NOT add `[voice:...]` speaker tags — the whole chapter is one voice. Apart from those stumbles you do NOT change a word of the prose. Return the prose verbatim plus beat markers and (rare) stumbles inline. No preamble, no commentary.";
const DEDUP_DIRECTIVE: &str = "This is a DEDUP pass. The user prompt contains ONE chapter of a story you wrote, plus a list of audit findings — repeated phrases, motifs, similes, sentence templates and continuity errors found across the whole book. Your job: return this chapter with every flagged repetition that occurs IN IT rephrased fresh, and everything else byte-identical.\n\nHARD RULES:\n- For any motif, simile, phrase, image or structural tic the findings flag as recurring: if it appears in THIS chapter, render this chapter's occurrence in fresh, distinctive wording. Never reuse the flagged original phrasing. The other chapters' occurrences are being revised separately — do NOT try to coordinate with them; just make yours distinct from the flagged original.\n- Fix any continuity error the findings flag that touches this chapter (a wrong age, number, name, date) — use the correct value the findings identify.\n- Change NOTHING the findings do not flag. Every sentence not implicated by a finding stays EXACTLY as written, word for word. This is not a rewrite, not a polish, not an edit for taste — it is a surgical dedup. When in doubt, leave it.\n- Canon is absolute: names, dates, events, the order they happen, every fact — unchanged. The chapter stays the same length and shape.\n- Return ONLY the chapter prose. No heading unless the source had one. No preamble, no commentary, no list of what you changed.\n\n";
const REWRITE_DIRECTIVE: &str = "This is a REWRITE pass. The user prompt contains a chapter of prose written by another hand. Re-author it entirely in YOUR voice — every sentence reworked in your style: your sentence rhythm, your word choice, your paragraph shape, your way of landing a beat. This is not editing or polishing. It is re-authoring. The reader should not be able to tell another writer ever touched it.\n\nHARD CONSTRAINTS — canon is non-negotiable:\n- Every character name, every date, every place name stays exactly as written.\n- Every event, and the ORDER events happen in, stays exactly as written.\n- Every technical or historical fact stays exactly as written.\n- Do not add new scenes, characters, or events. Do not cut any scene or beat. Same story, same shape — your telling.\n\nReturn ONLY the rewritten chapter prose. Begin with the chapter heading line (`## Chapter N — title`) exactly as in the source. No preamble, no commentary about the rewrite.";
// ─── User-prompt builders ───────────────────────────────────────
@ -518,6 +562,25 @@ pub struct CharacterSpeaker {
pub hint: Option<String>,
}
fn dedup_user_prompt(prose: &str, findings: &str) -> String {
let mut out = String::with_capacity(prose.len() + findings.len() + 512);
out.push_str("# Audit findings for this story\n\n");
out.push_str(
"These repetitions and errors were found across the whole book. \
Fix only the ones that occur in the chapter below.\n\n",
);
out.push_str(findings);
out.push_str("\n\n# Chapter to dedup\n\n");
out.push_str(prose);
out.push_str(
"\n\n# Task\n\nReturn the chapter above with every flagged repetition \
that appears in it rephrased fresh, and any flagged continuity error \
touching it corrected. Leave every unflagged sentence verbatim. \
Return only the chapter prose.\n",
);
out
}
fn rewrite_user_prompt(prose: &str) -> String {
let mut out = String::with_capacity(prose.len() + 256);
out.push_str("# Chapter to re-author\n\n");