cobb/skald

Cobb Hayes 346cea515d Public-flip audit: env-driven paths, scrub audit-ticket prefixes, terser README

Lucy bind paths + LAN host pins replaced with env defaults. Repository URLs
→ git.sulkta.com. Audit-changelog scaffolding stripped from inline comments
(technical reasoning preserved). README sheds marketing scaffolding. AI-speak
in load-bearing prompts/SOULs left alone — that IS the product.

2026-05-27 11:42:58 -07:00

2.5 KiB

Raw Permalink Blame History

Skald TTS engines

This subtree holds the per-engine sidecars that skald's narrate path talks to over HTTP. Each engine has the same contract:

POST /synthesize — same JSON shape across engines so skald's one Rust client (skald-core::narrate::Narrator) deserializes all of them. See engines/<name>/server.py for the per-engine implementation.
GET /healthz — boot probe + model-loaded flag.

Skald routes per-request by voices.source: a kokoro_* source goes to $KOKORO_URL, a tortoise_* source goes to $TORTOISE_URL, anything else (lj_speech, generic) goes to $F5_TTS_URL.

Engines

Dir	Engine	License (code/weights)	VRAM	Speed	Voices
`f5-tts/`	SWivid F5-TTS v1	MIT / CC-BY-NC	~5GB	fast (~2x real-time on 2070S)	voice cloning (LJ Speech reference shipped)
`kokoro/`	hexgrad Kokoro-82M	Apache 2.0 / Apache 2.0	~1GB	very fast (~50x real-time)	50+ named presets (af_, am_, bf_, bm_)
`tortoise/`	neonbjb Tortoise-TTS	Apache 2.0 / Apache 2.0	~5GB	slow (~0.014x real-time, ~74s/s of audio on 2070S, standard preset)	26 named built-ins (lj, freeman, daniel, weaver, jlaw, etc.)

Branch model

main carries the vanilla version of each engine — what you'd get from a clean pip install <engine> plus the FastAPI sidecar

control-tag splitter. No engine-specific kludges. Safe to look at without context.

engine/<name> branches hold engine-tuned tweaks that don't generalise. Examples:

engine/kokoro — doubled-?? prosody hack for the 82M's weak question intonation, paragraph/scene/breath gap durations tuned for af_heart's pacing, notes on how respellings need to be all- lowercase to avoid letter-by-letter spell-out by misaki.
engine/tortoise — GPU exclusivity coordinator (stops F5 + Kokoro before a Tortoise run since the 2070 Super can't host all three at once), preset choice ergonomics, character→tortoise- voice seed assignments.

To deploy a tuned engine, check out the engine's branch in the build dir and docker compose up -d --build:

git fetch && git checkout engine/kokoro
docker compose up -d --build

GPU coordination

On an 8GB card F5 + Kokoro can co-reside (~5GB + ~1GB). Tortoise pushes the budget over and needs the GPU largely to itself — the engine/tortoise branch carries a script to stop kokoro + f5 before a Tortoise run and restart them after. Replace with proper coordination once more VRAM is available.

2.5 KiB Raw Permalink Blame History

Skald TTS engines

Engines

Branch model

GPU coordination

2.5 KiB

Raw Permalink Blame History