History

Kayos 7a96031aa6 engine/tortoise: GPU exclusivity wrapper + kludges notes Adds the Tortoise-specific tooling that main intentionally omits: - engines/tortoise/exclusive-gpu.sh wraps any command, stops F5 + Kokoro on the GPU, restarts Tortoise to clear stale CUDA contexts, waits for healthz, runs the command, restarts the engines on EXIT trap. Solves the 8GB OOM that took down the first smoke. - engines/tortoise/hacks.md captures the speed reality (~74x real- time slowdown on the 2070 Super at standard preset) and the pronunciation-overrides cross-engine compatibility note. Deploy from this branch when you want Tortoise's tuning. Main's vanilla Tortoise is for the cross-engine reference + future 'we have more VRAM now' cleanup.		2026-05-14 09:42:09 -07:00
..
f5-tts	engines: import f5-tts + kokoro + tortoise sidecars into the tree	2026-05-14 09:40:01 -07:00
kokoro	engines: import f5-tts + kokoro + tortoise sidecars into the tree	2026-05-14 09:40:01 -07:00
tortoise	engine/tortoise: GPU exclusivity wrapper + kludges notes	2026-05-14 09:42:09 -07:00
README.md	engines: import f5-tts + kokoro + tortoise sidecars into the tree	2026-05-14 09:40:01 -07:00

README.md

Skald TTS engines

This subtree holds the per-engine sidecars that skald's narrate path talks to over HTTP. Each engine has the same contract:

POST /synthesize — same JSON shape across engines so skald's one Rust client (skald-core::narrate::Narrator) deserializes all of them. See engines/<name>/server.py for the per-engine implementation.
GET /healthz — boot probe + model-loaded flag.

Skald routes per-request by voices.source: a kokoro_* source goes to $KOKORO_URL, a tortoise_* source goes to $TORTOISE_URL, anything else (lj_speech, generic) goes to $F5_TTS_URL.

Engines

Dir	Engine	License (code/weights)	VRAM	Speed	Voices
`f5-tts/`	SWivid F5-TTS v1	MIT / CC-BY-NC	~5GB	fast (~2x real-time on 2070S)	voice cloning (LJ Speech reference shipped)
`kokoro/`	hexgrad Kokoro-82M	Apache 2.0 / Apache 2.0	~1GB	very fast (~50x real-time)	50+ named presets (af_, am_, bf_, bm_)
`tortoise/`	neonbjb Tortoise-TTS	Apache 2.0 / Apache 2.0	~5GB	slow (~0.014x real-time, ~74s/s of audio on 2070S, standard preset)	26 named built-ins (lj, freeman, daniel, weaver, jlaw, etc.)

Branch model

main carries the vanilla version of each engine — what you'd get from a clean pip install <engine> plus the FastAPI sidecar

control-tag splitter. No engine-specific kludges. Safe to look at without context.

engine/<name> branches hold engine-tuned tweaks that don't generalise. Examples:

engine/kokoro — doubled-?? prosody hack for the 82M's weak question intonation, paragraph/scene/breath gap durations tuned for af_heart's pacing, notes on how respellings need to be all- lowercase to avoid letter-by-letter spell-out by misaki.
engine/tortoise — GPU exclusivity coordinator (stops F5 + Kokoro before a Tortoise run since the 2070 Super can't host all three at once), preset choice ergonomics, character→tortoise- voice seed assignments.

When deploying an engine to Lucy, the build dir at /mnt/cache/appdata/<engine>/build/ tracks the engine's branch:

cd /mnt/cache/appdata/kokoro/build
git fetch && git checkout engine/kokoro
docker compose -p <name> up -d --build

GPU coordination (2070 Super)

The 8GB card is the bottleneck. F5 + Kokoro can co-reside (~5GB + ~1GB). Tortoise pushes the budget over and needs the GPU largely to itself — the engine/tortoise branch will carry the script that stops kokoro + f5 before a tortoise run and restarts them after. Replace with proper coordination once we have more VRAM.