Catches up engines/tortoise/server.py with what's been deployed on
Lucy through tonight's smoke iterations:
0.2 — _chunk_for_tortoise splits text nodes at sentence boundaries
(max 220 chars) before each tts_with_preset call. Fixes the
end-of-prompt gibberish past tortoise's ~20s reliable horizon.
0.3 — _get_voice now .to(DEVICE) cached samples + latents. Without
this, non-lj voices crash with 'Expected all tensors to be on
the same device, but found cpu and cuda:0'.
0.4 — [voice:NAME pitch=N rate=R][/voice] tag syntax. librosa
pitch_shift + time_stretch applied per-chunk for single-voice
multi-character renders. The strategy survived the design
table — but the librosa phase-vocoder artifacts at ±5 semitones
ate the quality on the 2070 Super. Parked here for the GPU
rebuild; modulation works architecturally, just needs better
stretching algorithm (rubberband) + more headroom.
Production stayed Kokoro. Coast-Down preferred_voice_id reverted
to kokoro_af_heart in the live DB after this experiment.
Adds the Tortoise-specific tooling that main intentionally omits:
- engines/tortoise/exclusive-gpu.sh wraps any command, stops F5 +
Kokoro on the GPU, restarts Tortoise to clear stale CUDA contexts,
waits for healthz, runs the command, restarts the engines on EXIT
trap. Solves the 8GB OOM that took down the first smoke.
- engines/tortoise/hacks.md captures the speed reality (~74x real-
time slowdown on the 2070 Super at standard preset) and the
pronunciation-overrides cross-engine compatibility note.
Deploy from this branch when you want Tortoise's tuning. Main's
vanilla Tortoise is for the cross-engine reference + future
'we have more VRAM now' cleanup.
The python FastAPI sidecars have lived ad-hoc at /mnt/cache/appdata/
<engine>/build/ on Lucy without version control. Bringing them into
the skald repo so the engine code travels with the cross-engine
routing it depends on.
This commit lands the VANILLA version of each engine on main:
engines/f5-tts/ SWivid F5-TTS (CC-BY-NC weights flagged)
engines/kokoro/ hexgrad Kokoro-82M (Apache 2.0 top to bottom)
engines/tortoise/ neonbjb Tortoise-TTS (Apache 2.0 top to bottom)
Engine-specific kludges (question doubling, GPU coordination,
pause-duration tuning) get layered on engine/* branches per the
README. Main stays the safe-to-read baseline.