skald

cobb/skald

History

Kayos 9df378f799 engine/tortoise: sentence chunking + device fix + pitch/rate modulation Catches up engines/tortoise/server.py with what's been deployed on Lucy through tonight's smoke iterations: 0.2 — _chunk_for_tortoise splits text nodes at sentence boundaries (max 220 chars) before each tts_with_preset call. Fixes the end-of-prompt gibberish past tortoise's ~20s reliable horizon. 0.3 — _get_voice now .to(DEVICE) cached samples + latents. Without this, non-lj voices crash with 'Expected all tensors to be on the same device, but found cpu and cuda:0'. 0.4 — [voice:NAME pitch=N rate=R][/voice] tag syntax. librosa pitch_shift + time_stretch applied per-chunk for single-voice multi-character renders. The strategy survived the design table — but the librosa phase-vocoder artifacts at ±5 semitones ate the quality on the 2070 Super. Parked here for the GPU rebuild; modulation works architecturally, just needs better stretching algorithm (rubberband) + more headroom. Production stayed Kokoro. Coast-Down preferred_voice_id reverted to kokoro_af_heart in the live DB after this experiment.		2026-05-14 19:08:43 -07:00
..
compose.yml	engines: import f5-tts + kokoro + tortoise sidecars into the tree	2026-05-14 09:40:01 -07:00
Dockerfile	engines: import f5-tts + kokoro + tortoise sidecars into the tree	2026-05-14 09:40:01 -07:00
exclusive-gpu.sh	engine/tortoise: GPU exclusivity wrapper + kludges notes	2026-05-14 09:42:09 -07:00
hacks.md	engine/tortoise: GPU exclusivity wrapper + kludges notes	2026-05-14 09:42:09 -07:00
server.py	engine/tortoise: sentence chunking + device fix + pitch/rate modulation	2026-05-14 19:08:43 -07:00