skald/engines/tortoise/exclusive-gpu.sh
Kayos 7a96031aa6 engine/tortoise: GPU exclusivity wrapper + kludges notes
Adds the Tortoise-specific tooling that main intentionally omits:

- engines/tortoise/exclusive-gpu.sh wraps any command, stops F5 +
  Kokoro on the GPU, restarts Tortoise to clear stale CUDA contexts,
  waits for healthz, runs the command, restarts the engines on EXIT
  trap. Solves the 8GB OOM that took down the first smoke.

- engines/tortoise/hacks.md captures the speed reality (~74x real-
  time slowdown on the 2070 Super at standard preset) and the
  pronunciation-overrides cross-engine compatibility note.

Deploy from this branch when you want Tortoise's tuning. Main's
vanilla Tortoise is for the cross-engine reference + future
'we have more VRAM now' cleanup.
2026-05-14 09:42:09 -07:00

47 lines
1.4 KiB
Bash
Executable file

#!/bin/bash
# Tortoise GPU exclusivity wrapper. The 2070 Super (8GB) can't host
# F5 (~4.5GB) + Kokoro (~2.7GB) + Tortoise (~5GB peak) simultaneously,
# so we stop the other two engines for the duration of a Tortoise run
# and restart them after.
#
# Usage:
# exclusive-gpu.sh <command...>
#
# Example:
# exclusive-gpu.sh docker exec skald skald narrate --chapter <uuid>
#
# Exits with the wrapped command's status. Restarts the engines
# regardless of success/failure (trap on EXIT).
set -euo pipefail
STOP_ENGINES=(f5-tts kokoro)
cleanup() {
local rc=$?
echo "[exclusive-gpu] restarting engines"
for engine in "${STOP_ENGINES[@]}"; do
docker start "$engine" >/dev/null 2>&1 || \
echo "[exclusive-gpu] failed to restart $engine — investigate"
done
return "$rc"
}
trap cleanup EXIT
echo "[exclusive-gpu] stopping engines: ${STOP_ENGINES[*]}"
for engine in "${STOP_ENGINES[@]}"; do
docker stop "$engine" >/dev/null 2>&1 || true
done
# Restart Tortoise to clean up any cached GPU allocations from the
# now-stopped engines (their CUDA contexts can linger briefly).
docker restart tortoise >/dev/null
echo "[exclusive-gpu] waiting for tortoise healthz..."
for i in {1..30}; do
if curl -sf http://192.168.0.5:7795/healthz | grep -q '"loaded":true'; then
break
fi
sleep 2
done
echo "[exclusive-gpu] running: $*"
"$@"