skald/engines/kokoro/Dockerfile
Kayos d1631ddffe engines: import f5-tts + kokoro + tortoise sidecars into the tree
The python FastAPI sidecars have lived ad-hoc at /mnt/cache/appdata/
<engine>/build/ on Lucy without version control. Bringing them into
the skald repo so the engine code travels with the cross-engine
routing it depends on.

This commit lands the VANILLA version of each engine on main:

  engines/f5-tts/    SWivid F5-TTS (CC-BY-NC weights flagged)
  engines/kokoro/    hexgrad Kokoro-82M (Apache 2.0 top to bottom)
  engines/tortoise/  neonbjb Tortoise-TTS (Apache 2.0 top to bottom)

Engine-specific kludges (question doubling, GPU coordination,
pause-duration tuning) get layered on engine/* branches per the
README. Main stays the safe-to-read baseline.
2026-05-14 09:40:01 -07:00

35 lines
1.1 KiB
Docker

# Sulkta build of Kokoro-82M TTS.
#
# License: Apache 2.0 (code AND model weights). Clean stack — no
# CC-BY-NC asterisk like F5-TTS's Emilia weights. This is the
# narrator engine for sleep-quality audiobook reads; F5-TTS stays
# around for voice-cloning cases.
#
# Kokoro is small enough to run on CPU but we use the cuda base
# anyway to stay consistent with f5-tts and so it'll pick up the
# GPU when no other tenant has it.
FROM pytorch/pytorch:2.6.0-cuda12.4-cudnn9-runtime
ENV DEBIAN_FRONTEND=noninteractive \
PYTHONUNBUFFERED=1 \
HF_HOME=/cache/hf \
HF_HUB_DISABLE_TELEMETRY=1
RUN apt-get update && apt-get install -y --no-install-recommends \
ffmpeg \
espeak-ng \
ca-certificates \
curl \
&& rm -rf /var/lib/apt/lists/*
# kokoro pulls phonemizer + soundfile + espeakng transitively.
RUN pip install --no-cache-dir 'kokoro>=0.9.0' 'fastapi>=0.115.0' 'uvicorn>=0.32.0' 'soundfile>=0.13.0'
RUN mkdir -p /cache/hf /audio
COPY kokoro_server.py /app/kokoro_server.py
WORKDIR /app
EXPOSE 7860
CMD ["uvicorn", "kokoro_server:app", "--host", "0.0.0.0", "--port", "7860"]