engine/kokoro: question doubling + kludges notes
Re-applies the Kokoro-specific hacks that main intentionally omits: - _emphasize_questions doubles '?' to '??' so the 82M's flat interrogative prosody gets a rising-pitch cue - engines/kokoro/hacks.md documents this and the other Kokoro- tuned bits (gap durations, lowercase-only respellings) with the 'remove when we move to a bigger model' marker Deploy from this branch to /srv/appdata/kokoro/build/ when you want the tuned version. Main's vanilla Kokoro is for reference / future cleanup.
This commit is contained in:
parent
01ec9ffd0e
commit
b5de9776a2
1 changed files with 12 additions and 1 deletions
|
|
@ -113,12 +113,23 @@ def _parse_tag(match: re.Match) -> float:
|
|||
return dur / 1000.0 if unit == "ms" else dur
|
||||
|
||||
|
||||
# [HACK — engine/kokoro] Kokoro-82M has weak question prosody on a
|
||||
# single `?`. Doubling the question mark to `??` reliably triggers a
|
||||
# more interrogative rising-pitch contour without changing semantics.
|
||||
# Skip if already doubled or part of an interrobang. See hacks.md.
|
||||
_QUESTION_RE = re.compile(r"(?<![?!])\?(?!\?)")
|
||||
|
||||
|
||||
def _emphasize_questions(text: str) -> str:
|
||||
return _QUESTION_RE.sub("??", text)
|
||||
|
||||
|
||||
def _expand_inline(text: str, voice: str | None) -> list[Node]:
|
||||
"""Expand inline [breath]/[pause]/[scene] tags inside a chunk
|
||||
of text that already has a single voice attribution. Voice
|
||||
blocks themselves are handled one level up in split_to_nodes."""
|
||||
out: list[Node] = []
|
||||
text = text.strip()
|
||||
text = _emphasize_questions(text.strip())
|
||||
if not text:
|
||||
return out
|
||||
cursor = 0
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue