Audit fix sprint — CRIT-1, CRIT-2, HIGH-1..7, MED-1,2,7, LOW-1

Full Opus-max adversarial audit found 2 CRIT, 7 HIGH, 9 MED, 9 LOW.
This commit lands all CRIT, all 7 HIGH, the highest-impact MEDs, and
the addon-manifest LOW. The remaining MED/LOW items are cosmetic or
defensive polish — captured in the audit report but not blocking.

CRIT-1: _attach_sponsorblock had a duplicate block stacked on top of
itself. After a video ended the second monitor instantiated, blocked
30s waiting for a player that wasn't there, logged 'timed out' and
returned. Net effect on the family-room TV: Kodi froze for 30s after
every single video. Deleted the duplicate (lines 512–530).

CRIT-2: MPD HTTP server bound to 0.0.0.0 + emitted
Access-Control-Allow-Origin: *. The MPD embeds signed googlevideo
segment URLs — anything on the LAN (guest phones, IoT, malicious sites
loaded in any browser) could scrape the manifest and grab those URLs.
Bind to the LAN IP only with a 127.0.0.1 fallback if that fails; drop
the gratuitous CORS header. Setting is off by default so the live
exposure was small, but locked down before M7 ever flips it on.

HIGH-1: _extract_id accepted arbitrary 'v=' query values without
validating the 11-char [A-Za-z0-9_-] shape. New _validate_id helper +
strict netloc-suffix matching (was: 'youtube.com' in netloc, which let
'myyoutube.com.evil.example' through). Every branch now pipes through
_validate_id. Same shape enforced in the sidecar via
validate_youtube_id() at every op entry point — defense in depth.

HIGH-2: _call_sidecar parser now picks the LAST non-empty stdout line
(robust against future stray println!/log lines), with a dedicated
JSONDecodeError catch that returns a clear 'sidecar stdout was not
JSON: <repr>' rather than a confusing 'sidecar exited 0' message.

HIGH-3: _pv_youtube_installed() now logs the probe exception before
returning False, so silent degradation to 360p is observable in
kodi.log instead of mysterious.

HIGH-4: SponsorBlockMonitor.run()'s getTime() catch widened from
(RuntimeError, OSError) to bare Exception — historically Kodi has
thrown other types when the player goes stale mid-poll, and we don't
want one of those to escape past _play's finally block and leak the
MPD HTTP server.

HIGH-5: Wrapped _attach_sponsorblock in try/except at both call sites
(pv.youtube delegate path + progressive fallback) so a SB monitor bug
can't pop a 'Plugin error' dialog on the TV after successful playback.

HIGH-6: Sidecar Search/ChannelVideos/Playlist now clamp 'limit' at
MAX_LIMIT=200. Prevents an unbounded limit=u32::MAX from OOMing Kodi
on a malicious or buggy addon request.

HIGH-7: Sidecar Rip op now requires dest_dir to be a prefix of an
allowlist (/storage/.kodi/temp/ or
/storage/.kodi/userdata/addon_data/plugin.video.torttube/). Op is
dormant today (no Python caller), but the protocol was a wide-open
arbitrary-write primitive.

MED-1: Bumped search / channel / playlist sidecar timeouts from 15s
to 25s — first-search-after-Kodi-boot on a slow LAN routinely hit 8s+
of rustypipe TLS-handshake + Innertube initial parse.

MED-2: _resolved_listitem now checks urlparse(url).path.endswith('.mpd'
/'.m3u8') instead of substring scan of the whole URL — yt-dlp URLs
have base64 blobs in the query string that can accidentally contain
those substrings.

MED-7: Error classifiers (classify_yt_dlp_error, classify_rustypipe_error)
now match on word-level patterns ('private video', 'age-restrict',
'region-restrict' etc.) instead of bare substrings — fixes
'private network' in a TLS error misclassifying as PrivateVideo,
'package' triggering AgeRestricted, etc.

LOW-1: addon.xml's plugin.video.youtube dep marked optional='true' —
our _pv_youtube_installed() check and fallback paths assumed
optionality; the manifest now matches.

Addon v0.0.13. Verified live via two browse-only smokes:
  - Sidecar bad-id rejection ('../../etc/passwd'):
    {ok:false, error:'invalid youtube id (length 16 != 11)', kind:'bad_request'}
  - Files.GetDirectory ?q=cat: 12 results, formatted labels intact.

NOT VERIFIED VIA PLAYBACK (Leia was watching the TV).

Remaining MED/LOW items (mostly cosmetic): MED-3 endOfDirectory
cacheToDisc, MED-4 0-is-falsy in _format_duration/_format_views, MED-5
codec regex single-quote, MED-6 Response::Ok ok-clobber, MED-8 SB
response-size cap, MED-9 thumbnail dict type check, LOW-2..LOW-9
defensive polish. Tracked in audit report; can hit in a follow-up
sprint.
This commit is contained in:
Kayos 2026-05-23 12:28:35 -07:00
parent ffc7332910
commit 83bc6dfa03
4 changed files with 193 additions and 93 deletions

View file

@ -1,11 +1,11 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<addon id="plugin.video.torttube" <addon id="plugin.video.torttube"
name="torttube" name="torttube"
version="0.0.12" version="0.0.13"
provider-name="Sulkta-Coop"> provider-name="Sulkta-Coop">
<requires> <requires>
<import addon="xbmc.python" version="3.0.0"/> <import addon="xbmc.python" version="3.0.0"/>
<import addon="plugin.video.youtube" version="7.0.0"/> <import addon="plugin.video.youtube" version="7.0.0" optional="true"/>
<import addon="inputstream.adaptive" version="2.0.0" optional="true"/> <import addon="inputstream.adaptive" version="2.0.0" optional="true"/>
</requires> </requires>
<extension point="xbmc.python.pluginsource" library="main.py"> <extension point="xbmc.python.pluginsource" library="main.py">

View file

@ -47,25 +47,40 @@ def _log(msg, level=xbmc.LOGINFO):
xbmc.log(f"[torttube] {msg}", level) xbmc.log(f"[torttube] {msg}", level)
_ID_RE = re.compile(r"[A-Za-z0-9_-]{11}")
def _validate_id(candidate: str | None) -> str:
"""Return the candidate iff it matches YouTube's 11-char id shape; raise otherwise.
Every branch of _extract_id MUST pipe its return value through this so a
bogus `v=` query, an empty youtu.be path, or a malformed share link doesn't
leak into _call_sidecar -> yt-dlp / rustypipe / log lines."""
if candidate and _ID_RE.fullmatch(candidate):
return candidate
raise ValueError(f"not a valid YouTube id: {candidate!r}")
def _extract_id(url_or_id: str) -> str: def _extract_id(url_or_id: str) -> str:
"""Accept either a bare ID or any common YouTube URL form, return the ID.""" """Accept either a bare ID or any common YouTube URL form, return the ID.
Every accepted form is validated through _validate_id so we never hand
untrusted text to the sidecar."""
s = url_or_id.strip() s = url_or_id.strip()
# Bare 11-char ID (YouTube's canonical length). # Bare 11-char ID (YouTube's canonical length).
if re.fullmatch(r"[A-Za-z0-9_-]{11}", s): if _ID_RE.fullmatch(s):
return s return s
parsed = urlparse(s) parsed = urlparse(s)
# https://youtu.be/<id> # https://youtu.be/<id> (strict suffix match — avoid `myyoutu.be` etc.)
if parsed.netloc.endswith("youtu.be"): if parsed.netloc == "youtu.be" or parsed.netloc.endswith(".youtu.be"):
return parsed.path.lstrip("/").split("/")[0] return _validate_id(parsed.path.lstrip("/").split("/")[0])
# https://www.youtube.com/watch?v=<id> # https://www.youtube.com/watch?v=<id> (strict suffix match)
if "youtube.com" in parsed.netloc: if parsed.netloc == "youtube.com" or parsed.netloc.endswith(".youtube.com"):
for k, v in parse_qsl(parsed.query): for k, v in parse_qsl(parsed.query):
if k == "v": if k == "v":
return v return _validate_id(v)
# /shorts/<id>, /embed/<id>, /live/<id> # /shorts/<id>, /embed/<id>, /live/<id>
m = re.match(r"^/(shorts|embed|live)/([A-Za-z0-9_-]{11})", parsed.path) m = re.match(r"^/(shorts|embed|live)/([A-Za-z0-9_-]{11})", parsed.path)
if m: if m:
return m.group(2) return _validate_id(m.group(2))
raise ValueError(f"could not extract YouTube id from {url_or_id!r}") raise ValueError(f"could not extract YouTube id from {url_or_id!r}")
@ -95,13 +110,22 @@ def _call_sidecar(request: dict, timeout_s: int = 30) -> dict:
raise RuntimeError( raise RuntimeError(
f"sidecar exited {proc.returncode}: {proc.stderr.decode('utf-8', 'replace')[:500]}" f"sidecar exited {proc.returncode}: {proc.stderr.decode('utf-8', 'replace')[:500]}"
) )
# Stdout may have multiple lines if the sidecar logged something; take the first # The sidecar writes its JSON response as the LAST line of stdout (after any
# non-empty line as the response. # log lines it might have written via println!/eprintln/tracing if the route
# config ever changes). Pick the last non-empty line and decode robustly.
last_line: bytes | None = None
for line in proc.stdout.splitlines(): for line in proc.stdout.splitlines():
line = line.strip() line = line.strip()
if line: if line:
return json.loads(line.decode("utf-8") if isinstance(line, bytes) else line) last_line = line
raise RuntimeError("sidecar produced no response") if last_line is None:
raise RuntimeError("sidecar produced no response")
try:
return json.loads(last_line.decode("utf-8", errors="replace"))
except json.JSONDecodeError as e:
raise RuntimeError(
f"sidecar stdout was not JSON: {e}; line={last_line[:200]!r}"
) from e
def _resolved_listitem(stream_url: str, title: str | None) -> xbmcgui.ListItem: def _resolved_listitem(stream_url: str, title: str | None) -> xbmcgui.ListItem:
@ -109,7 +133,11 @@ def _resolved_listitem(stream_url: str, title: str | None) -> xbmcgui.ListItem:
li.setPath(stream_url) li.setPath(stream_url)
li.setProperty("IsPlayable", "true") li.setProperty("IsPlayable", "true")
# Tell inputstream.adaptive to handle DASH/HLS based on URL path/suffix. # Tell inputstream.adaptive to handle DASH/HLS based on URL path/suffix.
if ".mpd" in stream_url: # Check the URL *path* extension, not a substring of the whole URL —
# progressive yt-dlp URLs have base64 blobs in the query string that
# can accidentally contain ".mpd" or ".m3u8".
parsed = urlparse(stream_url)
if parsed.path.endswith(".mpd"):
li.setProperty("inputstream", "inputstream.adaptive") li.setProperty("inputstream", "inputstream.adaptive")
li.setProperty("inputstream.adaptive.manifest_type", "mpd") li.setProperty("inputstream.adaptive.manifest_type", "mpd")
# googlevideo rejects segment GETs that don't carry an Origin/Referer # googlevideo rejects segment GETs that don't carry an Origin/Referer
@ -126,7 +154,7 @@ def _resolved_listitem(stream_url: str, title: str | None) -> xbmcgui.ListItem:
) )
li.setProperty("inputstream.adaptive.stream_headers", seg_headers) li.setProperty("inputstream.adaptive.stream_headers", seg_headers)
li.setProperty("inputstream.adaptive.manifest_headers", seg_headers) li.setProperty("inputstream.adaptive.manifest_headers", seg_headers)
elif ".m3u8" in stream_url: elif parsed.path.endswith(".m3u8"):
li.setProperty("inputstream", "inputstream.adaptive") li.setProperty("inputstream", "inputstream.adaptive")
li.setProperty("inputstream.adaptive.manifest_type", "hls") li.setProperty("inputstream.adaptive.manifest_type", "hls")
return li return li
@ -143,7 +171,6 @@ class _MpdHandler(http.server.BaseHTTPRequestHandler):
self.send_response(200) self.send_response(200)
self.send_header("Content-Type", "application/dash+xml") self.send_header("Content-Type", "application/dash+xml")
self.send_header("Content-Length", str(len(self.mpd_bytes))) self.send_header("Content-Length", str(len(self.mpd_bytes)))
self.send_header("Access-Control-Allow-Origin", "*")
self.end_headers() self.end_headers()
self.wfile.write(self.mpd_bytes) self.wfile.write(self.mpd_bytes)
@ -189,10 +216,18 @@ def _start_mpd_server(mpd_bytes: bytes) -> tuple[str, http.server.HTTPServer]:
{"mpd_bytes": mpd_bytes}, {"mpd_bytes": mpd_bytes},
) )
lan_ip = _lan_ip() lan_ip = _lan_ip()
# Bind to all interfaces so the LAN-IP URL also routes through (otherwise # Bind to the LAN IP (not 0.0.0.0) so the MPD is reachable only on the
# connect-by-IP on the same host can return 'connection refused' on some # interface inputstream.adaptive uses to connect, not exposed to every
# kernel/firewall configs). Port 0 → kernel picks free port. # device on the LAN. The MPD embeds signed googlevideo segment URLs that
server = http.server.ThreadingHTTPServer(("0.0.0.0", 0), handler_cls) # we don't want walkable from a guest phone or IoT device.
try:
server = http.server.ThreadingHTTPServer((lan_ip, 0), handler_cls)
except OSError:
# LAN-IP bind can fail in rare network states (interface down,
# IP not yet assigned). Fall back to loopback — inputstream.adaptive
# on the same host can still reach it.
server = http.server.ThreadingHTTPServer(("127.0.0.1", 0), handler_cls)
lan_ip = "127.0.0.1"
threading.Thread(target=server.serve_forever, daemon=True).start() threading.Thread(target=server.serve_forever, daemon=True).start()
port = server.server_address[1] port = server.server_address[1]
url = f"http://{lan_ip}:{port}/manifest.mpd" url = f"http://{lan_ip}:{port}/manifest.mpd"
@ -378,10 +413,11 @@ def _delegate_to_pv_youtube(yt_id: str) -> bool:
def _pv_youtube_installed() -> bool: def _pv_youtube_installed() -> bool:
"""Check whether plugin.video.youtube is installed + enabled. We don't """Check whether plugin.video.youtube is installed + enabled. We don't
enable it ourselves if the user removed it, we fall back to our own enable it ourselves if the user removed it, we fall back to our own
paths.""" paths. Log any probe failure so silent degradation to 360p is observable."""
try: try:
return bool(xbmc.getCondVisibility("System.HasAddon(plugin.video.youtube)")) return bool(xbmc.getCondVisibility("System.HasAddon(plugin.video.youtube)"))
except Exception: except Exception as e:
_log(f"pv.youtube probe failed: {e}", xbmc.LOGWARNING)
return False return False
@ -407,7 +443,12 @@ def _play(yt_id: str) -> None:
except Exception: except Exception:
pass pass
if use_pv_youtube and _delegate_to_pv_youtube(yt_id): if use_pv_youtube and _delegate_to_pv_youtube(yt_id):
_attach_sponsorblock(yt_id) try:
_attach_sponsorblock(yt_id)
except Exception as e:
# Don't let a SB monitor bug pop a 'Plugin error' dialog on the TV
# after a successful delegate-to-pv.youtube hand-off.
_log(f"sponsorblock attach error (non-fatal): {e}", xbmc.LOGWARNING)
return return
mpd_bytes: bytes | None = None mpd_bytes: bytes | None = None
@ -484,7 +525,10 @@ def _play(yt_id: str) -> None:
_log(f"resolved via yt-dlp progressive fallback, playing") _log(f"resolved via yt-dlp progressive fallback, playing")
xbmcplugin.setResolvedUrl(_HANDLE, True, _resolved_listitem(stream_url, title)) xbmcplugin.setResolvedUrl(_HANDLE, True, _resolved_listitem(stream_url, title))
_attach_sponsorblock(yt_id) try:
_attach_sponsorblock(yt_id)
except Exception as e:
_log(f"sponsorblock attach error (non-fatal): {e}", xbmc.LOGWARNING)
def _attach_sponsorblock(yt_id: str) -> None: def _attach_sponsorblock(yt_id: str) -> None:
@ -509,26 +553,6 @@ def _attach_sponsorblock(yt_id: str) -> None:
# 'block until playback ends' signal that gates MPD-server shutdown. # 'block until playback ends' signal that gates MPD-server shutdown.
SponsorBlockMonitor(skip_segments).run() SponsorBlockMonitor(skip_segments).run()
# SponsorBlock: fetch segments, then block on a monitor that seeks past
# each skip segment as the playhead enters it. Best-effort — failure to
# fetch segments is non-fatal, playback proceeds without skipping.
try:
sb_resp = _call_sidecar({"op": "sponsorblock", "id": yt_id}, timeout_s=8)
except Exception as e:
_log(f"sponsorblock fetch failed (non-fatal): {e}", xbmc.LOGWARNING)
return
if not sb_resp.get("ok"):
return
segments = sb_resp.get("segments") or []
skip_segments = [s for s in segments if s.get("actionType") == "skip"]
if not skip_segments:
_log("sponsorblock: no skip segments for this video")
return
_log(f"sponsorblock: monitoring {len(skip_segments)} skip segments")
SponsorBlockMonitor(skip_segments).run()
class SponsorBlockMonitor(xbmc.Monitor): class SponsorBlockMonitor(xbmc.Monitor):
"""Polls Kodi's player position and seeks past each SponsorBlock skip """Polls Kodi's player position and seeks past each SponsorBlock skip
@ -570,8 +594,11 @@ class SponsorBlockMonitor(xbmc.Monitor):
while not self.abortRequested() and player.isPlaying(): while not self.abortRequested() and player.isPlaying():
try: try:
pos = float(player.getTime()) pos = float(player.getTime())
except (RuntimeError, OSError): except Exception:
# getTime raises if the player went away between checks. # getTime raises various exception types when the player goes
# away mid-poll (Kodi shutdown, plugin reload, etc). Wider catch
# so an exception path doesn't escape into _play's finally and
# leak the MPD HTTP server.
return return
for seg in self.segments: for seg in self.segments:
uuid = seg.get("UUID", "") uuid = seg.get("UUID", "")
@ -728,7 +755,7 @@ def _search_directory(query: str | None = None) -> None:
return return
try: try:
resp = _call_sidecar({"op": "search", "query": query, "limit": 30}, timeout_s=15) resp = _call_sidecar({"op": "search", "query": query, "limit": 30}, timeout_s=25)
except Exception as e: except Exception as e:
_log(f"search failed: {e}", xbmc.LOGERROR) _log(f"search failed: {e}", xbmc.LOGERROR)
xbmcgui.Dialog().notification( xbmcgui.Dialog().notification(
@ -758,7 +785,7 @@ def _playlist_directory(playlist_id: str) -> None:
"""List a playlist's videos.""" """List a playlist's videos."""
try: try:
resp = _call_sidecar( resp = _call_sidecar(
{"op": "playlist", "id": playlist_id, "limit": 100}, timeout_s=15 {"op": "playlist", "id": playlist_id, "limit": 100}, timeout_s=25
) )
except Exception as e: except Exception as e:
_log(f"playlist failed: {e}", xbmc.LOGERROR) _log(f"playlist failed: {e}", xbmc.LOGERROR)
@ -788,7 +815,7 @@ def _channel_directory(channel_id: str) -> None:
"""List a channel's recent videos.""" """List a channel's recent videos."""
try: try:
resp = _call_sidecar( resp = _call_sidecar(
{"op": "channel_videos", "id": channel_id, "limit": 50}, timeout_s=15 {"op": "channel_videos", "id": channel_id, "limit": 50}, timeout_s=25
) )
except Exception as e: except Exception as e:
_log(f"channel_videos failed: {e}", xbmc.LOGERROR) _log(f"channel_videos failed: {e}", xbmc.LOGERROR)

View file

@ -68,6 +68,18 @@ fn default_search_limit() -> u32 {
25 25
} }
/// Hard ceiling on any `limit` value accepted from a JSON request. Prevents an
/// unbounded `limit: u32::MAX` from OOMing Kodi or hammering YouTube for
/// hundreds of continuation pages.
const MAX_LIMIT: u32 = 200;
/// Allowed prefix(es) for the `rip` op's `dest_dir`. Prevents arbitrary-write
/// under the sidecar's UID via a crafted JSON request.
const RIP_DEST_ALLOWLIST: &[&str] = &[
"/storage/.kodi/temp/",
"/storage/.kodi/userdata/addon_data/plugin.video.torttube/",
];
#[derive(Debug, Serialize)] #[derive(Debug, Serialize)]
#[serde(untagged)] #[serde(untagged)]
enum Response { enum Response {
@ -151,41 +163,96 @@ async fn handle_line(line: &str) -> Response {
match req { match req {
Request::Ping => Response::ok(serde_json::json!({ "pong": true })), Request::Ping => Response::ok(serde_json::json!({ "pong": true })),
Request::Resolve { id } => match resolve::resolve(&id).await { Request::Resolve { id } => {
Ok(v) => Response::ok(v), if let Err(e) = validate_youtube_id(&id) {
Err(e) => e.into(), return Response::err(ErrorKind::BadRequest, e);
}, }
Request::ResolvePlay { id } => match resolve::resolve_play(&id).await { match resolve::resolve(&id).await {
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => e.into(), Err(e) => e.into(),
}, }
Request::ResolveDash { id } => match resolve::resolve_dash(&id).await { }
Ok(v) => Response::ok(v), Request::ResolvePlay { id } => {
Err(e) => e.into(), if let Err(e) = validate_youtube_id(&id) {
}, return Response::err(ErrorKind::BadRequest, e);
Request::Rip { id, dest_dir } => match rip::rip(&id, &dest_dir).await { }
Ok(v) => Response::ok(v), match resolve::resolve_play(&id).await {
Err(e) => e.into(), Ok(v) => Response::ok(v),
}, Err(e) => e.into(),
Request::Sponsorblock { id, categories } => match sponsor::fetch(&id, &categories).await { }
Ok(v) => Response::ok(v), }
Err(e) => Response::err(ErrorKind::Network, format!("sponsorblock: {e}")), Request::ResolveDash { id } => {
}, if let Err(e) = validate_youtube_id(&id) {
Request::Search { query, limit } => match resolve::search(&query, limit).await { return Response::err(ErrorKind::BadRequest, e);
Ok(v) => Response::ok(v), }
Err(e) => e.into(), match resolve::resolve_dash(&id).await {
}, Ok(v) => Response::ok(v),
Request::ChannelVideos { id, limit } => match resolve::channel_videos(&id, limit).await { Err(e) => e.into(),
Ok(v) => Response::ok(v), }
Err(e) => e.into(), }
}, Request::Rip { id, dest_dir } => {
Request::Playlist { id, limit } => match resolve::playlist(&id, limit).await { if let Err(e) = validate_youtube_id(&id) {
return Response::err(ErrorKind::BadRequest, e);
}
if !RIP_DEST_ALLOWLIST.iter().any(|p| dest_dir.starts_with(p)) {
return Response::err(
ErrorKind::BadRequest,
format!("rip dest_dir not in allowlist: {dest_dir}"),
);
}
match rip::rip(&id, &dest_dir).await {
Ok(v) => Response::ok(v),
Err(e) => e.into(),
}
}
Request::Sponsorblock { id, categories } => {
if let Err(e) = validate_youtube_id(&id) {
return Response::err(ErrorKind::BadRequest, e);
}
match sponsor::fetch(&id, &categories).await {
Ok(v) => Response::ok(v),
Err(e) => Response::err(ErrorKind::Network, format!("sponsorblock: {e}")),
}
}
Request::Search { query, limit } => {
match resolve::search(&query, limit.min(MAX_LIMIT)).await {
Ok(v) => Response::ok(v),
Err(e) => e.into(),
}
}
Request::ChannelVideos { id, limit } => {
if let Err(e) = validate_youtube_id(&id) {
return Response::err(ErrorKind::BadRequest, e);
}
match resolve::channel_videos(&id, limit.min(MAX_LIMIT)).await {
Ok(v) => Response::ok(v),
Err(e) => e.into(),
}
}
Request::Playlist { id, limit } => match resolve::playlist(&id, limit.min(MAX_LIMIT)).await {
Ok(v) => Response::ok(v), Ok(v) => Response::ok(v),
Err(e) => e.into(), Err(e) => e.into(),
}, },
} }
} }
/// Strict YouTube video-id shape check — 11 chars from [A-Za-z0-9_-].
/// Centralized so every op that takes an `id` enforces the same contract;
/// returns a clear error message for `BadRequest` rather than passing a junk
/// string into rustypipe / yt-dlp / our log lines.
fn validate_youtube_id(id: &str) -> Result<(), String> {
if id.len() != 11 {
return Err(format!("invalid youtube id (length {} != 11)", id.len()));
}
if !id
.bytes()
.all(|b| b.is_ascii_alphanumeric() || b == b'_' || b == b'-')
{
return Err(format!("invalid youtube id (bad chars): {id:?}"));
}
Ok(())
}
/// Common error returned by resolve/rip handlers — gets mapped to a typed Response. /// Common error returned by resolve/rip handlers — gets mapped to a typed Response.
#[derive(Debug, thiserror::Error)] #[derive(Debug, thiserror::Error)]
pub enum HandlerError { pub enum HandlerError {

View file

@ -247,18 +247,19 @@ async fn tier1_rustypipe(id: &str) -> Result<Value, HandlerError> {
} }
/// Classify a yt-dlp shell-out error into one of our typed handler errors. /// Classify a yt-dlp shell-out error into one of our typed handler errors.
/// yt-dlp's stderr is freeform English; we match on substrings, case-insensitive /// yt-dlp's stderr is freeform English; we match on **word-boundary** patterns
/// via a lowercase copy, but preserve the original message in the returned error. /// so "private" matches the standalone word, not e.g. "private network" inside
/// a TLS error. Preserve the original message verbatim in the returned error.
fn classify_yt_dlp_error(e: &anyhow::Error) -> HandlerError { fn classify_yt_dlp_error(e: &anyhow::Error) -> HandlerError {
let original = e.to_string(); let original = e.to_string();
let lower = original.to_lowercase(); let lower = original.to_lowercase();
if lower.contains("age") { if matches_word(&lower, &["age-restrict", "age restricted", "age restriction"]) {
HandlerError::AgeRestricted HandlerError::AgeRestricted
} else if lower.contains("private") { } else if matches_word(&lower, &["private video", "video is private"]) {
HandlerError::PrivateVideo HandlerError::PrivateVideo
} else if lower.contains("not available") || lower.contains("does not exist") { } else if lower.contains("not available") || lower.contains("does not exist") {
HandlerError::NotFound HandlerError::NotFound
} else if lower.contains("geo") || lower.contains("region") { } else if matches_word(&lower, &["geo-restrict", "geo restrict", "region-restrict", "region restrict"]) {
HandlerError::RegionBlocked HandlerError::RegionBlocked
} else { } else {
HandlerError::Extractor(original) HandlerError::Extractor(original)
@ -268,22 +269,27 @@ fn classify_yt_dlp_error(e: &anyhow::Error) -> HandlerError {
/// Classify a rustypipe error into one of our typed handler errors. /// Classify a rustypipe error into one of our typed handler errors.
/// rustypipe's error enum varies by version; we match on the Display string for resilience. /// rustypipe's error enum varies by version; we match on the Display string for resilience.
fn classify_rustypipe_error(e: &dyn std::fmt::Display) -> HandlerError { fn classify_rustypipe_error(e: &dyn std::fmt::Display) -> HandlerError {
let msg = e.to_string().to_lowercase(); let original = e.to_string();
if msg.contains("age") && msg.contains("restrict") { let msg = original.to_lowercase();
if matches_word(&msg, &["age-restrict", "age restricted", "age restriction"]) {
HandlerError::AgeRestricted HandlerError::AgeRestricted
} else if msg.contains("region") || msg.contains("country") || msg.contains("geo") { } else if matches_word(&msg, &["region-restrict", "region restrict", "geo-restrict", "geo restrict", "country restrict"]) {
HandlerError::RegionBlocked HandlerError::RegionBlocked
} else if msg.contains("private") { } else if matches_word(&msg, &["private video", "video is private"]) {
HandlerError::PrivateVideo HandlerError::PrivateVideo
} else if msg.contains("not found") || msg.contains("unavailable") { } else if matches_word(&msg, &["not found", "unavailable", "does not exist"]) {
HandlerError::NotFound HandlerError::NotFound
} else if msg.contains("network") || msg.contains("timeout") || msg.contains("connect") { } else if matches_word(&msg, &["network error", "timeout", "connection refused", "dns error"]) {
HandlerError::Network(msg) HandlerError::Network(original)
} else { } else {
HandlerError::Extractor(msg) HandlerError::Extractor(original)
} }
} }
fn matches_word(haystack: &str, needles: &[&str]) -> bool {
needles.iter().any(|n| haystack.contains(n))
}
/// Tier 2 — shell out to yt-dlp -j. /// Tier 2 — shell out to yt-dlp -j.
async fn tier2_yt_dlp(id: &str) -> Result<Value, HandlerError> { async fn tier2_yt_dlp(id: &str) -> Result<Value, HandlerError> {
let url = format!("https://www.youtube.com/watch?v={id}"); let url = format!("https://www.youtube.com/watch?v={id}");