From 03e1eb526a67fa5a9a16aa16d90749b33f548435 Mon Sep 17 00:00:00 2001 From: Kayos Date: Sat, 23 May 2026 12:52:49 -0700 Subject: [PATCH] =?UTF-8?q?Second=20audit=20fix=20sprint=20=E2=80=94=202?= =?UTF-8?q?=20CRIT=20+=205=20HIGH=20+=205=20MED?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Second adversarial Opus audit found 2 CRIT + 7 HIGH + 9 MED + 9 LOW against HEAD 503dbef. This sprint lands all CRIT, the 5 highest-value HIGH items, and the most impactful MEDs. Remaining items deferred (see notes at end). CRIT-1 (race + torn write on Watch Later / search-history JSON state): The persistence layer was load->mutate->save with plain open(w). Two real failure modes were live: - Lost-update race: phone fires two RunPlugin wl_add back-to-back, each in a fresh Python interpreter. Interpreter A reads [], inserts AAA, writes [AAA]; B was already past _load_watch_later (also read []) and writes [BBB] — AAA lost. - Torn-write loss: open(w) truncates immediately. Kodi crash / Pi yank between truncate and json.dump finish leaves a zero-byte or partial file. _load* catches JSONDecodeError and returns [], silently wiping the user's pinned-videos list on next read. Fix: _atomic_write_json (mkstemp + fsync + os.replace) + _with_lock (fcntl.flock on a sibling .lock file). All four persistence functions (_record_search, _clear_search_history, _add_to_watch_later, _remove_from_watch_later) refactored to wrap load->mutate->save under the lock. os.replace is atomic on POSIX same-filesystem; flock is per-process advisory across the concurrent plugin interpreters. CRIT-2 (rip allowlist bypass via ..): RIP_DEST_ALLOWLIST check was string-level starts_with. '/storage/.kodi/temp/../../etc/cron.d' passed the allowlist but escaped to anywhere. Dormant op (no Python caller today) but the protocol is a wide-open arbitrary-write primitive under the sidecar UID (root on LibreELEC). Fix: literal-prefix check first (cheap reject), then create_dir_all + tokio::fs::canonicalize, then a second check against the same allowlist on the canonical result. Defeats .. and symlink escape. HIGH-1 (SponsorBlock 1 MiB cap was post-hoc): The cap from the first audit landed AFTER resp.bytes().await — which buffers the entire body unbounded. A hostile mirror returning multi-GB would OOM the Pi before the cap fired. Fix: stream via resp.chunk() in a loop, bail as soon as accumulated bytes > cap. Defends OOM during ingest, not after. HIGH-2 (Container.Refresh fires in wrong context): _wl_remove_action unconditionally fired Refresh. Future refactors that expose the Remove context menu outside the Watch Later listing (or stale context-menu state across navigation) would refresh the wrong container — search results would reload pointlessly when the user just hit Remove in WL. Fix: only Refresh when Container.FolderPath contains 'action=watch_later'. HIGH-3 (thumbnail shape — promoted from first-audit MED-9): The new Watch Later code persists rustypipe Player.details which has a thumbnail string in many versions. _pick_thumbnail called max() on a string — iterates chars and crashes with AttributeError on the .get lookup. Live crash-on-render of Watch Later. Fix: _pick_thumbnail now accepts Any and dispatches on isinstance: empty/None -> '', str -> str, dict -> .url, list -> filter for dicts then max-by-area. HIGH-4 (search query unvalidated — length, control chars, surrogates): Search took an arbitrary string from the addon and shipped it to rustypipe. Three failure modes: - 10 MB query allocates 10 MB on stdin pipe + another 10 MB in serde_json::from_str + materializes in rustypipe's HTTP call. ~30 MB blip per request on the 1 GB Pi. - Newlines in query produce multi-line entries in tracing::info!(query, ...) — log injection (mild on this stack, but obfuscates real entries). - Lone UTF-16 surrogates aren't a crash but produce a confusing error path. Fix: validate_query() at the sidecar dispatch — 2 KB length cap, reject control chars (TAB allowed since YouTube treats it as whitespace). Addon-side _record_search now also collapses whitespace via ' '.join(query.split()). MED-1 (clear_history flicker): _clear_history_action finalized with succeeded=True THEN Container.Update — caused a half-tick of empty directory before the nav replaced it. Fix: succeeded=False first, then the replace navigates cleanly. MED-3 (LibreELEC-only fallback): _addon_data_path's xbmcvfs fallback hardcoded /storage/.kodi/userdata/... — non-existent on Linux desktop Kodi etc. Fix: fall back to ~/.kodi/userdata/addon_data/... which is portable. Tests + dev rigs now persist correctly. MED-4 (Response::ok clobber): Added debug_assert! so any future op returning its own 'ok' key is caught in debug builds. Not a fix per se — but a tripwire. MED-5 (codec regex single-quote): Made _MIME_CODEC_RE accept either quote style. Belt-and-braces against upstream MIME format drift. MED-8 (SponsorBlockMonitor orphaned across plugin boundary): When we delegate to plugin.video.youtube, our monitor runs in our plugin's context against xbmc.Player()'s global state. If the user starts a different video mid-monitor, the segments are for the old one — would spuriously skip the new content. Fix: capture player.getPlayingFile() at start, bail if it changes mid-loop. MED-9 / LOW-2 carried over from first audit and tracked here too. Smoke-verified via JSON-RPC (browse-only, no playback, Leia still watching the TV): - 3000-byte query rejected: 'query too long: 3000 bytes (cap 2048)' - newline-in-query rejected: 'query contains control characters' - legit search returns expected results - wl_add + watch_later directory renders without thumbnail crash Remaining deferred (cosmetic / no current impact): - MED-2 (WL staleness): accept for v0.1, add lazy refresh later - MED-6 (byte-length validate_youtube_id error message): cosmetic - MED-7 (_lan_ip multi-NIC): not currently a problem on the family LAN, document and revisit if Tailscale lands on the Pi - LOW-1..9: defensive polish, no real risk - First-audit's deferred MED/LOWs: still defer-able Addon v0.0.16. --- addon/plugin.video.torttube/addon.xml | 2 +- addon/plugin.video.torttube/main.py | 226 ++++++++++++++---- sidecar/crates/torttube-sidecar/src/main.rs | 65 +++++ .../crates/torttube-sidecar/src/sponsor.rs | 28 ++- 4 files changed, 260 insertions(+), 61 deletions(-) diff --git a/addon/plugin.video.torttube/addon.xml b/addon/plugin.video.torttube/addon.xml index f6be16f..6dceed7 100644 --- a/addon/plugin.video.torttube/addon.xml +++ b/addon/plugin.video.torttube/addon.xml @@ -1,7 +1,7 @@ diff --git a/addon/plugin.video.torttube/main.py b/addon/plugin.video.torttube/main.py index b40184f..7b6aea8 100644 --- a/addon/plugin.video.torttube/main.py +++ b/addon/plugin.video.torttube/main.py @@ -18,6 +18,7 @@ That's how Android / phone / "send to TV" flows hand off — Kodi already exposes the endpoint, we just need to register the plugin URL. """ +import fcntl import http.server import json import os @@ -25,6 +26,7 @@ import re import socket import subprocess import sys +import tempfile import threading from typing import Any from urllib.parse import parse_qsl, urlencode, urlparse @@ -39,7 +41,20 @@ ADDON = xbmcaddon.Addon() ADDON_PATH = ADDON.getAddonInfo("path") SIDECAR_BIN = os.path.join(ADDON_PATH, "bin", "torttube-sidecar") -_HANDLE = int(sys.argv[1]) if len(sys.argv) > 1 else -1 +def _safe_handle() -> int: + """Kodi always passes an int as sys.argv[1] per the plugin contract, but + if someone runs main.py directly (testing) or Kodi has a bad day we'd + crash with a ValueError before _log is even defined. -1 means 'no handle'; + setResolvedUrl / endOfDirectory are no-ops with that.""" + if len(sys.argv) <= 1: + return -1 + try: + return int(sys.argv[1]) + except ValueError: + return -1 + + +_HANDLE = _safe_handle() _QS = sys.argv[2] if len(sys.argv) > 2 else "" @@ -235,7 +250,7 @@ def _start_mpd_server(mpd_bytes: bytes) -> tuple[str, http.server.HTTPServer]: return url, server -_MIME_CODEC_RE = re.compile(r'codecs="([^"]+)"') +_MIME_CODEC_RE = re.compile(r'codecs=["\']([^"\']+)["\']') def _codec_from_mime(stream: dict[str, Any]) -> str: @@ -591,15 +606,31 @@ class SponsorBlockMonitor(xbmc.Monitor): _log("sponsorblock: timed out waiting for playback to start") return + # Capture the file path that's actually playing now; bail if it changes + # mid-monitor (delegate-to-pv.youtube means our SponsorBlockMonitor is + # alive in our plugin's context while pv.youtube drives playback — + # if the user starts a different video, our skip-segments are stale). + # Audit MED-8 (2nd pass). + try: + initial_file = player.getPlayingFile() + except Exception: + initial_file = "" + while not self.abortRequested() and player.isPlaying(): try: pos = float(player.getTime()) + current_file = player.getPlayingFile() except Exception: # getTime raises various exception types when the player goes # away mid-poll (Kodi shutdown, plugin reload, etc). Wider catch # so an exception path doesn't escape into _play's finally and # leak the MPD HTTP server. return + if initial_file and current_file and current_file != initial_file: + # User started a different video — our skip segments are for the + # old one, abort instead of spurious-skipping the new content. + _log("sponsorblock: playing file changed, monitor exiting") + return for seg in self.segments: uuid = seg.get("UUID", "") if uuid in self.skipped: @@ -660,18 +691,70 @@ def _format_views(views: int | None) -> str: return str(views) +# Persistence helpers (atomic write + advisory lock). +# Both search history and Watch Later go through these to survive concurrent +# RunPlugin invocations (each is a fresh Python interpreter — load→mutate→save +# races dropped updates without locking) and unclean shutdowns (open(w) truncates +# immediately — Kodi crash mid-write leaves the user's pinned-videos list as a +# zero-byte file that load() silently treats as empty). Audit CRIT-1 (2nd pass). + + +def _addon_data_path(filename: str) -> str: + try: + import xbmcvfs + base = xbmcvfs.translatePath("special://profile/addon_data/plugin.video.torttube/") + except Exception: + # Fallback for non-Kodi tests / generic Linux Kodi installs. + base = os.path.join( + os.path.expanduser("~"), ".kodi", "userdata", + "addon_data", "plugin.video.torttube", + ) + return os.path.join(base, filename) + + +def _atomic_write_json(path: str, data: Any) -> None: + """Write JSON to `path` via a same-filesystem tempfile + os.replace. + Crash-safe: at any moment, `path` either has the old contents or the new + contents — never a torn write.""" + os.makedirs(os.path.dirname(path), exist_ok=True) + fd, tmp = tempfile.mkstemp( + dir=os.path.dirname(path), prefix=".tmp.", suffix=".json" + ) + try: + with os.fdopen(fd, "w", encoding="utf-8") as f: + json.dump(data, f, ensure_ascii=False) + f.flush() + os.fsync(f.fileno()) + os.replace(tmp, path) + except Exception: + try: + os.remove(tmp) + except OSError: + pass + raise + + +def _with_lock(path: str, fn): + """Run `fn()` under an exclusive advisory lock on a sibling .lock file. + fcntl.flock is per-process — exactly what we need to serialize concurrent + plugin interpreters that are each modifying the same JSON state file.""" + lock_path = path + ".lock" + os.makedirs(os.path.dirname(lock_path), exist_ok=True) + with open(lock_path, "w") as lf: + fcntl.flock(lf.fileno(), fcntl.LOCK_EX) + try: + return fn() + finally: + fcntl.flock(lf.fileno(), fcntl.LOCK_UN) + + # Search history persistence — stored as a plain JSON list of recent queries. # Newest first, deduplicated, capped at SEARCH_HISTORY_MAX. SEARCH_HISTORY_MAX = 12 def _search_history_path() -> str: - try: - import xbmcvfs - base = xbmcvfs.translatePath("special://profile/addon_data/plugin.video.torttube/") - except Exception: - base = "/storage/.kodi/userdata/addon_data/plugin.video.torttube/" - return os.path.join(base, "search_history.json") + return _addon_data_path("search_history.json") def _load_search_history() -> list[str]: @@ -686,31 +769,47 @@ def _load_search_history() -> list[str]: def _save_search_history(items: list[str]) -> None: + """Atomic, no-lock save. Caller MUST already hold the lock if doing a + load→mutate→save sequence (see _record_search).""" path = _search_history_path() try: - os.makedirs(os.path.dirname(path), exist_ok=True) - with open(path, "w", encoding="utf-8") as f: - json.dump(items[:SEARCH_HISTORY_MAX], f, ensure_ascii=False) + _atomic_write_json(path, items[:SEARCH_HISTORY_MAX]) except OSError as e: _log(f"search history save failed (non-fatal): {e}", xbmc.LOGWARNING) def _record_search(query: str) -> None: - q = query.strip() + q = " ".join(query.split()) # collapse whitespace if not q: return - history = _load_search_history() - # Dedupe case-insensitively, keep newest at the front. - history = [h for h in history if h.lower() != q.lower()] - history.insert(0, q) - _save_search_history(history) + path = _search_history_path() + + def _do() -> None: + history = _load_search_history() + # Dedupe case-insensitively, keep newest at the front. + history = [h for h in history if h.lower() != q.lower()] + history.insert(0, q) + _save_search_history(history) + + try: + _with_lock(path, _do) + except OSError as e: + _log(f"search history lock failed (non-fatal): {e}", xbmc.LOGWARNING) def _clear_search_history() -> None: + path = _search_history_path() + + def _do() -> None: + try: + os.remove(path) + except (FileNotFoundError, OSError): + pass + try: - os.remove(_search_history_path()) - except (FileNotFoundError, OSError): - pass + _with_lock(path, _do) + except OSError as e: + _log(f"search history clear failed (non-fatal): {e}", xbmc.LOGWARNING) # Watch Later — user-curated list of saved videos. The anti-algorithm answer @@ -722,12 +821,7 @@ WATCH_LATER_MAX = 500 def _watch_later_path() -> str: - try: - import xbmcvfs - base = xbmcvfs.translatePath("special://profile/addon_data/plugin.video.torttube/") - except Exception: - base = "/storage/.kodi/userdata/addon_data/plugin.video.torttube/" - return os.path.join(base, "watch_later.json") + return _addon_data_path("watch_later.json") def _load_watch_later() -> list[dict[str, Any]]: @@ -745,38 +839,71 @@ def _load_watch_later() -> list[dict[str, Any]]: def _save_watch_later(items: list[dict[str, Any]]) -> None: + """Atomic, no-lock save. Caller MUST hold the lock (see _add_to / _remove_from).""" path = _watch_later_path() try: - os.makedirs(os.path.dirname(path), exist_ok=True) - with open(path, "w", encoding="utf-8") as f: - json.dump(items[:WATCH_LATER_MAX], f, ensure_ascii=False) + _atomic_write_json(path, items[:WATCH_LATER_MAX]) except OSError as e: _log(f"watch later save failed (non-fatal): {e}", xbmc.LOGWARNING) def _add_to_watch_later(item: dict[str, Any]) -> None: yt_id = item.get("id") - if not yt_id: + if not yt_id or not _ID_RE.fullmatch(str(yt_id)): return - items = _load_watch_later() - items = [i for i in items if i.get("id") != yt_id] # dedupe - items.insert(0, item) - _save_watch_later(items) + path = _watch_later_path() + + def _do() -> None: + items = _load_watch_later() + items = [i for i in items if i.get("id") != yt_id] + items.insert(0, item) + _save_watch_later(items) + + try: + _with_lock(path, _do) + except OSError as e: + _log(f"watch later add lock failed (non-fatal): {e}", xbmc.LOGWARNING) def _remove_from_watch_later(yt_id: str) -> None: - items = _load_watch_later() - items = [i for i in items if i.get("id") != yt_id] - _save_watch_later(items) + path = _watch_later_path() + + def _do() -> None: + items = _load_watch_later() + items = [i for i in items if i.get("id") != yt_id] + _save_watch_later(items) + + try: + _with_lock(path, _do) + except OSError as e: + _log(f"watch later remove lock failed (non-fatal): {e}", xbmc.LOGWARNING) -def _pick_thumbnail(thumbs: list[dict[str, Any]] | None) -> str: - """Pick the largest thumbnail from rustypipe's thumbnail list.""" +def _pick_thumbnail(thumbs: Any) -> str: + """Pick a thumbnail URL from whatever shape rustypipe / yt-dlp / our own + persisted Watch Later records hand us. + + rustypipe `Player.details.thumbnail` is a single URL **string**, while + `VideoItem.thumbnail` is a `list[dict]`. The original implementation + crashed on the string form (`max(str, ...)` iterates chars and calls + `.get` on each — `AttributeError`). Watch Later persists details-shaped + items, so this is a live crash-on-render once any video is pinned. + Audit HIGH-3 (2nd pass).""" if not thumbs: return "" - return max(thumbs, key=lambda t: (t.get("width") or 0) * (t.get("height") or 0)).get( - "url", "" - ) + if isinstance(thumbs, str): + return thumbs + if isinstance(thumbs, dict): + return thumbs.get("url", "") or "" + if isinstance(thumbs, list): + dict_thumbs = [t for t in thumbs if isinstance(t, dict)] + if not dict_thumbs: + return "" + best = max( + dict_thumbs, key=lambda t: (t.get("width") or 0) * (t.get("height") or 0) + ) + return best.get("url", "") or "" + return "" def _root_directory() -> None: @@ -856,9 +983,10 @@ def _clear_history_action() -> None: xbmcgui.Dialog().notification( "torttube", "Search history cleared", xbmcgui.NOTIFICATION_INFO, 2000 ) - # Bounce back to root. + # succeeded=False so Kodi doesn't enter an empty directory before the + # Container.Update lands — avoids the half-tick flicker the audit caught. + xbmcplugin.endOfDirectory(_HANDLE, succeeded=False) xbmc.executebuiltin("Container.Update(plugin://plugin.video.torttube/,replace)") - xbmcplugin.endOfDirectory(_HANDLE, succeeded=True) def _watch_later_directory() -> None: @@ -931,9 +1059,13 @@ def _wl_remove_action() -> None: xbmcgui.Dialog().notification( "torttube", "Removed from Watch Later", xbmcgui.NOTIFICATION_INFO, 2000 ) - # Refresh the current container so the item disappears immediately when - # invoked from inside the Watch Later list. - xbmc.executebuiltin("Container.Refresh") + # Refresh ONLY if we're currently inside the Watch Later listing. If the + # user invoked remove from somewhere else (future refactor, stale + # context-menu state), refreshing the wrong container is jarring. + # Audit HIGH-2 (2nd pass). + container_path = xbmc.getInfoLabel("Container.FolderPath") or "" + if "action=watch_later" in container_path: + xbmc.executebuiltin("Container.Refresh") def _add_video_items(items: list[dict[str, Any]], *, in_watch_later: bool = False) -> None: diff --git a/sidecar/crates/torttube-sidecar/src/main.rs b/sidecar/crates/torttube-sidecar/src/main.rs index 5c20e30..db5384f 100644 --- a/sidecar/crates/torttube-sidecar/src/main.rs +++ b/sidecar/crates/torttube-sidecar/src/main.rs @@ -104,8 +104,15 @@ enum ErrorKind { impl Response { fn ok(value: serde_json::Value) -> Self { // Inject ok:true into the value if it's an object, otherwise wrap it. + // Sanity-check: no op should ever return its own `ok` field — the + // wire protocol reserves `ok` for our wrapper. Catches drift if any + // future op forwards a remote response that happens to carry `ok`. let value = match value { serde_json::Value::Object(mut map) => { + debug_assert!( + !map.contains_key("ok"), + "op-handler returned a value with its own `ok` key; would be clobbered" + ); map.insert("ok".into(), serde_json::Value::Bool(true)); serde_json::Value::Object(map) } @@ -194,12 +201,41 @@ async fn handle_line(line: &str) -> Response { if let Err(e) = validate_youtube_id(&id) { return Response::err(ErrorKind::BadRequest, e); } + // Resolve the path against the literal allowlist FIRST (cheap reject), + // then canonicalize after create_dir_all to defeat `..` traversal. + // String-level starts_with passes `/storage/.kodi/temp/../../etc` — we + // need a real filesystem-aware check. Audit CRIT-2 + HIGH-5 (2nd pass). if !RIP_DEST_ALLOWLIST.iter().any(|p| dest_dir.starts_with(p)) { return Response::err( ErrorKind::BadRequest, format!("rip dest_dir not in allowlist: {dest_dir}"), ); } + if let Err(e) = tokio::fs::create_dir_all(&dest_dir).await { + return Response::err( + ErrorKind::Io, + format!("create_dir_all {dest_dir}: {e}"), + ); + } + let canonical = match tokio::fs::canonicalize(&dest_dir).await { + Ok(p) => p, + Err(e) => { + return Response::err( + ErrorKind::Io, + format!("canonicalize {dest_dir}: {e}"), + ) + } + }; + let canonical_str = canonical.to_string_lossy(); + let inside_allowlist = RIP_DEST_ALLOWLIST.iter().any(|p| { + canonical_str.starts_with(p.trim_end_matches('/')) + }); + if !inside_allowlist { + return Response::err( + ErrorKind::BadRequest, + format!("rip dest_dir escapes allowlist after canonicalize: {canonical_str}"), + ); + } match rip::rip(&id, &dest_dir).await { Ok(v) => Response::ok(v), Err(e) => e.into(), @@ -215,6 +251,9 @@ async fn handle_line(line: &str) -> Response { } } Request::Search { query, limit } => { + if let Err(e) = validate_query(&query) { + return Response::err(ErrorKind::BadRequest, e); + } match resolve::search(&query, limit.min(MAX_LIMIT)).await { Ok(v) => Response::ok(v), Err(e) => e.into(), @@ -236,6 +275,32 @@ async fn handle_line(line: &str) -> Response { } } +/// Validate a free-form search query. Audit HIGH-4 (2nd pass). +/// - Length cap (2KB) — prevents a multi-megabyte query from sucking up RAM +/// on the Pi 4 + provoking the OOM killer. +/// - No control chars (except TAB which YouTube treats as whitespace). +/// Protects against log-injection via newlines (sidecar emits a single +/// `tracing::info!(query, ...)` per call; embedded newlines look like +/// separate log records). Also protects against null bytes that would +/// break OS-level args if we ever stop using stdin. +fn validate_query(q: &str) -> Result<(), String> { + const MAX_QUERY_BYTES: usize = 2048; + if q.is_empty() { + return Err("query is empty".into()); + } + if q.len() > MAX_QUERY_BYTES { + return Err(format!( + "query too long: {} bytes (cap {})", + q.len(), + MAX_QUERY_BYTES + )); + } + if q.chars().any(|c| c.is_control() && c != '\t') { + return Err("query contains control characters".into()); + } + Ok(()) +} + /// Strict YouTube video-id shape check — 11 chars from [A-Za-z0-9_-]. /// Centralized so every op that takes an `id` enforces the same contract; /// returns a clear error message for `BadRequest` rather than passing a junk diff --git a/sidecar/crates/torttube-sidecar/src/sponsor.rs b/sidecar/crates/torttube-sidecar/src/sponsor.rs index 02e7e7a..5f8931a 100644 --- a/sidecar/crates/torttube-sidecar/src/sponsor.rs +++ b/sidecar/crates/torttube-sidecar/src/sponsor.rs @@ -54,7 +54,7 @@ pub(crate) async fn fetch(id: &str, categories: &[String]) -> anyhow::Result anyhow::Result SPONSORBLOCK_MAX_BYTES { - anyhow::bail!( - "sponsorblock response too large: {} bytes (cap {})", - bytes.len(), - SPONSORBLOCK_MAX_BYTES - ); + // Stream the body and bail as soon as accumulated bytes exceed the cap — + // do NOT buffer the entire response first, otherwise a hostile mirror or + // compromised DNS returning a multi-GB response would OOM the sidecar + // before any cap check fires. Audit HIGH-1 (2nd pass). + const SPONSORBLOCK_MAX_BYTES: usize = 1024 * 1024; + let mut bytes: Vec = Vec::with_capacity(64 * 1024); + while let Some(chunk) = resp.chunk().await? { + if bytes.len() + chunk.len() > SPONSORBLOCK_MAX_BYTES { + anyhow::bail!( + "sponsorblock response too large (>{} bytes)", + SPONSORBLOCK_MAX_BYTES + ); + } + bytes.extend_from_slice(&chunk); } let body: Vec = serde_json::from_slice(&bytes)?;