Commit graph

5 commits

Author SHA1 Message Date
c8dfc8a34a Public-flip audit: scrub audit-ticket prefixes + LAN refs + tighten README
URLs → git.sulkta.com. Audit-ticket prefixes (SPEC §N, audit Track X, vc=N
audit-fix, FIX (audit ...), PORT DEVIATION) stripped from comments — technical
reasoning retained. Crafting-table LAN refs softened to 'Sulkta build host'.
README sheds marketing scaffolding + stale status tables.
2026-05-27 13:29:52 -07:00
d4000a9f9a Cleanup: drop playlist + suggestion + dead client constants + suppress_unused stubs
Round-2 cruft audit punch list — mechanical deletes, no behavior change.

Whole modules deleted (no wrapper consumer):
  * youtube/playlist_extractor.rs (297 LOC) — full playlist extraction
  * youtube/linkhandler/playlist.rs (81 LOC) — playlist URL parser
  * youtube/suggestion_extractor.rs (91 LOC) — search-as-you-type
  * tests/stream_phase4_offline.rs (186 LOC) — tautological test

Dead pub fns + enum variants + constants:
  * WEB_REMIX_* constants (3) + WEB_MUSIC_ANALYTICS_* constants (3)
  * InnertubeClientRequestInfo::of_web_music_analytics_charts_client
    factory + its charts_client_omits_platform_and_screen test
  * SearchFilter::Music{Songs,Videos,Albums,Playlists,Artists} variants
    (5 of 9 cases) + uses_music_endpoint helper + the search_extractor
    'music search not implemented' reject branch
  * Two #[allow(dead_code)] _suppress_unused stub fns and the imports
    they were keeping alive (std::sync::Arc in js/extractor.rs,
    NetworkError in stream_extractor.rs)

Renamed:
  * search_extractor::test_helpers -> renderer_helpers. Mis-named:
    it's production code called from channel.rs, not a test fixture.

potoken/ kept and documented as the designed Phase-5 extension point
for YouTube bot-detection — wrapper's Android side hasn't registered
a real provider yet, but the trait + global slot stay so when YT
forces po_token universally the integration is one Kotlin patch away,
not a Rust-side rewrite.

~580 LOC removed from production. Wrapper does not need to change.
2026-05-26 22:16:11 -07:00
a47e142ab7 Phase 4 (complete) — stream_extractor orchestrator
Wire the Android-primary fetch path + JSON-walking + URL post-processing
into a single stream_info(video_id) entry point. Mirrors NPE
YoutubeStreamExtractor.onFetchPage() per audit Track C §1.2.

src/youtube/stream_extractor.rs
  * stream_info(video_id) + stream_info_with(video_id, options)
  * fetch_android — reel endpoint (anonymous) OR /player (with po_token)
  * check_playability_status — maps to ContentUnavailable variants
    (AgeRestricted, GeoRestricted, Paid, Private, YoutubeMusicPremium,
    AccountTerminated, Other)
  * is_player_response_not_valid — decoy-video detection
  * populate_video_details + populate_microformat + populate_streams +
    populate_manifests + populate_captions
  * process_url — sig deobf path (signatureCipher → JS function call)
    + unconditional nsig deobf + cpn append + pot append
  * build_video_progressive / build_video_only / build_audio +
    push_*_dedup helpers (FIX: NPE bug — dedup by itag id, not by
    mediaFormat.id which collides 140/141)

Consolidated stream_helper's local ExtractionError into the crate-wide
exceptions::ExtractionError with a new DownloaderMissing variant.

Tests: 73 lib unit pass (+9 since Phase 3) + 7 new Phase 4 offline
integration tests = 80 lib green. Live YT end-to-end smoke deferred
to Straw integration; the code path is in place.
2026-05-24 17:08:04 -07:00
91639f26d1 Phase 2 — JS deobfuscator (rquickjs + ress)
Port NewPipeExtractor's JS pipeline: player.js fetch + cache, sig and
nsig function extraction, deobfuscation, sticky-error caching.

src/youtube/js/
  * runtime.rs        — rquickjs wrapper (mirrors utils/JavaScript.java)
                        compile_or_throw + run(snippet, name, parameter)
  * lexer.rs          — match_to_closing_brace via the `ress` JS scanner
                        (NPE's lexer is derived from the same crate
                        upstream)
  * extractor.rs      — iframe_api → embed page fallback for player.js
                        URL, regex-driven hash extraction, clean-and-fetch
  * signature.rs      — 6 sig fn name regexes (front-most-recent),
                        deobf-function-body via lexer w/ regex fallback,
                        helper-object + global-string-array extraction,
                        signatureTimestamp, snippet assembler
  * nsig.rs           — 8 nsig fn name regexes (incl. array-indirection),
                        body via lexer w/ regex fallback, fixupFunction
                        early-return strip
  * player_manager.rs — orchestrator + sticky-error cache mirroring
                        YoutubeJavaScriptPlayerManager

PORT DEVIATIONS from NPE (each flagged in code):
  * dropped the 6th sig fn name regex (used Java backref \2; Rust's
    `regex` crate is backtracking-free, so we substitute a loose form
    that NPE itself half-broke per audit Track B §2.1)
  * dropped the Java atomic group `(?>...)` from helper-object regex —
    Rust's NFA is already linear-time
  * nsig fixup substitutes `(?:"undefined"|'undefined')` for the
    \1 backref; harmless loosening
  * sig and nsig assembled snippets prepend `var` — QuickJS rejects
    bare-assignment to undeclared identifiers; NPE relied on Rhino's
    non-strict mode

Tests:
  * 43 lib unit tests (up from 7 in Phase 1)
  * 7 Phase 2 offline integration tests against a hand-crafted
    minified synthetic player.js — exercises the full sig pipeline
    (build_deobfuscator → runtime::run) and nsig fixup_function
  * 7 Phase 1 live smoke tests still green

57/57 total green.
2026-05-24 16:53:19 -07:00
46201c731f Phase 1 — Foundation
Mirror NPE's dependency-free spine in Rust:

* exceptions   — NetworkError + ParsingError + ContentUnavailable
                 + ExtractionError tree, with reqwest/serde_json conversions
* localization — Localization + ContentCountry, default (en, GB)
* downloader/  — Downloader trait, Request builder, Response,
                 reqwest blocking default impl
* page         — continuation-token carrier
* image        — Image + ImageSet + ResolutionLevel
                 (HEIGHT_UNKNOWN/WIDTH_UNKNOWN = -1)
* metainfo     — title/content/url/url_text grab-bag
* service      — StreamingService trait + LinkType + ServiceInfo
* newpipe      — process-global Downloader / Localization /
                 ContentCountry singleton

Foundational invariants nailed down (per SPEC §3):
* HTTP non-2xx returns Ok(Response); only 429 throws NetworkError::Recaptcha
* Response header keys lowercase-normalized
* Request.add_header PARITY with NPE bug (silent overwrite);
  append_header is our clean addition
* default Localization is en-GB
* No cookie jar in the default downloader

Tests: 7 unit + 7 live smoke against httpbin.org (gated on
'online-tests' feature). All green.
2026-05-24 16:32:36 -07:00