This repository has been archived on 2026-05-27. You can view files and clone it, but you cannot make any changes to it's state, such as pushing and creating new issues, pull requests or comments.
rustypipe/tests/sulkta_smoke.rs
Kayos 8126cc0da5 audit-fix sprint: all 13 findings (CRIT/HIGH/MED/LOW)
CRIT-1: ExtractionError::Deobfuscation is now switchable.
        Deobfuscator gains has_sig()/has_nsig() — deobfuscate_sig/_nsig
        short-circuit with a recognisable error class so cipher streams
        on the wrong client fall through to the next client in the chain
        instead of killing the whole call.

CRIT-2: Soft-failed DeobfData now caches with a 1-hour retry instead of
        living for 24h. Re-extraction kicks in automatically once YT
        rotates back to a player.js shape we recognise — no more
        wall-clock-day-of-poisoned-cache.

HIGH-1: Reporter now emits a Level::WRN `extract_deobf_soft_fail` report
        on partial extraction. straw / torttube get an artefact when
        sig/nsig regex starts missing.

HIGH-2: player_client_order branches on opts.auth. With botguard
        + authed-cookie users, Desktop is now position 2 (where their
        cookie maps to an OAuth session) instead of position 4.

HIGH-3: Android dropped from the default order. needs_po_token doesn't
        flag Android, so requests were firing unsigned and tripping
        YT's bot-check rejection — which is also not switchable.
        Re-add when a real po_token strategy lands.

MED-1: Comment in needs_deobf softened — the iOS/Android-no-deobf
        property is a current YT behaviour, not a permanent protocol.

MED-2: Cargo.toml workspace pin bumped 0.11.4 → 0.11.5 so it matches
        the package version (avoids future 0.12.x bump surprises).

MED-3: Smoke test fixture uses an isolated per-process scratch dir
        instead of the repo root, avoiding cache-race with
        tests/youtube.rs (which uses CARGO_MANIFEST_DIR and could
        wipe OAuth tokens).

LOW-1: Misleading "dead-code fallback" comment in extract_fns replaced
        with the actual behaviour description.

LOW-2: get_deobf_data uses read-then-write — concurrent player calls
        on warm cache no longer serialise on the write lock.

LOW-3: Smoke test catches IpBan via exact UnavailabilityReason match
        instead of substring "Sign in/IpBan/bot" — a real regression
        won't silently pass anymore.

LOW-4: TV smoke test now asserts !audio_streams.is_empty() too,
        matching iOS / default-order tests.

LOW-5: needs_deobf comment notes YT's historical n= experiments on
        Android — sets expectation for future review passes.
2026-05-24 12:20:14 -07:00

167 lines
5.9 KiB
Rust

//! Sulkta-fork smoke tests for the player pipeline.
//!
//! Verifies the patched default client order (`Ios, Tv` without botguard) plus
//! the soft-fail DeobfData::extract works against current YouTube player.js.
//!
//! Run with: `cargo test --test sulkta_smoke -- --nocapture`
use std::path::PathBuf;
use rstest::{fixture, rstest};
use rustypipe::client::{ClientType, RustyPipe};
use rustypipe::error::{Error, ExtractionError, UnavailabilityReason};
/// A stable, long-running, public-domain music video. Used by upstream
/// tests too (`n4tK7LYFxI0` = Spektrem - Shine, NCS).
const TEST_VIDEO_ID: &str = "n4tK7LYFxI0";
/// Build a `RustyPipe` with a per-process scratch storage dir. Avoids the
/// concurrent-write race with `tests/youtube.rs` that shares `rustypipe_cache.json`
/// in the repo root, which was tripping audit MED-3.
#[fixture]
fn rp() -> RustyPipe {
let scratch: PathBuf = std::env::temp_dir().join(format!(
"rustypipe-sulkta-smoke-{}-{}",
std::process::id(),
std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.map(|d| d.as_nanos())
.unwrap_or(0)
));
std::fs::create_dir_all(&scratch)
.unwrap_or_else(|e| panic!("create scratch storage dir {scratch:?}: {e}"));
RustyPipe::builder()
.storage_dir(&scratch)
.build()
.unwrap_or_else(|e| panic!("build RustyPipe with scratch={scratch:?}: {e}"))
}
/// Sanity: iOS path returns stream URLs and never touches the deobf code.
#[rstest]
#[tokio::test]
async fn ios_player_returns_streams(rp: RustyPipe) {
let pd = rp
.query()
.player_from_client(TEST_VIDEO_ID, ClientType::Ios)
.await
.expect("iOS player_from_client should succeed");
assert_eq!(pd.details.id, TEST_VIDEO_ID);
assert!(
!pd.video_streams.is_empty() || !pd.video_only_streams.is_empty(),
"expected at least one video stream"
);
assert!(
!pd.audio_streams.is_empty(),
"expected at least one audio stream"
);
}
/// TV path exercises the `needs_deobf=true` branch: the sig_timestamp request
/// payload is required, but the soft-fail patch keeps the call alive even when
/// sig_fn/nsig_fn regex extraction fails on a rotated player.js.
///
/// YouTube IP-bans some shared egress IPs (datacenters, LAN-routed servers)
/// for the TV client with "Sign in to confirm you're not a bot". That's
/// environmental — match it precisely on the `UnavailabilityReason` enum
/// instead of substring-matching the rendered error so a real regression
/// can't sneak past the catch arm.
#[rstest]
#[tokio::test]
async fn tv_player_returns_streams(rp: RustyPipe) {
match rp
.query()
.player_from_client(TEST_VIDEO_ID, ClientType::Tv)
.await
{
Ok(pd) => {
assert_eq!(pd.details.id, TEST_VIDEO_ID);
assert!(
!pd.video_streams.is_empty() || !pd.video_only_streams.is_empty(),
"TV path returned no video streams"
);
// Symmetric with iOS / default-order tests so a regression that
// silently drops the audio adaptation set can't pass here.
assert!(
!pd.audio_streams.is_empty(),
"TV path returned no audio streams"
);
}
Err(Error::Extraction(ExtractionError::Unavailable {
reason: UnavailabilityReason::IpBan,
..
})) => {
eprintln!(
"TV path skipped: YT IpBan on this egress (expected on shared/datacenter IPs)"
);
}
Err(e) => panic!("TV path failed for a non-environmental reason: {e}"),
}
}
/// The patched default-client order should pick iOS as primary and return
/// playable streams in the absence of botguard signing.
#[rstest]
#[tokio::test]
async fn default_client_order_returns_streams(rp: RustyPipe) {
let order = rp.query().player_client_order();
eprintln!("default client order (no botguard): {order:?}");
assert_eq!(
order[0],
ClientType::Ios,
"iOS should be the no-botguard primary"
);
let pd = rp
.query()
.player(TEST_VIDEO_ID)
.await
.expect("default-clients player() should succeed");
assert_eq!(pd.details.id, TEST_VIDEO_ID);
assert!(
!pd.video_streams.is_empty() || !pd.video_only_streams.is_empty(),
"expected at least one video stream from the default-clients path"
);
assert!(
!pd.audio_streams.is_empty(),
"expected at least one audio stream from the default-clients path"
);
// Probe one returned audio stream to confirm YT actually serves it.
// GET with Range 0-1023 + an iOS User-Agent because YT's googlevideo
// CDN tends to 403 HEAD requests and UA mismatches.
let stream_url = pd
.audio_streams
.first()
.expect("at least one audio stream")
.url
.clone();
eprintln!(
"probing first audio URL: {}",
&stream_url[..stream_url.len().min(180)]
);
let client = reqwest::Client::builder()
.user_agent(
"com.google.ios.youtube/19.45.4 (iPhone16,2; U; CPU iOS 18_1 like Mac OS X; en_US)",
)
.build()
.unwrap();
let resp = client
.get(&stream_url)
.header("Range", "bytes=0-1023")
.send()
.await
.expect("GET request to YT CDN should not error");
let status = resp.status();
let body_len = resp.bytes().await.map(|b| b.len()).unwrap_or(0);
eprintln!("response: {body_len} bytes, status {status}");
assert!(
status.is_success() || status.is_redirection(),
"audio URL Range-GET returned non-OK status: {status} (body={body_len} bytes; URL may need visitor_data or po_token)"
);
assert!(
body_len > 0,
"audio URL returned OK but zero bytes — likely a sig-required URL we couldn't deobf"
);
}