Phase 1 — Foundation

Mirror NPE's dependency-free spine in Rust:

* exceptions   — NetworkError + ParsingError + ContentUnavailable
                 + ExtractionError tree, with reqwest/serde_json conversions
* localization — Localization + ContentCountry, default (en, GB)
* downloader/  — Downloader trait, Request builder, Response,
                 reqwest blocking default impl
* page         — continuation-token carrier
* image        — Image + ImageSet + ResolutionLevel
                 (HEIGHT_UNKNOWN/WIDTH_UNKNOWN = -1)
* metainfo     — title/content/url/url_text grab-bag
* service      — StreamingService trait + LinkType + ServiceInfo
* newpipe      — process-global Downloader / Localization /
                 ContentCountry singleton

Foundational invariants nailed down (per SPEC §3):
* HTTP non-2xx returns Ok(Response); only 429 throws NetworkError::Recaptcha
* Response header keys lowercase-normalized
* Request.add_header PARITY with NPE bug (silent overwrite);
  append_header is our clean addition
* default Localization is en-GB
* No cookie jar in the default downloader

Tests: 7 unit + 7 live smoke against httpbin.org (gated on
'online-tests' feature). All green.
This commit is contained in:
Kayos 2026-05-24 16:32:36 -07:00
parent f44b46fab5
commit 46201c731f
16 changed files with 2689 additions and 1 deletions

1528
Cargo.lock generated Normal file

File diff suppressed because it is too large Load diff

40
Cargo.toml Normal file
View file

@ -0,0 +1,40 @@
[package]
name = "strawcore"
version = "0.1.0"
edition = "2021"
license = "GPL-3.0-or-later"
authors = ["Sulkta-Coop"]
repository = "http://192.168.0.5:3001/Sulkta-Coop/strawcore"
description = "Rust port of NewPipeExtractor (YT-only). Plugs into Straw via UniFFI."
[lib]
crate-type = ["rlib", "cdylib", "staticlib"]
[dependencies]
reqwest = { version = "0.12", default-features = false, features = ["rustls-tls-webpki-roots", "blocking", "gzip"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
thiserror = "1"
parking_lot = "0.12"
url = "2"
once_cell = "1"
[dev-dependencies]
serde_json = "1"
[features]
default = []
# `online-tests` gates network-dependent integration tests. Enable with
# `cargo test --features online-tests` once an internet route is available.
online-tests = []
[profile.release]
strip = true
lto = "thin"
codegen-units = 1
panic = "abort"
opt-level = "z"
[profile.dev]
opt-level = 0
debug = 1

View file

@ -1,3 +1,36 @@
# strawcore
Rust port of NewPipeExtractor (YT-only). Plugs into Straw via UniFFI.
Rust port of [NewPipeExtractor](https://github.com/TeamNewPipe/NewPipeExtractor) (v0.26.2), YouTube-only. Plugs into [Straw](http://192.168.0.5:3001/Sulkta-Coop/straw) via UniFFI.
## Why this exists
`rustypipe` regex-parses YouTube's `player.js` and reimplements the signature deobfuscator in Rust. Every YT player rotation breaks it. NPE embeds Mozilla Rhino and executes the JS function live — resilient by design, and that's the architecture we're mirroring.
The rustypipe-backed Straw build (vc=15..17) also routed playback through iOS-progressive URLs, which hit a server-side ~917 KiB end-byte cap. NPE uses the Android client + po_token → DASH manifest path, which doesn't see the cap. Same fix, different layer.
See `memory/npe-audit-2026-05-24/SPEC.md` in the workspace repo for the full plan.
## Status
| Phase | Subsystem | Status |
|---|---|---|
| 1 | Foundation (downloader + service spine) | **in progress** |
| 2 | JS engine (rquickjs + ress) | pending |
| 3 | InnerTube + itag table | pending |
| 4 | Stream extractor + DASH | pending |
| 5 | PoTokenProvider trait + Android JNI bridge | pending |
| 6 | Search + Channel + Playlist + Kiosks | pending |
| 7 | UniFFI surface swap | pending |
| 8 | Delete rustypipe everywhere | pending |
## Build + test
```bash
cargo build
cargo test --lib # offline unit tests
cargo test --features online-tests # full smoke incl. live httpbin.org
```
## License
GPL-3.0-or-later. NPE is GPL-3.0; this port inherits.

View file

@ -0,0 +1,97 @@
// reqwest-backed default Downloader.
//
// Mirrors NewPipe-app's OkHttpDownloaderImpl behavior:
// * blocking (mirrors NPE's sync surface; async is deferred to a later
// phase that threads tokio through the whole tree)
// * no cookie jar — apps hand-build Cookie headers
// * up to 10 redirects, ~30s timeout
// * HTTP 429 → NetworkError::Recaptcha; all other status codes surface
// as Ok(Response)
use std::time::Duration;
use reqwest::blocking::Client;
use reqwest::redirect::Policy;
use crate::downloader::request::{Method, Request};
use crate::downloader::response::{Headers as RespHeaders, Response};
use crate::downloader::Downloader;
use crate::exceptions::NetworkError;
const DEFAULT_TIMEOUT: Duration = Duration::from_secs(30);
const MAX_REDIRECTS: usize = 10;
const USER_AGENT: &str =
"Mozilla/5.0 (Linux; Android 14) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Mobile Safari/537.36";
pub struct ReqwestDownloader {
client: Client,
}
impl ReqwestDownloader {
pub fn new() -> Result<Self, NetworkError> {
let client = Client::builder()
.user_agent(USER_AGENT)
.timeout(DEFAULT_TIMEOUT)
.redirect(Policy::limited(MAX_REDIRECTS))
.gzip(true)
.build()?;
Ok(Self { client })
}
pub fn with_client(client: Client) -> Self {
Self { client }
}
}
impl Downloader for ReqwestDownloader {
fn execute(&self, request: Request) -> Result<Response, NetworkError> {
let method = match request.method() {
Method::Get => reqwest::Method::GET,
Method::Head => reqwest::Method::HEAD,
Method::Post => reqwest::Method::POST,
Method::Put => reqwest::Method::PUT,
Method::Delete => reqwest::Method::DELETE,
};
let mut builder = self.client.request(method, request.url());
for (name, values) in request.headers() {
for value in values {
builder = builder.header(name, value);
}
}
if let Some(loc) = request.localization() {
if request.automatic_localization_header() {
builder = builder.header("Accept-Language", loc.localization_code());
}
}
if let Some(body) = request.body() {
builder = builder.body(body.to_vec());
}
let resp = builder.send()?;
let status = resp.status();
let url_after_redirects = resp.url().to_string();
if status.as_u16() == 429 {
return Err(NetworkError::Recaptcha { url: url_after_redirects });
}
let code = status.as_u16();
let message = status.canonical_reason().unwrap_or("").to_string();
let mut headers: RespHeaders = RespHeaders::new();
for (name, value) in resp.headers().iter() {
let key = name.as_str().to_ascii_lowercase();
let val = value.to_str().unwrap_or("").to_string();
headers.entry(key).or_default().push(val);
}
let body = resp.text()?;
Ok(Response::new(code, message, headers, body, url_after_redirects))
}
}

48
src/downloader/mod.rs Normal file
View file

@ -0,0 +1,48 @@
// Downloader contract — mirrors NPE's Downloader abstract class.
//
// Foundational invariants (SPEC §3, audited from NPE Downloader.java +
// OkHttpDownloaderImpl in the NewPipe-app):
//
// * No automatic cookie jar. `Cookie:` header is hand-built per request.
// * HTTP non-2xx is NOT an error. Only HTTP 429 throws
// (NetworkError::Recaptcha). Every other 4xx/5xx surfaces as Ok(Response)
// with the status set. Callers inspect themselves.
// * Response::headers normalizes keys to lowercase (OkHttp does this for
// NPE; we make it contractual).
// * Request::add_header mirrors NPE's set-on-add bug — last write wins.
// append_header is our clean addition.
pub mod default_impl;
pub mod request;
pub mod response;
use crate::exceptions::NetworkError;
use crate::localization::Localization;
pub use default_impl::ReqwestDownloader;
pub use request::{Request, RequestBuilder};
pub use response::Response;
pub trait Downloader: Send + Sync {
fn execute(&self, request: Request) -> Result<Response, NetworkError>;
fn get(&self, url: &str) -> Result<Response, NetworkError> {
self.execute(Request::get(url).build())
}
fn get_localized(
&self,
url: &str,
localization: Localization,
) -> Result<Response, NetworkError> {
self.execute(Request::get(url).localization(Some(localization)).build())
}
fn head(&self, url: &str) -> Result<Response, NetworkError> {
self.execute(Request::head(url).build())
}
fn post(&self, url: &str, body: Vec<u8>) -> Result<Response, NetworkError> {
self.execute(Request::post(url, body).build())
}
}

192
src/downloader/request.rs Normal file
View file

@ -0,0 +1,192 @@
// Request + RequestBuilder — mirrors NPE Request.java.
//
// PARITY: add_header silently overwrites instead of appending, per NPE
// Request.java:215-221. Callers depend on this. append_header is our
// own clean addition for callers we control.
use std::collections::BTreeMap;
use crate::localization::Localization;
pub type Headers = BTreeMap<String, Vec<String>>;
#[derive(Clone, Debug, Eq, PartialEq)]
pub enum Method {
Get,
Head,
Post,
Put,
Delete,
}
impl Method {
pub fn as_str(&self) -> &'static str {
match self {
Method::Get => "GET",
Method::Head => "HEAD",
Method::Post => "POST",
Method::Put => "PUT",
Method::Delete => "DELETE",
}
}
}
#[derive(Clone, Debug)]
pub struct Request {
method: Method,
url: String,
headers: Headers,
body: Option<Vec<u8>>,
localization: Option<Localization>,
automatic_localization_header: bool,
}
impl Request {
pub fn get(url: impl Into<String>) -> RequestBuilder {
RequestBuilder::new(Method::Get, url)
}
pub fn head(url: impl Into<String>) -> RequestBuilder {
RequestBuilder::new(Method::Head, url)
}
pub fn post(url: impl Into<String>, body: Vec<u8>) -> RequestBuilder {
RequestBuilder::new(Method::Post, url).body(Some(body))
}
pub fn method(&self) -> &Method {
&self.method
}
pub fn url(&self) -> &str {
&self.url
}
pub fn headers(&self) -> &Headers {
&self.headers
}
pub fn body(&self) -> Option<&[u8]> {
self.body.as_deref()
}
pub fn localization(&self) -> Option<&Localization> {
self.localization.as_ref()
}
pub fn automatic_localization_header(&self) -> bool {
self.automatic_localization_header
}
}
#[derive(Clone, Debug)]
pub struct RequestBuilder {
method: Method,
url: String,
headers: Headers,
body: Option<Vec<u8>>,
localization: Option<Localization>,
automatic_localization_header: bool,
}
impl RequestBuilder {
pub fn new(method: Method, url: impl Into<String>) -> Self {
Self {
method,
url: url.into(),
headers: BTreeMap::new(),
body: None,
localization: None,
automatic_localization_header: true,
}
}
/// PARITY with NPE Request.Builder.addHeader: silently overwrites any
/// existing values for `name`. Callers downstream of NPE-derived code
/// depend on this. For new code prefer [`Self::append_header`].
pub fn add_header(mut self, name: impl Into<String>, value: impl Into<String>) -> Self {
let key = lowercase(name.into());
self.headers.insert(key, vec![value.into()]);
self
}
/// Appends a value to `name`, creating the entry if absent. This is the
/// behavior NPE's addHeader was intended to have. Use freely in our own
/// code; avoid when porting NPE call sites that rely on overwrite.
pub fn append_header(mut self, name: impl Into<String>, value: impl Into<String>) -> Self {
let key = lowercase(name.into());
self.headers.entry(key).or_default().push(value.into());
self
}
pub fn headers(mut self, headers: Headers) -> Self {
self.headers = headers
.into_iter()
.map(|(k, v)| (lowercase(k), v))
.collect();
self
}
pub fn body(mut self, body: Option<Vec<u8>>) -> Self {
self.body = body;
self
}
pub fn localization(mut self, localization: Option<Localization>) -> Self {
self.localization = localization;
self
}
pub fn automatic_localization_header(mut self, on: bool) -> Self {
self.automatic_localization_header = on;
self
}
pub fn build(self) -> Request {
Request {
method: self.method,
url: self.url,
headers: self.headers,
body: self.body,
localization: self.localization,
automatic_localization_header: self.automatic_localization_header,
}
}
}
fn lowercase(s: String) -> String {
s.to_ascii_lowercase()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn add_header_overwrites_parity() {
let r = Request::get("https://x")
.add_header("X-Foo", "first")
.add_header("X-Foo", "second")
.build();
assert_eq!(r.headers().get("x-foo"), Some(&vec!["second".into()]));
}
#[test]
fn append_header_accumulates() {
let r = Request::get("https://x")
.append_header("X-Foo", "first")
.append_header("X-Foo", "second")
.build();
assert_eq!(
r.headers().get("x-foo"),
Some(&vec!["first".into(), "second".into()])
);
}
#[test]
fn headers_keys_lowercased() {
let r = Request::get("https://x").add_header("Content-Type", "text/plain").build();
assert!(r.headers().contains_key("content-type"));
assert!(!r.headers().contains_key("Content-Type"));
}
}

View file

@ -0,0 +1,96 @@
// Response — mirrors NPE Response.java.
//
// Header keys are lowercased (SPEC §3 invariant #3). latest_url tracks the
// final URL after redirect chasing — used by every linkHandler and the
// channel resolver loop.
use std::collections::BTreeMap;
pub type Headers = BTreeMap<String, Vec<String>>;
#[derive(Clone, Debug)]
pub struct Response {
response_code: u16,
response_message: String,
response_headers: Headers,
response_body: String,
latest_url: String,
}
impl Response {
pub fn new(
response_code: u16,
response_message: impl Into<String>,
response_headers: Headers,
response_body: impl Into<String>,
latest_url: impl Into<String>,
) -> Self {
let response_headers = response_headers
.into_iter()
.map(|(k, v)| (k.to_ascii_lowercase(), v))
.collect();
Self {
response_code,
response_message: response_message.into(),
response_headers,
response_body: response_body.into(),
latest_url: latest_url.into(),
}
}
pub fn response_code(&self) -> u16 {
self.response_code
}
pub fn response_message(&self) -> &str {
&self.response_message
}
pub fn response_headers(&self) -> &Headers {
&self.response_headers
}
pub fn response_body(&self) -> &str {
&self.response_body
}
pub fn latest_url(&self) -> &str {
&self.latest_url
}
pub fn header(&self, name: &str) -> Option<&str> {
let key = name.to_ascii_lowercase();
self.response_headers.get(&key).and_then(|v| v.first()).map(String::as_str)
}
pub fn headers(&self, name: &str) -> Vec<&str> {
let key = name.to_ascii_lowercase();
match self.response_headers.get(&key) {
Some(v) => v.iter().map(String::as_str).collect(),
None => Vec::new(),
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn header_lookup_case_insensitive() {
let mut h = Headers::new();
h.insert("content-type".into(), vec!["application/json".into()]);
let r = Response::new(200, "OK", h, "{}", "https://x");
assert_eq!(r.header("Content-Type"), Some("application/json"));
assert_eq!(r.header("CONTENT-TYPE"), Some("application/json"));
}
#[test]
fn headers_normalized_to_lowercase() {
let mut h = Headers::new();
h.insert("X-Foo".into(), vec!["bar".into()]);
let r = Response::new(200, "OK", h, "", "https://x");
assert!(r.response_headers().contains_key("x-foo"));
assert!(!r.response_headers().contains_key("X-Foo"));
}
}

91
src/exceptions.rs Normal file
View file

@ -0,0 +1,91 @@
// Mirrors NewPipeExtractor's exception tree (extractor/src/main/java/org/schabi/newpipe/extractor/exceptions/*).
//
// NPE's hierarchy:
// ExtractionException (root)
// ├── ParsingException
// ├── ContentNotAvailableException
// │ ├── AgeRestrictedContentException
// │ ├── GeographicRestrictionException
// │ ├── PaidContentException
// │ ├── PrivateContentException
// │ ├── YoutubeMusicPremiumContentException
// │ └── SoundCloudGoPlusContentException
// ├── ReCaptchaException
// └── AccountTerminatedException
//
// NetworkError is the IOException-equivalent — only transport-level failures
// throw it. HTTP non-2xx returns Ok(Response). HTTP 429 is the one
// "downloader-aborts" condition, surfaced as NetworkError::Recaptcha.
use thiserror::Error;
#[derive(Debug, Error)]
pub enum NetworkError {
#[error("network transport: {0}")]
Transport(String),
#[error("HTTP 429 reCAPTCHA at {url}")]
Recaptcha { url: String },
}
#[derive(Debug, Error)]
pub enum ParsingError {
#[error("regex didn't match: {0}")]
RegexMiss(String),
#[error("missing field: {0}")]
MissingField(String),
#[error("unexpected JSON shape: {0}")]
JsonShape(String),
#[error("invalid input: {0}")]
Invalid(String),
}
#[derive(Debug, Error)]
pub enum ContentUnavailable {
#[error("age restricted")]
AgeRestricted,
#[error("geo restricted")]
GeoRestricted,
#[error("paid content")]
Paid,
#[error("private content")]
Private,
#[error("YouTube Music premium content")]
YoutubeMusicPremium,
#[error("SoundCloud Go+ content")]
SoundCloudGoPlus,
#[error("account terminated")]
AccountTerminated,
#[error("unavailable: {0}")]
Other(String),
}
#[derive(Debug, Error)]
pub enum ExtractionError {
#[error("network: {0}")]
Network(#[from] NetworkError),
#[error("parsing: {0}")]
Parsing(#[from] ParsingError),
#[error("content unavailable: {0}")]
ContentUnavailable(#[from] ContentUnavailable),
#[error("{0}")]
Other(String),
}
impl From<reqwest::Error> for NetworkError {
fn from(e: reqwest::Error) -> Self {
NetworkError::Transport(e.to_string())
}
}
impl From<serde_json::Error> for ParsingError {
fn from(e: serde_json::Error) -> Self {
ParsingError::JsonShape(e.to_string())
}
}

72
src/image.rs Normal file
View file

@ -0,0 +1,72 @@
// Image + ImageSet + ResolutionLevel. Mirrors NPE Image.java.
//
// HEIGHT_UNKNOWN / WIDTH_UNKNOWN are -1 sentinels per SPEC §3 invariant #10
// — kept as i32, not Option<u32>, because several JSON output sites encode
// this directly.
pub const HEIGHT_UNKNOWN: i32 = -1;
pub const WIDTH_UNKNOWN: i32 = -1;
#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash)]
pub enum ResolutionLevel {
Low,
Medium,
High,
Unknown,
}
impl ResolutionLevel {
pub fn from_height(height: i32) -> Self {
if height == HEIGHT_UNKNOWN {
ResolutionLevel::Unknown
} else if height <= 175 {
ResolutionLevel::Low
} else if height <= 720 {
ResolutionLevel::Medium
} else {
ResolutionLevel::High
}
}
}
#[derive(Clone, Debug)]
pub struct Image {
url: String,
height: i32,
width: i32,
estimated_resolution_level: ResolutionLevel,
}
impl Image {
pub fn new(
url: impl Into<String>,
height: i32,
width: i32,
estimated_resolution_level: ResolutionLevel,
) -> Self {
Self {
url: url.into(),
height,
width,
estimated_resolution_level,
}
}
pub fn url(&self) -> &str {
&self.url
}
pub fn height(&self) -> i32 {
self.height
}
pub fn width(&self) -> i32 {
self.width
}
pub fn estimated_resolution_level(&self) -> ResolutionLevel {
self.estimated_resolution_level
}
}
pub type ImageSet = Vec<Image>;

24
src/lib.rs Normal file
View file

@ -0,0 +1,24 @@
// Rust port of NewPipeExtractor (YT-only).
//
// Phase 1 lays the dependency-free spine that everything else builds on:
// errors, localization, the Downloader contract, value types, the
// StreamingService trait, and the NewPipe singleton. None of this module
// tree knows anything about YouTube yet — that lands in Phase 3+.
pub mod downloader;
pub mod exceptions;
pub mod image;
pub mod localization;
pub mod metainfo;
pub mod newpipe;
pub mod page;
pub mod service;
pub use downloader::{Downloader, Request, Response};
pub use exceptions::{ExtractionError, NetworkError, ParsingError};
pub use image::{Image, ImageSet, ResolutionLevel};
pub use localization::{ContentCountry, Localization};
pub use metainfo::MetaInfo;
pub use newpipe::NewPipe;
pub use page::Page;
pub use service::{LinkType, ServiceInfo, StreamingService};

109
src/localization.rs Normal file
View file

@ -0,0 +1,109 @@
// Localization + ContentCountry. Per SPEC §3 invariant #9, the DEFAULT
// Localization is ("en", "GB") — not en-US, not the system locale.
// NPE's Localization.java exposes ~100 country codes; we ship a small
// in-source set today and grow as needed.
use std::fmt;
#[derive(Clone, Debug, Eq, PartialEq, Hash)]
pub struct Localization {
language_code: String,
country_code: Option<String>,
}
impl Localization {
pub fn new(language_code: impl Into<String>, country_code: Option<String>) -> Self {
Self { language_code: language_code.into(), country_code }
}
pub fn from_localization_code(code: &str) -> Option<Self> {
let (lang, country) = code.split_once('-').unwrap_or((code, ""));
if lang.is_empty() {
return None;
}
Some(Self {
language_code: lang.to_string(),
country_code: if country.is_empty() { None } else { Some(country.to_string()) },
})
}
pub fn language_code(&self) -> &str {
&self.language_code
}
pub fn country_code(&self) -> Option<&str> {
self.country_code.as_deref()
}
pub fn localization_code(&self) -> String {
match &self.country_code {
Some(c) => format!("{}-{}", self.language_code, c),
None => self.language_code.clone(),
}
}
}
impl Default for Localization {
fn default() -> Self {
Self::new("en", Some("GB".into()))
}
}
impl fmt::Display for Localization {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.write_str(&self.localization_code())
}
}
#[derive(Clone, Debug, Eq, PartialEq, Hash)]
pub struct ContentCountry {
country_code: String,
}
impl ContentCountry {
pub fn new(country_code: impl Into<String>) -> Self {
Self { country_code: country_code.into() }
}
pub fn country_code(&self) -> &str {
&self.country_code
}
}
impl Default for ContentCountry {
fn default() -> Self {
Self::new("GB")
}
}
impl fmt::Display for ContentCountry {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.write_str(&self.country_code)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn default_is_en_gb() {
let l = Localization::default();
assert_eq!(l.language_code(), "en");
assert_eq!(l.country_code(), Some("GB"));
assert_eq!(l.localization_code(), "en-GB");
}
#[test]
fn parse_localization_code() {
let l = Localization::from_localization_code("en-US").unwrap();
assert_eq!(l.language_code(), "en");
assert_eq!(l.country_code(), Some("US"));
let l = Localization::from_localization_code("de").unwrap();
assert_eq!(l.language_code(), "de");
assert_eq!(l.country_code(), None);
assert!(Localization::from_localization_code("").is_none());
}
}

53
src/metainfo.rs Normal file
View file

@ -0,0 +1,53 @@
// MetaInfo — mirrors NPE MetaInfo.java.
//
// Carries "info card" style data (knowledge-panel boxes, COVID/election
// warning banners, etc.) attached to a stream or search result. Paired URLs
// + URL texts — same indices.
use url::Url;
#[derive(Clone, Debug, Default)]
pub struct MetaInfo {
title: String,
content: String,
urls: Vec<Url>,
url_texts: Vec<String>,
}
impl MetaInfo {
pub fn new() -> Self {
Self::default()
}
pub fn title(&self) -> &str {
&self.title
}
pub fn set_title(&mut self, title: impl Into<String>) -> &mut Self {
self.title = title.into();
self
}
pub fn content(&self) -> &str {
&self.content
}
pub fn set_content(&mut self, content: impl Into<String>) -> &mut Self {
self.content = content.into();
self
}
pub fn urls(&self) -> &[Url] {
&self.urls
}
pub fn url_texts(&self) -> &[String] {
&self.url_texts
}
pub fn add_url(&mut self, url: Url, text: impl Into<String>) -> &mut Self {
self.urls.push(url);
self.url_texts.push(text.into());
self
}
}

68
src/newpipe.rs Normal file
View file

@ -0,0 +1,68 @@
// NewPipe singleton — mirrors NPE NewPipe.java.
//
// Holds the process-global Downloader + preferred Localization +
// preferred ContentCountry. init() once at startup, then call sites read
// the globals through these getters.
//
// Concrete service registration lands in Phase 3+ once YoutubeService
// exists. Phase 1 only wires the globals.
use std::sync::Arc;
use parking_lot::RwLock;
use crate::downloader::Downloader;
use crate::localization::{ContentCountry, Localization};
pub struct NewPipe {
downloader: RwLock<Option<Arc<dyn Downloader>>>,
preferred_localization: RwLock<Localization>,
preferred_content_country: RwLock<ContentCountry>,
}
impl NewPipe {
pub fn instance() -> &'static NewPipe {
use once_cell::sync::Lazy;
static INSTANCE: Lazy<NewPipe> = Lazy::new(|| NewPipe {
downloader: RwLock::new(None),
preferred_localization: RwLock::new(Localization::default()),
preferred_content_country: RwLock::new(ContentCountry::default()),
});
&INSTANCE
}
pub fn init(downloader: Arc<dyn Downloader>) {
*Self::instance().downloader.write() = Some(downloader);
}
pub fn init_full(
downloader: Arc<dyn Downloader>,
localization: Localization,
content_country: ContentCountry,
) {
let np = Self::instance();
*np.downloader.write() = Some(downloader);
*np.preferred_localization.write() = localization;
*np.preferred_content_country.write() = content_country;
}
pub fn downloader() -> Option<Arc<dyn Downloader>> {
Self::instance().downloader.read().clone()
}
pub fn preferred_localization() -> Localization {
Self::instance().preferred_localization.read().clone()
}
pub fn preferred_content_country() -> ContentCountry {
Self::instance().preferred_content_country.read().clone()
}
pub fn set_preferred_localization(localization: Localization) {
*Self::instance().preferred_localization.write() = localization;
}
pub fn set_preferred_content_country(content_country: ContentCountry) {
*Self::instance().preferred_content_country.write() = content_country;
}
}

79
src/page.rs Normal file
View file

@ -0,0 +1,79 @@
// Page — continuation token carrier. Mirrors NPE Page.java.
//
// Used everywhere "the next page" is paginated through an opaque token
// (search results, channel videos, playlist videos, comments). The fields
// are deliberately a grab-bag — NPE callers stuff whatever they need to
// resume.
use std::collections::BTreeMap;
#[derive(Clone, Debug, Default)]
pub struct Page {
url: Option<String>,
id: Option<String>,
ids: Vec<String>,
body: Option<Vec<u8>>,
cookies: BTreeMap<String, String>,
}
impl Page {
pub fn new() -> Self {
Self::default()
}
pub fn with_url(url: impl Into<String>) -> Self {
Self { url: Some(url.into()), ..Self::default() }
}
pub fn url(&self) -> Option<&str> {
self.url.as_deref()
}
pub fn set_url(&mut self, url: Option<String>) -> &mut Self {
self.url = url;
self
}
pub fn id(&self) -> Option<&str> {
self.id.as_deref()
}
pub fn set_id(&mut self, id: Option<String>) -> &mut Self {
self.id = id;
self
}
pub fn ids(&self) -> &[String] {
&self.ids
}
pub fn set_ids(&mut self, ids: Vec<String>) -> &mut Self {
self.ids = ids;
self
}
pub fn body(&self) -> Option<&[u8]> {
self.body.as_deref()
}
pub fn set_body(&mut self, body: Option<Vec<u8>>) -> &mut Self {
self.body = body;
self
}
pub fn cookies(&self) -> &BTreeMap<String, String> {
&self.cookies
}
pub fn set_cookies(&mut self, cookies: BTreeMap<String, String>) -> &mut Self {
self.cookies = cookies;
self
}
pub fn is_valid(&self) -> bool {
self.url.as_deref().map(|s| !s.is_empty()).unwrap_or(false)
|| self.id.as_deref().map(|s| !s.is_empty()).unwrap_or(false)
|| !self.ids.is_empty()
|| self.body.as_ref().map(|b| !b.is_empty()).unwrap_or(false)
}
}

63
src/service.rs Normal file
View file

@ -0,0 +1,63 @@
// StreamingService trait + ServiceInfo + LinkType. Mirrors NPE
// StreamingService.java.
//
// Phase 1 keeps this dependency-free — no extractor traits, no per-service
// linkHandler factories. YouTube's concrete impl lands in Phase 3 once the
// JS engine is in place (Phase 2).
//
// Service IDs are stable persistence keys per SPEC §3 invariant #6:
// YouTube = 0, even if we never implement another service.
use std::fmt;
#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash)]
pub enum LinkType {
None,
Stream,
Channel,
Playlist,
}
#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash)]
pub enum MediaCapability {
Audio,
Video,
LiveStream,
Comments,
}
#[derive(Clone, Debug)]
pub struct ServiceInfo {
name: String,
media_capabilities: Vec<MediaCapability>,
}
impl ServiceInfo {
pub fn new(name: impl Into<String>, media_capabilities: Vec<MediaCapability>) -> Self {
Self { name: name.into(), media_capabilities }
}
pub fn name(&self) -> &str {
&self.name
}
pub fn media_capabilities(&self) -> &[MediaCapability] {
&self.media_capabilities
}
}
pub trait StreamingService: Send + Sync {
fn service_id(&self) -> u32;
fn service_info(&self) -> &ServiceInfo;
fn base_url(&self) -> &str;
}
impl fmt::Debug for dyn StreamingService {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("StreamingService")
.field("service_id", &self.service_id())
.field("name", &self.service_info().name())
.finish()
}
}

95
tests/foundation_smoke.rs Normal file
View file

@ -0,0 +1,95 @@
// Phase 1 smoke — exercises the foundation against live httpbin.org.
//
// Per SPEC §4 Phase 1 "Done when": build a Request, send through default
// Downloader, parse Response, confirm latest_url follows redirects.
//
// These tests hit the network — gated on the `online` cfg so CI offline
// runs aren't broken.
#![cfg(feature = "online-tests")]
use std::sync::Arc;
use strawcore::downloader::request::Request;
use strawcore::downloader::ReqwestDownloader;
use strawcore::exceptions::NetworkError;
use strawcore::localization::{ContentCountry, Localization};
use strawcore::{Downloader, NewPipe};
#[test]
fn get_through_default_downloader() {
let dl = ReqwestDownloader::new().expect("build downloader");
let resp = dl.get("https://httpbin.org/get").expect("transport");
assert_eq!(resp.response_code(), 200);
assert!(resp.response_body().contains("\"url\""));
}
#[test]
fn latest_url_follows_redirects() {
let dl = ReqwestDownloader::new().expect("build downloader");
let resp = dl
.get("https://httpbin.org/redirect/3")
.expect("transport");
assert_eq!(resp.response_code(), 200);
assert!(
resp.latest_url().ends_with("/get"),
"latest_url should land at /get after 3 redirects, got {}",
resp.latest_url()
);
}
#[test]
fn non_2xx_returns_ok_not_err() {
let dl = ReqwestDownloader::new().expect("build downloader");
let resp = dl.get("https://httpbin.org/status/404").expect("transport");
assert_eq!(resp.response_code(), 404);
}
#[test]
fn http_429_surfaces_as_recaptcha_err() {
let dl = ReqwestDownloader::new().expect("build downloader");
let err = dl.get("https://httpbin.org/status/429").expect_err("429 must be NetworkError");
match err {
NetworkError::Recaptcha { url } => assert!(url.contains("/status/429")),
other => panic!("expected Recaptcha, got {other:?}"),
}
}
#[test]
fn localization_header_attached_when_enabled() {
let dl = ReqwestDownloader::new().expect("build downloader");
let req = Request::get("https://httpbin.org/headers")
.localization(Some(Localization::new("en", Some("GB".into()))))
.build();
let resp = dl.execute(req).expect("transport");
assert_eq!(resp.response_code(), 200);
assert!(
resp.response_body().to_ascii_lowercase().contains("accept-language"),
"Accept-Language should be echoed by httpbin"
);
assert!(resp.response_body().contains("en-GB"));
}
#[test]
fn header_keys_lowercased_in_response() {
let dl = ReqwestDownloader::new().expect("build downloader");
let resp = dl.get("https://httpbin.org/get").expect("transport");
for (k, _) in resp.response_headers() {
assert_eq!(k, &k.to_ascii_lowercase(), "header key {k} not lowercased");
}
}
#[test]
fn newpipe_singleton_wires_downloader() {
let dl = Arc::new(ReqwestDownloader::new().expect("build downloader"));
NewPipe::init_full(
dl.clone(),
Localization::default(),
ContentCountry::default(),
);
let from_global = NewPipe::downloader().expect("downloader registered");
let resp = from_global.get("https://httpbin.org/get").expect("transport");
assert_eq!(resp.response_code(), 200);
assert_eq!(NewPipe::preferred_localization().localization_code(), "en-GB");
}