These three modules were ported from NewPipeExtractor in Phase 1 as
part of the spine. Nothing in the YT extractor (channel/search/stream/
playlist/linkhandler) imports them, and the strawcore wrapper crate
that consumes us doesn't re-export them either. Per the round-2
audit's cruft inventory, this is ~195 LOC of dead surface shipping
to every Android APK.
* page.rs — Page continuation carrier; continuation tokens flow
through the codebase as plain Strings.
* metainfo.rs — NPE MetaInfo info-card struct; no extractor
populates it.
* service.rs — StreamingService trait + ServiceInfo + LinkType;
zero impls exist anywhere.
Wrapper does not need to change — none of the pub use re-exports
crossed the crate boundary.
Channels on the newer pageHeaderRenderer layout (most channels with a
2024+ refreshed header — WTYP, etc.) were getting empty avatars and
banners since the parse_channel_browse only extracted those from the
older c4TabbedHeaderRenderer branch.
Two fixes layered:
1. parse_page_header_avatar() — walks the deep ViewModel nest:
header.content.pageHeaderViewModel.image
.decoratedAvatarViewModel.avatar.avatarViewModel.image.sources[]
Falls back to a couple of shallower nestings YT has used on this
path historically. Returns ImageSet sorted by height ascending so
.last() still picks the largest source.
2. metadata.channelMetadataRenderer.avatar.thumbnails[] backfill.
Set whether the header is c4Tabbed or pageHeader, and the most
reliable single avatar source. Used only when both header branches
came back empty so we don't override a higher-quality header avatar.
Description-from-metadata extraction folded into the same metadata
walk to avoid the JSON tree twice.
Found via emulator smoke that channelInfo was returning empty
recent_videos list, breaking the subscriptions feed.
Two root causes:
1. First browse of a channel by browseId lands on the HOME tab in
2026 YT, not Videos. Home uses sectionListRenderer, not the
richGridRenderer my parser expected. The Videos tab in the
response carries an empty content block (you need a SECOND
browse with the params token to populate it).
2. Channel video items on the Videos tab migrated from
videoRenderer to lockupViewModel (YT made the switch ~2024).
My old parser only handled videoRenderer.
Fix:
* fetch_channel_browse now does TWO browses — first for Home
(header + metadata), second with params='EgZ2aWRlb3PyBgQKAjoA'
for the Videos tab. Same magic constant NPE uses (audit Track
A §2.4).
* parse_videos_tab handles BOTH videoRenderer (legacy/fallback)
AND lockupViewModel (current). lockupViewModel parse extracts:
- contentId → video ID
- metadata.lockupMetadataViewModel.title.content → title
- metadataRows[].metadataParts[].text.content → view-count
('1.1m views') + relative-age ('2 years ago') + uploader
- contentImage.thumbnailViewModel.overlays[]
.thumbnailBottomOverlayViewModel.badges[]
.thumbnailBadgeViewModel.text → duration ('3:14:08')
- contentImage.thumbnailViewModel.image.sources[] → thumbnails
* parse_videos_continuation pulls the continuation token from the
Videos tab grid for pagination.
Second browse is best-effort: if it fails, recent_videos stays
empty and the channel header still populates from the first.
Verified the YT response shape by probing live channel
UCwwtUfy0-CqN50HfaFDzL0w (NCS Spektrem) — got 30+ lockup-style
video items with the expected fields.
Caught during the cargo-ndk cross-compile — strawcore-core was
emitting its own libstrawcore_core.so (~306 KB per ABI) into Straw's
jniLibs. That .so is never loaded by Android; the wrapper crate's
libstrawcore.so is the only entry point.
rlib only is what consumer crates need.
Straw's wrapper crate already owns the name 'strawcore' (and that name
is baked into the Android .so file + Kotlin's System.loadLibrary call).
Renaming this extractor crate to 'strawcore-core' resolves the cargo
package-name collision so both can live in the same workspace dep tree.
Repo name on Gitea stays Sulkta-Coop/strawcore.
Mirrors NPE PoTokenProvider.java + PoTokenResult.java; defines the
host-injection surface for BotGuard attestation. The Rust crate stays
out of the BotGuard business — embedders (Straw on Android, future
Sulkta CLI via Browserless, etc.) supply their own impl.
src/youtube/potoken/mod.rs
* PoTokenResult { player_request_po_token, streaming_data_po_token,
visitor_data } + ::new + ::single constructors
* PoTokenError (Unavailable, MintFailed) — FIX vs NPE: split 'declined'
(Ok(None)) from 'errored' (Err) so callers can react differently
* trait PoTokenProvider with 4 client-scoped methods; default impl
returns Ok(None) so embedders can override just what they support
* set_po_token_provider / clear_po_token_provider / po_token_provider
static registration via RwLock<Option<Arc<dyn PoTokenProvider>>>
src/youtube/potoken/noop.rs
* NoopPoTokenProvider — safe default
src/youtube/stream_extractor.rs
* resolve_po_token via options-first-then-provider helper
(options_or_provider)
* Android branch: pulls player_request_po_token + visitor_data into
/player body, streams streaming_data_po_token through to URL &pot=
* iOS branch: same shape, gated on fetch_ios_client AND non-empty
provider result
Kotlin side (PoTokenWebView lift into Straw via UniFFI's foreign-trait
bridge) is separate work — strawcore just owns the contract.
Tests: 77 lib unit pass (+4 since Phase 4) + 7 Phase 2 offline + 7
Phase 4 offline = 91 green.
Port NewPipeExtractor's JS pipeline: player.js fetch + cache, sig and
nsig function extraction, deobfuscation, sticky-error caching.
src/youtube/js/
* runtime.rs — rquickjs wrapper (mirrors utils/JavaScript.java)
compile_or_throw + run(snippet, name, parameter)
* lexer.rs — match_to_closing_brace via the `ress` JS scanner
(NPE's lexer is derived from the same crate
upstream)
* extractor.rs — iframe_api → embed page fallback for player.js
URL, regex-driven hash extraction, clean-and-fetch
* signature.rs — 6 sig fn name regexes (front-most-recent),
deobf-function-body via lexer w/ regex fallback,
helper-object + global-string-array extraction,
signatureTimestamp, snippet assembler
* nsig.rs — 8 nsig fn name regexes (incl. array-indirection),
body via lexer w/ regex fallback, fixupFunction
early-return strip
* player_manager.rs — orchestrator + sticky-error cache mirroring
YoutubeJavaScriptPlayerManager
PORT DEVIATIONS from NPE (each flagged in code):
* dropped the 6th sig fn name regex (used Java backref \2; Rust's
`regex` crate is backtracking-free, so we substitute a loose form
that NPE itself half-broke per audit Track B §2.1)
* dropped the Java atomic group `(?>...)` from helper-object regex —
Rust's NFA is already linear-time
* nsig fixup substitutes `(?:"undefined"|'undefined')` for the
\1 backref; harmless loosening
* sig and nsig assembled snippets prepend `var` — QuickJS rejects
bare-assignment to undeclared identifiers; NPE relied on Rhino's
non-strict mode
Tests:
* 43 lib unit tests (up from 7 in Phase 1)
* 7 Phase 2 offline integration tests against a hand-crafted
minified synthetic player.js — exercises the full sig pipeline
(build_deobfuscator → runtime::run) and nsig fixup_function
* 7 Phase 1 live smoke tests still green
57/57 total green.