vc=69: audit-fix sprint round 2 (regressions on round 1)

Round-2 audit caught four real regressions on round-1 fixes plus a
handful of MEDs. This sprint fixes them.

H1 — FeedRefreshWorker exception class
  Round 1 wrapped subscriptionFeed in try/catch IOException, but
  UniFFI generates StrawcoreException (kotlin.Exception, not
  IOException). The retry path was dead code. Catch
  StrawcoreException.Network instead — the variant our error.rs
  maps NetworkError::Transport into.

H2 — enrichJob terminal emit cancellation race
  withContext(Dispatchers.Default) { mergeFromCache(...) } has no
  suspension points so a cancel arriving mid-merge isn't observed
  until the next suspending call. Without a guard, the non-suspending
  _ui.update lands AFTER clearInMemoryCache() and resurrects the
  cleared items. Add coroutineContext.ensureActive() after each
  withContext hop, before the emit. Applied on both the refresh
  terminal emit and the enrich terminal emit.

H6 — enrichVisibleItems shows stale subscriptions
  The channelsSnapshot captured at refresh-end is ~2s stale by the
  time the enrich terminal emit runs. If the user unsubscribed from
  X in that window, X's items still appear on the feed for one
  frame. Re-read Subscriptions at the terminal step and intersect
  with the snapshot.

R-H3 — extract_channel_id substring match
  Round 1 used trimmed_lower.find(prefix) which matches ANY position.
  evil.com/?redir=https://www.youtube.com/channel/UCxxx silently
  rewrote to the embedded channel ID. strip_prefix() anchors at byte
  0. ASCII-only prefix means byte indices align in trimmed_lower vs
  trimmed.

R-H2 — String::from_utf8 silent-drop
  YouTube ships mojibake titles in the wild. Strict from_utf8
  returned None on any bad byte, dropping the entire channel from the
  feed with only a quiet None. Switch to from_utf8_lossy — quick-xml
  tolerates U+FFFD replacement chars and the per-entry skip-on-empty
  handles broken entries.

R-H1 — read_capped_body per-chunk size sanity
  HTTP allows arbitrarily large single chunks. Reject any chunk
  exceeding the whole body cap before adding it to the buffer, so a
  hostile server can't get us to allocate a hyper Bytes larger than
  the cap.

M3 — Avatar URL validation
  ch.avatar is extractor-emitted; a poisoned channel page could ship
  data:image/svg+xml,<svg>...<script> or javascript: URLs. Validate
  http(s):// scheme before persisting to Subscriptions and before
  surfacing via VideoDetail.uploaderAvatar.

M4 — ChannelViewModel dual loadedUrl
  Same shape VideoDetail's round-1 fix declared unsafe. Move
  loadedUrl into ChannelUiState, drop the field, use _ui.value
  snapshot at top of load() and _ui.value.loadedUrl for the fence.
  Rejected-URL path also stamps loadedUrl so the gate is coherent.
This commit is contained in:
Kayos 2026-05-26 21:31:07 -07:00
parent 5f2ba264b0
commit 23fb6f52b0
6 changed files with 98 additions and 26 deletions

View file

@ -55,6 +55,6 @@ const val NEWPIPE_APPLICATION_ID_NEW = "net.newpipe.app"
// vc=19 / 0.1.0-AE — rust pipeline cutover. Extraction via
// strawcore-core (Sulkta-Coop/strawcore) via the UniFFI wrapper; no
// NewPipeExtractor in the runtime path.
const val STRAW_VERSION_CODE = 68
const val STRAW_VERSION_NAME = "0.1.0-CB"
const val STRAW_VERSION_CODE = 69
const val STRAW_VERSION_NAME = "0.1.0-CC"
const val STRAW_APPLICATION_ID = "com.sulkta.straw"

View file

@ -164,6 +164,16 @@ async fn read_capped_body(resp: reqwest::Response) -> Option<String> {
let mut stream = resp.bytes_stream();
while let Some(chunk_result) = stream.next().await {
let chunk = chunk_result.ok()?;
// Defense-in-depth: a single hostile chunk can be arbitrarily
// large (HTTP allows multi-GiB chunks). Reject any one chunk
// bigger than the whole body cap before we even add it to the
// running total — protects against hyper having already
// allocated the chunk on our behalf. Round-68 audit
// rust-HIGH-1.
if chunk.len() > RSS_MAX_BYTES {
log::warn!("strawcore::rss single chunk {} exceeds cap; aborting", chunk.len());
return None;
}
total = total.saturating_add(chunk.len());
if total > RSS_MAX_BYTES {
log::warn!("strawcore::rss body exceeded {RSS_MAX_BYTES} bytes; aborting");
@ -171,7 +181,12 @@ async fn read_capped_body(resp: reqwest::Response) -> Option<String> {
}
buf.extend_from_slice(&chunk);
}
String::from_utf8(buf).ok()
// Lossy decode — round-68 audit rust-HIGH-2. A strict from_utf8
// returns None on any invalid byte, so a single mojibake title
// would silently drop the entire channel from the feed. quick-xml
// tolerates U+FFFD replacement chars and the per-entry skip-on-
// empty handles broken entries downstream.
Some(String::from_utf8_lossy(&buf).into_owned())
}
/// Extract the `UCxxx` channel ID from a channel URL. Accepts the
@ -194,7 +209,11 @@ fn extract_channel_id(input: &str) -> Option<String> {
let trimmed_lower = trimmed.to_lowercase();
// Match the "<scheme>://<host>/channel/" prefix in a single sweep
// so we accept http/https + www./m. variants without four-way
// string-strip ladders.
// string-strip ladders. ANCHORED at the start of the string —
// round-68 audit rust-HIGH-3: prior `find()` accepted any input
// containing the prefix as a substring, so a pasted
// `evil.com/?redir=https://www.youtube.com/channel/UCxxx` would
// silently rewrite to the wrong channel.
const PREFIXES: &[&str] = &[
"https://www.youtube.com/channel/",
"https://youtube.com/channel/",
@ -204,9 +223,13 @@ fn extract_channel_id(input: &str) -> Option<String> {
"http://m.youtube.com/channel/",
];
for p in PREFIXES {
if let Some(idx) = trimmed_lower.find(p) {
let rest = &trimmed[idx + p.len()..];
let id = rest.split(|c: char| c == '/' || c == '?' || c == '#').next()?;
if let Some(rest) = trimmed_lower.strip_prefix(p) {
// Bytes match 1:1 with `trimmed` since the prefix is ASCII
// and case-folding ASCII doesn't change byte length.
let rest_in_original = &trimmed[p.len()..p.len() + rest.len()];
let id = rest_in_original
.split(|c: char| c == '/' || c == '?' || c == '#')
.next()?;
return validate_channel_id(id);
}
}

View file

@ -29,6 +29,16 @@ data class ChannelUiState(
val avatar: String? = null,
val videos: List<StreamItem> = emptyList(),
val error: String? = null,
/**
* Tracks which channel URL the current state belongs to. Same
* activity-scoped-VM hazard as VideoDetail: a fresh nav to
* channel B sees the PREVIOUS channel's state for one composition
* frame before vm.load(B) clears it. Without this field, any
* caller that derives "this is the channel we want" from
* `state.name` (or other display fields) is reading channel A's
* data while believing it's B. Round-68 audit MED-4.
*/
val loadedUrl: String? = null,
)
class ChannelViewModel : ViewModel() {
@ -40,10 +50,11 @@ class ChannelViewModel : ViewModel() {
// the late-arriving older fetch is cancelled. Round-4 audit
// HIGH-2 / MED-1.
private var inFlight: Job? = null
private var loadedUrl: String? = null
fun load(channelUrl: String) {
if (loadedUrl == channelUrl && _ui.value.videos.isNotEmpty()) return
// Snapshot _ui once so the two reads agree. Round-68 audit MED-4.
val snap = _ui.value
if (snap.loadedUrl == channelUrl && snap.videos.isNotEmpty()) return
// Round-5 audit MED-3: extractor-emitted uploaderUrl can be
// attacker-controlled if the YT response is poisoned upstream.
// Refuse non-YT hosts at the entry point so we don't even
@ -53,12 +64,17 @@ class ChannelViewModel : ViewModel() {
if (!isAllowedYtUrl(channelUrl)) {
inFlight?.cancel()
inFlight = null
_ui.update { ChannelUiState(loading = false, error = "Unsupported URL") }
_ui.update {
ChannelUiState(
loading = false,
error = "Unsupported URL",
loadedUrl = channelUrl,
)
}
return
}
inFlight?.cancel()
loadedUrl = channelUrl
_ui.update { ChannelUiState(loading = true) }
_ui.update { ChannelUiState(loading = true, loadedUrl = channelUrl) }
inFlight = viewModelScope.launch {
try {
val ch = uniffi.strawcore.channelInfo(channelUrl)
@ -74,7 +90,7 @@ class ChannelViewModel : ViewModel() {
uploadDateRelative = v.uploadDateRelative,
)
}
if (loadedUrl != channelUrl) return@launch
if (_ui.value.loadedUrl != channelUrl) return@launch
_ui.update {
ChannelUiState(
loading = false,
@ -83,11 +99,12 @@ class ChannelViewModel : ViewModel() {
banner = ch.banner,
avatar = ch.avatar,
videos = videos,
loadedUrl = channelUrl,
)
}
} catch (t: Throwable) {
if (t is CancellationException) throw t
if (loadedUrl != channelUrl) return@launch
if (_ui.value.loadedUrl != channelUrl) return@launch
_ui.update {
ChannelUiState(
loading = false,
@ -98,6 +115,7 @@ class ChannelViewModel : ViewModel() {
error = com.sulkta.straw.util.LogDump.scrubLine(
t.message ?: t.javaClass.simpleName,
),
loadedUrl = channelUrl,
)
}
}

View file

@ -232,15 +232,25 @@ class VideoDetailViewModel : ViewModel() {
// subscribed and our stored avatar is stale or
// missing, push the fresh one back to the store
// so the subs feed picks it up too.
//
// Validate the scheme before persisting — the
// extractor surfaces the URL string verbatim
// and a poisoned channel page could ship
// `data:image/svg+xml,<svg>...<script>` or
// `javascript:`. Round-68 audit MED-3.
val fresh = ch.avatar
if (!fresh.isNullOrBlank()) {
val safeFresh = if (!fresh.isNullOrBlank() &&
(fresh.startsWith("https://") || fresh.startsWith("http://"))) {
fresh
} else null
if (safeFresh != null) {
runCatchingCancellable {
com.sulkta.straw.data.Subscriptions
.get().updateAvatar(uploaderUrl, fresh)
.get().updateAvatar(uploaderUrl, safeFresh)
}
}
ChannelExtras(
avatar = fresh,
avatar = safeFresh,
subscriberCount = ch.subscriberCount,
videos = ch.videos
.filter { it.url != streamUrl }

View file

@ -28,7 +28,6 @@ import com.sulkta.straw.data.Subscriptions
import com.sulkta.straw.feature.search.StreamItem
import com.sulkta.straw.util.strawLogI
import com.sulkta.straw.util.strawLogW
import java.io.IOException
class FeedRefreshWorker(
context: Context,
@ -44,14 +43,17 @@ class FeedRefreshWorker(
// a flat list; we group by uploaderUrl to rebuild the per-
// channel cache shape FeedCacheStore expects.
//
// Round-67 audit MED-5: distinguish transient failures
// (network down, timeout) from parse failures. The former
// wants Result.retry() so WorkManager re-attempts within the
// current window with exponential backoff; without this, a
// 30-second offline blip eats a full 6-hour refresh cycle.
// Distinguish transient failures (network down, timeout) from
// parse failures. The former wants Result.retry() so
// WorkManager re-attempts within the current window with
// exponential backoff; without this, a 30-second offline blip
// eats a full 6-hour refresh cycle. Round-68 audit HIGH-1:
// earlier `IOException` catch was dead code — UniFFI throws
// `uniffi.strawcore.StrawcoreException.Network` for transport
// errors, which does NOT extend IOException.
val flat = try {
uniffi.strawcore.subscriptionFeed(subs.map { it.url })
} catch (e: IOException) {
} catch (e: uniffi.strawcore.StrawcoreException.Network) {
strawLogW("FeedRefresh") { "transient network failure, retrying: ${e.message}" }
return Result.retry()
} catch (e: Throwable) {

View file

@ -34,6 +34,7 @@ import kotlinx.coroutines.Job
import kotlinx.coroutines.async
import kotlinx.coroutines.awaitAll
import kotlinx.coroutines.coroutineScope
import kotlinx.coroutines.ensureActive
import kotlinx.coroutines.flow.MutableStateFlow
import kotlinx.coroutines.flow.StateFlow
import kotlinx.coroutines.flow.asStateFlow
@ -214,8 +215,15 @@ class SubscriptionFeedViewModel : ViewModel() {
// Move flatMap + per-item regex + sort off Main —
// viewModelScope.launch runs on Main by default and
// mergeFromCache is non-trivial on a 500-item merge.
// Round-67 audit HIGH-1.
// Round-67 audit HIGH-1. ensureActive() AFTER the
// withContext hop is round-68 audit HIGH-2: a
// synchronous Default body doesn't observe
// cancellation until the next suspension; without
// this check, a cancel that landed mid-merge would
// still let the terminal _ui.update fire and clobber
// a fresher state.
val freshItems = withContext(Dispatchers.Default) { mergeFromCache(channels) }
coroutineContext.ensureActive()
_ui.update {
SubscriptionFeedUiState(
loading = false,
@ -382,9 +390,20 @@ class SubscriptionFeedViewModel : ViewModel() {
// + sort over up to 500 items is too much for the UI
// thread. Then hop to Main only for the StateFlow emit.
// Round-67 audit HIGH-1.
//
// Re-read subs at the terminal step (round-68 audit
// HIGH-6): the snapshot captured at refresh-end may
// include channels the user has since unsubscribed from
// in the ~2s enrich window. Intersect so a freshly-
// unsubscribed channel doesn't briefly re-appear in the
// feed after the enrich emit.
val mergeChannels = Subscriptions.get().subs.value
.filter { it.url in channelsSnapshot.map { c -> c.url }.toSet() }
val merged = withContext(Dispatchers.Default) {
mergeFromCache(channelsSnapshot)
mergeFromCache(mergeChannels)
}
// Honor cancellation post-merge — round-68 audit HIGH-2.
coroutineContext.ensureActive()
_ui.update { it.copy(items = merged) }
}
}