audit-fix sprint: 12 findings from the max-effort adversarial pass

Threats closed (CRIT/HIGH):

- CRIT-1 (mail_move folder injection via uid_copy fallback):
  validate_mailbox() rejects CR/LF/NUL/"/\\ on every folder arg
  (list/read/search/thread/move). async-imap's uid_copy doesn't quote
  the destination — quoting metacharacters would have smuggled COPY
  targets. We refuse the characters outright rather than escape.

- HIGH-1 (mail_thread message_id backslash bypass): seed Message-ID
  rejection set extended from {", CR, LF} to {", \\, CR, LF, {}.
  A bare \\ inside the IMAP quoted-string would escape the closing
  quote and confuse the server's parser. { also opener of literal-form.

- CRIT-2 / HIGH (search literal-form): mail_search now rejects the
  IMAP {N} literal-form opener via has_imap_literal(). CR/LF were
  already blocked.

- HIGH-3 (strip_quotes asymmetric strip): only strips matching pairs.
  A password starting with " but lacking a closing " no longer
  silently loses its leading char.

- HIGH-4 (no attachment size cap): new MAX_ATTACHMENT_BYTES (25 MB
  decoded, matches Gmail), MAX_ATTACHMENTS (25), MAX_BODY_BYTES
  (5 MB on body + body_html), MAX_TOTAL_RECIPIENTS (100). Pre-decode
  bound on encoded base64 length prevents giant-payload OOM before
  the decode buffer allocates.

- HIGH-5 (raw_eml fetch unbounded): RFC822.SIZE pre-flight on
  mail_inbox_read refuses messages > MAX_RAW_EML_BYTES (20 MB) before
  the body transfer.

- HIGH-6 (flat headers map empties for structured variants):
  switched from h.value().as_text() (which returns None for Address /
  DateTime / ContentType / Received) to Message::header_raw(name)
  which returns the un-decoded header value as &str uniformly across
  all variants. Date / From / To / Subject / Content-Type /
  DKIM-Signature etc. all populate correctly now.

- HIGH-9 (password resolved after TLS handshake): resolve_password()
  now runs at the top of open_session(), before TCP connect, so a
  missing/unreadable credential errors before the IMAP server logs
  an unauthenticated session that fail2ban could pattern on.

MED/LOW:

- MED-8 mail_search tool description: clarifies that CR/LF + {N}
  literal-form are rejected but the query is otherwise raw — caller
  must not pass untrusted input.

- MED-10 ServerHandler instructions: lists all 7 tools (not just the
  original 3) and explains UID stability + BODY.PEEK posture.

- LOW-2 snippet_unused dead code: deleted.

Smoke verified 2026-05-21:
- send -> land -> read round trip clean
- headers map now shows Date / From / To / Subject / Message-ID /
  User-Agent / Content-Type / DKIM-Signature populated
- 4 injection probes all cleanly rejected: CR in folder, {5}hello
  search literal, message_id with \\, folder with "
- mail_move INBOX <-> Junk round trip clean

Findings explicitly verified NOT-exploitable by the audit (no code
change needed): lettre CR/LF filter on Subject/Message-Id, lettre
mailbox rfc2822 parser, MIME-boundary randomness, rustls hostname
verification, password leakage in error paths, MIME smuggling via
filename, format_imap_since negative-year bypass.

Deferred (separate follow-ups): session pool (MED-6), partial-body
fetch in mail_inbox_read (MED-9), canonical Flag display rendering
(LOW-3), JSON schema 0=default sentinel (LOW-5), config chmod check
(LOW-6), proper unit/integration test suite (INFO-3).
This commit is contained in:
Kayos 2026-05-21 07:38:43 -07:00
parent 4251f514e6
commit f4b3199e86
4 changed files with 191 additions and 25 deletions

View file

@ -117,11 +117,20 @@ impl Account {
} }
} }
/// Strip ONE matching pair of leading+trailing quote chars, never a half.
///
/// `"foo"` -> `foo`; `'foo'` -> `foo`. `"foo` -> `"foo` (unchanged — a
/// password with a leading `"` and no trailing close stays intact).
/// Mixed quotes (`"foo'`) stay untouched.
fn strip_quotes(s: &str) -> &str { fn strip_quotes(s: &str) -> &str {
let s = s.strip_prefix('"').unwrap_or(s); if s.len() >= 2 {
let s = s.strip_suffix('"').unwrap_or(s); let first = s.as_bytes()[0];
let s = s.strip_prefix('\'').unwrap_or(s); let last = s.as_bytes()[s.len() - 1];
s.strip_suffix('\'').unwrap_or(s) if (first == b'"' || first == b'\'') && first == last {
return &s[1..s.len() - 1];
}
}
s
} }
fn config_path() -> Result<PathBuf> { fn config_path() -> Result<PathBuf> {

View file

@ -74,7 +74,35 @@ pub struct FolderEntry {
const DEFAULT_LIMIT: u32 = 50; const DEFAULT_LIMIT: u32 = 50;
const MAX_LIMIT: u32 = 500; const MAX_LIMIT: u32 = 500;
const SNIPPET_LEN: usize = 240; /// Cap on raw_eml fetch size in `mail_inbox_read`. Anything larger refuses
/// the read with a hint to use `format=text` (which still pulls the body,
/// but doesn't double-buffer through `String::from_utf8_lossy`).
const MAX_RAW_EML_BYTES: u64 = 20 * 1024 * 1024;
/// Mailbox-name guard: reject any byte that would force us into IMAP
/// quoted-string or literal-form territory we don't control. CR/LF/NUL
/// could split commands; `\`/`"` would need escaping that async-imap's
/// `uid_copy` does not perform on the destination argument. We refuse
/// the small set rather than try to quote-and-escape on the wire.
fn validate_mailbox(name: &str) -> Result<()> {
if name.is_empty() {
return Err(anyhow!("mailbox name cannot be empty"));
}
for b in name.bytes() {
match b {
b'\r' | b'\n' | 0 => {
return Err(anyhow!("mailbox name contains CR/LF/NUL"));
}
b'"' | b'\\' => {
return Err(anyhow!(
"mailbox name contains `\"` or `\\` — refused; rename the folder server-side if needed"
));
}
_ => {}
}
}
Ok(())
}
// ============================================================================= // =============================================================================
// list // list
@ -82,6 +110,7 @@ const SNIPPET_LEN: usize = 240;
pub async fn list(account: &Account, opts: ListOpts) -> Result<Vec<ListEntry>> { pub async fn list(account: &Account, opts: ListOpts) -> Result<Vec<ListEntry>> {
let folder = opts.folder.as_deref().unwrap_or("INBOX"); let folder = opts.folder.as_deref().unwrap_or("INBOX");
validate_mailbox(folder)?;
let limit = match opts.limit { let limit = match opts.limit {
0 => DEFAULT_LIMIT, 0 => DEFAULT_LIMIT,
n if n > MAX_LIMIT => MAX_LIMIT, n if n > MAX_LIMIT => MAX_LIMIT,
@ -209,6 +238,7 @@ pub async fn read(
format: &str, format: &str,
) -> Result<ReadOutput> { ) -> Result<ReadOutput> {
let folder = folder.unwrap_or("INBOX"); let folder = folder.unwrap_or("INBOX");
validate_mailbox(folder)?;
let format = match format { let format = match format {
"text" | "html" | "raw_eml" => format, "text" | "html" | "raw_eml" => format,
other => { other => {
@ -224,6 +254,30 @@ pub async fn read(
.await .await
.with_context(|| format!("SELECT {folder}"))?; .with_context(|| format!("SELECT {folder}"))?;
// Pre-flight size check via RFC822.SIZE so we can refuse oversized fetches
// before we transfer the body. Mail-parser allocates O(message size); a
// 1 GB MIME message would OOM us.
{
let mut size_stream = session
.uid_fetch(uid.to_string(), "(UID RFC822.SIZE)")
.await
.with_context(|| format!("UID FETCH SIZE {uid}"))?;
let size_msg = size_stream
.next()
.await
.ok_or_else(|| anyhow!("no message at UID {uid} in {folder}"))?
.context("UID FETCH SIZE stream")?;
let size = size_msg.size.unwrap_or(0) as u64;
drop(size_stream);
if size > MAX_RAW_EML_BYTES {
session.logout().await.ok();
return Err(anyhow!(
"message UID {uid} is {size} bytes — refusing to fetch (cap is {MAX_RAW_EML_BYTES}). \
Use a more specific tool when we add partial-fetch in Phase C."
));
}
}
// BODY[] = full RFC822 message. We parse with mail-parser, then either // BODY[] = full RFC822 message. We parse with mail-parser, then either
// return the text part, html part, or raw. // return the text part, html part, or raw.
let mut stream = session let mut stream = session
@ -277,17 +331,26 @@ pub async fn read(
}) })
.collect(); .collect();
// Headers as a flat JSON map (last-write-wins on duplicates is fine for v0.1). // Headers as a flat JSON map. mail-parser's `HeaderValue::as_text()`
// returns None for structured variants (Address / DateTime / ContentType
// / Received) which would leave most "interesting" headers empty. We use
// `Message::header_raw(name)` instead — that returns the un-decoded
// header value as &str, uniform across all header types.
let mut headers = serde_json::Map::new(); let mut headers = serde_json::Map::new();
let mut seen = std::collections::HashSet::new();
for h in parsed.headers() { for h in parsed.headers() {
let name = h.name(); let name = h.name().to_string();
let val = h.value().as_text().map(|s| s.to_string()).unwrap_or_default(); if !seen.insert(name.clone()) {
headers.insert(name.to_string(), serde_json::Value::String(val)); continue; // duplicate header — keep the first occurrence
}
let raw = parsed
.header_raw(name.as_str())
.map(|s| s.trim().to_string())
.unwrap_or_default();
headers.insert(name, serde_json::Value::String(raw));
} }
let subject = parsed.subject().unwrap_or_default().to_string(); let subject = parsed.subject().unwrap_or_default().to_string();
let snippet_unused: String = body.chars().take(SNIPPET_LEN).collect();
let _ = snippet_unused; // suppress unused (kept structure-wise for symmetry)
Ok(ReadOutput { Ok(ReadOutput {
uid, uid,
@ -357,7 +420,17 @@ pub async fn search(
if query.contains('\r') || query.contains('\n') { if query.contains('\r') || query.contains('\n') {
return Err(anyhow!("search query must not contain CR or LF")); return Err(anyhow!("search query must not contain CR or LF"));
} }
// Reject IMAP literal-form `{N}` / `{N+}` patterns. These would let a
// caller switch the wire parser into literal mode and consume the
// following bytes raw — not a CRLF-bounded injection but still a
// command-shape surprise we don't want to enable.
if has_imap_literal(query) {
return Err(anyhow!(
"search query contains IMAP literal-form `{{N}}` syntax — refused"
));
}
let folder = folder.unwrap_or("INBOX"); let folder = folder.unwrap_or("INBOX");
validate_mailbox(folder)?;
let limit = match limit { let limit = match limit {
0 => DEFAULT_LIMIT, 0 => DEFAULT_LIMIT,
n if n > MAX_LIMIT => MAX_LIMIT, n if n > MAX_LIMIT => MAX_LIMIT,
@ -400,10 +473,18 @@ pub async fn thread(
if id_unbraced.is_empty() { if id_unbraced.is_empty() {
return Err(anyhow!("message_id is empty")); return Err(anyhow!("message_id is empty"));
} }
if id_unbraced.contains('"') || id_unbraced.contains('\r') || id_unbraced.contains('\n') { // Reject `"` (IMAP quoted-string terminator), `\` (IMAP quoted-string
return Err(anyhow!("message_id must not contain quotes, CR, or LF")); // escape), CR/LF (command terminators), and `{` (literal-form opener).
if id_unbraced
.chars()
.any(|c| matches!(c, '"' | '\\' | '\r' | '\n' | '{'))
{
return Err(anyhow!(
r#"message_id must not contain ", \, CR, LF, or {{"#
));
} }
let folder = folder.unwrap_or("INBOX"); let folder = folder.unwrap_or("INBOX");
validate_mailbox(folder)?;
let limit = match limit { let limit = match limit {
0 => DEFAULT_LIMIT, 0 => DEFAULT_LIMIT,
n if n > MAX_LIMIT => MAX_LIMIT, n if n > MAX_LIMIT => MAX_LIMIT,
@ -460,12 +541,8 @@ pub async fn move_msg(
to_folder: &str, to_folder: &str,
) -> Result<()> { ) -> Result<()> {
let from = from_folder.unwrap_or("INBOX"); let from = from_folder.unwrap_or("INBOX");
if to_folder.is_empty() { validate_mailbox(from)?;
return Err(anyhow!("to_folder cannot be empty")); validate_mailbox(to_folder)?;
}
if to_folder.contains('\r') || to_folder.contains('\n') {
return Err(anyhow!("to_folder must not contain CR or LF"));
}
let mut session = open_session(account).await?; let mut session = open_session(account).await?;
session session
@ -594,6 +671,11 @@ async fn open_session(
"plain-IMAP (no TLS) not supported in v0.1 — set imap_tls=true and imap_port=993" "plain-IMAP (no TLS) not supported in v0.1 — set imap_tls=true and imap_port=993"
)); ));
} }
// Resolve the password BEFORE the TCP+TLS handshake. If the credential
// is missing or unreadable we want the error before the server logs an
// unauthenticated session (which fail2ban could rate-limit on).
let password = account.resolve_password()?;
let addr = format!("{}:{}", account.imap_host, account.imap_port); let addr = format!("{}:{}", account.imap_host, account.imap_port);
let tcp = TcpStream::connect(&addr) let tcp = TcpStream::connect(&addr)
.await .await
@ -614,12 +696,28 @@ async fn open_session(
let client = async_imap::Client::new(tls); let client = async_imap::Client::new(tls);
// greeting was consumed by Client::new in async-imap >= 0.10 // greeting was consumed by Client::new in async-imap >= 0.10
let session = client let session = client
.login(&account.username, account.resolve_password()?) .login(&account.username, password)
.await .await
.map_err(|(e, _client)| anyhow!("imap login failed: {e}"))?; .map_err(|(e, _client)| anyhow!("imap login failed: {e}"))?;
Ok(session) Ok(session)
} }
/// Detect the IMAP literal-form `{N}` / `{N+}` / `{N+}\r\n` opener pattern.
/// We don't try to validate fully — any `{<digits>` substring is enough
/// signal to refuse a user-supplied search query.
fn has_imap_literal(s: &str) -> bool {
let mut chars = s.chars().peekable();
while let Some(c) = chars.next() {
if c == '{' {
// The next char must be an ASCII digit for it to be a literal opener.
if matches!(chars.peek(), Some(d) if d.is_ascii_digit()) {
return true;
}
}
}
false
}
fn rustls_roots() -> rustls::RootCertStore { fn rustls_roots() -> rustls::RootCertStore {
let mut roots = rustls::RootCertStore::empty(); let mut roots = rustls::RootCertStore::empty();
roots.extend(webpki_roots::TLS_SERVER_ROOTS.iter().cloned()); roots.extend(webpki_roots::TLS_SERVER_ROOTS.iter().cloned());

View file

@ -52,10 +52,58 @@ pub struct SendOutput {
const USER_AGENT: &str = concat!("mail-mcp/", env!("CARGO_PKG_VERSION")); const USER_AGENT: &str = concat!("mail-mcp/", env!("CARGO_PKG_VERSION"));
/// Hard caps to keep an MCP-driver-gone-wrong from OOM-ing the box.
/// Match Gmail's effective 25 MB per-message ceiling — any single attachment
/// past this is almost certainly unintended. Body caps are generous; HTML
/// bodies past 5 MB are a smell.
const MAX_ATTACHMENT_BYTES: usize = 25 * 1024 * 1024;
const MAX_ATTACHMENTS: usize = 25;
const MAX_BODY_BYTES: usize = 5 * 1024 * 1024;
const MAX_TOTAL_RECIPIENTS: usize = 100;
pub async fn send(account: &Account, input: SendInput) -> Result<SendOutput> { pub async fn send(account: &Account, input: SendInput) -> Result<SendOutput> {
if input.to.is_empty() { if input.to.is_empty() {
return Err(anyhow!("at least one `to` address required")); return Err(anyhow!("at least one `to` address required"));
} }
let total_recipients = input.to.len() + input.cc.len() + input.bcc.len();
if total_recipients > MAX_TOTAL_RECIPIENTS {
return Err(anyhow!(
"too many recipients: {total_recipients} (cap {MAX_TOTAL_RECIPIENTS})"
));
}
if input.body.len() > MAX_BODY_BYTES {
return Err(anyhow!(
"body too large: {} bytes (cap {MAX_BODY_BYTES})",
input.body.len()
));
}
if let Some(html) = &input.body_html {
if html.len() > MAX_BODY_BYTES {
return Err(anyhow!(
"body_html too large: {} bytes (cap {MAX_BODY_BYTES})",
html.len()
));
}
}
if input.attachments.len() > MAX_ATTACHMENTS {
return Err(anyhow!(
"too many attachments: {} (cap {MAX_ATTACHMENTS})",
input.attachments.len()
));
}
// Coarse pre-decode size estimate — base64 expands the payload ~33%,
// so the encoded string is ~4*N/3 of the decoded bytes. Refuse early
// if the encoded form alone exceeds 4/3 of MAX, saving a giant decode.
let max_encoded = (MAX_ATTACHMENT_BYTES * 4 + 2) / 3;
for att in &input.attachments {
if att.content_base64.len() > max_encoded {
return Err(anyhow!(
"attachment `{}` encoded length {} exceeds limit (decoded cap {MAX_ATTACHMENT_BYTES})",
att.filename,
att.content_base64.len()
));
}
}
// Build From, To, Cc, Bcc. // Build From, To, Cc, Bcc.
let from_str = format!("{} <{}>", account.from_name, account.from_addr); let from_str = format!("{} <{}>", account.from_name, account.from_addr);
@ -173,6 +221,13 @@ fn build_body(input: &SendInput) -> Result<MultiPart> {
let bytes = base64::engine::general_purpose::STANDARD let bytes = base64::engine::general_purpose::STANDARD
.decode(&att.content_base64) .decode(&att.content_base64)
.with_context(|| format!("attachment `{}`: invalid base64", att.filename))?; .with_context(|| format!("attachment `{}`: invalid base64", att.filename))?;
if bytes.len() > MAX_ATTACHMENT_BYTES {
return Err(anyhow!(
"attachment `{}` decoded to {} bytes — exceeds cap {MAX_ATTACHMENT_BYTES}",
att.filename,
bytes.len()
));
}
let content_type: ContentType = att.mime_type.parse().with_context(|| { let content_type: ContentType = att.mime_type.parse().with_context(|| {
format!( format!(
"attachment `{}`: invalid mime_type `{}`", "attachment `{}`: invalid mime_type `{}`",

View file

@ -280,7 +280,7 @@ impl MailService {
#[tool( #[tool(
name = "mail_search", name = "mail_search",
description = "Raw IMAP SEARCH passthrough against a folder (default INBOX). Examples: `SUBJECT \"invoice\"`, `FROM \"cobb@sulkta.com\"`, `SINCE 21-May-2026 UNSEEN`, `OR SUBJECT \"foo\" SUBJECT \"bar\"`. CR/LF in the query are rejected (anti-injection). Returns the same summary shape as mail_inbox_list, newest UID first." description = "Raw IMAP SEARCH passthrough against a folder (default INBOX). Examples: `SUBJECT \"invoice\"`, `FROM \"cobb@sulkta.com\"`, `SINCE 21-May-2026 UNSEEN`, `OR SUBJECT \"foo\" SUBJECT \"bar\"`. CR/LF and IMAP `{N}` literal-form are rejected, but the query is otherwise passed raw — do not pass untrusted input (an unbalanced `\"` can change which UIDs match). Returns the same summary shape as mail_inbox_list, newest UID first."
)] )]
async fn mail_search( async fn mail_search(
&self, &self,
@ -388,10 +388,14 @@ impl ServerHandler for MailService {
ServerInfo { ServerInfo {
capabilities: ServerCapabilities::builder().enable_tools().build(), capabilities: ServerCapabilities::builder().enable_tools().build(),
instructions: Some( instructions: Some(
"mail-mcp — Rust MCP server for Sulkta-hosted email. \ "mail-mcp — Rust MCP server for Sulkta-hosted email. Tools: \
Tools: mail_send, mail_inbox_list, mail_inbox_read. \ mail_send, mail_inbox_list, mail_inbox_read, mail_folder_list, \
Default account from config; pass `account` to switch. \ mail_search, mail_thread, mail_move. Default account from \
Reads use BODY.PEEK so they don't toggle the \\Seen flag." config; pass `account` to switch. Reads use BODY.PEEK so they \
don't toggle the \\Seen flag. UID is stable across SELECT; \
sequence numbers are not always address by UID. mail_search \
takes a raw IMAP SEARCH query; mail_thread walks the \
References + In-Reply-To chain."
.into(), .into(),
), ),
..Default::default() ..Default::default()