v0.1 wave 3 (steps 9+10): autonomous patch loop + production recipes

Step 9 — autonomous patch loop:
- patcher.py: clawdforge session → unified diff → worktree apply → verify recipe → push branch → open Gitea PR
- migration 007: patch_attempts (UNIQUE per finding+attempt, max 3 attempts)
- runner.py: post-parse hook fires patcher.maybe_draft_for_job when notify.auto_patch=true
- server.py: POST /jobs/{id}/patches, GET /patches, GET /patches/{id}
- digest.py: patch-drafted lines + open-follow-up count via Gitea PR state check
- mcp: crafting_table_draft_patch stub replaced with real implementation
- tests/test_patcher.py + tests/test_patches_api.py: 27 new tests

No auto-merge — patches stop at PR-open. Cobb merges.

Step 10 — production recipes:
- examples/recipes/clawdforge.json: 14 subprojects across all SDKs, audit nightly
- examples/recipes/cauldron.json: single Flask subproject, audit nightly
- examples/recipes/tradecraft.json: nightly audit, auto_patch=false (manual review)
- examples/register-all.sh: bulk-register helper with GITEA_TOKEN substitution
- README "Autonomous patch loop" + "First production recipes" sections

Tests: server 116→143, mcp 65→67. All green.

Spec: memory/spec-crafting-table.md
This commit is contained in:
Kayos 2026-04-29 09:04:48 -07:00
parent ecb9d76e6d
commit 4eab869df0
17 changed files with 2752 additions and 78 deletions

View file

@ -42,3 +42,14 @@ CRAFTING_GC_AGE=86400
# CRAFTING_SMTP_PASS=
# CRAFTING_SMTP_FROM=crafting-table@sulkta.com
# CRAFTING_SMTP_TLS=1
# --- Autonomous patch loop (wave 3, optional) ------------------------------
# All four CRAFTING_CLAWDFORGE_* + CRAFTING_GITEA_* must be set for the
# patcher to come up. Missing any → patcher disabled, /jobs/{id}/patches
# returns 503. Runner hook silently no-ops.
# CRAFTING_CLAWDFORGE_URL=http://192.168.0.5:8800
# CRAFTING_CLAWDFORGE_TOKEN=cf_...
# CRAFTING_GITEA_URL=http://192.168.0.5:3001
# CRAFTING_GITEA_TOKEN=
# CRAFTING_PATCHER_MAX_ATTEMPTS=3
# CRAFTING_PATCHER_BRANCH_PREFIX=crafting-table/auto/

112
README.md
View file

@ -15,7 +15,7 @@ through clawdforge.
Spec: `Sulkta-Coop/openclaw-workspace/memory/spec-crafting-table.md` (LAN-only).
## Status — v0.1 step 7 of 10
## Status — v0.1 complete (10 of 10)
- [x] Step 1: Dockerfile + per-language smoke
- [x] Step 2: SQLite ledger + project registry
@ -25,8 +25,8 @@ Spec: `Sulkta-Coop/openclaw-workspace/memory/spec-crafting-table.md` (LAN-only).
- [x] Step 6: Findings extraction + storage
- [x] Step 7: MCP server (stdio JSON-RPC, 8 tools) — see [mcp/README.md](mcp/README.md)
- [x] Step 8: Email digest scheduler
- [ ] Step 9: Autonomous patch loop (clawdforge integration)
- [ ] Step 10: Production recipes — clawdforge, cauldron, tradecraft
- [x] Step 9: Autonomous patch loop (clawdforge integration → unified diff → worktree apply → verify recipe → push branch → Gitea PR)
- [x] Step 10: Production recipes — clawdforge, cauldron, tradecraft (see [examples/recipes/](examples/recipes/))
## Toolchains in v0.1
@ -71,6 +71,9 @@ override via `CRAFTING_LAN_CIDRS`.
| GET | `/jobs/{id}` | owner | State + last 200 log lines |
| GET | `/jobs/{id}/log` | owner | Full log (file stream) |
| GET | `/jobs/{id}/findings` | owner | Structured findings (see Findings) |
| POST | `/jobs/{id}/patches` | owner | Trigger an auto-patch attempt (wave 3) |
| GET | `/patches?project=&status=&limit=`| any | List own patch attempts |
| GET | `/patches/{id}` | owner | Patch attempt detail |
Cross-token access returns **404, not 403** — same existence-leak guard as
clawdforge sessions.
@ -207,9 +210,15 @@ without lock contention. The runner is the only mutator of `jobs`/
│ ├── runner.py # async job pool + subprocess exec
│ ├── workspace.py # bare clone + worktree materialization + gc
│ ├── models.py # Pydantic schemas
│ ├── digest.py # email digest scheduler
│ ├── patcher.py # autonomous patch loop (clawdforge → diff → verify → PR)
│ ├── parsers/ # per-language Finding extractors
│ └── config.py # env-driven config
├── tests/ # pytest suite (~60 tests)
├── tests/ # pytest suite (143 tests)
├── mcp/ # crafting-table-mcp — MCP stdio bridge (separate pip install)
├── examples/
│ ├── recipes/ # production recipes — clawdforge, cauldron, tradecraft
│ └── register-all.sh # bulk-register helper
├── pyproject.toml
├── requirements.txt
└── .env.example
@ -332,6 +341,101 @@ curl -sH "Authorization: Bearer $ADMIN" \
Idempotency: `digest_runs` table holds `UNIQUE(date, project_name)`, so the
06:00 loop is safe to re-fire on the same day — only the first call sends.
## Autonomous patch loop
Wave 3 wires crafting-table into clawdforge so a project with
`notify.auto_patch=true` gets an automatic patch attempt on every
actionable finding (lint with file/line; cve with a known fix). Lifecycle:
1. Runner finishes a job + parsers populate findings.
2. Post-job hook fires: pulls the highest-severity actionable finding,
reads ±20 lines of context from the worktree.
3. Patcher opens a clawdforge session (`POST /sessions`), sends one
turn with the finding + source context + project metadata, expects
`{"diff": ..., "explanation": ..., "confidence": ...}` back.
4. Diff applied to a fresh worktree on `crafting-table/auto/<job_id>-<finding_id>`.
Apply failure → status `apply_failed`.
5. Recipe re-runs against the patched worktree (the **verify** step).
Fail → `verify_failed`.
6. Pass → commit + push + open Gitea PR. Status `pr_opened`.
7. clawdforge session always closed.
Configuration (env vars):
```
CRAFTING_CLAWDFORGE_URL=http://192.168.0.5:8800
CRAFTING_CLAWDFORGE_TOKEN=cf_...
CRAFTING_GITEA_URL=http://192.168.0.5:3001
CRAFTING_GITEA_TOKEN=<gitea PAT>
CRAFTING_PATCHER_MAX_ATTEMPTS=3
CRAFTING_PATCHER_BRANCH_PREFIX=crafting-table/auto/
```
If any of the four required vars is missing, the patcher stays disabled
and `POST /jobs/{id}/patches` returns 503. The runner hook silently no-ops
in that case so existing job flow is unaffected.
**Verification cost matters.** The verify step re-runs the failing recipe
on the patched worktree — for projects with multi-minute builds this
DOUBLES the latency. Set `notify.auto_patch=true` only for projects where
the audit/test recipe is <5min, OR accept the latency. v0.2 candidate:
"fast verify" mode that re-runs only the specific lint that fired.
`patch_attempts` table holds every attempt with `UNIQUE(finding_id, attempt_number)`;
the loop early-exits at `max_attempts_per_finding` (default 3). No
auto-merge; PRs land for human review.
Manual trigger:
```bash
curl -sH "Authorization: Bearer $TOKEN" \
-X POST http://192.168.0.5:8810/jobs/$JOB/patches \
-d '{"finding_id": 42}' | jq .
# → {"ok": true, "attempt": {"status": "pr_opened", "pr_url": "...", ...}}
```
## First production recipes
Three recipes ship in `examples/recipes/`:
| Recipe | Subprojects | Schedule (audit) | auto_patch |
|---------------|-------------|------------------|------------|
| `clawdforge` | 14 (one per SDK + root) | nightly 02:00 | **true** |
| `cauldron` | 1 (Flask app, `.`) | nightly 02:00 | **true** |
| `tradecraft` | 1 (`.`) | nightly 02:00 | **false** (manual review) |
Each ships with a placeholder `REPLACE_WITH_GITEA_TOKEN` in `git_url`;
`examples/register-all.sh` substitutes `$GITEA_TOKEN` at register time so
no real token ever lands in the repo.
Smoke procedure (post-deploy):
```
1. docker compose up -d
2. TOKEN=$(cat /mnt/user/appdata/crafting-table/data/admin-bearer.txt)
3. CRAFTING_TABLE_TOKEN=$TOKEN GITEA_TOKEN=<your-pat> bash examples/register-all.sh
4. curl -H "Authorization: Bearer $TOKEN" http://192.168.0.5:8810/projects \
→ expect 3 projects (clawdforge, cauldron, tradecraft)
5. curl -X POST -H "Authorization: Bearer $TOKEN" \
http://192.168.0.5:8810/projects/clawdforge/jobs \
-d '{"recipe":"test","subproject":"clients/python"}'
→ expect job_id
6. Poll GET /jobs/{job_id} until status terminal → expect succeeded
```
Per-recipe smoke status (today, pre-deploy):
- `clawdforge` — 14 subprojects; `clients/python` & `clients/typescript`
& `clients/go` & `clients/rust` known clean from existing CI; ruby /
php / kotlin / java / csharp / swift compile-cleanly today but
toolchain availability inside the crafting-table image is what step 1
smoke verified. Bash subproject's `test/run.sh` may not exist (manual
check needed post-deploy).
- `cauldron` — single Flask subproject; pip-audit & pytest known to run
cleanly from the cauldron repo's own CI history.
- `tradecraft` — single subproject; auto_patch is **off** by design
(production app, manual PR review only).
## MCP bridge
The `mcp/` subdirectory ships a self-contained `crafting-table-mcp` Python

View file

@ -127,6 +127,32 @@ MIGRATIONS: list[tuple[str, str]] = [
CREATE INDEX IF NOT EXISTS idx_digest_runs_date ON digest_runs(date);
""",
),
(
"007_patch_attempts",
"""
CREATE TABLE IF NOT EXISTS patch_attempts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
finding_id INTEGER NOT NULL,
job_id TEXT NOT NULL,
project_name TEXT NOT NULL,
attempt_number INTEGER NOT NULL,
status TEXT NOT NULL,
branch_name TEXT,
pr_url TEXT,
diff_excerpt TEXT,
session_id TEXT,
error TEXT,
created_at INTEGER NOT NULL,
finished_at INTEGER,
UNIQUE(finding_id, attempt_number),
FOREIGN KEY (finding_id) REFERENCES findings(id) ON DELETE CASCADE,
FOREIGN KEY (job_id) REFERENCES jobs(id) ON DELETE CASCADE
);
CREATE INDEX IF NOT EXISTS idx_patch_attempts_status ON patch_attempts(status);
CREATE INDEX IF NOT EXISTS idx_patch_attempts_project ON patch_attempts(project_name);
CREATE INDEX IF NOT EXISTS idx_patch_attempts_finding ON patch_attempts(finding_id);
""",
),
]
# fmt: on
@ -550,6 +576,132 @@ class DB:
).fetchall()
return [dict(r) for r in rows]
# ---------- patch attempts ----------------------------------------------
def get_finding(self, finding_id: int) -> dict | None:
with self._conn() as c:
row = c.execute(
"SELECT * FROM findings WHERE id=?", (int(finding_id),)
).fetchone()
return dict(row) if row else None
def list_findings_for_job(self, job_id: str) -> list[dict]:
"""Alias matching list_findings — kept for callers that prefer the
more explicit name."""
return self.list_findings(job_id)
def count_patch_attempts(self, finding_id: int) -> int:
with self._conn() as c:
row = c.execute(
"SELECT COUNT(*) AS n FROM patch_attempts WHERE finding_id=?",
(int(finding_id),),
).fetchone()
return int(row["n"]) if row else 0
def insert_patch_attempt(
self,
*,
finding_id: int,
job_id: str,
project_name: str,
attempt_number: int,
status: str,
branch_name: str | None = None,
pr_url: str | None = None,
diff_excerpt: str | None = None,
session_id: str | None = None,
error: str | None = None,
) -> int:
with self._conn() as c:
cur = c.execute(
"""
INSERT INTO patch_attempts
(finding_id, job_id, project_name, attempt_number, status,
branch_name, pr_url, diff_excerpt, session_id, error,
created_at, finished_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""",
(
int(finding_id),
job_id,
project_name,
int(attempt_number),
status,
branch_name,
pr_url,
diff_excerpt,
session_id,
error,
int(time.time()),
int(time.time()),
),
)
return int(cur.lastrowid)
def get_patch_attempt(self, attempt_id: int) -> dict | None:
with self._conn() as c:
row = c.execute(
"SELECT * FROM patch_attempts WHERE id=?", (int(attempt_id),)
).fetchone()
return dict(row) if row else None
def list_patch_attempts(
self,
*,
project_name: str | None = None,
status: str | None = None,
finding_id: int | None = None,
owner_token: str | None = None,
limit: int = 100,
) -> list[dict]:
sql = """
SELECT pa.* FROM patch_attempts pa
JOIN projects p ON p.name = pa.project_name
WHERE 1=1
"""
params: list = []
if project_name is not None:
sql += " AND pa.project_name=?"
params.append(project_name)
if status is not None:
sql += " AND pa.status=?"
params.append(status)
if finding_id is not None:
sql += " AND pa.finding_id=?"
params.append(int(finding_id))
if owner_token is not None:
sql += " AND p.owner_token=?"
params.append(owner_token)
sql += " ORDER BY pa.created_at DESC LIMIT ?"
params.append(int(limit))
with self._conn() as c:
rows = c.execute(sql, params).fetchall()
return [dict(r) for r in rows]
def list_patch_attempts_in_window(
self,
*,
window_start: int,
window_end: int,
project_name: str | None = None,
statuses: tuple[str, ...] | None = None,
) -> list[dict]:
"""Patch attempts created within [window_start, window_end]. Used by
the email digest to surface drafted patches in the daily summary."""
sql = "SELECT * FROM patch_attempts WHERE created_at >= ? AND created_at <= ?"
params: list = [int(window_start), int(window_end)]
if project_name is not None:
sql += " AND project_name=?"
params.append(project_name)
if statuses:
placeholders = ",".join("?" for _ in statuses)
sql += f" AND status IN ({placeholders})"
params.extend(statuses)
sql += " ORDER BY created_at"
with self._conn() as c:
rows = c.execute(sql, params).fetchall()
return [dict(r) for r in rows]
# ---------- async wrappers ----------------------------------------------
async def arun(self, fn, *args, **kwargs):

View file

@ -82,6 +82,25 @@ class SmtpConfig:
# --- helpers ----------------------------------------------------------------
def _parse_pr_url(pr_url: str) -> tuple[str, str, int] | None:
"""Pull (owner, repo, number) out of a Gitea-style PR URL.
Accepts URLs like ``http://192.168.0.5:3001/Sulkta-Coop/clawdforge/pulls/42``.
Returns None if the URL doesn't look right — caller treats that as
"can't determine state, assume open".
"""
try:
from urllib.parse import urlparse
u = urlparse(pr_url)
parts = [p for p in u.path.split("/") if p]
# owner/repo/pulls/N
if len(parts) >= 4 and parts[-2] in ("pulls", "issues"):
return parts[-4], parts[-3], int(parts[-1])
except (ValueError, TypeError):
return None
return None
def _job_event_tags(job: dict, findings: list[dict]) -> set[str]:
"""Map a job + its findings to notify.on event tags.
@ -167,10 +186,16 @@ def _filter_for_project(jobs_with_findings: list[tuple[dict, list[dict]]], notif
# --- rendering --------------------------------------------------------------
def _render_text(date_str: str, sections: list[dict], full_log_url: str) -> str:
def _render_text(
date_str: str,
sections: list[dict],
full_log_url: str,
*,
open_followups: int = 0,
) -> str:
"""Build the text body. Matches the worked example in the spec."""
total_runs = sum(len(s["runs"]) for s in sections)
total_drafted = 0 # placeholder, wave 3
total_drafted = sum(len(s.get("patches", [])) for s in sections)
total_cves = sum(s["cves"] for s in sections)
subj_summary = f"{total_runs} build" + ("s" if total_runs != 1 else "")
lines = []
@ -187,19 +212,32 @@ def _render_text(date_str: str, sections: list[dict], full_log_url: str) -> str:
lines.append(
f" {glyph} {proj_sub:<32s} {run['recipe']:<6s} {run['status']:<5s} ({run['summary']})"
)
for patch in s.get("patches", []):
if patch.get("branch_name"):
lines.append(
f" → patch drafted: branch {patch['branch_name']}"
)
if patch.get("pr_url"):
lines.append(f" → PR: {patch['pr_url']}")
lines.append("")
lines.append("Open follow-ups:")
lines.append(" - 0 unmerged auto-patches")
lines.append(f" - {open_followups} unmerged auto-patches")
lines.append(" - 0 manual review tickets in bugs.sulkta.com")
lines.append("")
lines.append(f"Full log: {full_log_url}")
return "\n".join(lines) + "\n"
def _render_html(date_str: str, sections: list[dict], full_log_url: str) -> str:
def _render_html(
date_str: str,
sections: list[dict],
full_log_url: str,
*,
open_followups: int = 0,
) -> str:
"""Build the HTML body. Same content, table styling, monospace font."""
total_runs = sum(len(s["runs"]) for s in sections)
total_drafted = 0
total_drafted = sum(len(s.get("patches", [])) for s in sections)
total_cves = sum(s["cves"] for s in sections)
rows = []
@ -211,6 +249,16 @@ def _render_html(date_str: str, sections: list[dict], full_log_url: str) -> str:
f"<td>{run['recipe']}</td><td>{run['status']}</td>"
f"<td>{run['summary']}</td></tr>"
)
for patch in s.get("patches", []):
cell = ""
if patch.get("branch_name"):
cell += f"branch <code>{patch['branch_name']}</code>"
if patch.get("pr_url"):
cell += f" — <a href=\"{patch['pr_url']}\">PR</a>"
if cell:
rows.append(
f'<tr><td>↳</td><td colspan="4">{cell}</td></tr>'
)
if not rows:
rows.append('<tr><td colspan="5"><i>(no activity)</i></td></tr>')
@ -234,7 +282,7 @@ tr td:first-child {{ font-size: 1.2em; }}
</table>
<h3>Open follow-ups</h3>
<ul>
<li>0 unmerged auto-patches</li>
<li>{open_followups} unmerged auto-patches</li>
<li>0 manual review tickets in bugs.sulkta.com</li>
</ul>
<p class="foot">Full log: <a href="{full_log_url}">{full_log_url}</a></p>
@ -267,6 +315,7 @@ class DigestScheduler:
hour: int = 6,
minute: int = 0,
full_log_base_url: str = "http://192.168.0.5:8810/digests",
gitea_pr_state_check=None,
):
self.db = db
self.smtp = smtp
@ -274,6 +323,10 @@ class DigestScheduler:
self.hour = hour
self.minute = minute
self.full_log_base_url = full_log_base_url
# Optional callable: (owner, repo, number) -> "open" | "closed" | None.
# Used to count open follow-ups across all PR-opened patches in the
# window. Tests inject a stub so we don't make real network calls.
self.gitea_pr_state_check = gitea_pr_state_check
self._loop_task: asyncio.Task | None = None
self._stopping = False
@ -389,6 +442,8 @@ class DigestScheduler:
per_project_sections: list[dict] = []
per_project_meta: list[dict] = []
full_log_url = f"{self.full_log_base_url}/{date_str}"
# Total open follow-ups across all projects in the window.
open_followups_total = 0
for prow in projects:
recipe = json.loads(prow.get("recipe_json") or "{}")
@ -421,10 +476,44 @@ class DigestScheduler:
})
cves += sum(1 for f in findings if f.get("kind") == "cve")
# Patch attempts for this project in the same window.
patch_rows = self.db.list_patch_attempts_in_window(
window_start=window_start,
window_end=window_end,
project_name=prow["name"],
statuses=("pushed", "pr_opened"),
)
patch_entries: list[dict] = []
for pa in patch_rows:
patch_entries.append({
"branch_name": pa.get("branch_name"),
"pr_url": pa.get("pr_url"),
"status": pa.get("status"),
})
# Count open follow-ups via Gitea state check (when configured).
if pa.get("status") == "pr_opened" and pa.get("pr_url"):
if self.gitea_pr_state_check is not None:
owner_repo_n = _parse_pr_url(pa["pr_url"])
if owner_repo_n is not None:
owner, repo, n = owner_repo_n
try:
state = self.gitea_pr_state_check(owner, repo, n)
except Exception as e:
log.warning(
"digest: gitea PR state check failed: %s", e
)
state = None
if state in (None, "open"):
open_followups_total += 1
else:
# Without a checker, treat all pr_opened rows as still open.
open_followups_total += 1
section = {
"project": prow["name"],
"runs": section_runs,
"cves": cves,
"patches": patch_entries,
}
meta = {
@ -446,7 +535,7 @@ class DigestScheduler:
meta["skipped_reason"] = "no_recipients"
per_project_meta.append(meta)
continue
if not section_runs and not wants_summary:
if not section_runs and not patch_entries and not wants_summary:
meta["skipped_reason"] = "zero_activity"
per_project_meta.append(meta)
continue
@ -454,8 +543,18 @@ class DigestScheduler:
per_project_sections.append(section)
per_project_meta.append(meta)
text_body = _render_text(date_str, per_project_sections, full_log_url)
html_body = _render_html(date_str, per_project_sections, full_log_url)
text_body = _render_text(
date_str,
per_project_sections,
full_log_url,
open_followups=open_followups_total,
)
html_body = _render_html(
date_str,
per_project_sections,
full_log_url,
open_followups=open_followups_total,
)
# Per-project send loop. Idempotency check via digest_runs UNIQUE.
for meta, section in zip(
@ -470,11 +569,22 @@ class DigestScheduler:
continue
# Build a per-project-scoped body.
proj_text = _render_text(date_str, [section], full_log_url)
proj_html = _render_html(date_str, [section], full_log_url)
proj_text = _render_text(
date_str,
[section],
full_log_url,
open_followups=open_followups_total,
)
proj_html = _render_html(
date_str,
[section],
full_log_url,
open_followups=open_followups_total,
)
n_patches = len(section.get("patches", []))
subject = (
f"crafting-table digest — {date_str} "
f"({len(section['runs'])} runs, 0 patches drafted, {section['cves']} CVEs)"
f"({len(section['runs'])} runs, {n_patches} patches drafted, {section['cves']} CVEs)"
)
if dry_run or self.smtp is None:

1102
crafting_table/patcher.py Normal file

File diff suppressed because it is too large Load diff

View file

@ -48,6 +48,7 @@ from .models import (
Project,
TokenCreateRequest,
)
from .patcher import Patcher, PatcherConfig
from .runner import Runner
from .workspace import WorkspaceManager
@ -77,6 +78,45 @@ runner: Runner = Runner(
_smtp_cfg: SmtpConfig | None = SmtpConfig.from_env()
digest_scheduler: DigestScheduler = DigestScheduler(db=db, smtp=_smtp_cfg)
# Patcher (wave 3): clawdforge + Gitea creds env-driven; if any required env
# var is missing, the patcher stays None and the runner hook short-circuits.
_patcher_cfg: PatcherConfig | None = PatcherConfig.from_env()
patcher: Patcher | None = (
Patcher(db=db, workspace=workspace, config=_patcher_cfg, runner=runner)
if _patcher_cfg is not None
else None
)
# Wire the patcher into the runner's post-job hook. The runner already runs
# the parser pipeline before this hook fires, so by the time we land here
# the findings rows for `job_id` are committed and pickable.
async def _maybe_auto_patch_hook(event: dict) -> None:
if patcher is None:
return
if event.get("findings_count", 0) <= 0:
return
project_row = await db.arun(db.get_project, event["project_name"])
if project_row is None:
return
try:
recipe = json.loads(project_row.get("recipe_json") or "{}")
except json.JSONDecodeError:
return
notify = recipe.get("notify") or {}
if not bool(notify.get("auto_patch")):
return
job = await db.arun(db.get_job, event["job_id"])
if job is None:
return
try:
await patcher.maybe_draft_for_job(job)
except Exception as e:
log.warning("patcher hook failed for job %s: %s", event["job_id"], e)
runner.add_hook(_maybe_auto_patch_hook)
# ---------- lifespan --------------------------------------------------------
@ -473,6 +513,103 @@ async def get_job_findings(
return {"ok": True, "findings": findings}
# ---- /patches --------------------------------------------------------------
@app.post("/jobs/{id}/patches")
async def trigger_patch(
id: str,
request: Request,
authorization: Annotated[str | None, Header()] = None,
body: dict | None = None,
):
"""Manually trigger a patch attempt against a job.
body: {"finding_id": int | null}. If finding_id is null/absent we pick
the highest-severity actionable finding on the job.
Returns the resulting PatchAttempt as a dict. 503 if the patcher is
not configured (CRAFTING_CLAWDFORGE_URL/TOKEN/GITEA_URL/TOKEN missing).
"""
tok = auth.require_app(request, authorization)
job_row = await db.arun(db.get_job, id)
_job_visible(job_row, tok)
if patcher is None:
raise HTTPException(503, "patcher not configured")
body = body or {}
finding_id = body.get("finding_id")
if finding_id is not None and not isinstance(finding_id, int):
raise HTTPException(400, "finding_id must be an integer or null")
try:
attempt = await patcher.maybe_draft(id, finding_id=finding_id)
except Exception as e:
log.exception("patch trigger failed: %s", e)
raise HTTPException(500, f"patch attempt errored: {type(e).__name__}")
if attempt is None:
return {"ok": True, "attempt": None, "reason": "no_actionable_finding"}
return {"ok": True, "attempt": _patch_attempt_to_api(attempt)}
@app.get("/patches")
async def list_patches(
request: Request,
authorization: Annotated[str | None, Header()] = None,
project: str | None = None,
status: str | None = None,
limit: int = 100,
):
tok = auth.require_app(request, authorization)
owner = None if tok.is_admin else tok.name
rows = await db.arun(
db.list_patch_attempts,
project_name=project,
status=status,
owner_token=owner,
limit=max(1, min(limit, 500)),
)
return {"ok": True, "patches": rows}
@app.get("/patches/{id}")
async def get_patch(
id: int,
request: Request,
authorization: Annotated[str | None, Header()] = None,
):
tok = auth.require_app(request, authorization)
row = await db.arun(db.get_patch_attempt, int(id))
if row is None:
raise HTTPException(404, "patch attempt not found")
# Visibility-gate via the underlying project.
project_row = await db.arun(db.get_project, row["project_name"])
if project_row is None:
raise HTTPException(404, "patch attempt not found")
if not tok.is_admin and project_row["owner_token"] != tok.name:
raise HTTPException(404, "patch attempt not found")
return {"ok": True, "patch": row}
def _patch_attempt_to_api(attempt) -> dict:
"""Serialize a PatchAttempt dataclass to the wire shape."""
return {
"id": attempt.id,
"finding_id": attempt.finding_id,
"job_id": attempt.job_id,
"project_name": attempt.project_name,
"attempt_number": attempt.attempt_number,
"status": attempt.status,
"branch_name": attempt.branch_name,
"pr_url": attempt.pr_url,
"diff_excerpt": attempt.diff_excerpt,
"session_id": attempt.session_id,
"error": attempt.error,
}
# ---- /digests --------------------------------------------------------------

View file

@ -0,0 +1,19 @@
{
"name": "cauldron",
"git_url": "http://kayos:REPLACE_WITH_GITEA_TOKEN@192.168.0.5:3001/Sulkta-Coop/cauldron.git",
"default_branch": "main",
"languages": ["python"],
"subprojects": [
{
"path": ".",
"language": "python",
"build": "pip install -e .[test]",
"test": "pytest tests/",
"lint": "ruff check .",
"audit": "pip-audit",
"timeout_secs": 600
}
],
"schedule": {"audit": "0 2 * * *", "test": "0 */6 * * *"},
"notify": {"email": ["cobb@sulkta.com"], "on": ["audit_fail", "test_fail", "cve_found", "patch_drafted"], "auto_patch": true}
}

View file

@ -0,0 +1,24 @@
{
"name": "clawdforge",
"git_url": "http://kayos:REPLACE_WITH_GITEA_TOKEN@192.168.0.5:3001/Sulkta-Coop/clawdforge.git",
"default_branch": "main",
"languages": ["python", "rust", "go", "ruby", "php", "java", "csharp", "swift", "kotlin", "c", "cpp", "bash", "typescript", "mcp"],
"subprojects": [
{"path": "clients/python", "language": "python", "build": "pip install -e .[test]", "test": "pytest tests/", "lint": "ruff check . && mypy --strict src/", "audit": "pip-audit", "timeout_secs": 600},
{"path": "clients/rust", "language": "rust", "build": "cargo build --release", "test": "cargo test --all", "lint": "cargo clippy --all-targets -- -D warnings && cargo fmt --check", "audit": "cargo audit", "timeout_secs": 1200},
{"path": "clients/go", "language": "go", "build": "go build ./...", "test": "go test ./...", "lint": "go vet ./...", "audit": "govulncheck ./...", "timeout_secs": 600},
{"path": "clients/typescript", "language": "typescript", "build": "npm install --no-audit", "test": "node --test --import tsx tests/*.test.ts", "lint": "npx tsc --noEmit", "audit": "npm audit", "timeout_secs": 600},
{"path": "clients/ruby", "language": "ruby", "build": "bundle install", "test": "bundle exec rake test", "lint": null, "audit": "bundler-audit", "timeout_secs": 600},
{"path": "clients/php", "language": "php", "build": "composer install", "test": "vendor/bin/phpunit", "lint": null, "audit": "composer audit", "timeout_secs": 600},
{"path": "clients/java", "language": "java", "build": "mvn package -DskipTests", "test": "mvn test", "lint": "mvn javadoc:javadoc -Dquiet=false", "audit": null, "timeout_secs": 1200},
{"path": "clients/csharp", "language": "csharp", "build": "dotnet build -c Release", "test": "dotnet test -c Release", "lint": null, "audit": "dotnet list package --vulnerable --include-transitive", "timeout_secs": 900},
{"path": "clients/c", "language": "c", "build": "cmake -S . -B build && cmake --build build", "test": "ctest --test-dir build --output-on-failure", "lint": null, "audit": null, "timeout_secs": 900},
{"path": "clients/cpp", "language": "cpp", "build": "cmake -S . -B build && cmake --build build", "test": "ctest --test-dir build --output-on-failure", "lint": null, "audit": null, "timeout_secs": 900},
{"path": "clients/kotlin", "language": "kotlin", "build": "./gradlew --no-daemon build", "test": "./gradlew --no-daemon test", "lint": null, "audit": null, "timeout_secs": 1800},
{"path": "clients/bash", "language": "bash", "build": null, "test": "bash test/run.sh", "lint": "shellcheck cf", "audit": null, "timeout_secs": 300},
{"path": "clients/mcp", "language": "python", "build": "pip install -e .", "test": "pytest tests/", "lint": null, "audit": null, "timeout_secs": 300},
{"path": ".", "language": "python", "build": "pip install -e .", "test": "pytest tests/", "lint": null, "audit": null, "timeout_secs": 600}
],
"schedule": {"audit": "0 2 * * *", "test": "0 8 * * *"},
"notify": {"email": ["cobb@sulkta.com"], "on": ["audit_fail", "test_fail", "cve_found", "patch_drafted"], "auto_patch": true}
}

View file

@ -0,0 +1,19 @@
{
"name": "tradecraft",
"git_url": "http://kayos:REPLACE_WITH_GITEA_TOKEN@192.168.0.5:3001/TradeCraft/tradecraft.git",
"default_branch": "main",
"languages": ["python"],
"subprojects": [
{
"path": ".",
"language": "python",
"build": "pip install -e .",
"test": "pytest tests/",
"lint": "ruff check .",
"audit": "pip-audit",
"timeout_secs": 900
}
],
"schedule": {"audit": "0 2 * * *"},
"notify": {"email": ["cobb@sulkta.com"], "on": ["audit_fail", "cve_found"], "auto_patch": false}
}

48
examples/register-all.sh Executable file
View file

@ -0,0 +1,48 @@
#!/bin/bash
# Register all example recipes against a running crafting-table instance.
#
# Reads the bearer token from $CRAFTING_TABLE_TOKEN, falling back to
# /data/admin-bearer.txt (the path inside the container) if unset. The
# admin bearer file is also bind-mounted at
# /mnt/user/appdata/crafting-table/data/admin-bearer.txt on the Lucy host
# — that's the recommended source on the host side.
#
# IMPORTANT: the recipe JSON files in recipes/ ship with a placeholder
# git_url containing "REPLACE_WITH_GITEA_TOKEN". This script substitutes
# $GITEA_TOKEN into each recipe before posting; commit-time the real
# token never lives on disk.
set -euo pipefail
BASE_URL=${CRAFTING_TABLE_URL:-http://192.168.0.5:8810}
TOKEN=${CRAFTING_TABLE_TOKEN:-$(cat /data/admin-bearer.txt 2>/dev/null || echo "")}
GITEA_TOKEN=${GITEA_TOKEN:-}
if [ -z "$TOKEN" ]; then
echo "no crafting-table token (set CRAFTING_TABLE_TOKEN or ensure /data/admin-bearer.txt exists)" >&2
exit 1
fi
if [ -z "$GITEA_TOKEN" ]; then
echo "no Gitea token (set GITEA_TOKEN to substitute into recipe git_url)" >&2
exit 1
fi
DIR="$(dirname "$0")/recipes"
for recipe in "$DIR"/*.json; do
name="$(basename "$recipe" .json)"
echo "registering $name from $recipe..."
body="$(sed "s|REPLACE_WITH_GITEA_TOKEN|$GITEA_TOKEN|g" "$recipe")"
code=$(printf '%s' "$body" | curl -s -o /tmp/register-resp.json \
-w "%{http_code}" \
-X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
--data-binary @- \
"$BASE_URL/projects" || true)
if [ "$code" = "200" ]; then
echo " ok"
elif [ "$code" = "409" ]; then
echo " already exists — use PUT /projects/$name to update"
else
echo " FAILED ($code): $(cat /tmp/register-resp.json 2>/dev/null || echo no-body)"
fi
done

View file

@ -374,3 +374,30 @@ class CraftingTableClient:
raise ValueError("job_id must be non-empty")
slug = quote(job_id, safe="")
return self._get_text(f"/jobs/{slug}/log")
def trigger_patch(
self, job_id: str, finding_id: int | None = None
) -> dict:
"""POST /jobs/{id}/patches — autonomous patch loop trigger.
Returns the wire shape ``{"ok": bool, "attempt": <PatchAttempt>}``
from the server. ``attempt`` may be ``None`` when the job has no
actionable findings.
"""
if not job_id:
raise ValueError("job_id must be non-empty")
if finding_id is not None and not isinstance(finding_id, int):
raise ValueError("finding_id must be an integer or None")
slug = quote(job_id, safe="")
body: dict[str, Any] = {}
if finding_id is not None:
body["finding_id"] = int(finding_id)
payload = self._request(
"POST", f"/jobs/{slug}/patches", json_body=body
)
if not isinstance(payload, dict):
raise CraftingTableError(
f"unexpected POST /jobs/{{id}}/patches response type: "
f"{type(payload).__name__}"
)
return payload

View file

@ -9,8 +9,9 @@ Eight tools are exposed (per spec ``memory/spec-crafting-table.md``):
- ``crafting_table_run_test`` kick off a ``test`` recipe job.
- ``crafting_table_get_job`` fetch job state + log tail.
- ``crafting_table_get_findings`` fetch structured findings.
- ``crafting_table_draft_patch`` wave-3 stub; returns "not yet
implemented" so the tool surface is stable but no work happens.
- ``crafting_table_draft_patch`` autonomous patch loop trigger
(wave 3); calls ``POST /jobs/{id}/patches`` and returns the resulting
``PatchAttempt``.
Admin endpoints (``/admin/tokens``) are intentionally NOT exposed. Token
minting is a human-gated operation; an LLM client has no business poking at
@ -279,14 +280,17 @@ def _tool_definitions() -> list[types.Tool]:
types.Tool(
name=TOOL_DRAFT_PATCH,
description=(
"Draft a patch (unified diff) addressing one or more "
"findings on a job. WAVE 2B STUB — full implementation "
"lands in wave 3 / step 9 of the v0.1 plan. Today this tool "
"is callable but only returns a 'not yet implemented' "
"message; the surface exists so tool catalogues stay stable "
"across waves. Once shipped, the patch will be drafted via "
"clawdforge and applied to a worktree, with a Gitea PR "
"opened on the configured branch."
"Draft a patch (unified diff) addressing one finding on a "
"job. The server opens a clawdforge session, asks the model "
"for a unified diff, applies it to a fresh worktree, "
"re-runs the failing recipe to verify, and on success "
"pushes a branch and opens a Gitea PR. No auto-merge — "
"review and merge manually. Returns a PatchAttempt with "
"{status, branch_name, pr_url, error}; status ranges over "
"drafted/apply_failed/verify_failed/pushed/pr_opened/"
"max_attempts_exceeded. 503 if the patcher isn't "
"configured. v0.1 supports lint and cve findings; "
"test_fail is v0.2."
),
inputSchema={
"type": "object",
@ -301,8 +305,8 @@ def _tool_definitions() -> list[types.Tool]:
"minimum": 1,
"description": (
"Optional specific finding id. If omitted, "
"drafts patches for all open findings on the "
"job."
"the server picks the highest-severity "
"actionable finding on the job."
),
},
},
@ -555,9 +559,6 @@ async def _dispatch(
)
if name == TOOL_DRAFT_PATCH:
# Wave-2B stub: validate args lightly, return a stable message.
# Once wave-3 lands this whole branch becomes a real call to a
# /jobs/{id}/patch endpoint that drafts via clawdforge.
job_id = args.get("job_id")
if not isinstance(job_id, str) or not job_id:
return _err_content("missing or empty 'job_id' argument"), True
@ -567,23 +568,35 @@ async def _dispatch(
and not isinstance(finding_id, bool)
):
return _err_content("'finding_id' must be an integer"), True
return (
_ok_content(
{
"ok": False,
"pending": True,
"message": (
"draft patch — not yet implemented (lands in "
"wave 3 / step 9). The tool surface is stable; "
"callers can keep referencing it. Today no "
"patch is drafted."
),
"job_id": job_id,
"finding_id": finding_id,
}
),
False,
try:
payload = await asyncio.to_thread(
ct.trigger_patch, job_id, finding_id
)
except ValueError as ve:
return _err_content(str(ve)), True
attempt = payload.get("attempt") if isinstance(payload, dict) else None
if attempt is None:
prose = (
f"no actionable finding on job {job_id} — patcher "
f"declined to draft. Check "
f"crafting_table_get_findings to confirm or pass an "
f"explicit finding_id."
)
return _two_block_content(prose, payload), False
status = attempt.get("status", "?")
branch = attempt.get("branch_name") or "(no branch)"
pr_url = attempt.get("pr_url") or "(no PR)"
err = attempt.get("error") or ""
prose_parts = [
f"patch attempt #{attempt.get('attempt_number')} for finding "
f"{attempt.get('finding_id')} on job {job_id}: status={status}",
f"branch={branch}",
f"pr={pr_url}",
]
if err:
prose_parts.append(f"error: {err}")
prose = "\n".join(prose_parts)
return _two_block_content(prose, payload), False
return _err_content(f"unknown tool: {name}"), True

View file

@ -572,43 +572,99 @@ class TestGetFindings(unittest.TestCase):
self.assertIn("not found", content[0].text)
class TestDraftPatchStub(unittest.TestCase):
"""Wave 2B stub: tool surface present, but returns a 'pending' message."""
class TestDraftPatch(unittest.TestCase):
"""Wave 3: real call to POST /jobs/{id}/patches; two-block return."""
def test_returns_pending_message(self) -> None:
@responses.activate
def test_pr_opened_two_block_return(self) -> None:
"""Server returns a pr_opened attempt → MCP returns prose + JSON."""
responses.add(
responses.POST,
f"{BASE_URL}/jobs/j-1/patches",
json={
"ok": True,
"attempt": {
"id": 7,
"finding_id": 42,
"job_id": "j-1",
"project_name": "demo",
"attempt_number": 1,
"status": "pr_opened",
"branch_name": "crafting-table/auto/j-1-42",
"pr_url": "http://192.168.0.5:3001/X/Y/pulls/9",
"diff_excerpt": "--- a/x\n+++ b/x",
"session_id": "s-1",
"error": None,
},
},
status=200,
)
c = _client()
try:
content, is_error = _run(
_dispatch(c, TOOL_DRAFT_PATCH, {"job_id": "j-1"})
)
finally:
c.close()
self.assertFalse(is_error)
# Two-content-block return: prose + JSON.
self.assertEqual(len(content), 2)
prose = content[0].text
self.assertIn("pr_opened", prose)
self.assertIn("crafting-table/auto/j-1-42", prose)
self.assertIn("/pulls/9", prose)
body = json.loads(content[1].text)
self.assertTrue(body["ok"])
self.assertEqual(body["attempt"]["status"], "pr_opened")
@responses.activate
def test_no_actionable_finding(self) -> None:
responses.add(
responses.POST,
f"{BASE_URL}/jobs/j-1/patches",
json={"ok": True, "attempt": None, "reason": "no_actionable_finding"},
status=200,
)
c = _client()
try:
content, is_error = _run(
_dispatch(c, TOOL_DRAFT_PATCH, {"job_id": "j-1"})
)
finally:
c.close()
self.assertFalse(is_error)
self.assertIn("no actionable finding", content[0].text)
@responses.activate
def test_with_finding_id_passes_through(self) -> None:
responses.add(
responses.POST,
f"{BASE_URL}/jobs/j-1/patches",
json={
"ok": True,
"attempt": {
"id": 1, "finding_id": 42, "job_id": "j-1",
"project_name": "demo", "attempt_number": 1,
"status": "drafted", "branch_name": None, "pr_url": None,
"diff_excerpt": None, "session_id": None,
"error": "malformed_response",
},
},
status=200,
)
c = _client()
try:
content, is_error = _run(
_dispatch(
c,
TOOL_DRAFT_PATCH,
{"job_id": "j-1"},
c, TOOL_DRAFT_PATCH, {"job_id": "j-1", "finding_id": 42}
)
)
finally:
c.close()
self.assertFalse(is_error)
body = json.loads(content[0].text)
self.assertFalse(body["ok"])
self.assertTrue(body["pending"])
self.assertIn("not yet implemented", body["message"])
self.assertIn("wave 3", body["message"])
def test_with_finding_id(self) -> None:
c = _client()
try:
content, is_error = _run(
_dispatch(
c,
TOOL_DRAFT_PATCH,
{"job_id": "j-1", "finding_id": 42},
)
)
finally:
c.close()
self.assertFalse(is_error)
body = json.loads(content[0].text)
self.assertEqual(body["finding_id"], 42)
body = json.loads(content[1].text)
self.assertEqual(body["attempt"]["finding_id"], 42)
self.assertEqual(body["attempt"]["status"], "drafted")
def test_rejects_bool_finding_id(self) -> None:
# bool is a subclass of int — defense-in-depth.
@ -616,9 +672,7 @@ class TestDraftPatchStub(unittest.TestCase):
try:
content, is_error = _run(
_dispatch(
c,
TOOL_DRAFT_PATCH,
{"job_id": "j-1", "finding_id": True},
c, TOOL_DRAFT_PATCH, {"job_id": "j-1", "finding_id": True}
)
)
finally:
@ -635,6 +689,24 @@ class TestDraftPatchStub(unittest.TestCase):
self.assertTrue(is_error)
self.assertIn("job_id", content[0].text)
@responses.activate
def test_503_when_patcher_disabled(self) -> None:
responses.add(
responses.POST,
f"{BASE_URL}/jobs/j-1/patches",
json={"detail": "patcher not configured"},
status=503,
)
c = _client()
try:
content, is_error = _run(
_dispatch(c, TOOL_DRAFT_PATCH, {"job_id": "j-1"})
)
finally:
c.close()
self.assertTrue(is_error)
self.assertIn("503", content[0].text)
class TestUnknownTool(unittest.TestCase):
def test_unknown_tool_returns_error(self) -> None:

View file

@ -16,6 +16,7 @@ dependencies = [
"fastapi>=0.115,<1.0",
"uvicorn[standard]>=0.30,<1.0",
"pydantic>=2.7,<3.0",
"httpx>=0.27,<1.0",
]
[project.optional-dependencies]

View file

@ -1,3 +1,4 @@
fastapi==0.115.5
uvicorn[standard]==0.32.1
pydantic==2.9.2
httpx>=0.27,<1.0

545
tests/test_patcher.py Normal file
View file

@ -0,0 +1,545 @@
"""Patcher unit tests — drafted/apply_failed/verify_failed/pushed/pr_opened
status transitions plus the runner hook integration.
We mock the clawdforge + Gitea wires (no real network calls) and stub the
runner._exec_recipe so the verify step is deterministic. Diff applying
uses real git in a temp worktree this catches the wire-up issues that
pure unit tests miss.
"""
from __future__ import annotations
import asyncio
import json
import shutil
import subprocess
import time
from pathlib import Path
from unittest.mock import AsyncMock, MagicMock
import pytest
from crafting_table.db import DB
from crafting_table.patcher import (
ClawdforgeClient,
GiteaClient,
Patcher,
PatcherConfig,
extract_diff_json,
findings_were_actionable,
turn_text,
)
from crafting_table.workspace import WorkspaceManager
# ---------- helpers ---------------------------------------------------------
def _make_origin_repo(root: Path, *, file_text: str = "hello\nworld\n") -> str:
"""Create a bare-cloneable origin repo with a tracked file the patch
will rewrite."""
if shutil.which("git") is None:
pytest.skip("git binary not present")
origin = root / "origin.git"
work = root / "origin-work"
work.mkdir()
subprocess.run(["git", "init", "-q", "-b", "main"], cwd=work, check=True)
subprocess.run(["git", "config", "user.email", "test@example"], cwd=work, check=True)
subprocess.run(["git", "config", "user.name", "test"], cwd=work, check=True)
subprocess.run(["git", "config", "commit.gpgsign", "false"], cwd=work, check=True)
(work / "src").mkdir()
(work / "src" / "app.py").write_text(file_text)
subprocess.run(["git", "add", "."], cwd=work, check=True)
subprocess.run(["git", "commit", "-q", "-m", "init"], cwd=work, check=True)
# Bare clone so push works.
subprocess.run(
["git", "clone", "--bare", str(work), str(origin)],
check=True,
capture_output=True,
)
# Re-point work's origin at the bare so subsequent fetches in tests work.
subprocess.run(
["git", "remote", "add", "bare", str(origin)],
cwd=work, check=True, capture_output=True,
)
return str(origin)
def _seed_project_and_job(
db: DB,
*,
project_name: str,
git_url: str,
findings: list[dict] | None = None,
auto_patch: bool = True,
) -> tuple[str, int | None]:
"""Insert a project + a job + (optionally) one finding. Returns
(job_id, finding_id_or_None)."""
# Project
db.insert_token(name="alpha", bearer="ct_alpha", is_admin=False, ip_cidrs=None)
recipe = {
"languages": ["python"],
"subprojects": [
{
"path": ".",
"language": "python",
"lint": "echo 'lint ok'",
"timeout_secs": 30,
}
],
"schedule": {},
"notify": {"email": ["x@y"], "on": [], "auto_patch": auto_patch},
}
db.upsert_project(
name=project_name,
git_url=git_url,
default_branch="main",
recipe_json=json.dumps(recipe),
owner_token="alpha",
)
# Job
snapshot = {
"git_url": git_url,
"default_branch": "main",
"languages": ["python"],
"subprojects": recipe["subprojects"],
}
job_id = "job-1"
db.insert_job(
job_id=job_id,
project_name=project_name,
subproject_path=".",
recipe="lint",
branch="main",
log_path="/tmp/_x.log",
recipe_snapshot_json=json.dumps(snapshot),
)
db.mark_job_finished(job_id=job_id, status="failed", exit_code=1)
finding_id = None
for f in findings or []:
finding_id = db.insert_finding(
job_id=job_id,
kind=f.get("kind", "lint"),
severity=f.get("severity", "warn"),
message=f.get("message", "msg"),
fingerprint=f.get("fingerprint", "abcdef0123456789"),
file=f.get("file"),
line=f.get("line"),
code=f.get("code"),
suggested_fix=f.get("suggested_fix"),
raw_json=None,
)
return job_id, finding_id
def _patcher_with_mocks(db: DB, workspace: WorkspaceManager, *, runner=None):
"""Build a Patcher with mocked clawdforge + Gitea clients. Returns
(patcher, claw_mock, gitea_mock) so tests can assert on call counts.
"""
cfg = PatcherConfig(
clawdforge_base_url="http://cf.local",
clawdforge_token="cf_x",
gitea_base_url="http://gitea.local",
gitea_token="gt_x",
max_attempts_per_finding=3,
)
claw = MagicMock(spec=ClawdforgeClient)
claw.create_session = AsyncMock(return_value={"session_id": "s-1"})
claw.turn = AsyncMock()
claw.close_session = AsyncMock()
gitea = MagicMock(spec=GiteaClient)
gitea.open_pr = AsyncMock(
return_value={"html_url": "http://192.168.0.5:3001/X/Y/pulls/1"}
)
p = Patcher(
db=db,
workspace=workspace,
config=cfg,
runner=runner,
clawdforge=claw,
gitea=gitea,
)
return p, claw, gitea
def _diff_for(file_rel: str, *, old: str, new: str) -> str:
"""Build a unified diff that real git apply will accept against a
file containing exactly `old`. Format matches `git diff` output."""
return (
f"diff --git a/{file_rel} b/{file_rel}\n"
f"--- a/{file_rel}\n"
f"+++ b/{file_rel}\n"
f"@@ -1,{len(old.splitlines())} +1,{len(new.splitlines())} @@\n"
+ "\n".join(f"-{l}" for l in old.splitlines()) + "\n"
+ "\n".join(f"+{l}" for l in new.splitlines()) + "\n"
)
# ---------- helper-fn unit tests ------------------------------------------
def test_findings_were_actionable_lint_with_locator():
assert findings_were_actionable([
{"kind": "lint", "file": "x.py", "line": 1}
])
def test_findings_were_actionable_lint_without_locator():
assert not findings_were_actionable([
{"kind": "lint", "file": None, "line": None}
])
def test_findings_were_actionable_test_fail_skipped():
# test_fail is NOT actionable in v0.1
assert not findings_were_actionable([
{"kind": "test_fail", "file": "x.py", "line": 1}
])
def test_findings_were_actionable_cve():
assert findings_were_actionable([
{"kind": "cve", "code": "RUSTSEC-1", "suggested_fix": "bump"}
])
def test_extract_diff_json_plain():
obj = extract_diff_json('{"diff": "x", "explanation": "y"}')
assert obj == {"diff": "x", "explanation": "y"}
def test_extract_diff_json_fenced():
obj = extract_diff_json('```json\n{"diff": "x", "explanation": "y"}\n```')
assert obj is not None
assert obj["diff"] == "x"
def test_extract_diff_json_returns_none_on_garbage():
assert extract_diff_json("not even json") is None
def test_turn_text_concatenates_text_events():
assert turn_text({"events": [
{"type": "text", "content": "hello "},
{"type": "tool_call"},
{"type": "text", "content": "world"},
]}) == "hello world"
# ---------- patcher pipeline tests -----------------------------------------
@pytest.mark.asyncio
async def test_drafts_via_clawdforge_session(db_only, tmp_path):
"""First-light test: malformed JSON from the model leaves the attempt
in status=drafted with error=malformed_response."""
git_url = _make_origin_repo(tmp_path)
workspace = WorkspaceManager(tmp_path / "ws")
job_id, finding_id = _seed_project_and_job(
db_only,
project_name="demo",
git_url=git_url,
findings=[{
"kind": "lint", "severity": "warn", "code": "F401",
"file": "src/app.py", "line": 1, "message": "bad",
}],
)
p, claw, gitea = _patcher_with_mocks(db_only, workspace)
# Model returns prose without JSON.
claw.turn.return_value = {
"events": [{"type": "text", "content": "I cannot help with that"}]
}
attempt = await p.maybe_draft(job_id, finding_id=finding_id)
assert attempt is not None
assert attempt.status == "drafted"
assert attempt.error == "malformed_response"
assert claw.create_session.await_count == 1
assert claw.close_session.await_count == 1
@pytest.mark.asyncio
async def test_apply_failed_when_diff_rejects(db_only, tmp_path):
git_url = _make_origin_repo(tmp_path)
workspace = WorkspaceManager(tmp_path / "ws")
job_id, finding_id = _seed_project_and_job(
db_only, project_name="demo", git_url=git_url,
findings=[{
"kind": "lint", "severity": "warn", "code": "F401",
"file": "src/app.py", "line": 1, "message": "x",
}],
)
p, claw, gitea = _patcher_with_mocks(db_only, workspace)
# Diff with wrong line numbers (the file is 2 lines, this hits line 999).
bad_diff = (
"diff --git a/src/app.py b/src/app.py\n"
"--- a/src/app.py\n"
"+++ b/src/app.py\n"
"@@ -999,1 +999,1 @@\n"
"-nonexistent\n"
"+something else\n"
)
claw.turn.return_value = {
"events": [{"type": "text", "content": json.dumps({
"diff": bad_diff, "explanation": "x", "confidence": "high"
})}]
}
attempt = await p.maybe_draft(job_id, finding_id=finding_id)
assert attempt is not None
assert attempt.status == "apply_failed"
assert claw.close_session.await_count == 1
@pytest.mark.asyncio
async def test_verify_failed_when_recipe_still_fails(db_only, tmp_path):
git_url = _make_origin_repo(tmp_path)
workspace = WorkspaceManager(tmp_path / "ws")
job_id, finding_id = _seed_project_and_job(
db_only, project_name="demo", git_url=git_url,
findings=[{
"kind": "lint", "severity": "warn", "code": "F401",
"file": "src/app.py", "line": 1, "message": "x",
}],
)
# Stub runner that fails verify.
fake_runner = MagicMock()
fake_runner._exec_recipe = AsyncMock(return_value=(1, False))
p, claw, gitea = _patcher_with_mocks(db_only, workspace, runner=fake_runner)
# Valid diff that DOES apply (replace 'hello' with 'goodbye')
good_diff = _diff_for("src/app.py", old="hello\nworld", new="goodbye\nworld")
claw.turn.return_value = {
"events": [{"type": "text", "content": json.dumps({
"diff": good_diff, "explanation": "x", "confidence": "high"
})}]
}
attempt = await p.maybe_draft(job_id, finding_id=finding_id)
assert attempt is not None
assert attempt.status == "verify_failed"
assert fake_runner._exec_recipe.await_count == 1
@pytest.mark.asyncio
async def test_pushed_and_pr_opened_on_success(db_only, tmp_path):
git_url = _make_origin_repo(tmp_path)
workspace = WorkspaceManager(tmp_path / "ws")
job_id, finding_id = _seed_project_and_job(
db_only, project_name="demo", git_url=git_url,
findings=[{
"kind": "lint", "severity": "warn", "code": "F401",
"file": "src/app.py", "line": 1, "message": "x",
}],
)
fake_runner = MagicMock()
fake_runner._exec_recipe = AsyncMock(return_value=(0, False))
p, claw, gitea = _patcher_with_mocks(db_only, workspace, runner=fake_runner)
good_diff = _diff_for("src/app.py", old="hello\nworld", new="goodbye\nworld")
claw.turn.return_value = {
"events": [{"type": "text", "content": json.dumps({
"diff": good_diff, "explanation": "tiny fix", "confidence": "high"
})}]
}
attempt = await p.maybe_draft(job_id, finding_id=finding_id)
assert attempt is not None, "expected a PatchAttempt"
assert attempt.status == "pr_opened", f"unexpected: {attempt.status} / {attempt.error}"
assert attempt.pr_url == "http://192.168.0.5:3001/X/Y/pulls/1"
assert attempt.branch_name and "crafting-table/auto/" in attempt.branch_name
assert gitea.open_pr.await_count == 1
@pytest.mark.asyncio
async def test_max_attempts_per_finding(db_only, tmp_path):
git_url = _make_origin_repo(tmp_path)
workspace = WorkspaceManager(tmp_path / "ws")
job_id, finding_id = _seed_project_and_job(
db_only, project_name="demo", git_url=git_url,
findings=[{
"kind": "lint", "severity": "warn", "code": "F401",
"file": "src/app.py", "line": 1, "message": "x",
}],
)
# Pre-seed three failed attempts so the 4th early-exits.
for i in range(1, 4):
db_only.insert_patch_attempt(
finding_id=finding_id, job_id=job_id, project_name="demo",
attempt_number=i, status="apply_failed",
)
p, claw, gitea = _patcher_with_mocks(db_only, workspace)
attempt = await p.maybe_draft(job_id, finding_id=finding_id)
assert attempt is not None
assert attempt.status == "max_attempts_exceeded"
assert claw.create_session.await_count == 0
@pytest.mark.asyncio
async def test_clawdforge_session_always_closes_on_exception(db_only, tmp_path):
git_url = _make_origin_repo(tmp_path)
workspace = WorkspaceManager(tmp_path / "ws")
job_id, finding_id = _seed_project_and_job(
db_only, project_name="demo", git_url=git_url,
findings=[{
"kind": "lint", "severity": "warn", "code": "F401",
"file": "src/app.py", "line": 1, "message": "x",
}],
)
p, claw, gitea = _patcher_with_mocks(db_only, workspace)
claw.turn.side_effect = RuntimeError("simulated network blip")
attempt = await p.maybe_draft(job_id, finding_id=finding_id)
assert attempt is not None
assert attempt.status == "failed"
# Session was created and then closed even though turn raised.
assert claw.create_session.await_count == 1
assert claw.close_session.await_count == 1
@pytest.mark.asyncio
async def test_runner_invokes_patcher_when_auto_patch_true(client, tmp_path):
"""Integration: the runner's post-job hook calls patcher.maybe_draft_for_job
when project.notify.auto_patch=true and there are actionable findings.
"""
tc, ctx = client
server = ctx["server"]
# Build + inject a stub patcher BEFORE we kick the job. The real
# _maybe_auto_patch_hook closes over server.patcher at call time.
stub_patcher = MagicMock()
stub_patcher.maybe_draft_for_job = AsyncMock(return_value=[])
server.patcher = stub_patcher
# Make a tiny git repo so the runner can clone+worktree.
if shutil.which("git") is None:
pytest.skip("git not available")
repo = tmp_path / "fixture-repo"
repo.mkdir()
subprocess.run(["git", "init", "-q", "-b", "main"], cwd=repo, check=True)
subprocess.run(["git", "config", "user.email", "t@e"], cwd=repo, check=True)
subprocess.run(["git", "config", "user.name", "t"], cwd=repo, check=True)
subprocess.run(["git", "config", "commit.gpgsign", "false"], cwd=repo, check=True)
(repo / "README.md").write_text("hi\n")
subprocess.run(["git", "add", "."], cwd=repo, check=True)
subprocess.run(["git", "commit", "-q", "-m", "init"], cwd=repo, check=True)
git_url = str(repo)
# Register a project with notify.auto_patch=true and a lint that emits
# ruff-shaped JSON so the parser picks up an actionable finding.
ruff_stub = json.dumps([{
"code": "F401",
"message": "'os' imported",
"filename": "src/app.py",
"location": {"row": 3, "column": 1},
}])
payload = {
"name": "ct-autopatch-on",
"git_url": git_url,
"default_branch": "main",
"languages": ["python"],
"subprojects": [{
"path": ".",
"language": "python",
"lint": f"echo '{ruff_stub}'; exit 1",
"timeout_secs": 20,
}],
"schedule": {},
"notify": {"email": ["x@y"], "on": [], "auto_patch": True},
}
r = tc.post(
"/projects",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
json=payload,
)
assert r.status_code == 200, r.text
r2 = tc.post(
"/projects/ct-autopatch-on/jobs",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
json={"recipe": "lint"},
)
assert r2.status_code == 200, r2.text
job_id = r2.json()["job_id"]
# Wait for terminal.
deadline = time.monotonic() + 30
while time.monotonic() < deadline:
rr = tc.get(
f"/jobs/{job_id}",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
)
if rr.json()["job"]["status"] in ("succeeded", "failed", "timed_out", "cancelled"):
break
time.sleep(0.1)
# Hook fan-out is fire-and-forget; let the loop turn once more.
time.sleep(0.2)
# Patcher.maybe_draft_for_job should have been called at least once.
assert stub_patcher.maybe_draft_for_job.await_count >= 1
@pytest.mark.asyncio
async def test_runner_skips_patcher_when_auto_patch_false(client, tmp_path):
tc, ctx = client
server = ctx["server"]
stub_patcher = MagicMock()
stub_patcher.maybe_draft_for_job = AsyncMock(return_value=[])
server.patcher = stub_patcher
if shutil.which("git") is None:
pytest.skip("git not available")
repo = tmp_path / "fixture-repo-off"
repo.mkdir()
subprocess.run(["git", "init", "-q", "-b", "main"], cwd=repo, check=True)
subprocess.run(["git", "config", "user.email", "t@e"], cwd=repo, check=True)
subprocess.run(["git", "config", "user.name", "t"], cwd=repo, check=True)
subprocess.run(["git", "config", "commit.gpgsign", "false"], cwd=repo, check=True)
(repo / "README.md").write_text("hi\n")
subprocess.run(["git", "add", "."], cwd=repo, check=True)
subprocess.run(["git", "commit", "-q", "-m", "init"], cwd=repo, check=True)
git_url = str(repo)
ruff_stub = json.dumps([{
"code": "F401", "message": "x",
"filename": "src/app.py", "location": {"row": 3, "column": 1},
}])
payload = {
"name": "ct-autopatch-off",
"git_url": git_url,
"default_branch": "main",
"languages": ["python"],
"subprojects": [{
"path": ".",
"language": "python",
"lint": f"echo '{ruff_stub}'; exit 1",
"timeout_secs": 20,
}],
"schedule": {},
"notify": {"email": ["x@y"], "on": [], "auto_patch": False},
}
r = tc.post(
"/projects",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
json=payload,
)
assert r.status_code == 200, r.text
r2 = tc.post(
"/projects/ct-autopatch-off/jobs",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
json={"recipe": "lint"},
)
assert r2.status_code == 200, r2.text
job_id = r2.json()["job_id"]
deadline = time.monotonic() + 30
while time.monotonic() < deadline:
rr = tc.get(
f"/jobs/{job_id}",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
)
if rr.json()["job"]["status"] in ("succeeded", "failed", "timed_out", "cancelled"):
break
time.sleep(0.1)
time.sleep(0.2)
assert stub_patcher.maybe_draft_for_job.await_count == 0

289
tests/test_patches_api.py Normal file
View file

@ -0,0 +1,289 @@
"""HTTP API tests for the wave-3 patches surface.
Covers POST /jobs/{id}/patches (manual trigger), GET /patches (list with
filters), GET /patches/{id} (detail with cross-token guards). The patcher
itself is stubbed so we don't make real clawdforge / Gitea calls.
"""
from __future__ import annotations
import json
import time
from unittest.mock import AsyncMock, MagicMock
import pytest
from crafting_table.patcher import PatchAttempt
from tests.conftest import sample_project_payload
def _install_stub_patcher(server, *, attempt: PatchAttempt | None = None):
"""Replace server.patcher with a stub that returns the given attempt.
``attempt=None`` simulates the "no actionable finding" code path.
Returns the stub for assertion-side access.
"""
stub = MagicMock()
stub.maybe_draft = AsyncMock(return_value=attempt)
server.patcher = stub
return stub
def _register_demo_project(tc, bearer: str, *, name: str = "demo") -> None:
payload = sample_project_payload(name=name)
r = tc.post(
"/projects",
headers={"Authorization": f"Bearer {bearer}"},
json=payload,
)
assert r.status_code == 200, r.text
def _seed_job_row(server, *, project_name: str = "demo", job_id: str = "j-1") -> None:
snapshot = {
"git_url": "/dev/null",
"default_branch": "main",
"subprojects": [{
"path": ".", "language": "python", "lint": "echo x"
}],
"languages": ["python"],
}
server.db.insert_job(
job_id=job_id,
project_name=project_name,
subproject_path=".",
recipe="lint",
branch="main",
log_path="/tmp/_x.log",
recipe_snapshot_json=json.dumps(snapshot),
)
server.db.mark_job_finished(job_id=job_id, status="failed", exit_code=1)
def test_post_patches_with_finding_id(client):
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
fake = PatchAttempt(
finding_id=42,
job_id="j-1",
project_name="demo",
attempt_number=1,
status="pr_opened",
branch_name="crafting-table/auto/j-1-42",
pr_url="http://192.168.0.5:3001/X/Y/pulls/9",
diff_excerpt="--- a/x\n+++ b/x",
session_id="s-1",
)
fake.id = 7
stub = _install_stub_patcher(server, attempt=fake)
r = tc.post(
"/jobs/j-1/patches",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
json={"finding_id": 42},
)
assert r.status_code == 200, r.text
body = r.json()
assert body["ok"] is True
assert body["attempt"]["status"] == "pr_opened"
assert body["attempt"]["pr_url"].endswith("/9")
# Patcher was called with the explicit finding_id.
assert stub.maybe_draft.await_count == 1
args, kwargs = stub.maybe_draft.call_args
assert kwargs.get("finding_id") == 42 or (len(args) > 1 and args[1] == 42)
def test_post_patches_without_finding_id_auto_picks(client):
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
fake = PatchAttempt(
finding_id=99, job_id="j-1", project_name="demo",
attempt_number=1, status="drafted",
)
fake.id = 1
_install_stub_patcher(server, attempt=fake)
r = tc.post(
"/jobs/j-1/patches",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
json={},
)
assert r.status_code == 200, r.text
assert r.json()["attempt"]["finding_id"] == 99
def test_post_patches_no_actionable_returns_attempt_none(client):
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
_install_stub_patcher(server, attempt=None)
r = tc.post(
"/jobs/j-1/patches",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
json={},
)
assert r.status_code == 200, r.text
body = r.json()
assert body["ok"] is True
assert body["attempt"] is None
assert body.get("reason") == "no_actionable_finding"
def test_post_patches_503_when_patcher_disabled(client):
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
server.patcher = None
r = tc.post(
"/jobs/j-1/patches",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
json={},
)
assert r.status_code == 503
def test_post_patches_cross_token_returns_404(client):
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
_install_stub_patcher(server, attempt=None)
# bravo cannot see alpha's job → 404 (existence-leak guard).
r = tc.post(
"/jobs/j-1/patches",
headers={"Authorization": f"Bearer {ctx['bravo_bearer']}"},
json={},
)
assert r.status_code == 404
def test_post_patches_rejects_non_int_finding_id(client):
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
_install_stub_patcher(server, attempt=None)
r = tc.post(
"/jobs/j-1/patches",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
json={"finding_id": "not-an-int"},
)
assert r.status_code == 400
def test_get_patches_filtered_by_project_and_status(client):
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
# Insert two attempts with different statuses, plus one for a different
# (synthetic) project so the filter actually has to exclude.
fid = server.db.insert_finding(
job_id="j-1", kind="lint", severity="warn", message="m",
fingerprint="f", file="x.py", line=1, code="X",
)
server.db.insert_patch_attempt(
finding_id=fid, job_id="j-1", project_name="demo",
attempt_number=1, status="pr_opened",
pr_url="http://gitea/X/Y/pulls/1",
)
server.db.insert_patch_attempt(
finding_id=fid, job_id="j-1", project_name="demo",
attempt_number=2, status="apply_failed",
)
r = tc.get(
"/patches?project=demo&status=pr_opened",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
)
assert r.status_code == 200, r.text
rows = r.json()["patches"]
assert len(rows) == 1
assert rows[0]["status"] == "pr_opened"
def test_get_patches_owner_scoped(client):
"""Bravo's /patches list never includes Alpha's attempts."""
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
fid = server.db.insert_finding(
job_id="j-1", kind="lint", severity="warn", message="m",
fingerprint="f", file="x.py", line=1, code="X",
)
server.db.insert_patch_attempt(
finding_id=fid, job_id="j-1", project_name="demo",
attempt_number=1, status="pr_opened",
)
r = tc.get(
"/patches",
headers={"Authorization": f"Bearer {ctx['bravo_bearer']}"},
)
assert r.status_code == 200
assert r.json()["patches"] == []
def test_get_patches_detail_owner(client):
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
fid = server.db.insert_finding(
job_id="j-1", kind="lint", severity="warn", message="m",
fingerprint="f", file="x.py", line=1, code="X",
)
pid = server.db.insert_patch_attempt(
finding_id=fid, job_id="j-1", project_name="demo",
attempt_number=1, status="pr_opened",
)
r = tc.get(
f"/patches/{pid}",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
)
assert r.status_code == 200, r.text
assert r.json()["patch"]["id"] == pid
def test_get_patches_detail_other_token_404(client):
tc, ctx = client
server = ctx["server"]
_register_demo_project(tc, ctx["alpha_bearer"])
_seed_job_row(server)
fid = server.db.insert_finding(
job_id="j-1", kind="lint", severity="warn", message="m",
fingerprint="f", file="x.py", line=1, code="X",
)
pid = server.db.insert_patch_attempt(
finding_id=fid, job_id="j-1", project_name="demo",
attempt_number=1, status="pr_opened",
)
r = tc.get(
f"/patches/{pid}",
headers={"Authorization": f"Bearer {ctx['bravo_bearer']}"},
)
assert r.status_code == 404
def test_get_patches_detail_missing(client):
tc, ctx = client
r = tc.get(
"/patches/999999",
headers={"Authorization": f"Bearer {ctx['alpha_bearer']}"},
)
assert r.status_code == 404