Public-flip audit: generalize internal hosts/paths + drop Sulkta-internal refs

URLs, mount paths, and LAN host bindings parameterized via env or relative paths
so the repo stands up from a clean clone anywhere. Drop cross-codebase refs
("mirrors clawdforge's pattern"), Sulkta-Coop client/merchant test fixtures,
and audit-changelog scaffolding from comments. README terser, technical content
preserved.
This commit is contained in:
Cobb Hayes 2026-05-27 11:25:47 -07:00
parent 8b1774130b
commit b335405c02
23 changed files with 238 additions and 266 deletions

256
README.md
View file

@ -1,42 +1,27 @@
# crafting-table
Polyglot dev/build/audit container — the build farm for the Sulkta ecosystem.
Polyglot dev/build/audit container. HTTP API + async job runner + autonomous
patch loop + email digest.
## What this is
A single Docker image carries every toolchain in the matrix below. You hand
it a git URL + a per-language recipe (build / test / lint / audit), it
materializes a worktree, runs the recipe, parses the tool output into
structured findings, and stores everything in a SQLite ledger. Optional
patch loop drafts a unified diff via an external Claude-style agent, verifies
it by re-running the recipe, and opens a PR.
A single Docker container with every toolchain we work with, fronted by a
FastAPI HTTP API + async job runner. Used as a reliable place to compile /
test / audit any Sulkta repo regardless of where the caller is — agents,
Claude sessions, ad-hoc curl, scheduled cron.
Use it as a replacement for ad-hoc per-repo build environments. One image,
one runner, every language.
Eventual surface (v0.1 full): HTTP API + MCP server + project registry +
job runner + structured findings + email digest + autonomous patch loop
through clawdforge.
Spec: `Sulkta-Coop/openclaw-workspace/memory/spec-crafting-table.md` (LAN-only).
## Status — v0.1 complete (10 of 10)
- [x] Step 1: Dockerfile + per-language smoke
- [x] Step 2: SQLite ledger + project registry
- [x] Step 3: HTTP API skeleton (FastAPI, port 8810)
- [x] Step 4: Job runner core (asyncio worker pool, git worktree, subprocess)
- [x] Step 5: Per-language parsers (Rust / Python / Go / TS first)
- [x] Step 6: Findings extraction + storage
- [x] Step 7: MCP server (stdio JSON-RPC, 8 tools) — see [mcp/README.md](mcp/README.md)
- [x] Step 8: Email digest scheduler
- [x] Step 9: Autonomous patch loop (clawdforge integration → unified diff → worktree apply → verify recipe → push branch → Gitea PR)
- [x] Step 10: Production recipes — clawdforge, cauldron, tradecraft (see [examples/recipes/](examples/recipes/))
## Toolchains in v0.1
## Toolchains
| Lang | Versions / extras |
|----------|--------------------------------------------------------------------|
| Python | 3.11 (Debian default) + uv, pipx, pip-audit, ruff, mypy, pytest, semgrep |
| Node | 22.11.0 LTS + npm, pnpm, tsx, eslint, typescript |
| Bun | latest (rolling) |
| Go | 1.22.10 + govulncheck, staticcheck |
| Rust | stable (rustup) + clippy, rustfmt, cargo-audit, cargo-deny |
| Go | 1.25.9 + govulncheck, staticcheck |
| Rust | stable (rustup) + clippy, rustfmt |
| Ruby | 3.1 (Debian default) + bundler, bundler-audit, rubocop |
| PHP | 8.2 (Debian default) + composer, phpstan, phpunit |
| JDK | 17 (default) + 21 (Temurin, alongside via `JAVA_HOME_21`) |
@ -47,13 +32,18 @@ Spec: `Sulkta-Coop/openclaw-workspace/memory/spec-crafting-table.md` (LAN-only).
| Kotlin | 1.9.25 (compiler) |
| C/C++ | clang + lld + cmake + ninja + valgrind |
| Bash | bash + shellcheck + bats + shfmt |
| Nix | single-user install with cache.nixos.org + cache.iog.io substituters |
| Generic | git, jq, yq, ripgrep, fd, gh-cli, curl, wget |
## HTTP API surface
`cargo-audit` and `cargo-deny` are NOT baked into the image — both flaked
during build (libgit2-sys C bindings + GitHub release download flakiness).
Install at runtime with `cargo install cargo-audit cargo-deny` if you need them.
LAN-only. Every request needs `Authorization: Bearer <token>`. The default
LAN allowlist is `10/8`, `172.16/12`, `192.168/16`, `127/8`, `::1/128`;
override via `CRAFTING_LAN_CIDRS`.
## HTTP API
LAN-only. Every request needs `Authorization: Bearer <token>`. Default IP
allowlist is `10/8`, `172.16/12`, `192.168/16`, `127/8`, `::1/128`; override
via `CRAFTING_LAN_CIDRS`.
| Method | Path | Who | What |
|--------|-----------------------------------|--------|--------------------------------|
@ -70,36 +60,36 @@ override via `CRAFTING_LAN_CIDRS`.
| GET | `/jobs?project=&status=&limit=` | any | List own (or all if admin) |
| GET | `/jobs/{id}` | owner | State + last 200 log lines |
| GET | `/jobs/{id}/log` | owner | Full log (file stream) |
| GET | `/jobs/{id}/findings` | owner | Structured findings (see Findings) |
| POST | `/jobs/{id}/patches` | owner | Trigger an auto-patch attempt (wave 3) |
| GET | `/jobs/{id}/findings` | owner | Structured findings |
| POST | `/jobs/{id}/patches` | owner | Trigger an auto-patch attempt |
| GET | `/patches?project=&status=&limit=`| any | List own patch attempts |
| GET | `/patches/{id}` | owner | Patch attempt detail |
Cross-token access returns **404, not 403** — same existence-leak guard as
clawdforge sessions.
Cross-token access returns **404, not 403** — existence-leak guard.
## Quickstart
After build + first boot, the admin bearer is written to
`/data/admin-bearer.txt` (chmod 600 inside the container — readable from
the bind-mounted appdata path on the host).
the bind-mounted host path).
```bash
ADMIN=$(cat /mnt/user/appdata/crafting-table/data/admin-bearer.txt)
HOST=http://localhost:8810
ADMIN=$(cat ./data/admin-bearer.txt)
# Mint a project-scoped token
TOKEN=$(curl -s http://192.168.0.5:8810/admin/tokens \
TOKEN=$(curl -s "$HOST/admin/tokens" \
-H "Authorization: Bearer $ADMIN" \
-H "content-type: application/json" \
-d '{"name":"clawdforge","is_admin":false,"ip_cidrs":[]}' | jq -r .bearer)
-d '{"name":"alpha","is_admin":false,"ip_cidrs":[]}' | jq -r .bearer)
# Register the project
curl -s http://192.168.0.5:8810/projects \
# Register a project
curl -s "$HOST/projects" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
-d '{
"name": "clawdforge",
"git_url": "http://192.168.0.5:3001/Sulkta-Coop/clawdforge.git",
"name": "alpha",
"git_url": "http://git.example.com/org/alpha.git",
"default_branch": "main",
"languages": ["python", "rust"],
"subprojects": [
@ -108,38 +98,35 @@ curl -s http://192.168.0.5:8810/projects \
"timeout_secs": 600},
{"path": "clients/rust", "language": "rust",
"build": "cargo build --release", "test": "cargo test --all",
"audit": "cargo audit", "timeout_secs": 1800}
"timeout_secs": 1800}
]
}'
# Kick off a test job
JOB=$(curl -s http://192.168.0.5:8810/projects/clawdforge/jobs \
JOB=$(curl -s "$HOST/projects/alpha/jobs" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
-d '{"recipe":"test","subproject":"clients/python"}' | jq -r .job_id)
# Poll status
watch -n2 "curl -s http://192.168.0.5:8810/jobs/$JOB \
-H 'Authorization: Bearer $TOKEN' | jq '.job.status, .log_tail[-5:]'"
curl -s "$HOST/jobs/$JOB" -H "Authorization: Bearer $TOKEN" | jq '.job.status, .log_tail[-5:]'
# Stream the full log
curl http://192.168.0.5:8810/jobs/$JOB/log \
-H "Authorization: Bearer $TOKEN"
curl "$HOST/jobs/$JOB/log" -H "Authorization: Bearer $TOKEN"
```
## Build + smoke
```bash
docker network inspect sulkta >/dev/null 2>&1 || docker network create sulkta
docker compose build
# Run the per-toolchain hello-world smoke
# Per-toolchain hello-world
docker compose run --rm crafting-table /usr/local/bin/smoke.sh
# expect: "=== ALL TOOLCHAINS GREEN ==="
# Bring up the API
docker compose up -d
curl http://192.168.0.5:8810/healthz
curl http://localhost:8810/healthz
```
## Image notes
@ -149,23 +136,22 @@ curl http://192.168.0.5:8810/healthz
- Runs as non-root user `crafter` (uid 1000) with passwordless sudo. The
API server runs as `crafter`. Recipe commands run as `crafter` too —
never elevated to root.
- Volume mount points (production):
- Volume mount points:
- `/data` — SQLite ledger, admin bearer file, per-job logs
- `/workspace` — bare clones + per-job worktrees
- `/caches` — cargo / maven / gradle / npm / pip / bun caches
- Network: external `sulkta` bridge (same one clawdforge + cauldron use).
Create with `docker network create sulkta` if missing.
- Image size baseline is large (8-15 GB expected). Per spec: that's fine.
- `/nix` — pre-seeded single-user Nix store (Plutarch / IOG flakes)
- Image size baseline is large (8-15 GB). By design.
## Architecture notes
## Architecture
### Recipe security
Recipe commands run via `/bin/sh -c` so any shell metachar works. This is
**by design** — admins set them. Recipes like `cargo build && cargo test`
work as expected; `; rm -rf /` would too if an admin set it. The container
Recipe commands run via `/bin/sh -c` so any shell metachar works. **By
design** — admins set them. Recipes like `cargo build && cargo test` work
as expected; `; rm -rf /` would too if an admin set it. The container
sandbox + `crafter` user are the safety net, not the recipe parser.
### Workspace strategy
### Workspace
- `/workspace/<project>/.cache/` — bare clone of the upstream
- `/workspace/<project>/<job_id>/` — git worktree pointing at the requested
branch+sha, removed after the job ends
@ -175,26 +161,26 @@ sandbox + `crafter` user are the safety net, not the recipe parser.
### Recipe immutability
Every job snapshots the project's recipe at run-time
(`recipe_snapshot_json`). Editing a project's recipe doesn't retcon the
view of in-flight or already-finished jobs.
view of in-flight or finished jobs.
### Process group + timeout
Recipe subprocesses spawn in their own process group via
`start_new_session=True`. On timeout we `os.killpg(pgid, SIGTERM)` and grace
for 10s before escalating to `SIGKILL`. Without process-group kill,
multi-process recipes (cargo build spawning rustc, etc.) could leave
orphans that hold the stdout pipe open.
multi-process recipes (cargo build spawning rustc, etc.) leave orphans that
hold the stdout pipe open.
### Restart resilience
Jobs marked `running` but stranded by a process crash are NOT auto-resumed.
On startup the runner sweeps them to `failed` with `exit_code=-1` and
appends a synthetic log line `[crafting-table] runner restart, job
orphaned`. Callers can re-enqueue if they want.
On startup the runner sweeps them to `failed` with `exit_code=-1` and a
synthetic log line `[crafting-table] runner restart, job orphaned`. Callers
can re-enqueue if they want.
### SQLite + WAL
Single host, single process — SQLite is plenty. WAL mode (`PRAGMA
journal_mode=WAL` + `synchronous=NORMAL`) gives many readers + one writer
without lock contention. The runner is the only mutator of `jobs`/
`findings`; HTTP workers mostly read.
Single host, single process. WAL mode (`PRAGMA journal_mode=WAL` +
`synchronous=NORMAL`) gives many readers + one writer without lock
contention. The runner is the only mutator of `jobs`/`findings`; HTTP
workers mostly read.
## Layout
@ -211,13 +197,13 @@ without lock contention. The runner is the only mutator of `jobs`/
│ ├── workspace.py # bare clone + worktree materialization + gc
│ ├── models.py # Pydantic schemas
│ ├── digest.py # email digest scheduler
│ ├── patcher.py # autonomous patch loop (clawdforge → diff → verify → PR)
│ ├── patcher.py # autonomous patch loop
│ ├── parsers/ # per-language Finding extractors
│ └── config.py # env-driven config
├── tests/ # pytest suite (143 tests)
├── tests/ # pytest suite
├── mcp/ # crafting-table-mcp — MCP stdio bridge (separate pip install)
├── examples/
│ ├── recipes/ # production recipes — clawdforge, cauldron, tradecraft
│ ├── recipes/ # example recipe JSONs
│ └── register-all.sh # bulk-register helper
├── pyproject.toml
├── requirements.txt
@ -234,12 +220,12 @@ pytest tests/
## Findings
After every job, the runner reads the captured log and hands it to a
After every job the runner reads the captured log and hands it to a
per-language parser. The parser turns native tool output (clippy JSON,
ruff JSON, govulncheck NDJSON, eslint JSON, tsc human errors, etc.) into
structured rows in the `findings` table.
### Parsers in v0.1
### Parsers
| Language | Recipes parsed | Tool output expected |
|-------------|------------------------------|------------------------------------------------------|
@ -250,10 +236,9 @@ structured rows in the `findings` table.
| javascript | (alias of typescript) | same |
| _other_ | (any) | falls back to `GenericParser` — emits one `recipe_fail` row when exit_code != 0, else nothing |
Resolution order in the registry: exact `language` match (parsers
self-gate on recipe via `Parser.matches`), then `GenericParser`. Adding a
new language is a single file in `crafting_table/parsers/` plus an entry
in `PARSERS`.
Resolution order: exact `language` match (parsers self-gate on recipe via
`Parser.matches`), then `GenericParser`. Adding a new language is a single
file in `crafting_table/parsers/` plus an entry in `PARSERS`.
### Finding kinds
@ -262,8 +247,7 @@ in `PARSERS`.
`code` = advisory id (RUSTSEC-..., GO-..., PYSEC-...) and a
`suggested_fix` of the form "bump <pkg> to <version>" when patched
versions are known.
- `test_fail` — failed test name extracted from `cargo test` /
`pytest` output.
- `test_fail` — failed test name extracted from `cargo test` / `pytest` output.
- `recipe_fail` — fallback when no language-specific parser fired and the
recipe exited non-zero. `code` = `exit_<n>`, message names the recipe.
@ -271,8 +255,8 @@ in `PARSERS`.
Every finding row carries a 16-char `fingerprint` hash over
`kind|file|line|code` (NOT the message — tool wording drifts). The same
lint reappearing across nightly runs produces the same fingerprint, so a
later wave can dedup digest output and surface only "new since last run."
lint reappearing across nightly runs produces the same fingerprint, so
later passes can dedup digest output and surface only "new since last run."
### Consuming findings
@ -287,14 +271,14 @@ GET /jobs/{job_id}/findings
]}
```
Authorization is project-token-scoped (same model as `/jobs/{id}`). The
matching `job` row's `findings_count` mirrors the array length so
callers can decide whether to fetch the detail.
Authorization is project-token-scoped (same as `/jobs/{id}`). The matching
`job` row's `findings_count` mirrors the array length so callers can decide
whether to fetch the detail.
## Digest
Daily 06:00 PT email digest. One message per project per day; aggregates the
last 24h of jobs per recipient and sends via SMTP relay (Lucy postfix).
last 24h of jobs per recipient and sends via SMTP.
Set the SMTP block in `.env` to enable — leaving `CRAFTING_SMTP_HOST` unset
keeps the scheduler off and logs `digest disabled — CRAFTING_SMTP_HOST not set`
@ -302,11 +286,11 @@ at startup. The `/digests` and `/admin/digest/run-now` endpoints still work
in dry-run mode regardless.
```bash
CRAFTING_SMTP_HOST=postfix.sulkta.com
CRAFTING_SMTP_HOST=smtp.example.com
CRAFTING_SMTP_PORT=587
CRAFTING_SMTP_USER=crafting-table@sulkta.com
CRAFTING_SMTP_USER=crafting-table@example.com
CRAFTING_SMTP_PASS=...
CRAFTING_SMTP_FROM=crafting-table@sulkta.com
CRAFTING_SMTP_FROM=crafting-table@example.com
CRAFTING_SMTP_TLS=1
```
@ -319,7 +303,7 @@ Each project's `notify.email` + `notify.on` fields control delivery:
| `test_fail` | a failing test job |
| `lint_warn` | a lint job with warning-severity findings |
| `cve_found` | any job whose findings include a `cve` |
| `patch_drafted` | (wave 3 / step 9) auto-patch was drafted |
| `patch_drafted` | an auto-patch was drafted |
| `nightly_summary` | catch-all — show ALL jobs in the project's section |
Empty `notify.on` defaults to `["audit_fail", "cve_found", "patch_drafted"]`.
@ -330,12 +314,12 @@ Manual trigger from the LAN admin token:
```bash
# Render today's digest without sending
curl -sH "Authorization: Bearer $ADMIN" \
-X POST http://192.168.0.5:8810/admin/digest/run-now \
-X POST http://localhost:8810/admin/digest/run-now \
-d '{"dry_run": true}' | jq .
# Render an arbitrary date as JSON
curl -sH "Authorization: Bearer $ADMIN" \
http://192.168.0.5:8810/digests/2026-04-29 | jq .
http://localhost:8810/digests/2026-04-29 | jq .
```
Idempotency: `digest_runs` table holds `UNIQUE(date, project_name)`, so the
@ -343,37 +327,37 @@ Idempotency: `digest_runs` table holds `UNIQUE(date, project_name)`, so the
## Autonomous patch loop
Wave 3 wires crafting-table into clawdforge so a project with
`notify.auto_patch=true` gets an automatic patch attempt on every
actionable finding (lint with file/line; cve with a known fix). Lifecycle:
Wires crafting-table into an external Claude-agent host (default: clawdforge)
so a project with `notify.auto_patch=true` gets an automatic patch attempt
on every actionable finding (lint with file/line; cve with a known fix).
Lifecycle:
1. Runner finishes a job + parsers populate findings.
2. Post-job hook fires: pulls the highest-severity actionable finding,
reads ±20 lines of context from the worktree.
3. Patcher opens a clawdforge session (`POST /sessions`), sends one
turn with the finding + source context + project metadata, expects
3. Patcher opens an agent session, sends one turn with the finding + source
context + project metadata, expects
`{"diff": ..., "explanation": ..., "confidence": ...}` back.
4. Diff applied to a fresh worktree on `crafting-table/auto/<job_id>-<finding_id>`.
Apply failure → status `apply_failed`.
5. Recipe re-runs against the patched worktree (the **verify** step).
Fail → `verify_failed`.
6. Pass → commit + push + open Gitea PR. Status `pr_opened`.
7. clawdforge session always closed.
7. Agent session always closed.
Configuration (env vars):
```
CRAFTING_CLAWDFORGE_URL=http://192.168.0.5:8800
CRAFTING_CLAWDFORGE_URL=http://clawdforge.internal:8800
CRAFTING_CLAWDFORGE_TOKEN=cf_...
CRAFTING_GITEA_URL=http://192.168.0.5:3001
CRAFTING_GITEA_URL=http://gitea.internal:3000
CRAFTING_GITEA_TOKEN=<gitea PAT>
CRAFTING_PATCHER_MAX_ATTEMPTS=3
CRAFTING_PATCHER_BRANCH_PREFIX=crafting-table/auto/
```
If any of the four required vars is missing, the patcher stays disabled
and `POST /jobs/{id}/patches` returns 503. The runner hook silently no-ops
in that case so existing job flow is unaffected.
and `POST /jobs/{id}/patches` returns 503. The runner hook silently no-ops.
**Verification cost matters.** The verify step re-runs the failing recipe
on the patched worktree — for projects with multi-minute builds this
@ -389,69 +373,53 @@ Manual trigger:
```bash
curl -sH "Authorization: Bearer $TOKEN" \
-X POST http://192.168.0.5:8810/jobs/$JOB/patches \
-X POST http://localhost:8810/jobs/$JOB/patches \
-d '{"finding_id": 42}' | jq .
# → {"ok": true, "attempt": {"status": "pr_opened", "pr_url": "...", ...}}
```
## First production recipes
## Example recipes
Three recipes ship in `examples/recipes/`:
Three example recipes ship in `examples/recipes/`:
| Recipe | Subprojects | Schedule (audit) | auto_patch |
|---------------|-------------|------------------|------------|
| `clawdforge` | 14 (one per SDK + root) | nightly 02:00 | **true** |
| `cauldron` | 1 (Flask app, `.`) | nightly 02:00 | **true** |
| `tradecraft` | 1 (`.`) | nightly 02:00 | **false** (manual review) |
| Recipe | Subprojects | auto_patch |
|---------------|-------------|------------|
| `alpha` | 14 (one per SDK + root) | **true** |
| `beta` | 1 (Flask app, `.`) | **true** |
| `gamma` | 1 (`.`) | **false** (manual review) |
Each ships with a placeholder `REPLACE_WITH_GITEA_TOKEN` in `git_url`;
`examples/register-all.sh` substitutes `$GITEA_TOKEN` at register time so
no real token ever lands in the repo.
Each ships with a placeholder `REPLACE_WITH_GITEA_TOKEN` in `git_url` and
`REPLACE_WITH_GIT_HOST` for the host. `examples/register-all.sh` substitutes
`$GITEA_TOKEN` + `$GIT_HOST` at register time so no real token or host ever
lands in the repo.
Smoke procedure (post-deploy):
```
1. docker compose up -d
2. TOKEN=$(cat /mnt/user/appdata/crafting-table/data/admin-bearer.txt)
3. CRAFTING_TABLE_TOKEN=$TOKEN GITEA_TOKEN=<your-pat> bash examples/register-all.sh
4. curl -H "Authorization: Bearer $TOKEN" http://192.168.0.5:8810/projects \
→ expect 3 projects (clawdforge, cauldron, tradecraft)
2. TOKEN=$(cat ./data/admin-bearer.txt)
3. CRAFTING_TABLE_TOKEN=$TOKEN GITEA_TOKEN=<your-pat> \
GIT_HOST=git.example.com bash examples/register-all.sh
4. curl -H "Authorization: Bearer $TOKEN" http://localhost:8810/projects
5. curl -X POST -H "Authorization: Bearer $TOKEN" \
http://192.168.0.5:8810/projects/clawdforge/jobs \
http://localhost:8810/projects/alpha/jobs \
-d '{"recipe":"test","subproject":"clients/python"}'
→ expect job_id
6. Poll GET /jobs/{job_id} until status terminal → expect succeeded
6. Poll GET /jobs/{job_id} until status terminal
```
Per-recipe smoke status (today, pre-deploy):
- `clawdforge` — 14 subprojects; `clients/python` & `clients/typescript`
& `clients/go` & `clients/rust` known clean from existing CI; ruby /
php / kotlin / java / csharp / swift compile-cleanly today but
toolchain availability inside the crafting-table image is what step 1
smoke verified. Bash subproject's `test/run.sh` may not exist (manual
check needed post-deploy).
- `cauldron` — single Flask subproject; pip-audit & pytest known to run
cleanly from the cauldron repo's own CI history.
- `tradecraft` — single subproject; auto_patch is **off** by design
(production app, manual PR review only).
## MCP bridge
The `mcp/` subdirectory ships a self-contained `crafting-table-mcp` Python
package that exposes the HTTP API to MCP-aware clients (Claude Desktop, Claude
Code, Cursor, Zed, custom agents). See [mcp/README.md](mcp/README.md) for the
tool surface, installation, and configuration.
Quickstart:
`mcp/` ships a self-contained `crafting-table-mcp` Python package that
exposes the HTTP API to MCP-aware clients (Claude Desktop, Claude Code,
Cursor, Zed, etc.). See [mcp/README.md](mcp/README.md).
```bash
pip install -e mcp
export CRAFTING_TABLE_BASE_URL=http://192.168.0.5:8810
export CRAFTING_TABLE_BASE_URL=http://localhost:8810
export CRAFTING_TABLE_TOKEN=ct_...
crafting-table-mcp # stdio JSON-RPC server
```
## License
MIT
MIT — see [LICENSE](LICENSE).