URLs, mount paths, and LAN host bindings parameterized via env or relative paths
so the repo stands up from a clean clone anywhere. Drop cross-codebase refs
("mirrors clawdforge's pattern"), Sulkta-Coop client/merchant test fixtures,
and audit-changelog scaffolding from comments. README terser, technical content
preserved.
A docker-managed named volume lives at /var/lib/docker/volumes/,
which is INSIDE docker.img (a 200 GB loop file shared with all
images, container layers, and every other docker volume on the
host). The Plutarch + haskell-nix closure for Liqwid-Labs/agora
is tens of GB.
Running nix develop against agora ONCE was enough to fill docker.img
to 100% (196/200 GB used, 2 GB free). Every container on Lucy was
about to start failing writes. Recovery: kill nix process, docker
compose down, free 66 GB of BuildKit cache via `docker builder
prune -a`, switch /nix to /mnt/cache bind mount (88+ GB free on
that pool, completely separate from docker.img).
Bind mount caveat: bare bind to an empty host dir shadows the
image's /nix install (the previous bug we caught with the
named-volume fix). One-time seed required:
mkdir -p /mnt/cache/appdata/crafting-table/nix
chown 1000:1000 /mnt/cache/appdata/crafting-table/nix
docker create --name ct-seed crafting-table:local
docker cp ct-seed:/nix/. /mnt/cache/appdata/crafting-table/nix/
docker rm ct-seed
After seed, the bind mount works because the host path has the
nix tree already populated. Subsequent docker compose up -d picks
up the populated /nix and `nix --version` works in-container.
The previous ca-derivations attempt didn't actually fix the schema
issue — Nix 2.34.7's v10 → v11 migration (which adds the
Realisations table) doesn't fire cleanly even with the feature
pre-enabled at install time. First nix develop against a flake
that requests ca-derivations crashes with
`Assertion 'stmt.stmt' failed in nix::SQLiteStmt::Use::Use`.
Workaround: set accept-flake-config=false. Flake nixConfig blocks
trying to add ca-derivations to our experimental-features get
ignored. Realisation queries never fire. Builds use the default
input-addressed path and work fine.
Substituters are now in our base nix.conf (cache.nixos.org +
cache.iog.io) so we don't lose the IOG binary cache by ignoring
the flake's substituter additions. mlabs.cachix.org dropped —
it's a private cache returning 401 to anonymous reads.
Verified live: nix develop against github:Liqwid-Labs/agora
proceeds past the previous crash point, pulling haskell-nix
closure from cache.iog.io.
Two coupled fixes in section 19.5 of the Dockerfile:
1. Add ca-derivations to experimental-features. Without it, the
SQLite store is initialized at schema v10 (no Realisations
table). Plutarch / Liqwid Agora / IOG flakes request
ca-derivations via nixConfig; first realisation query then
crashes with `Assertion 'stmt.stmt' failed in nix::SQLiteStmt::
Use::Use(SQLiteStmt&)`. Pre-enabling at install time means
store init creates schema v11 with the table. Self-inflicted
wound caught in the first nix develop attempt against
github:Liqwid-Labs/agora.
2. Add cache.iog.io + mlabs.cachix.org as substituters with their
public keys. Without these, every Cardano/Plutarch dep gets
built from source — hours of GHC compile vs minutes of binary
cache pull.
Also: write nix.conf BEFORE running the Nix install script,
because the installer reads the user's nix.conf during init to
decide schema. Order-dependent.
accept-flake-config = true so flake nixConfig blocks (which add
their own substituters / experimental features) work without
re-prompting per command.
Bind-mount to an empty host dir was shadowing the image's
pre-installed /nix tree at runtime — `nix --version` returned
"sh: nix: not found" inside the live container even though the
binary was baked into the image at build time.
Docker auto-populates a fresh named volume from the image's
content on first mount. So the named-volume version preserves
the install AND persists across container recreations.
Volume name `crafting-table-nix`. Lives at the docker default
volume path on Lucy. Backups/migration-out: `docker run --rm
-v crafting-table-nix:/src -v /tmp:/dst alpine tar cf /dst/nix.tar /src`.
Latent bug: the post-loop check used `command -v` to verify
govulncheck and staticcheck installed. `command -v` only walks
PATH, but at this layer PATH does NOT include $GOPATH/bin
(/home/crafter/go/bin) — that's only added in the canonical
final PATH at the bottom of the Dockerfile (line 314). At
runtime the binaries work fine via the bottom PATH; only the
build-time verify was broken.
The bug was masked by stale Docker layer caching from earlier
Dockerfile shapes. Adding the new Nix layer above this step
invalidated the cache and surfaced it.
Switch to direct binary path checks (test -x \"\$GOPATH/bin/...\")
which work regardless of PATH state at the layer.
Two coupled changes:
1. Add a single-user Nix install at section 19.5 so the container can
`nix develop` / `nix run` / `nix build` for the Cardano smart-
contract toolchain stack (Plutarch, plutus-core, Liqwid Agora's
`agora-scripts` exporter — all ship as IOG haskell-nix flakes
with pinned GHC). Without Nix, building any of those is a manual-
version-pinning fight.
Single-user mode (no daemon), sandbox=false (containers can't nest
sandboxes cleanly), flakes + nix-command experimental features
enabled. /nix is owned by `crafter` and bind-mounted from
/mnt/user/appdata/crafting-table/nix in compose so the multi-GB
haskell-nix downloads survive container rebuilds.
2. Bump GO_VERSION 1.22.10 → 1.25.9. govulncheck@latest (v1.3.0) and
staticcheck@latest (v0.7.0) both now require Go ≥ 1.25 — building
with 1.22 hits "requires go >= 1.25.0" and the per-step retry loop
exhausts. Go's auto-toolchain-switch tries to download 1.25.9 on
the fly but staticcheck's parent build then runs in 1.22 and
re-fails. Pinning to 1.25.9 (current Go release) sidesteps the
wedge.
PATH bump: prepend /home/crafter/.nix-profile/bin so nix-installed
binaries (cabal, ghc inside dev shells, cardano-cli, etc) take
precedence over system tooling without per-recipe prefixing.
Build invocation unchanged — nothing required at the docker run /
docker compose layer beyond the new /nix bind mount in compose.yml.
The 4 patcher-fired-but-malformed_response failures showed extract_diff_json
was too strict: it required {"diff": "..."} as the top-level JSON shape
with at most 1 brace nesting depth (regex-based). Real model output
varies more.
Now handles:
1. Bare JSON {"diff", "explanation", "confidence"}
2. Fenced JSON: ```json {…} ```
3. Fenced diff + prose: ```diff …unified diff… ``` + loose explanation
4. Bare unified diff (no JSON wrapper, no fence)
5. JSON with deeply-nested {} inside the diff string (struct literals,
function bodies)
Fixes:
- Replaced regex-based balanced-{} matcher (capped at depth 1) with a
string-aware depth-tracking generator that handles arbitrary nesting
+ skips brace chars inside JSON string literals
- Walk all fenced blocks not just the first; recognize ```diff and
```patch language tags
- Fall back to fenced-diff-with-prose construction when no JSON form
matches — synthetic payload with surrounding text as explanation
- Final fallback for bare unified diffs (no fence, no wrapper) using a
simple line-prefix detector
- Normalize alternate keys (patch, content, diff_text → diff)
- Always set confidence (defaults to medium when absent, low for bare
diffs that have no model commentary)
Tests: 16 → 20 (5 new shape coverage tests). All green.
Two recipe-shape gaps caught by the all-SDK lint+audit dogfood:
1. `cargo install --root /caches/cargo cargo-audit cargo-deny` lost its
binaries at runtime because /caches/cargo is volume-shadowed by the
host bind mount. Fix: install with `--root /usr/local` so the bins
land in /usr/local/bin (root-owned, not volume-shadowed). Required
USER root briefly to write to /usr/local; reverts to crafter after.
2. `mypy --strict` against any project that imports requests/PyYAML/
setuptools fails with "Library stubs not installed" exit 1 because
pipx-installed mypy lives in its own venv and doesn't see the
stubs. Fix: `pipx inject mypy types-requests types-PyYAML
types-setuptools` so the stubs land in mypy's venv.
The agent-generated Dockerfile accumulated PATH via 6+ layered ENV
PATH= statements, and my own GOPATH-fix edit (commit 6cd5990) wrote
a literal-expanded PATH that clobbered the swift/kotlin/gradle/bun/
cargo entries. Result: cargo unreachable from crafter user (caught
by the 14-SDK queue dogfood — exit 127 'Permission denied' on cargo
build).
Fix: a final ENV PATH= line right before the CMD that sets PATH to
a clean, comprehensive list of every toolchain bin. Overrides any
drift above. Includes:
- /home/crafter/.local/bin (pipx tools: ruff, mypy, pytest, pip-audit, uv, semgrep)
- /home/crafter/.composer/vendor/bin (phpstan, phpunit)
- /home/crafter/.local/share/gem/ruby/3.1.0/bin (bundler-audit, rubocop)
- /home/crafter/.bun/bin (bun)
- /home/crafter/go/bin (govulncheck, staticcheck)
- /home/crafter/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin (cargo, rustc, clippy, rustfmt)
- /caches/cargo/bin (cargo install artifacts; volume-mounted)
- /opt/swift/usr/bin (swift)
- /opt/kotlin/bin (kotlinc)
- /opt/gradle/bin (gradle)
- /usr/local/go/bin (go)
- system bins
Once this rebuild lands, the rust recipes can drop the per-recipe
PATH= prefix the workaround used.
Two bugs caught by the first real concurrent dogfood (15 SDK build jobs
queued at once for clawdforge):
1. Concurrent fetch race — multiple jobs from same project all called
`git fetch +refs/heads/*:refs/heads/*` simultaneously. Git refuses
to fetch into a local branch ref that another worktree has checked
out, so jobs after the first failed fast with:
fatal: refusing to fetch into branch 'refs/heads/main' checked
out at '/workspace/clawdforge/<other-job-id>'
2. Worktree-from-local-branch — `worktree add ... main` reserved the
local branch ref to the worktree, blocking subsequent fetches even
when the lock above wasn't held.
Fix:
- Per-project asyncio.Lock around the materialize() body (clone +
fetch + worktree-add). Different projects still parallelize; same
project serializes through the cache dir.
- Fetch into remote-tracking refs only:
+refs/heads/*:refs/remotes/origin/*
Local refs/heads/* are never written, so no worktree can hold them.
- worktree add uses `--detach origin/<branch>` so each worktree is at
the remote-tracking ref. Multiple detached worktrees share the same
remote ref without conflict.
The recipe itself runs outside the lock, so concurrency=4 still
parallelizes the actual build/test work — only the brief
materialize step (clone or fetch + worktree create) serializes.
Tests still 6/6 green on test_runner.py (uses _StubWorkspace, not the
real WorkspaceManager, so it's unaffected — the real exercise is the
live re-queue against fixed code).
Code-work prompts (read CVE/lint context, draft a unified diff that
verifies cleanly against the failing recipe) reward Opus's longer
context + careful reasoning. Cauldron-style high-frequency Sonnet calls
are unaffected — this only changes what crafting-table's patcher asks
clawdforge to use for its drafted-patch sessions.
Pairs with clawdforge dbbead2 which adds the optional `model` field on
POST /sessions and propagates ANTHROPIC_MODEL into the acpx subprocess
env on both create + turn.
Override knob: CRAFTING_PATCHER_MODEL env (e.g. "sonnet" if cost > quality).
Tests: 16/16 patcher tests still green.
- Dockerfile: pip-install requirements.txt and copy crafting_table/ into
/app, switch CMD from /bin/bash to uvicorn server (port 8810). pip lands
in /usr/local/bin so the crafter user runs uvicorn without elevation.
- compose.yml: replace smoke.sh entrypoint with the API server command;
bind 192.168.0.5:8810:8810 (LAN-only); switch named volumes to real
Lucy appdata paths so /data + /workspace + /caches survive recreate.
env_file marked optional so a fresh checkout boots without copying
.env.example.
- README.md: tick steps 1-4 done, document API surface table, add
curl-based quickstart (mint token → register project → kick off job →
poll → stream log), and an architecture-notes section covering the
recipe-immutability snapshot, process-group SIGTERM/SIGKILL escalation,
WAL+single-writer trade-off, and the recipe-security stance.
Smoke remains runnable on demand:
docker compose run --rm crafting-table /usr/local/bin/smoke.sh