Commit graph

3 commits

Author SHA1 Message Date
3578c9321b workspace: explicit fetch right after bare clone (populates remote-tracking refs that --bare doesn't) 2026-04-29 13:58:28 -07:00
129d630391 workspace: per-project lock + fetch into remote-tracking refs
Two bugs caught by the first real concurrent dogfood (15 SDK build jobs
queued at once for clawdforge):

1. Concurrent fetch race — multiple jobs from same project all called
   `git fetch +refs/heads/*:refs/heads/*` simultaneously. Git refuses
   to fetch into a local branch ref that another worktree has checked
   out, so jobs after the first failed fast with:
       fatal: refusing to fetch into branch 'refs/heads/main' checked
       out at '/workspace/clawdforge/<other-job-id>'

2. Worktree-from-local-branch — `worktree add ... main` reserved the
   local branch ref to the worktree, blocking subsequent fetches even
   when the lock above wasn't held.

Fix:
- Per-project asyncio.Lock around the materialize() body (clone +
  fetch + worktree-add). Different projects still parallelize; same
  project serializes through the cache dir.
- Fetch into remote-tracking refs only:
    +refs/heads/*:refs/remotes/origin/*
  Local refs/heads/* are never written, so no worktree can hold them.
- worktree add uses `--detach origin/<branch>` so each worktree is at
  the remote-tracking ref. Multiple detached worktrees share the same
  remote ref without conflict.

The recipe itself runs outside the lock, so concurrency=4 still
parallelizes the actual build/test work — only the brief
materialize step (clone or fetch + worktree create) serializes.

Tests still 6/6 green on test_runner.py (uses _StubWorkspace, not the
real WorkspaceManager, so it's unaffected — the real exercise is the
live re-queue against fixed code).
2026-04-29 13:56:44 -07:00
0ec3a04676 v0.1 wave 1 (steps 2+3+4): SQLite ledger + FastAPI skeleton + async job runner
- db.py: migrations + DAOs for tokens / projects / jobs / findings (SQLite WAL)
- auth.py: SHA-256 bearer hashing + LAN-CIDR allowlist + admin/app token tiers
- models.py: Pydantic shapes (Project, Subproject, Schedule, Notify, Job, CreateJobRequest)
- server.py: FastAPI on port 8810; /healthz, /admin/tokens/*, /projects/*, /jobs, /jobs/{id}, /jobs/{id}/log, /jobs/{id}/findings
- runner.py: bounded asyncio pool, per-job timeout with process-group SIGTERM→SIGKILL escalation, orphaned-job recovery on boot
- workspace.py: bare-clone + worktree materialization, gc
- config.py: env-driven
- 62 tests across db / auth / projects / jobs / runner / e2e — all green

Cross-token project access returns 404 (not 403) — existence-leak guard.
Bearer tokens hashed at rest; admin token bootstrapped on first boot.
Recipe subprocess uses start_new_session=True so killpg targets the
whole process tree on timeout — child processes can't escape SIGKILL.
Pump task guarded with wait_for(2s) + cancel fallback against any
orphan that survives the group kill.

Wave 2 (parsers + findings extraction + MCP + email digest) pending.

Spec: memory/spec-crafting-table.md
2026-04-29 08:17:41 -07:00