v0.1 wave 2A (steps 5+6): per-language parsers + findings extraction

- parsers/ package: rust / python / go / typescript / generic
- parser registry with language+recipe -> fallback resolution
- fingerprint hash (kind+file+line+code) for cross-run dedup
- runner.py post-exec hook: parse log, persist findings, count on job row
  (extraction runs before mark_job_finished so callers polling on terminal
  status see findings_count populated atomically)
- db.insert_finding / list_findings / increment_findings_count DAOs already
  shipped in wave 1; wired here
- GET /jobs/{id}/findings now returns real data (server route already
  shipped; was returning empty list because nothing populated the table)
- tests/test_parsers/: 6 modules + 11 fixtures (rust/python/go/typescript)
- tests/test_runner_findings.py: 3 integration tests
- README: tick steps 2-6, add Findings section

Suite: 108 passing (62 wave-1 + 46 new).
Spec: memory/spec-crafting-table.md
This commit is contained in:
Kayos 2026-04-29 08:32:56 -07:00
parent 98306ca2e0
commit d467b2f5be
30 changed files with 1968 additions and 5 deletions

View file

@ -15,14 +15,14 @@ through clawdforge.
Spec: `Sulkta-Coop/openclaw-workspace/memory/spec-crafting-table.md` (LAN-only).
## Status — v0.1
## Status — v0.1 step 6 of 10
- [x] Step 1: Dockerfile + per-language smoke
- [x] Step 2: SQLite ledger + project registry
- [x] Step 3: HTTP API skeleton (FastAPI, port 8810)
- [x] Step 4: Job runner core (asyncio worker pool, git worktree, subprocess)
- [ ] Step 5: Per-language parsers (Rust / Python / Go / TS first)
- [ ] Step 6: Findings extraction + storage
- [x] Step 5: Per-language parsers (Rust / Python / Go / TS first)
- [x] Step 6: Findings extraction + storage
- [ ] Step 7: MCP server (stdio JSON-RPC, 8 tools)
- [x] Step 8: Email digest scheduler
- [ ] Step 9: Autonomous patch loop (clawdforge integration)
@ -70,7 +70,7 @@ override via `CRAFTING_LAN_CIDRS`.
| GET | `/jobs?project=&status=&limit=` | any | List own (or all if admin) |
| GET | `/jobs/{id}` | owner | State + last 200 log lines |
| GET | `/jobs/{id}/log` | owner | Full log (file stream) |
| GET | `/jobs/{id}/findings` | owner | Structured findings (wave 1: empty) |
| GET | `/jobs/{id}/findings` | owner | Structured findings (see Findings) |
Cross-token access returns **404, not 403** — same existence-leak guard as
clawdforge sessions.
@ -222,6 +222,65 @@ pip install -e '.[test]'
pytest tests/
```
## Findings
After every job, the runner reads the captured log and hands it to a
per-language parser. The parser turns native tool output (clippy JSON,
ruff JSON, govulncheck NDJSON, eslint JSON, tsc human errors, etc.) into
structured rows in the `findings` table.
### Parsers in v0.1
| Language | Recipes parsed | Tool output expected |
|-------------|------------------------------|------------------------------------------------------|
| rust | audit, lint, test, build | `cargo audit --json`, `cargo clippy --message-format=json`, `cargo test` (human) |
| python | audit, lint, test, build | `pip-audit -f json`, `ruff --output-format=json`, `mypy --output=json`, `pytest --tb=line` |
| go | audit, lint, build, test | `govulncheck -json`, `go vet -json` |
| typescript | lint, build, test, audit | `eslint -f json`, `tsc --noEmit` (stderr) |
| javascript | (alias of typescript) | same |
| _other_ | (any) | falls back to `GenericParser` — emits one `recipe_fail` row when exit_code != 0, else nothing |
Resolution order in the registry: exact `language` match (parsers
self-gate on recipe via `Parser.matches`), then `GenericParser`. Adding a
new language is a single file in `crafting_table/parsers/` plus an entry
in `PARSERS`.
### Finding kinds
- `lint` — clippy / ruff / mypy / eslint / tsc / go vet diagnostic.
- `cve` — vulnerability from cargo-audit / pip-audit / govulncheck. Carries
`code` = advisory id (RUSTSEC-..., GO-..., PYSEC-...) and a
`suggested_fix` of the form "bump <pkg> to <version>" when patched
versions are known.
- `test_fail` — failed test name extracted from `cargo test` /
`pytest` output.
- `recipe_fail` — fallback when no language-specific parser fired and the
recipe exited non-zero. `code` = `exit_<n>`, message names the recipe.
### Fingerprints + dedup
Every finding row carries a 16-char `fingerprint` hash over
`kind|file|line|code` (NOT the message — tool wording drifts). The same
lint reappearing across nightly runs produces the same fingerprint, so a
later wave can dedup digest output and surface only "new since last run."
### Consuming findings
```
GET /jobs/{job_id}/findings
→ {"ok": true, "findings": [
{"id": 1, "job_id": "...", "kind": "lint", "severity": "warn",
"file": "src/app.py", "line": 3, "code": "F401",
"message": "...", "suggested_fix": "...", "fingerprint": "...",
"raw_json": "{...}", "created_at": ...},
...
]}
```
Authorization is project-token-scoped (same model as `/jobs/{id}`). The
matching `job` row's `findings_count` mirrors the array length so
callers can decide whether to fetch the detail.
## Digest
Daily 06:00 PT email digest. One message per project per day; aggregates the