Fix BEE-ACCESS-PLAN: correct tunnel topology, remove wlp1s0f1 kill

This commit is contained in:
Kayos 2026-03-22 07:37:03 -07:00
parent a67247df68
commit 726241e1e8

View file

@ -3,201 +3,165 @@ _Updated 2026-03-22 — read this before touching the Bee_
--- ---
## Network Topology (important)
```
Cobb's phone ──── Bee AP (wlp1s0f0, 192.168.0.10) ──── phone gets 192.168.0.x
ROUTING CONFLICT on phone
(both AP and zerocool are 192.168.0.0/24)
Bee WiFi client (wlp1s0f1) ──── zerocool ──── Lucy (192.168.0.5) ──── OpenClaw
```
**Key facts:**
- sshd on Bee binds ONLY to `192.168.0.10` (AP interface) — confirmed from recon
- The reverse tunnel OUTBOUND uses `wlp1s0f1` (zerocool) — do NOT kill this interface
- Routing conflict is on Cobb's PHONE (both AP and zerocool are 192.168.0.0/24)
- Killing wlp1s0f1 kills zerocool → kills the tunnel path → wrong
---
## Step 1 — Get Clean Access ## Step 1 — Get Clean Access
**Physical requirement:** Be near the truck with your phone. ### 1a. Phone connects to Bee AP
Connect to `dashcam-4A928016A02C1046`, SSH to `root@192.168.0.10`
### 1a. Connect before zerocool does ### 1b. Start reverse tunnel FROM THE BEE immediately
Boot the Bee and **immediately** connect your phone to `dashcam-4A928016A02C1046` before it associates with zerocool. If zerocool connects first, we get routing hell again. Run this on the Bee before the routing conflict kicks you off:
### 1b. Set up reverse tunnel correctly
From the Bee (via your phone SSH session on the AP):
```bash ```bash
ssh -R 2222:192.168.0.10:22 -N -o StrictHostKeyChecking=no root@192.168.0.5 & ssh -R 2222:192.168.0.10:22 -N -o StrictHostKeyChecking=no root@192.168.0.5 &
``` ```
> **Critical:** must be `192.168.0.10:22` not `localhost:22` — sshd only binds to the AP interface. - Goes OUTBOUND via wlp1s0f1 (zerocool) to Lucy
- AP interface not involved in this path
- Even if your phone SSH session dies, tunnel stays alive
- Lucy will show `127.0.0.1:2222` listening
**Auth question to resolve:** What key does the Bee use to connect to Lucy? Either: **Auth requirement:** Bee needs a key that Lucy's `root` authorized_keys accepts.
- Check `/root/.ssh/` on the Bee for existing keys - Check what's in `/root/.ssh/` on the Bee
- Or add the Bee's pubkey to Lucy's `authorized_keys` during this session - If nothing: during the phone session, generate a key on Bee and add pubkey to Lucy's authorized_keys
### 1c. Verify tunnel from OpenClaw ### 1c. Verify from OpenClaw
Once tunnel is up, I'll verify with:
```bash ```bash
# Lucy should show port 2222 listening ssh lucy "ss -tlnp | grep 2222"
ss -tlnp | grep 2222 # Should show: LISTEN 0 128 127.0.0.1:2222
``` ```
Then I connect via `127.0.0.1:2222` on Lucy.
### 1d. Optional: disconnect Bee from zerocool
To avoid routing conflict entirely, kill the WiFi client connection while we work:
```bash
ip link set wlp1s0f1 down
```
Re-enable after: `ip link set wlp1s0f1 up`
--- ---
## Step 2 — Read-Only Recon (NO WRITES) ## Step 2 — Read-Only Recon (NO WRITES)
Once I'm in via the tunnel, I run these in order. Read only.
### 2a. Storage inventory ### 2a. Storage inventory
```bash ```bash
df -h df -h
du -sh /data/recording/*/ du -sh /data/recording/*/
ls -la /data/recording/ml_metadata/ | head -20 sqlite3 /data/odc-api.db ".tables"
ls -la /data/recording/unprocessed_framekm/ | head -5 sqlite3 /data/odc-api.db ".schema framekms"
sqlite3 /data/odc-api.db ".schema"
sqlite3 /data/odc-api.db "SELECT COUNT(*) FROM framekms;" sqlite3 /data/odc-api.db "SELECT COUNT(*) FROM framekms;"
sqlite3 /data/odc-api.db "SELECT * FROM framekms LIMIT 3;" ls /data/recording/ml_metadata/ | head -20
cat $(ls /data/recording/ml_metadata/ | head -1) 2>/dev/null
``` ```
**Goal:** Understand how much data is stored and in what state.
### 2b. Redis key scan (live detections) ### 2b. Redis — find detection keys
```bash ```bash
redis-cli keys "*" redis-cli keys "*"
redis-cli type GNSSFusion30Hz # Look specifically for detection/landmark keys:
redis-cli zrevrange GNSSFusion30Hz 0 2
# Look for detection/landmark keys map-ai publishes to:
redis-cli keys "*landmark*" redis-cli keys "*landmark*"
redis-cli keys "*detection*" redis-cli keys "*detection*"
redis-cli keys "*sign*" redis-cli keys "*sign*"
redis-cli keys "*map*" redis-cli keys "*ai*"
# Check GNSSFusion30Hz (may not appear until Bee has been running a while):
redis-cli type GNSSFusion30Hz 2>/dev/null
redis-cli zrevrange GNSSFusion30Hz 0 2 2>/dev/null
``` ```
**Goal:** Find the exact Redis key(s) map-ai writes detections to.
### 2c. Read odc-api source — find detection key ### 2c. Find detection key in odc-api source
```bash ```bash
grep -i "landmark\|detection\|redis\|publish\|set\|zadd" /opt/odc-api/odc-api-bee.js | head -50 grep -in "redis\|landmark\|detection\|zadd\|rpush\|publish" /opt/odc-api/odc-api-bee.js | head -40
``` ```
**Goal:** Confirm exactly how odc-api reads detections from Redis so we know what key to poll.
### 2d. Read map-ai source — confirm write pattern ### 2d. Find detection key in map-ai source
```bash ```bash
grep -i "redis\|set\|zadd\|publish\|landmark\|detection" /opt/map-ai/map-ai.py | head -50
# Also check if there's a compiled version or if it's pure Python:
ls /opt/map-ai/ ls /opt/map-ai/
grep -in "redis\|landmark\|detection\|zadd\|rpush\|publish" /opt/map-ai/map-ai.py 2>/dev/null | head -40
``` ```
**Goal:** Confirm what Redis key map-ai writes detections to after inference.
### 2e. Check ml_metadata contents ### 2e. Frame storage format
```bash
ls -la /data/recording/ml_metadata/ | tail -20
# Look at a sample file:
cat $(ls /data/recording/ml_metadata/ | head -1)
```
**Goal:** Understand if detection metadata is also written to disk files (backup to Redis).
### 2f. Check frame storage
```bash ```bash
ls /tmp/recording/pics/ | head -5 ls /tmp/recording/pics/ | head -5
ls /tmp/recording/pics/ | wc -l ls /tmp/recording/pics/ | wc -l
# Filename format:
ls /tmp/recording/pics/ | head -1
``` ```
**Goal:** Confirm frame filename format for detection-to-image correlation.
### 2g. Check existing SSH keys on Bee ### 2f. SSH keys on Bee
```bash ```bash
ls -la /home/root/.ssh/ 2>/dev/null || ls -la /root/.ssh/ 2>/dev/null ls -la /root/.ssh/ 2>/dev/null
cat /root/.ssh/authorized_keys 2>/dev/null cat /root/.ssh/authorized_keys 2>/dev/null
ls /root/.ssh/id_* 2>/dev/null ls /root/.ssh/id_* 2>/dev/null
``` ```
**Goal:** Know what keys exist for tunnel auth and for our post-liberation access.
### 2h. Check service file for map-ai dependency ### 2g. map-ai service file
```bash ```bash
cat /lib/systemd/system/map-ai.service 2>/dev/null || \
systemctl cat map-ai.service systemctl cat map-ai.service
``` ```
**Goal:** Confirm the `Requires=odc-api.service` line so we know what to override in the drop-in.
--- ---
## Step 3 — Decisions Based on Recon ## Step 3 — Decisions Based on Recon
After recon, we decide: | Finding | Decision |
|---------|----------|
### 3a. Detection key confirmed? | Detection Redis key found | Forwarder polls Redis directly |
- **Yes:** Write forwarder to poll that Redis key directly | No detection Redis key | Use ml_metadata files or poll odc-api at low frequency |
- **No Redis key found:** Use ml_metadata files OR keep polling odc-api endpoints (low frequency, not localhost) | ml_metadata has detection files | Primary source, tail by mtime |
| Large stored framekm backlog | Plan backfill to ADAMaps before liberation |
### 3b. ml_metadata has useful files?
- **Yes:** Primary source for detections — tail by mtime, parse directly
- **No:** Redis is the only path
### 3c. How much data is stored?
- Estimate backfill time/volume to ADAMaps
- Decide if we do a one-time backfill before liberation or after
--- ---
## Step 4 — Liberation Plan (v0.6) ## Step 4 — Liberation (v0.6)
Based on recon findings, update `liberate-v0.5.sh` to `v0.6`: ### Kill list
### Kill list (services to stop + disable)
``` ```
hivemapper-data-logger ← the uploader, MUST kill hivemapper-data-logger ← MUST — the uploader
mitmproxy ← Hivemapper proxy, MUST kill mitmproxy ← MUST — Hivemapper proxy
beekeeper-plugin ← Hivemapper telemetry/HW comms, kill beekeeper-plugin ← kill — Hivemapper telemetry
here-plugin ← HERE Maps integration, kill here-plugin ← kill — HERE Maps
mender-client ← OTA update client, kill (recovery via USB still works) mender-client ← kill — OTA updates (USB recovery still works)
odc-api ← Node.js REST layer, kill (we read from Redis/files directly) odc-api ← kill — Node.js bloat (we read Redis/files directly)
lte.service ← Kill LTE upload path (no SIM = irrelevant, but block anyway)
``` ```
### Keep list (services that stay running) ### Keep list
``` ```
redis ← IPC backbone, keep redis ← IPC backbone
depthai_gate ← Camera hardware init, keep depthai_gate ← Camera hardware init
map-ai ← ML inference (sign detection), keep ← THIS IS THE VALUE map-ai ← ML inference — THIS IS THE VALUE
jpeg-recorder ← Frame storage, keep jpeg-recorder ← Frame storage
video-processor ← Frame pipeline, keep video-processor ← Frame pipeline
RedisHandler ← Sensor fusion, keep RedisHandler ← Sensor fusion
datalogger ← GPS/IMU logging, keep datalogger ← GPS/IMU logging to SQLite
hostapd ← AP, keep (how we connect) hostapd / dnsmasq ← AP + DHCP
dnsmasq ← DHCP on AP, keep
``` ```
### New service to install: adacam-forwarder (rewritten) ### New service: adacam-forwarder (rewritten)
- Polls detection Redis key for new entries
- Grabs JPEG from `/tmp/recording/pics/` by timestamp
- POSTs to ADAMaps `/api/ingest` + `/api/images`
- Tracks state in `/data/adacam/forwarder-state.json`
- Runs every 30s, pure Python, minimal CPU
Lightweight Python service that: ### systemd drop-in (removes odc-api dep from map-ai)
1. Polls the detection Redis key (found in step 2b/2c) for new entries since last ID
2. Grabs corresponding JPEG from `/tmp/recording/pics/` by timestamp match
3. POSTs to ADAMaps `/api/ingest` with correct payload:
```json
{
"device_id": "bee-{SERIAL}",
"detections": [{
"ts": 1709920000000,
"lat": 34.05357,
"lon": -118.24545,
"class_label": "speed_limit_35",
"overall_confidence": 0.88
}]
}
```
4. Uploads image via `POST /api/images` (multipart)
5. Tracks last processed ID in `/data/adacam/forwarder-state.json`
6. Runs every 30s — low overhead, no Node.js
### systemd drop-in for map-ai (removes odc-api dep)
```ini ```ini
# /etc/systemd/system/map-ai.service.d/override.conf # /etc/systemd/system/map-ai.service.d/no-odc-api.conf
[Unit] [Unit]
Requires=redis.service Requires=redis.service
# Remove: Requires=odc-api.service # Overrides Requires=odc-api.service from base unit
``` ```
### SSH key installation ### SSH keys
Drop OpenClaw pubkey to `/root/.ssh/authorized_keys`: Drop to `/root/.ssh/authorized_keys`:
``` ```
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOQxwJU91TCxds34P18D3xRbu7rxlrgTUoml/H8nxeDK kayos@openclaw ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOQxwJU91TCxds34P18D3xRbu7rxlrgTUoml/H8nxeDK kayos@openclaw
``` ```
### Domain blocks (append to /etc/hosts) ### Domain blocks (/etc/hosts)
``` ```
0.0.0.0 data.api.hivemapper.com 0.0.0.0 data.api.hivemapper.com
0.0.0.0 api.hivemapper.com 0.0.0.0 api.hivemapper.com
@ -205,30 +169,26 @@ ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOQxwJU91TCxds34P18D3xRbu7rxlrgTUoml/H8nxeDK
0.0.0.0 direct.data.api.platform.here.com 0.0.0.0 direct.data.api.platform.here.com
0.0.0.0 account.api.here.com 0.0.0.0 account.api.here.com
0.0.0.0 mender.io 0.0.0.0 mender.io
0.0.0.0 s3.amazonaws.com
``` ```
### What we do NOT touch ### Do NOT touch
- `/etc/ssh/sshd_config` — no changes, password auth stays - `/etc/ssh/sshd_config`
- AP config (`/var/hostapd.conf`) — no changes - AP config / SSID / IP
- IP (`192.168.0.10`) — stays forever - Firewall (no rules yet)
- Firewall — no changes yet
--- ---
## Step 5 — Test Before Commit ## Step 5 — Test
Before calling liberation complete: 1. `redis-cli get MAP_AI_READY` → should be "True"
1. Verify `map-ai` still starts and `MAP_AI_READY` appears in Redis 2. Check forwarder log → detections posting to ADAMaps
2. Verify our forwarder receives detections and posts successfully to ADAMaps 3. Verify no Hivemapper traffic (`nmap` or check `/etc/hosts` working)
3. Verify `depthai-device-kb` process still spawns (ML inference running)
4. Check `/data/adacam/forwarder-state.json` updating 4. Check `/data/adacam/forwarder-state.json` updating
5. Confirm no Hivemapper upload traffic (check hosts block is working)
--- ---
## Step 6 — Build bee-tunnel.service (permanent tunnel) ## Step 6 — bee-tunnel.service (permanent remote access)
After liberation, install a persistent reverse tunnel service so we never need physical access again:
```ini ```ini
[Unit] [Unit]
@ -238,7 +198,8 @@ Wants=network-online.target
[Service] [Service]
Type=simple Type=simple
ExecStart=/usr/bin/ssh -N -R 2222:192.168.0.10:22 \ ExecStart=/usr/bin/ssh -N \
-R 2222:192.168.0.10:22 \
-o StrictHostKeyChecking=no \ -o StrictHostKeyChecking=no \
-o ServerAliveInterval=30 \ -o ServerAliveInterval=30 \
-o ServerAliveCountMax=3 \ -o ServerAliveCountMax=3 \
@ -250,18 +211,20 @@ RestartSec=30
[Install] [Install]
WantedBy=multi-user.target WantedBy=multi-user.target
``` ```
Goes outbound via wlp1s0f1 (zerocool) — AP interface not involved.
--- ---
## Known Issues / Gotchas ## Known Issues
| Issue | Notes | | Issue | Notes |
|-------|-------| |-------|-------|
| sshd binds to `192.168.0.10` only | Never use `localhost:22` in tunnel | | sshd binds to 192.168.0.10 only | Reverse tunnel must use `192.168.0.10:22` not `localhost:22` |
| depthai-device-kb runs at 98% CPU | Normal — that's the VPU doing ML inference | | Routing conflict on phone | AP + zerocool both 192.168.0.0/24 — start tunnel BEFORE conflict kicks you |
| rngd at 20% CPU | Suspicious — investigate if it's needed | | wlp1s0f1 must stay UP | It's the zerocool path — tunnel dies without it |
| Redis is localhost:6379 only | Need to be on Bee to query it | | depthai-device-kb at 98% CPU | Normal — VPU running ML inference |
| GNSSFusion30Hz not in recon redis-keys | Recon was only 5min post-boot — key appears later | | rngd at 20% CPU | Investigate — may be killable |
| map-ai Requires=odc-api in systemd | Must add drop-in override before killing odc-api | | Redis localhost:6379 only | Must be on Bee or tunneled in to query |
| ml_metadata limited to 20MB | Small — Redis is likely primary detection source | | GNSSFusion30Hz absent at boot | Appears after GNSS warms up (~5-10 min) |
| Lots of unprocessed data on disk | Backfill to ADAMaps before or after liberation TBD | | Bee has stored framekm backlog | Plan backfill to ADAMaps |
| Bee-to-Lucy auth key unknown | Resolve during first phone session |