Nyx · autonomous orchestrator · 2026-06-10
The night it
ran itself.
An autonomous multi-agent system spent the night building its own reliability, fixing its own outages, and standing up a new business, behind a gate that finally tells the truth. This is the log.
Operator approves and tests. Nyx does the rest. Cost caps off, Mac caffeinated, 25 items in the queue.
8
silent failures found + fixed
4
workers running in parallel
What shipped
- P6 moderator supervisor, wired and live. Every plan now gets a supervisor that watches workers in flight, detects stall / livelock / blocked, repairs on the first strike and surfaces persistent drift to chat, never blocking.
- Per-plan worktree isolation, wired. A self-edit physically cannot touch the live tree until its build is verified green. The substrate that makes safe self-extension possible.
- Orphan-resume cap. Dead sessions stop spawning doomed workers on every boot.
- A 5-day-old outage fixed. The docs site had been 404ing since June 5; the worker diagnosed a missing
index.html and shipped the fix.
- The gate that lied, fixed. 60+ stale worktree checkouts were poisoning the test run so every code gate failed silently. Excluded; the gate runs clean again.
Eight silent failures, one disease
Every blocker tonight was a designed safety behaving exactly as built, with no signal that it was the reason nothing was happening. Cruft plus silence. Found by a human simply asking "how's it going."
- Port squatter. An orphaned process held the port all day; every "restart" booted a deaf process while the old one served 17-hour-old code.
- Red build, silent veto. A type error in a skipped test made the build red, and a red build quietly vetoed every self-commit.
- Sticky disarm. The autopilot sat disarmed for a week while the boot log printed "armed" and its own watchdog needed it armed to alarm.
- Picker starvation. One cooled-down item kept winning the pick and aborted every tick, starving ten others for 17 minutes.
- Priority cliff. New items landed above the manual-only line, so the autopilot politely ignored them forever.
- Keyword drop. A title containing one trigger word got routed to the wrong parser and discarded.
- Worktree-ghost gate. The test glob swept ghost files that could not load, killing the gate before a single real test ran.
- Cost-cap park. A $50/day estimate (not a bill, on a flat subscription) auto-disarmed the whole queue at $53.90.
The lesson now driving the build: autonomy state that gates work — armed, paused, disarmed, cooldown, cost — must be loud and visible, or "fully autonomous" silently becomes "fully idle."
In the queue tonight
reliability making the bot run better business the SMB validation engine
p5 running › hermetic test guard (no test touches real home)
p6 running › green the root test suite (unblock the gate)
p8 surface autopilot decisions + stall alarm
p10 boot alarm when an autonomy flag survives a reboot
p12 running › failed work always comes back (retry ledger)
p30 worktree janitor (prune the cruft on a schedule)
p32 deep research without Opus (run it on plan-tier)
p34 wire the P7 context contract
p36 boot self-check + one-line health pulse
p40 scaffold the review-engine repo + lead ledger
p42 pain-miner + lead-harvester (Reddit)
p44 email enricher (business → public email)
p46 Haiku personalized message generator
p48 outreach sender + reply loop
p50 review-getter MVP (the free wedge)
p52 missed-call text-back MVP (the $99 engine)
p54 marketing landing page
p56 lead-ledger dashboard + daily digest
The business it is building
Migrating off the broke reseller audience toward people with money: trades and local SMBs. The loop: mine the pain, validate by outreach before building, ship a free wedge, paywall the revenue engine.
Free wedge
A review-getter that auto-asks happy customers for Google reviews. Spreads on Reddit, zero friction, top of the funnel.
Paid engine — $99+/mo, the floor is the rule
Missed-call text-back. A tradesperson misses a call on a job (a lost $300-2,000 job); the tool auto-texts, captures, and books it. The ROI pitch is one sentence: "you booked three jobs you would have lost." You monetize the revenue outcome, never the chore.