OpenClaw 3.31 shipped with a trio of production-killing bugs: gateways hung silently, approval workflow state was wiped on every restart, and the config subsystem entered a restart loop. The team cut an emergency 4.1 release within days.
OpenClaw 3.31's release broke three critical subsystems simultaneously. First, the gateway would enter a silent hang — no error log, no health-check failure, just no responses. Operators only noticed when downstream agents stopped producing output. Second, the approval workflow stored pending approvals in process memory instead of disk, so every restart wiped the queue; users would approve an action, the gateway would restart (for any reason), and the approval would vanish. Third, the config subsystem had a bootstrap dependency loop: on start it tried to read a config field that only existed after the first successful start, triggering a restart loop that only broke when someone manually created the field. Emergency 4.1 was cut within 4 days, adding /tasks and SearXNG search as consolation.
The Actual Culprit
Three independent regressions shipped in one release because there was no smoke-test pass against a clean install, a restart scenario, or a non-trivial approval workflow.
A process can be 'up' while its event loop is dead. Health checks must probe actual request-handling capability.
Any queue that doesn't survive a restart is effectively a memory leak. Treat 'what happens on restart' as a release gate.
Loading comments...