Failure-injection recipes
Standard scenarios every bot is exercised against before it can leave shadow. The chaos suite runs them nightly in staging and weekly in production-canary.
Five standard scenarios
- Stale market data. Freeze the CLOB book channel for the bot's longest staleness threshold + 5 seconds. Expect: HARD_REJECT with
STALE_MARKET_DATAon every decision; no silent approvals. - Dependency outage. Hard-stop a declared dependency (PortfolioGuard, KillSwitch, etc.) for 2 minutes. Expect: bot enters safe-fallback (per its Failure-mode block); no decisions emitted that depend on the missing service.
- Schema drift. Inject a payload with an extra unknown field and a renamed-but-similar field on every input. Expect: bot tolerates extras (forward-compatible), rejects unknown renames with a clear schema error.
- Time-skew. Skew local clock by ±90 seconds vs WebSocket server clock. Expect: bot uses server-time when computing staleness; alerts on local-vs-server skew > 30s.
- Replay storm. Replay the same intent 1000× within 5s. Expect: idempotency holds — exactly one effect per intent_id; no duplicated orders; no duplicated GovernanceLog entries.
Bot-specific scenarios
Each bot lists its own additional scenarios in the Failure-injection recipes section of its page. Examples:
- OracleRiskMonitor: Inject a UMA proposal with a too-low bond ($500 pUSD instead of $750); expect
ORACLE_PROPOSER_BOND_BELOW_MINand pause on the affected market. - SmartRouter: Mid-route, swap the tick size from 0.001 to 0.01; expect
SMART_ROUTER_TICK_SIZE_CHANGED+ replan. - BuilderAttribution: Send a maker-side fill with a different builderCode than the taker-side; expect both attributed correctly per V2 rules.
- ContractAddressGuard: Submit a request to a V1 contract address; expect
CONTRACT_ADDRESS_NOT_ALLOWEDwith V2 hint.
Authoring rules
- Each scenario must have a deterministic injection (seeded RNG, fixed payload).
- Each scenario must have a measurable expected behaviour — an assertion, not a vibe.
- Recovery must be tested too: after the injection ends, the bot must return to normal within its declared recovery window.