Polytraders Dev Guide
internal
v3 spine Phase 1 · Shared contracts 9 demo-wired · 0 shadow-ready · 0 production-live · 100 pending · 109 total 15/33 infra tasks the plan status board

Incident playbooks

If one of these alerts fires, follow the playbook. Do not improvise.

Kill switch fires automatically

Trigger: Drawdown breach, reject-rate spike, feed loss, or wallet-funding short.

  1. Confirm trigger source from KillSwitchEvent.trigger field.
  2. Page the on-call risk engineer.
  3. Confirm no orders are in-flight; check Order Lifecycle Manager.
  4. If trigger was drawdown: publish a positions snapshot to the incident channel.
  5. Run replay-last-hour against the simulator to confirm KillSwitch fired correctly.
  6. Reset only after the trigger condition has cleared and a second approver signs off.

Polymarket API degraded

Trigger: API Degradation Monitor emits GOV_API_DEGRADED at warning or hard.

  1. Switch all Strategy bots from enforced to advisory.
  2. Cancel resting orders on markets we cannot read freshness for.
  3. Page the trading-ops on-call.
  4. Watch the rolling error rate; promote back to enforced only after 10 minutes of green metrics.

On-chain reconcile mismatch

Trigger: Reconciler emits GOV_RECONCILE_MISMATCH.

  1. Halt all Trade-authority bots immediately (auto-handled by KillSwitch).
  2. Capture the exact divergence: internal state, on-chain state, last-seen block, last-applied event.
  3. Page the security on-call.
  4. Do not re-enable trading until the divergence is explained and the divergent state is reconciled by hand.

Config drift detected

Trigger: Config Drift Detector emits GOV_CONFIG_DRIFT.

  1. Identify the drifted bot and field.
  2. If the drift is unauthorised: revert to the approved config and audit who changed it.
  3. If the drift is intentional but not yet approved: hold the operator override active for the documented time bound, and open an approval PR before it expires.

Universal first three steps

  1. Confirm the alert is real (not a flapping metric).
  2. Capture state before doing anything: positions, open orders, last 100 ReportEnvelopes, on-chain block height.
  3. Open the incident channel and assign an incident commander before taking any remedial action.