1.2 KillSwitch
KillSwitch is the top-level emergency stop for the entire trading system. It can be triggered automatically when intraday or weekly drawdown exceeds a threshold, when the order-reject rate spikes above a circuit-breaker level, or when a market data feed is lost with open positions. It can also be triggered manually via the Admin UI. Once active, KillSwitch rejects every incoming OrderIntent without exception until a manual reset is performed (if require_manual_reset=true) or the trigger condition clears. It does not modify orders — it only blocks them entirely.
v3 readiness
A bot is done when all four scores are. What does done mean?
risk.killswitch
Hard-stop on drawdown, reject rate, feed loss, manual flag. SEARCH_SPACE declared. Fixture pack pending.
Source: @polytraders/bots · src/risk/killswitch.js · Impl 11/15 · Backtest 3/4
1. Bot Identity
| Layer | Risk Risk |
|---|---|
| Bot class | Guardrail |
| Authority | PauseReject |
| Status | LIVE |
| Readiness | General live |
| Runs before | All other guardrails and ExecutionPlan emit |
| Runs after | Any triggering event (drawdown breach, reject-rate spike, manual flag, feed loss) |
| Applies to | All OrderIntents — checked first in the guardrail pipeline |
| Default mode | general_live |
| User-visible | Yes |
| Developer owner | Polytraders core — Risk pod |
Operational profile
| Modes supported | quarantine |
|---|
2. Purpose
KillSwitch is the top-level emergency stop for the entire trading system. It can be triggered automatically when intraday or weekly drawdown exceeds a threshold, when the order-reject rate spikes above a circuit-breaker level, or when a market data feed is lost with open positions. It can also be triggered manually via the Admin UI. Once active, KillSwitch rejects every incoming OrderIntent without exception until a manual reset is performed (if require_manual_reset=true) or the trigger condition clears. It does not modify orders — it only blocks them entirely.
3. Why This Bot Matters
Runaway loss not stopped automatically
Without a drawdown circuit breaker, a strategy that enters a losing streak continues to trade, compounding losses until a human intervenes — which may be too late.
API or wallet misbehaviour not detected
A spike in order rejections from the exchange or a wallet desync can indicate a broken execution path. Continuing to submit orders under these conditions risks unintended positions or double-orders.
Data feed lost with open positions
If the CLOB WebSocket feed dies and positions remain open, the system cannot monitor or hedge those positions. The safe action is to halt new orders until the feed is restored.
No single manual override path
In an incident, teams need a single, reliable mechanism to stop all trading immediately. Without a centralised kill signal, individual strategy shutdowns may be missed.
4. Required Polymarket Inputs
| Input | Source | Required? | Use |
|---|---|---|---|
| CLOB WebSocket connection status | WebSocket | Yes | Detect feed loss; trigger KillSwitch if WebSocket has been disconnected for more than the allowed dead window while positions are open. |
| Order reject events from the exchange | CLOB | Yes | Count reject events per minute; if reject rate exceeds reject_rate_circuit threshold, trigger KillSwitch. |
5. Required Internal Inputs
| Input | Source | Required? | Use |
|---|---|---|---|
| Rolling intraday and weekly P&L | PortfolioGuard | Yes | Monitor drawdown against intraday_drawdown_pct and weekly_drawdown_pct thresholds to trigger the circuit breaker. |
| Manual kill flag from operator dashboard | Admin UI | Yes | Accept a manual activation signal that overrides all automatic conditions and immediately halts trading. |
| Open position count | PortfolioGuard | No | Condition the WebSocket-dead trigger on whether any positions are currently open; if no positions exist, feed loss alone does not trigger KillSwitch. |
6. Parameter Guide
| Parameter | Default | Warning | Hard | What it controls |
|---|---|---|---|---|
| intraday_drawdown_pct | 12 | 8 | 12 | Intraday drawdown (as a percentage of start-of-day balance) at which KillSwitch triggers automatically. |
| weekly_drawdown_pct | 20 | 15 | 20 | Rolling 7-day drawdown (as a percentage of start-of-week balance) at which KillSwitch triggers automatically. |
| reject_rate_circuit | 30 | 20 | 30 | Order reject rate (as a percentage of submitted orders in a rolling 5-minute window) at which KillSwitch triggers automatically. |
| require_manual_reset | True | None | None | When true, KillSwitch remains active after the triggering condition clears and can only be deactivated by a manual action in the Admin UI. |
7. Detailed Parameter Instructions
intraday_drawdown_pct
What it means
Intraday drawdown (as a percentage of start-of-day balance) at which KillSwitch triggers automatically.
Default
{ "intraday_drawdown_pct": 12 }
Why this default matters
A 12% intraday drawdown is a significant adverse move that most strategies are not designed to continue trading through. Stopping at this level limits the maximum single-day loss.
Threshold logic
| Condition | Action |
|---|---|
| Intraday drawdown ≤ 8% | No action |
| 8–12% | WARN — log alert, no block yet |
| > 12% | ACTIVATE KillSwitch — reject all new orders |
Developer check
if (intradayDrawdownPct > p.hard) killswitch.activate('INTRADAY_DRAWDOWN_EXCEEDED');
User-facing English
Trading has been paused because today's losses reached the daily safety limit. No new orders will be placed until the limit is reset.
weekly_drawdown_pct
What it means
Rolling 7-day drawdown (as a percentage of start-of-week balance) at which KillSwitch triggers automatically.
Default
{ "weekly_drawdown_pct": 20 }
Why this default matters
A 20% weekly drawdown indicates a sustained losing period. Halting trading at this level prevents a single bad week from damaging the account irreparably.
Threshold logic
| Condition | Action |
|---|---|
| Weekly drawdown ≤ 15% | No action |
| 15–20% | WARN — log alert, no block yet |
| > 20% | ACTIVATE KillSwitch — reject all new orders |
Developer check
if (weeklyDrawdownPct > p.hard) killswitch.activate('WEEKLY_DRAWDOWN_EXCEEDED');
User-facing English
Trading has been paused because this week's total losses reached the weekly safety limit.
reject_rate_circuit
What it means
Order reject rate (as a percentage of submitted orders in a rolling 5-minute window) at which KillSwitch triggers automatically.
Default
{ "reject_rate_circuit": 30 }
Why this default matters
A 30% reject rate is far outside normal operating conditions and indicates a systemic problem — wallet desync, nonce collision, or exchange-side issue — that requires investigation before trading continues.
Threshold logic
| Condition | Action |
|---|---|
| Reject rate ≤ 20% over 5 min | No action |
| 20–30% | WARN — alert monitoring |
| > 30% | ACTIVATE KillSwitch — reject all new orders |
Developer check
if (rejectRate5min > p.hard) killswitch.activate('ORDER_BOOK_UNAVAILABLE');
User-facing English
A high number of orders are being rejected by the exchange. Trading has been paused while the issue is investigated.
require_manual_reset
What it means
When true, KillSwitch remains active after the triggering condition clears and can only be deactivated by a manual action in the Admin UI.
Default
{ "require_manual_reset": true }
Why this default matters
Automatic reactivation after a drawdown or reject-rate breach may resume trading before the root cause is understood. Requiring a manual reset ensures a human reviews the situation first.
Threshold logic
| Condition | Action |
|---|---|
| require_manual_reset=true | KillSwitch stays active until Admin UI reset; condition clearing alone is insufficient |
| require_manual_reset=false | KillSwitch deactivates automatically when the triggering condition falls below the warning threshold |
Developer check
if (!p.require_manual_reset && triggerConditionCleared) killswitch.deactivate();
User-facing English
Trading has been stopped for safety and requires a manual review before it can resume.
8. Default Configuration
{
"bot_id": "risk.kill_switch",
"version": "1.0.0",
"mode": "hard_guard",
"defaults": {
"intraday_drawdown_pct": 12,
"weekly_drawdown_pct": 20,
"reject_rate_circuit": 30,
"require_manual_reset": true
},
"locked": {
"require_manual_reset": {
"immutable": true
},
"intraday_drawdown_pct": {
"max": 20
},
"weekly_drawdown_pct": {
"max": 30
}
}
}9. Implementation Flow
- On each OrderIntent received, check the KillSwitch active flag in shared state before any other processing.
- If active flag is set, immediately return REJECT with reason_code matching the trigger (e.g. STRATEGY_BUDGET_EXCEEDED for drawdown, ORDER_BOOK_UNAVAILABLE for reject-rate or feed loss) — do not consult any other data.
- In a parallel monitoring loop, continuously read rolling intraday and weekly drawdown from PortfolioGuard.
- If intraday_drawdown_pct or weekly_drawdown_pct hard limit is breached, atomically set the active flag and record the trigger reason, timestamp, and metric value.
- In the same monitoring loop, count order reject events from the CLOB over a rolling 5-minute window; if reject_rate_circuit hard limit is breached, set the active flag.
- Monitor the CLOB WebSocket connection status; if the connection has been dead for more than 30 seconds and open positions exist, set the active flag with reason ORDER_BOOK_UNAVAILABLE.
- Listen for manual kill signal from Admin UI; on receipt, set the active flag immediately with reason MANUAL_KILL regardless of metric levels.
- When KillSwitch is active, emit a system-level alert to the monitoring stack on every rejected order.
- If require_manual_reset=true, ignore all automatic condition-clearing events; wait for explicit Admin UI reset.
- On reset, clear the active flag, log the reset timestamp and operator action, and resume the normal guardrail pipeline.
10. Reference Implementation
Runs two concurrent loops: a fast per-intent check that reads a single Redis key, and a background monitor that watches P&L, reject rate, and WS feed health. The Redis key is the single source of truth for the active flag.
Pseudocode is language-agnostic. FETCH = read input. EMIT = produce output. Translate to TS/Python/Go/Rust.
// ---- FAST PATH: called on every OrderIntent ----
FUNCTION checkKillSwitch(intent):
state = FETCH redis.GET('killswitch:state') // O(1) Redis read
IF state.active:
EMIT RiskVote(decision=HARD_REJECT,
reason=KILL_SWITCH_ACTIVE,
trigger_reason=state.trigger_reason,
activated_at=state.activated_at)
RETURN
// Pass through — other guardrails take over
EMIT RiskVote(decision=APPROVE)
// ---- BACKGROUND MONITOR: runs every 5s ----
FUNCTION monitorLoop():
WHILE true:
// 1. Intraday drawdown
drawdown = FETCH internal.portfolio_guard.drawdown_pct()
IF drawdown > params.intraday_drawdown_pct.hard:
activate('INTRADAY_DRAWDOWN_EXCEEDED')
// 2. Weekly drawdown
weeklyDrawdown = FETCH internal.portfolio_guard.weekly_drawdown_pct()
IF weeklyDrawdown > params.weekly_drawdown_pct.hard:
activate('WEEKLY_DRAWDOWN_EXCEEDED')
// 3. Order reject rate
rejectRate = FETCH internal.clob_reject_counter.rate_5min()
IF rejectRate > params.reject_rate_circuit.hard:
activate('ORDER_BOOK_UNAVAILABLE')
// 4. WebSocket feed health
wsDead = FETCH ws_market.seconds_since_last_message()
openPositions = FETCH internal.portfolio_guard.open_position_count()
IF wsDead > 30 AND openPositions > 0:
activate('ORDER_BOOK_UNAVAILABLE')
// 5. Data feed staleness
IF drawdown IS NULL AND wsDead > 60:
activate('STALE_MARKET_DATA') // fail-safe: unknown system health
SLEEP 5
// ---- ACTIVATE / DEACTIVATE ----
FUNCTION activate(triggerReason):
state = { active: true, trigger_reason: triggerReason, activated_at: now_iso() }
redis.SET('killswitch:state', state) // atomic write
EMIT alert(P0, 'KillSwitchActivated', state)
FUNCTION deactivate(operator):
IF params.require_manual_reset:
// Only admin UI can call this
redis.SET('killswitch:state', { active: false, reset_by: operator, reset_at: now_iso() })
ELSE:
IF triggerConditionCleared():
redis.SET('killswitch:state', { active: false })
Helpers used
| Helper | Signature | Purpose |
|---|---|---|
| isStale | isStale(snapshot: any, maxAgeS: int) -> bool | Returns true if snapshot was produced more than maxAgeS seconds ago; used for data-feed staleness gate. |
| toUsdcUnits | toUsdcUnits(rawUsd: float) -> int | Not used directly; imported for consistency with Risk pod SDK imports. |
| fetchClobPublic | fetchClobPublic(path: str) -> JSON | Used by the background monitor to sanity-check CLOB reachability. |
| buildOrderTypedData | buildOrderTypedData(intent, domain) -> TypedData | Not called by KillSwitch; referenced here because SmartRouter downstream calls it — KillSwitch's HARD_REJECT prevents it from running. |
SDK calls used
redis.GET('killswitch:state')redis.SET('killswitch:state', state)internal.portfolio_guard.drawdown_pct()internal.portfolio_guard.weekly_drawdown_pct()ws_market.seconds_since_last_message()
Complexity: O(1) per intent (single Redis read); background monitor O(1) per cycle
11. Wire Examples
Input — what arrives on the wire
OrderIntent arriving while KillSwitch is active — internal
{
"intent_id": "int_8e9f0a1b2c3d4e5f",
"market_id": "0x4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d",
"side": "BUY",
"size_usd": 500,
"generated_at": "2026-05-09T09:11:00Z"
}
Redis KillSwitch state (active) — internal
{
"active": true,
"trigger_reason": "INTRADAY_DRAWDOWN_EXCEEDED",
"trigger_metric": 0.132,
"activated_at": "2026-05-09T09:10:00Z",
"require_manual_reset": true
}
Output — what the bot emits
RiskVote — HARD_REJECT (KillSwitch active, drawdown breach)
{
"guard_id": "risk.kill_switch",
"decision": "HARD_REJECT",
"severity": "HARD",
"reason_code": "KILL_SWITCH_ACTIVE",
"message": "KillSwitch is active. Intraday drawdown of 13.2% exceeded the 12% circuit breaker. All new orders are rejected. Manual reset required.",
"constraints": {},
"trigger_reason": "INTRADAY_DRAWDOWN_EXCEEDED",
"trigger_metric": 0.132,
"activated_at": "2026-05-09T09:10:00Z",
"inputs_used": [
"redis.killswitch_state"
],
"checked_at": "2026-05-09T09:11:05Z"
}
RiskVote — APPROVE (KillSwitch not active)
{
"guard_id": "risk.kill_switch",
"decision": "APPROVE",
"severity": "INFO",
"reason_code": null,
"inputs_used": [
"redis.killswitch_state"
],
"checked_at": "2026-05-09T10:00:00Z"
}12. Decision Logic
APPROVE
KillSwitch active flag is false — no triggering conditions are met. All orders pass through to the rest of the guardrail pipeline.
RESHAPE_REQUIRED
Not applicable — KillSwitch does not reshape orders. It either passes them through or rejects them entirely.
REJECT
KillSwitch active flag is true for any reason: intraday drawdown exceeded, weekly drawdown exceeded, order reject rate too high, CLOB feed dead with open positions, or manual kill from Admin UI.
WARNING_ONLY
Not used — KillSwitch has reject and pause authority. Warning-level metrics (approaching thresholds) are logged as alerts but do not activate the switch until the hard threshold is breached.
13. Standard Decision Output
This bot returns a RiskVote object. See RiskVote schema.
{
"guard_id": "risk.kill_switch",
"decision": "REJECT",
"severity": "HARD",
"reason_code": "STRATEGY_BUDGET_EXCEEDED",
"message": "KillSwitch is active. Intraday drawdown of 13.2% exceeded the 12% circuit breaker. All new orders are rejected. Manual reset required.",
"constraints": {},
"trigger_reason": "INTRADAY_DRAWDOWN_EXCEEDED",
"trigger_metric": 0.132,
"activated_at": "2026-05-09T09:10:00Z",
"inputs_used": [
"portfolio_guard.drawdown",
"admin_ui.kill_flag"
],
"checked_at": "2026-05-09T09:11:05Z"
}14. Reason Codes
| Code | Severity | Meaning | Action | User-facing message |
|---|---|---|---|---|
KILL_SWITCH_ACTIVE | HARD_REJECT | The KillSwitch active flag is set; all orders are rejected regardless of other conditions. | Return HARD_REJECT immediately on every intent. | Trading is currently paused. Please review the settings panel for details. |
STRATEGY_BUDGET_EXCEEDED | HARD_REJECT | Intraday or weekly drawdown has exceeded the hard circuit-breaker threshold. | Activate KillSwitch with trigger_reason=INTRADAY_DRAWDOWN_EXCEEDED or WEEKLY_DRAWDOWN_EXCEEDED. | All trading has been paused because today's or this week's losses reached the safety limit. A manual review is required before trading can resume. |
STALE_MARKET_DATA | HARD_REJECT | P&L or WS feed data is unavailable for more than 60s; system health is unknown. | Activate KillSwitch with trigger_reason=STALE_MARKET_DATA. | Trading has been paused because market data became unavailable. It will resume once the connection is restored. |
KILL_SWITCH_INTRADAY_DRAWDOWN | HARD_REJECT | Intraday drawdown exceeded intraday_drawdown_pct hard threshold. | Activate KillSwitch; set trigger_reason=INTRADAY_DRAWDOWN_EXCEEDED. | Today's losses reached the daily limit. Trading has been paused. |
KILL_SWITCH_WEEKLY_DRAWDOWN | HARD_REJECT | Rolling 7-day drawdown exceeded weekly_drawdown_pct hard threshold. | Activate KillSwitch; set trigger_reason=WEEKLY_DRAWDOWN_EXCEEDED. | This week's losses reached the weekly limit. Trading has been paused. |
KILL_SWITCH_REJECT_RATE | HARD_REJECT | Order reject rate from the CLOB exceeded reject_rate_circuit threshold in a 5-minute window. | Activate KillSwitch; set trigger_reason=ORDER_BOOK_UNAVAILABLE. | A high number of orders were rejected by the exchange. Trading has been paused while the issue is investigated. |
KILL_SWITCH_FEED_DEAD | HARD_REJECT | CLOB WebSocket feed has been dead for > 30s with open positions. | Activate KillSwitch; set trigger_reason=ORDER_BOOK_UNAVAILABLE. | The live market data connection was interrupted while positions were open. Trading has been paused. |
KILL_SWITCH_MANUAL | HARD_REJECT | Operator manually activated KillSwitch via Admin UI. | Activate KillSwitch; set trigger_reason=MANUAL_KILL. | Trading was manually paused. It can be resumed from the settings panel. |
15. Metrics & Logs
Metrics emitted
| Metric | Type | Unit | Labels | Meaning |
|---|---|---|---|---|
polytraders_risk_killswitch_active | gauge | count | trigger_reason | 1 if KillSwitch is active, 0 otherwise. Label carries the triggering reason. |
polytraders_risk_killswitch_activations_total | counter | count | trigger_reason | Cumulative count of KillSwitch activations by trigger reason. |
polytraders_risk_killswitch_rejections_total | counter | count | trigger_reason | Cumulative count of OrderIntents rejected by KillSwitch while active. |
polytraders_risk_killswitch_active_duration_seconds | histogram | seconds | trigger_reason | Duration of each KillSwitch activation from trigger to manual reset. |
polytraders_risk_killswitch_check_latency_ms | histogram | seconds | Wall-clock latency of the Redis read on the fast-path check. |
Alerts
| Alert | Condition | Severity | Runbook |
|---|---|---|---|
KillSwitchActivated | polytraders_risk_killswitch_active > 0 | P0 | #runbook-killswitch-activated |
KillSwitchActivatedManual | polytraders_risk_killswitch_active{trigger_reason='MANUAL_KILL'} > 0 | P0 | #runbook-killswitch-manual |
KillSwitchRedisUnreachable | up{job='killswitch_redis'} == 0 | P0 | #runbook-killswitch-redis |
KillSwitchHighCheckLatency | histogram_quantile(0.99, rate(polytraders_risk_killswitch_check_latency_ms_bucket[5m])) > 10 | P1 | #runbook-killswitch-latency |
Dashboards
- Grafana — Risk overview / KillSwitch status
- Grafana — Incident timeline / KillSwitch activations
Log levels
| Level | What gets logged |
|---|---|
| DEBUG | Redis read result on every fast-path check. |
| INFO | KillSwitch activation and deactivation events with trigger_reason and metric values. |
| WARN | Drawdown or reject rate approaching warning threshold. |
| ERROR | Redis unreachable; drawdown data feed unavailable. |
16. Developer Reporting
{
"bot_id": "risk.kill_switch",
"decision": "REJECT",
"reason_code": "STRATEGY_BUDGET_EXCEEDED",
"trigger_reason": "INTRADAY_DRAWDOWN_EXCEEDED",
"inputs_used": [
"portfolio_guard.drawdown"
],
"metrics": {
"intraday_drawdown_pct": 13.2,
"weekly_drawdown_pct": 8.4,
"reject_rate_5min": 4.1,
"ws_dead_seconds": 0,
"open_positions": 3
},
"activated_at": "2026-05-09T09:10:00Z",
"require_manual_reset": true,
"checked_at": "2026-05-09T09:11:05Z"
}17. Plain-English Reporting
| Situation | User-facing explanation |
|---|---|
| Trading stopped — daily loss limit | All trading has been paused because today's total losses reached the daily safety limit. No new orders will be placed until you manually review and restart from the settings panel. |
| Trading stopped — weekly loss limit | All trading has been paused because this week's total losses reached the weekly safety limit. A manual review is required before trading can resume. |
| Trading stopped — order rejection spike | A large number of orders were rejected by the exchange in a short period. Trading has been paused while the issue is investigated to prevent placing orders that may not behave as expected. |
| Trading stopped — market data connection lost | The live market data connection was interrupted while positions were open. Trading has been paused until the connection is restored to ensure decisions are made on current information. |
| Trading stopped — manual action | Trading was manually paused. No new orders will be placed until the pause is lifted from the settings panel. |
18. Failure-Mode Block
| main_failure_mode | KillSwitch failing to activate during a genuine incident — for example, if the drawdown data feed from PortfolioGuard is itself unavailable, the circuit breaker never fires. |
|---|---|
| false_positive_risk | Triggering on a transient spike in the reject rate caused by a brief exchange-side issue that resolves in seconds, halting trading unnecessarily. |
| false_negative_risk | Not activating if the drawdown calculation is stale due to a P&L feed lag, allowing losses to accumulate beyond the stated hard limit before the trigger fires. |
| safe_fallback | If the drawdown or reject-rate data feed is unavailable for more than 60 seconds, KillSwitch activates automatically with reason STALE_MARKET_DATA. Uncertainty about system health defaults to halt, not continue. |
| required_dependencies | PortfolioGuard rolling P&L feed, CLOB order reject event stream, CLOB WebSocket connection health check, Admin UI kill-flag channel |
19. Failure-Injection Recipes
| Scenario | How to inject | Expected behaviour | Recovery |
|---|---|---|---|
REDIS_UNREACHABLE | Block TCP to Redis cluster | KillSwitch activates immediately with trigger_reason=STALE_MARKET_DATA; all intents return HARD_REJECT | KillSwitch returns to normal (active=false) only after manual reset once Redis is reachable. |
DRAWDOWN_BREACH | Set portfolio_guard mock to return drawdown_pct=0.13 | Background monitor activates KillSwitch within one 5s cycle; next intent returns HARD_REJECT(KILL_SWITCH_ACTIVE) | Manual reset after drawdown is confirmed resolved. |
REJECT_RATE_SPIKE | Inject 35 reject events in a 5-minute window | Background monitor activates KillSwitch with trigger_reason=ORDER_BOOK_UNAVAILABLE | Manual reset after root cause is investigated. |
WS_FEED_DEAD_WITH_POSITIONS | Disconnect WS market feed while open_positions > 0 | Background monitor activates KillSwitch within 30s | Reconnect WS; manual reset. |
MANUAL_KILL | Submit kill flag from Admin UI | Redis state set to active=true with trigger_reason=MANUAL_KILL immediately | Manual reset from Admin UI only. |
20. State & Persistence
Stores a single active flag and trigger metadata in Redis. This key is the authoritative source of truth for all guardrails.
State stores
| Name | Kind | Key | Value shape | TTL | Durability |
|---|---|---|---|---|---|
killswitch_state | redis | killswitch:state | { active: bool, trigger_reason: str, trigger_metric: float, activated_at: iso_ts, require_manual_reset: bool } | none | strong |
Cold-start recovery
On cold start, read Redis key. If key is missing (first deploy or Redis flush), default to active=false and log a WARNING.
On restart
Redis state is re-read immediately. If Redis is unreachable on startup, the bot activates with trigger_reason=STALE_MARKET_DATA until Redis is reachable.
21. Concurrency & Idempotency
| Aspect | Specification |
|---|---|
| Execution model | single-threaded event loop (fast path) + background monitor goroutine |
| Max in-flight | 1000 |
| Idempotency key | intent_id |
| Replay-safe | True |
| Deduplication | by intent_id within a 24h window |
| Ordering guarantees | no ordering — KillSwitch check is stateless per intent |
| Per-call timeout (ms) | 10 |
| Backpressure strategy | drop newest |
| Locking / mutual exclusion | none (Redis atomic SET) |
22. Dependencies
Depends on (must run first)
| Bot | Why | Contract |
|---|---|---|
| risk.portfolio_guard | Provides drawdown data that drives the automatic circuit-breaker trigger. | Monitor reads portfolio_guard.drawdown_pct() and weekly_drawdown_pct(). |
Emits to (downstream consumers)
| Bot | Why | Contract |
|---|---|---|
| risk.liquidity_guard | Every Risk bot consults KillSwitch first. | HARD_REJECT(KILL_SWITCH_ACTIVE) short-circuits LiquidityGuard. |
| risk.oracle_risk_monitor | Every Risk bot consults KillSwitch first. | HARD_REJECT(KILL_SWITCH_ACTIVE) short-circuits OracleRiskMonitor. |
| risk.portfolio_guard | PortfolioGuard checks KillSwitch first. | HARD_REJECT(KILL_SWITCH_ACTIVE) short-circuits PortfolioGuard. |
| exec.smart_router | SmartRouter checks KillSwitch before emitting any ExecutionPlan. | No ExecutionPlan emitted while KillSwitch is active. |
Used by (auto-aggregated)
0.1 0.2 0.3 0.4 0.5 0.6 2.1 2.2 2.3 2.4 2.5 2.6 2.8 4.1 4.10 4.11 4.12 4.13 4.14 4.15 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 1.1 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.3 1.4 1.5 1.6 1.7 1.8 1.9 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 3.1 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 3.2 3.20 3.21 3.22 3.23 3.24 3.3 3.4 3.5 3.6 3.7 3.8 3.9
External services
| Service | Endpoint | SLA assumed | On failure |
|---|---|---|---|
| Redis | internal Redis cluster | 99.99% / <5ms p99 | Fail-safe: if Redis is unreachable, activate KillSwitch immediately. |
| WS market feed | wss://ws-subscriptions-clob.polymarket.com/ws/market | best-effort | If feed dead > 30s with open positions, activate KillSwitch. |
23. Security Surfaces
KillSwitch holds no secrets and makes no signed calls. Its only write target is the internal Redis key. The Admin UI kill flag channel must be authenticated and audit-logged.
Signing surface
This bot does NOT sign anything.
Abuse vectors considered
- Unauthorized Redis write to clear the active flag and resume trading without operator review
- Race condition where two activations compete and one clears the trigger_reason
Mitigations
- Redis SET is atomic; activation is monotonic when require_manual_reset=true
- Admin UI kill-flag channel requires authenticated operator session and is audit-logged
- Deactivation via Admin UI requires explicit confirmation and logs operator identity
24. Polymarket V2 Compatibility
| Aspect | Value |
|---|---|
| CLOB version | v2 |
| Collateral asset | pUSD |
| EIP-712 Exchange domain version | 2 |
| Aware of builderCode field | no |
| Aware of negative-risk markets | no |
| Multi-chain ready | no |
| SDK used | @polymarket/clob-client-v2 ^2.x |
| Settlement contract | CTFExchangeV2 on Polygon |
| Notes | KillSwitch logic is V2-agnostic — it operates above the order-construction layer. The reject_rate_circuit trigger now counts V2 order rejections from CTFExchangeV2 matchOrders() calls. |
API surfaces declared
Networks supported
25. Versioning & Migration
| Field | Value |
|---|---|
| spec | 2.0.0 |
| implementation | 2.1.3 |
| schema | 2 |
| released | 2026-04-28 |
Migration history
| Date | From | To | Reason | Action taken |
|---|---|---|---|---|
| 2026-04-28 | v1 (USDC.e + HMAC builder) | v2 (pUSD + builderCode field) | Polymarket V2 cutover | Updated reject-rate counter to track V2 CTFExchangeV2 rejections. No structural change to the kill-switch logic or Redis schema. |
26. Acceptance Tests
Unit Tests
| Test | Setup | Expected result |
|---|---|---|
| Reject all orders when active flag is true | killswitch.active=true, order=any | REJECT with appropriate reason_code on every order |
| Activate on intraday drawdown breach | intraday_drawdown_pct=13, threshold=12 | active flag set to true, trigger_reason=INTRADAY_DRAWDOWN_EXCEEDED |
| Activate on weekly drawdown breach | weekly_drawdown_pct=22, threshold=20 | active flag set to true, trigger_reason=WEEKLY_DRAWDOWN_EXCEEDED |
| Activate on reject-rate circuit breach | reject_rate_5min=35, threshold=30 | active flag set to true, trigger_reason=ORDER_BOOK_UNAVAILABLE |
| Activate on manual kill signal | admin_ui.kill_flag=true | active flag set to true, trigger_reason=MANUAL_KILL |
| Does not auto-reset when require_manual_reset=true even after condition clears | active=true, drawdown drops below warning, require_manual_reset=true | active flag remains true; orders continue to be rejected |
Integration Tests
| Test | Expected result |
|---|---|
| End-to-end: drawdown breach in PortfolioGuard activates KillSwitch and blocks next OrderIntent | Next OrderIntent after drawdown breach returns REJECT without reaching LiquidityGuard or OracleRiskMonitor |
| WebSocket dead with open positions activates KillSwitch within 30 seconds | KillSwitch active flag set within 30 s of WebSocket disconnection when open_positions > 0 |
| Admin UI reset clears active flag and allows orders through again | OrderIntents approved by other guardrails pass through normally after Admin UI reset |
Property Tests
| Property | Required behaviour |
|---|---|
| No order is ever approved when KillSwitch active flag is true | Always true — active flag is checked atomically before any other processing |
| KillSwitch activation is monotonic when require_manual_reset=true | Once active, the flag cannot be cleared by any automated process; only a manual reset counts |
| Missing drawdown data triggers activation, never approval | Always true — data feed unavailability defaults to halt |
27. Operational Runbook
KillSwitch is the most critical bot in the system. Every activation is a P0 incident requiring immediate operator attention. Never auto-reset without confirming root cause.
On-call actions
| Alert | First step | Diagnosis | Mitigation | Escalate to |
|---|---|---|---|---|
KillSwitchActivated | Identify trigger_reason from the alert label or Redis state. | For INTRADAY_DRAWDOWN_EXCEEDED: check PortfolioGuard drawdown gauge. For ORDER_BOOK_UNAVAILABLE: check WS feed and CLOB status. For STALE_MARKET_DATA: check Redis and Data API connectivity. | Do NOT reset until root cause is confirmed. Notify Risk pod lead. | Risk pod lead immediately on every KillSwitch activation. |
KillSwitchRedisUnreachable | Check Redis cluster health immediately. | If Redis is down, KillSwitch is in fail-safe mode (active). All trading is halted. | Restore Redis connectivity. KillSwitch will resume normal operation once Redis is reachable. | Infra on-call immediately. |
KillSwitchHighCheckLatency | Check Redis p99 latency and network path. | High latency on the fast-path Redis read will delay HARD_REJECT on every intent. | Switch to replica Redis node if primary is degraded. | Infra on-call if latency > 10ms sustained. |
Manual overrides
polytraders bot reset killswitch— Clears the active flag in Redis after operator confirms root cause resolved. Requires Risk pod lead sign-off and is audit-logged.polytraders bot status killswitch— Prints current Redis state including trigger_reason, activated_at, and require_manual_reset flag.
Healthcheck
GET /health → 200 if Redis is reachable and key read latency < 5ms.28. Promotion Gates
A bot does not advance to the next readiness state until every gate below is green. Gates are observable from production data — no subjective sign-off.
Promote to Shadow
| Gate | How measured | Threshold |
|---|---|---|
| Unit tests pass for all trigger conditions and deactivation logic | CI test run | 100% pass |
| Redis integration test: atomic SET and GET round-trip verified | Integration test | Pass |
Promote to Limited live
| Gate | How measured | Threshold |
|---|---|---|
| Fast-path check latency p99 < 10ms over 24h | polytraders_risk_killswitch_check_latency_ms histogram | p99 < 10ms |
| Background monitor fires drawdown activation in staging within 5s of breach | Failure injection test | Activation within 5s |
Promote to General live
| Gate | How measured | Threshold |
|---|---|---|
| Manual kill flow verified end-to-end including Admin UI auth and audit log | E2E test in staging | Pass |
| Redis-unavailable fail-safe activates within 1s of Redis going down | Failure injection test | Pass |
29. Developer Checklist
Ready-to-ship score: 27/27 sections complete · 100%
| Requirement | Status |
|---|---|
| Purpose defined | ✓ done |
| Required inputs listed | ✓ done |
| Parameters defined | ✓ done |
| Defaults defined | ✓ done |
| Warning thresholds defined | ✓ done |
| Hard thresholds defined | ✓ done |
| Safe fallback defined | ✓ done |
| Structured output defined | ✓ done |
| Developer log defined | ✓ done |
| Plain-English explanation | ✓ done |
| Unit tests defined | ✓ done |
| Integration tests defined | ✓ done |
| Property tests defined | ✓ done |
| Failure-mode block complete | ✓ done |
| Reference implementation pseudocode | ✓ done |
| Wire examples (input + output) | ✓ done |
| Reason codes listed | ✓ done |
| Metrics & logs defined | ✓ done |
| State & persistence defined | ✓ done |
| Concurrency & idempotency defined | ✓ done |
| Dependencies declared | ✓ done |
| Security surfaces declared | ✓ done |
| Polymarket V2 compatibility declared | ✓ done |
| Version & migration history declared | ✓ done |
| Operational runbook defined | ✓ done |
| Promotion gates defined | ✓ done |
| Failure-injection recipes defined | ✓ done |