Polytraders Dev Guide
internal
v3 spine Phase 1 · Shared contracts 9 demo-wired · 0 shadow-ready · 0 production-live · 100 pending · 109 total 15/33 infra tasks the plan status board
HomeBy LayerGovernance6.19 ConfigDriftDetector

6.19 ConfigDriftDetector

Governance Governance Observe PLANNED Spec ready capital · Direct P7 · Governance & replay pending stub

Compares the running BotConfig of every live bot against the latest committed config in the config repo. Any drift (running != committed) is surfaced as a ConfigDriftReport naming the bot, the field, and the drift amount. Operators are forced to either commit the change or revert it.

v3 readiness

Docs27/27
donehow scored
Impl0/15
pendinghow scored
Backtest0/4
pendinghow scored
Runtime0/8
pendinghow scored

A bot is done when all four scores are. What does done mean?

1. Bot Identity

LayerGovernance  Governance
Bot classGovernance
AuthorityObserve
StatusPLANNED
ReadinessSpec ready
Runs before
Runs after
Applies toContinuous
Default modeshadow
User-visibleYes
Developer ownerGovernance pod

Operational profile

OwnershipGovernance pod · on-call gov-oncall · #polytraders-gov · escalates to Head of Governance · P2
Latency budget5000ms
Modes supportedoffshadowadvisoryenforced
Data freshnessmax_market_data_age_ms=900000 · max_orderbook_age_ms=900000 · max_external_feed_age_ms=900000 · on stale → Emit status=UNKNOWN.
Human overrideyes · by Governance on-call · logs GOV_CONFIG_DRIFT_ACK · time-bound: Until next check · scope: Single bot_slug · second approval required

2. Purpose

Compares the running BotConfig of every live bot against the latest committed config in the config repo. Any drift (running != committed) is surfaced as a ConfigDriftReport naming the bot, the field, and the drift amount. Operators are forced to either commit the change or revert it.

3. Why This Bot Matters

  • Untracked tuning

    An on-call who tweaks a threshold via the Admin UI without committing the change loses an audit trail; the next incident review cannot reconstruct the system state.

  • Drift between staging and prod

    Without an explicit comparison, prod can silently run an ancient config while staging is updated.

  • Compliance evidence

    Auditors require evidence that the running configuration matches a reviewed and signed-off version.

No worked examples on this bot yet. Worked examples are optional but strongly recommended — they turn an abstract failure mode into something a developer can verify in a fixture.

4. Required Polymarket Inputs

— not yet authored —

5. Required Internal Inputs

InputSourceRequired?Use
Running BotConfig per botBot runtimeYesEffective config in process memory, including any live-edited fields.
Committed BotConfig per botConfig repo (Git)YesSource of truth signed-off configuration.

6. Parameter Guide

ParameterDefaultWarningHardWhat it controls
check_interval_minutes153060How often the drift comparison runs.
tolerance_for_numeric_drift000.001Tolerance for numeric fields before a drift is flagged.

7. Detailed Parameter Instructions

check_interval_minutes

What it means

How often the drift comparison runs.

Default

{ "check_interval_minutes": 15 }

Why this default matters

Quarter-hourly is frequent enough to catch live edits before they outlive the on-call shift.

Threshold logic

ConditionAction
15Default

Developer check

schedule.every(p.check_interval_minutes).do(check);

User-facing English

(Internal.)

tolerance_for_numeric_drift

What it means

Tolerance for numeric fields before a drift is flagged.

Default

{ "tolerance_for_numeric_drift": 0 }

Why this default matters

Zero — there is no acceptable silent drift in production. Use `human_override` to record an intentional change.

Threshold logic

ConditionAction
0Default — strict

Developer check

if (abs(running - committed) > p.tolerance_for_numeric_drift) flag(field);

User-facing English

(Internal.)

8. Default Configuration

{
  "check_interval_minutes": 15,
  "tolerance_for_numeric_drift": 0
}

9. Implementation Flow

— not yet authored —

10. Reference Implementation

Pseudocode is language-agnostic. FETCH = read input. EMIT = produce output. IF/THEN/ELSE = decision. Translate directly to TypeScript, Python, Go, or Rust.

for bot in registry.live_bots():
  running = bot.runtime_config()
  committed = repo.config(bot.slug, branch='main')
  diffs = canonical_diff(running, committed, p.tolerance_for_numeric_drift)
  if diffs: emit('ConfigDriftReport', bot.slug, diffs)

11. Wire Examples

Input — what arrives on the wire

{
  "bot_slug": "risk.killswitch",
  "running": {
    "intraday_drawdown_pct": 10
  },
  "committed": {
    "intraday_drawdown_pct": 12
  }
}

Output — what the bot emits

{
  "kind": "ConfigDriftReport",
  "bot_slug": "risk.killswitch",
  "drifts": [
    {
      "field": "intraday_drawdown_pct",
      "running": 10,
      "committed": 12
    }
  ]
}

12. Decision Logic

APPROVE

Strict equality on enums and strings. Numeric tolerance applied via `tolerance_for_numeric_drift`. Drift latched until either a commit or revert resolves it.

RESHAPE_REQUIRED

This bot does not reshape orders.

REJECT

No reject path defined for this bot — it is observe-only.

WARNING_ONLY

No warn-only path defined.

13. Standard Decision Output

This bot returns a RiskVote object. See RiskVote schema.

{
  "kind": "ConfigDriftReport",
  "bot_slug": "risk.killswitch",
  "drifts": [
    {
      "field": "intraday_drawdown_pct",
      "running": 10,
      "committed": 12,
      "since_ts_ms": 1715260000000
    }
  ]
}

14. Reason Codes

CodeSeverityMeaningActionUser-facing message
GOV_CONFIG_DRIFT_DETECTEDP3Gov Config Drift DetectedSee decision output and developer log for context.The running configuration of one of the system's safeties differs from the version on file. Operators must reconcile.
GOV_CONFIG_DRIFT_RESOLVEDP3Gov Config Drift ResolvedSee decision output and developer log for context.The running configuration of one of the system's safeties differs from the version on file. Operators must reconcile.
GOV_CONFIG_DRIFT_UNKNOWNP3Gov Config Drift UnknownSee decision output and developer log for context.The running configuration of one of the system's safeties differs from the version on file. Operators must reconcile.

15. Metrics & Logs

Metrics emitted

MetricTypeUnitLabelsMeaning
drift_reports_totalcountereventbot_idDrift reports total.
bots_in_driftcountereventbot_idBots in drift.
drift_resolution_minutes_histogramcountereventbot_idDrift resolution minutes histogram.

Dashboards

  • 6.19 overview dashboard

16. Developer Reporting

"Per check: bot_slug, drifts_count, fields_drifted."

17. Plain-English Reporting

SituationUser-facing explanation
When this bot actsThe running configuration of one of the system's safeties differs from the version on file. Operators must reconcile.

18. Failure-Mode Block

main_failure_modeComparing against the wrong committed revision (e.g. wrong branch).
false_positive_riskDifferences in field ordering or default-equivalence falsely flagged; mitigation: canonicalise both sides through the JSON Schema before diffing.
false_negative_riskBot's running config object is missing a field the committed version added; mitigation: schema-validate both sides and treat missing-vs-present as drift.
safe_fallbackIf the committed config cannot be fetched, emit ConfigDriftReport with status=UNKNOWN — never silently report 'no drift'.
required_dependencies

19. Failure-Injection Recipes

ScenarioHow to injectExpected behaviourRecovery
Block the config repo and assert UNKNOWN is emittedBlock the config repo and assert UNKNOWN is emitted.Bot detects within its latency budget and emits the corresponding reason code.Remove the injected fault; bot returns to healthy state within one debounce window.
Drift one field and assert the report contains exactly that fieldDrift one field and assert the report contains exactly that field.Bot detects within its latency budget and emits the corresponding reason code.Remove the injected fault; bot returns to healthy state within one debounce window.

20. State & Persistence

Last drift report per bot. Persisted to KV.

State stores

NameKindKeyValue shapeTTLDurability
config_drift_detector_statein-memory + fast KV mirrorbot_idLast drift report per bot. Persisted to KV.24hcrash-safe via KV mirror

Cold-start recovery

Cold-start hydrates from fast KV; missing keys default to safe fallback.

On restart

All in-flight decisions are re-evaluated; no bot decision is trusted across restart without re-emit.

21. Concurrency & Idempotency

AspectSpecification
Execution modelSingle scheduled checker; no per-bot fan-out.
Max in-flight32
Idempotency keyorder_intent_id
Replay-safeTrue
DeduplicationBy idempotency_key within a 60s window.
Ordering guaranteesPer-market_id FIFO; cross-market unordered.
Per-call timeout (ms)250
Backpressure strategyBounded queue; oldest-dropped with metric increment when full.
Locking / mutual exclusionPer-market_id mutex; no global locks.

22. Dependencies

ConsumesBotConfigRunning BotConfigCommitted
EmitsOperationsReport(kind=ConfigDriftReport)
Blocks ordersno

23. Security Surfaces

Read-only access to config repo. Read-only RPC into bot runtime.

Signing surface

None — bot does not sign or submit.

Mitigations

  • Rate-limit per source
  • Audit-log every override
  • Require role-based authz on admin paths

24. Polymarket V2 Compatibility

AspectValue
CLOB versionV2
Collateral assetpUSD
EIP-712 Exchange domain version2
Aware of builderCode fieldyes
Aware of negative-risk marketsyes
Multi-chain readyyes
SDK usedPolymarket CLOB V2 SDK
Settlement contractCTFExchangeV2
NotesOperates on V2 BotConfig schema only.

25. Versioning & Migration

FieldValue
current0.1.0
contract_version1.0.0
last_breaking_changenone
deprecation_window_days30

26. Acceptance Tests

Unit Tests

TestSetupExpected result
Identical configs report no drift.Synthetic fixture per template.Behaviour matches the rule described in the test name.
One numeric field changed by 1 reports the drift exactly.Synthetic fixture per template.Behaviour matches the rule described in the test name.

Integration Tests

TestExpected result
Bump a running config via Admin UI without committing; the next check emits a drift report within one interval.End-to-end behaviour matches the spec without manual intervention.

Property Tests

PropertyRequired behaviour
For any (running, committed), the drift list contains exactly the fields whose canonicalised values differ.Always true across all generated inputs.

27. Operational Runbook

If multiple bots drift simultaneously: confirm the config repo branch the checker is reading from is the production branch.

On-call actions

AlertFirst stepDiagnosisMitigationEscalate to
6.19_anomalyOpen the bot's reporting page and confirm the alert is real (not a metric hiccup).Inspect developer log entries for the affected market_id over the last 30 minutes.Force-clear via Admin UI if the rule is clearly stale; otherwise leave engaged and notify owner.Governance pod

Manual overrides

  • polytraders bot pause 6.19 — Disables the bot's enforcement layer; downstream consumers fall back to safe defaults.

Healthcheck

GET /healthz/config_drift_detector → 200 if last successful evaluation < 60s ago.

28. Promotion Gates

A bot does not advance to the next readiness state until every gate below is green. Gates are observable from production data — no subjective sign-off.

Promote to Shadow

GateHow measuredThreshold
Stubagainst synthetic drifts.Documented threshold met for the full window.

Promote to Limited live

GateHow measuredThreshold
Shadow14 days; reports compared by Governance on-call.Documented threshold met for the full window.
Advisory7 days.Documented threshold met for the full window.

Promote to General live

GateHow measuredThreshold
Enforceddrift reports break the daily ops digest.Documented threshold met for the full window.

29. Developer Checklist

Ready-to-ship score: 27/27 sections complete · 100%

RequirementStatus
Purpose defined✓ done
Required inputs listed✓ done
Parameters defined✓ done
Defaults defined✓ done
Warning thresholds defined✓ done
Hard thresholds defined✓ done
Safe fallback defined✓ done
Structured output defined✓ done
Developer log defined✓ done
Plain-English explanation✓ done
Unit tests defined✓ done
Integration tests defined✓ done
Property tests defined✓ done
Failure-mode block complete✓ done
Reference implementation pseudocode✓ done
Wire examples (input + output)✓ done
Reason codes listed✓ done
Metrics & logs defined✓ done
State & persistence defined✓ done
Concurrency & idempotency defined✓ done
Dependencies declared✓ done
Security surfaces declared✓ done
Polymarket V2 compatibility declared✓ done
Version & migration history declared✓ done
Operational runbook defined✓ done
Promotion gates defined✓ done
Failure-injection recipes defined✓ done