Polytraders Dev Guide
internal
v3 spine Phase 1 · Shared contracts 9 demo-wired · 0 shadow-ready · 0 production-live · 100 pending · 109 total 15/33 infra tasks the plan status board
HomeBy LayerExecution2.10 LatencyProfiler

2.10 LatencyProfiler

Execution Execution Utility Reshape PLANNED Spec started capital · Direct P5 · Execution rails pending stub

LatencyProfiler continuously measures round-trip order submission latency by route and surfaces regressions. It probes each configured route at probe_interval_s and emits ObservationReports when p95 or p99 thresholds are breached.

v3 readiness

Docs27/27
donehow scored
Impl0/15
pendinghow scored
Backtest0/4
pendinghow scored
Runtime0/8
pendinghow scored

A bot is done when all four scores are. What does done mean?

1. Bot Identity

LayerExecution  Execution
Bot classExecution Utility
AuthorityReshape
StatusPLANNED
ReadinessSpec started
Runs beforeAny exec bot that uses latency data for routing decisions
Runs afterOrder submission and fill events from ws_user
Applies toAll CLOB V2 order submission and ws feed routes continuously
Default modeshadow_only
User-visiblesummary-only
Developer ownerPolytraders core — Execution pod

Operational profile

Modes supportedquarantine

2. Purpose

LatencyProfiler continuously measures round-trip order submission latency by route and surfaces regressions. It probes each configured route at probe_interval_s and emits ObservationReports when p95 or p99 thresholds are breached.

3. Why This Bot Matters

  • Latency regression undetected

    Strategy signals age past their TTL in transit, causing stale-signal discards and missed opportunities without a clear root cause.

  • Route not profiled per endpoint

    A degraded CLOB endpoint continues to receive orders because the routing layer lacks per-route latency data.

  • WebSocket lag not tracked

    ws_user fill events arrive late, causing order lifecycle state to be updated with significant delay.

No worked examples on this bot yet. Worked examples are optional but strongly recommended — they turn an abstract failure mode into something a developer can verify in a fixture.

4. Required Polymarket Inputs

InputSourceRequired?Use
CLOB V2 order submission endpoint (probe orders)clob_authYesMeasure submit-to-ack latency per endpoint.
WebSocket user feed heartbeatws_userYesMeasure ws feed lag by comparing heartbeat timestamp to local clock.

5. Required Internal Inputs

InputSourceRequired?Use
Probe trigger from schedulerinternal schedulerYesTrigger a latency probe on each configured route every probe_interval_s.

6. Parameter Guide

ParameterDefaultWarningHardWhat it controls
warn_p95_ms150200500p95 round-trip latency in milliseconds above which a WARN ObservationReport is emitted.
fail_p99_ms5007501000p99 round-trip latency in milliseconds above which HARD_REJECT is raised and the route is flagged as degraded.
probe_interval_s3060120How often to send a probe request to each configured route to measure latency.
routes_to_probe['clob_auth', 'ws_user']List of route identifiers to probe. Each entry corresponds to a configured CLOB V2 endpoint or WebSocket feed.

7. Detailed Parameter Instructions

warn_p95_ms

What it means

p95 round-trip latency in milliseconds above which a WARN ObservationReport is emitted.

Default

{ "warn_p95_ms": 150 }

Why this default matters

150ms p95 is the target for acceptable order routing latency; above 200ms strategies begin experiencing signal-age issues.

Threshold logic

ConditionAction
p95_ms <= 150No alert
150 < p95_ms <= 200WARN — LATENCY_WARN emitted
p95_ms > 500 (hard)HARD_REJECT — LATENCY_HARD_BREACH; alert fired

Developer check

if p95 > params.warn_p95_ms: emit(LATENCY_WARN)

User-facing English

Exchange connection speed is being monitored.

fail_p99_ms

What it means

p99 round-trip latency in milliseconds above which HARD_REJECT is raised and the route is flagged as degraded.

Default

{ "fail_p99_ms": 500 }

Why this default matters

500ms p99 is the threshold at which GTD signal TTLs begin expiring in transit; above this, order submission must be suspended on the degraded route.

Threshold logic

ConditionAction
p99_ms <= 500Healthy
500 < p99_ms <= 750WARN — LATENCY_P99_ELEVATED
p99_ms > 1000 (hard)HARD_REJECT — flag route degraded; notify exec bots

Developer check

if p99 > params.fail_p99_ms: flagRoute(route, 'degraded')

User-facing English

— not yet authored —

probe_interval_s

What it means

How often to send a probe request to each configured route to measure latency.

Default

{ "probe_interval_s": 30 }

Why this default matters

30s provides frequent enough sampling to detect latency regressions within one minute while consuming minimal rate-limit budget.

Threshold logic

ConditionAction
interval <= 30sNormal probe cadence
interval > 60sWARN — latency regressions may go undetected for > 1 minute
interval > 120s (hard)Reject config

Developer check

assert params.probe_interval_s <= params.hard

User-facing English

— not yet authored —

routes_to_probe

What it means

List of route identifiers to probe. Each entry corresponds to a configured CLOB V2 endpoint or WebSocket feed.

Default

{ "routes_to_probe": ["clob_auth", "ws_user"] }

Why this default matters

Probing both REST auth and WebSocket feeds captures the two most latency-sensitive paths for order execution.

Threshold logic

ConditionAction
includes both clob_auth and ws_userFull coverage
missing ws_userWARN — WebSocket lag not monitored

Developer check

if 'ws_user' not in params.routes_to_probe: emit(WARN)

User-facing English

— not yet authored —

8. Default Configuration

{
  "bot_id": "exec.latencyprofiler",
  "version": "0.1.0",
  "mode": "shadow_only",
  "defaults": {
    "warn_p95_ms": 150,
    "fail_p99_ms": 500,
    "probe_interval_s": 30,
    "routes_to_probe": [
      "clob_auth",
      "ws_user"
    ]
  },
  "locked": {
    "warn_p95_ms": {
      "max": 500
    },
    "fail_p99_ms": {
      "max": 1000
    },
    "probe_interval_s": {
      "max": 120
    }
  }
}

9. Implementation Flow

  1. Every probe_interval_s, for each route in routes_to_probe: send a probe request and record send_ms.
  2. For clob_auth: issue a lightweight GET /time or authenticated OPTIONS; record ack_ms.
  3. For ws_user: compare heartbeat ts_ms to local now_ms; record feed_lag_ms.
  4. Maintain a rolling window of the last 100 probe round-trip times per route.
  5. Compute p50, p95, p99 from the rolling window.
  6. If p95 > warn_p95_ms: emit ObservationReport(LATENCY_WARN) for the route.
  7. If p99 > fail_p99_ms: emit ObservationReport(LATENCY_HARD_BREACH); flag route as degraded in internal state store.
  8. Publish per-route latency histogram metrics every probe cycle.

10. Reference Implementation

Pseudocode is language-agnostic. FETCH = read input. EMIT = produce output. IF/THEN/ELSE = decision. Translate directly to TypeScript, Python, Go, or Rust.

FUNCTION probeRoute(route):
  sendMs = now_ms()
  IF route == 'clob_auth':
    result = clob_auth.GET('/time')  // lightweight probe
    ackMs = now_ms()
    rtt = ackMs - sendMs
    IF result IS NULL OR result.error:
      rtt = 1000  // count as max latency
  ELIF route == 'ws_user':
    hb = ws_user.lastHeartbeat()
    rtt = now_ms() - hb.ts_ms

  // Update rolling window
  windows[route].append(rtt)
  IF len(windows[route]) > 100:
    windows[route].pop(0)

  // Compute percentiles
  sorted_w = sorted(windows[route])
  p50 = sorted_w[int(0.50 * len(sorted_w))]
  p95 = sorted_w[int(0.95 * len(sorted_w))]
  p99 = sorted_w[int(0.99 * len(sorted_w))]

  // Threshold checks
  IF p99 > params.fail_p99_ms:
    routeState[route] = 'degraded'
    EMIT ObservationReport(route, p50, p95, p99, LATENCY_HARD_BREACH)
  ELIF p95 > params.warn_p95_ms:
    EMIT ObservationReport(route, p50, p95, p99, LATENCY_WARN)

SCHEDULE probeRoute FOR EACH route IN params.routes_to_probe
         EVERY params.probe_interval_s

SDK calls used

  • clob_auth.GET('/time')
  • ws_user.lastHeartbeat()

Complexity: O(W log W) where W = rolling window size (100)

11. Wire Examples

Input — what arrives on the wire

Probe trigger (internal scheduler)internal

{
  "route": "clob_auth",
  "trigger_ts_ms": 1746770300000
}

Output — what the bot emits

ObservationReport — LATENCY_WARN

{
  "report_id": "rep_5e6f7a8b9c0d1e2f",
  "bot_id": "exec.latencyprofiler",
  "route": "clob_auth",
  "p50_ms": 45,
  "p95_ms": 160,
  "p99_ms": 280,
  "verdict": "LATENCY_WARN",
  "measured_at_ms": 1746770300000
}

12. Decision Logic

APPROVE

p95 and p99 within thresholds; route healthy; no ObservationReport emitted.

RESHAPE_REQUIRED

Not applicable — LatencyProfiler is observation-only; it does not reshape orders.

REJECT

p99 exceeds fail_p99_ms; route flagged degraded; LATENCY_HARD_BREACH emitted.

WARNING_ONLY

p95 exceeds warn_p95_ms but p99 within threshold; LATENCY_WARN emitted.

13. Standard Decision Output

This bot returns a ObservationReport object. See ObservationReport schema.

{
  "report_id": "rep_5e6f7a8b9c0d1e2f",
  "trace_id": "trc_4d5e6f7a8b9c0d1e",
  "bot_id": "exec.latencyprofiler",
  "route": "clob_auth",
  "p50_ms": 45,
  "p95_ms": 160,
  "p99_ms": 280,
  "verdict": "LATENCY_WARN",
  "window_size": 100,
  "measured_at_ms": 1746770300000
}

14. Reason Codes

CodeSeverityMeaningActionUser-facing message
LATENCY_OKINFOAll probed routes within p95 and p99 thresholds.No alert; emit metrics only.
LATENCY_WARNWARNp95 latency exceeded warn_p95_ms on a probed route.Emit ObservationReport with WARN; do not block orders.Exchange connection is slightly slower than normal.
LATENCY_HARD_BREACHHARD_REJECTp99 latency exceeded fail_p99_ms; route flagged as degraded.Flag route degraded; notify exec bots; alert ops.Exchange connection has degraded. Order submission may be affected.
PROBE_TIMEOUTWARNProbe request timed out; recorded as max latency (1000ms) in rolling window.Record max latency; check for 3 consecutive timeouts before HARD_REJECT.

15. Metrics & Logs

Metrics emitted

MetricTypeUnitLabelsMeaning
polytraders_exec_latencyprofiler_rtt_mshistogrammsrouteRound-trip latency histogram per probed route.
polytraders_exec_latencyprofiler_degraded_routesgaugecountNumber of routes currently flagged as degraded.
polytraders_exec_latencyprofiler_probe_errors_totalcountercountrouteTotal probe timeouts or errors per route.

Alerts

AlertConditionSeverityRunbook
LatencyProfilerRoutesDegradedpolytraders_exec_latencyprofiler_degraded_routes > 0P1#runbook-latencyprofiler-degraded
LatencyProfilerHighP99histogram_quantile(0.99, rate(polytraders_exec_latencyprofiler_rtt_ms_bucket[5m])) > 500P2#runbook-latencyprofiler-p99

16. Developer Reporting

{
  "route": "clob_auth",
  "p50_ms": 45,
  "p95_ms": 160,
  "p99_ms": 280,
  "warn_p95_ms": 150,
  "fail_p99_ms": 500,
  "samples": 100,
  "route_degraded": false
}

17. Plain-English Reporting

SituationUser-facing explanation
Latency warning on submission routeThe connection to the exchange is slightly slower than normal. Orders may take a moment longer to be processed.
Route flagged degradedThe exchange connection speed has degraded significantly. Order submission may be suspended until conditions improve.

18. Failure-Mode Block

main_failure_modeProbe requests consume rate-limit budget on a congested connection, making actual order submission slower.
false_positive_riskA single slow probe response inflates p99, triggering LATENCY_HARD_BREACH when the route is actually healthy.
false_negative_riskRolling window too large (100 samples over 30s intervals) means a sudden latency spike takes up to 50 minutes to fully propagate through the p99 estimate.
safe_fallbackIf probe itself times out, record as max latency (1000ms) in the rolling window; emit LATENCY_HARD_BREACH after 3 consecutive timeouts.
required_dependenciesclob_auth endpoint, ws_user heartbeat, internal scheduler for probe triggers

19. Failure-Injection Recipes

ScenarioHow to injectExpected behaviourRecovery
CLOB_AUTH_HIGH_LATENCYAdd 600ms artificial delay to clob_auth GET /time responsesDelay removed; next probe cycle shows improved p99; route unflagged after 3 healthy probes
WS_USER_HEARTBEAT_STALEStop ws_user heartbeat for 10sHeartbeat resumes; lag drops; route unflagged
PROBE_RATE_LIMIT_EXHAUSTIONReduce probe_interval_s to 1s and increase routes_to_probe to 10 entriesConfig corrected; probes resume at safe interval

20. State & Persistence

Cold-start recovery

Window cleared on restart; first probe cycle rebuilds estimates from scratch.

21. Concurrency & Idempotency

AspectSpecification
Execution modelscheduled coroutine per route
Max in-flight10
Idempotency keyroute + probe_trigger_ts_ms
Per-call timeout (ms)1000
Backpressure strategyDrop probe if previous probe for same route still in flight
Locking / mutual exclusionper-route mutex for rolling window writes

22. Dependencies

Depends on (must run first)

BotWhyContract
internal.schedulerProvides probe triggers every probe_interval_s.Probe fires within ±5s of scheduled interval.

Emits to (downstream consumers)

BotWhyContract
exec.orderlifecyclemanagerDegraded route flags inform lifecycle manager to escalate stuck-order thresholds.ObservationReport with route_degraded=true consumed by exec bots.

External services

ServiceEndpointSLA assumedOn failure
CLOB V2 auth APIhttps://clob.polymarket.com99.95% / 200ms p99Probe timeout counted as 1000ms in rolling window.
WS user feedwss://ws-subscriptions-clob.polymarket.com/ws/userbest-effortIf heartbeat absent > 5s, feed_lag recorded as 5000ms.

23. Security Surfaces

Abuse vectors considered

  • Flooding probe scheduler to exhaust rate-limit budget with unnecessary latency checks
  • Injecting fake degraded-route state to suppress order submission on healthy routes

Mitigations

  • Probe rate capped at 1/probe_interval_s per route; scheduler enforces minimum interval
  • Route degraded state writable only by LatencyProfiler process; read by other exec bots via internal read-only API

24. Polymarket V2 Compatibility

AspectValue
CLOB versionv2
Collateral assetpUSD
EIP-712 Exchange domain version2
Aware of builderCode fieldno
Aware of negative-risk marketsno
Multi-chain readyno
SDK usedpy-clob-client-v2
Settlement contractCTFExchangeV2
NotesLatencyProfiler probes CLOB V2 auth endpoint latency only; it does not sign or submit real orders. All measurements are in milliseconds from the local system clock.

API surfaces declared

clob_authws_userinternal

Networks supported

polygon

25. Versioning & Migration

FieldValue
spec2.0.0
implementation0.1.0
schema2
releasedNone
planned_releaseQ4-2026

Migration history

DateFromToReasonAction taken
2026-04-28n/av2-specSpec drafted post-CLOB-V2 cutover; bot not yet implementedDesigned against V2 schema (pUSD, builder codes, V2 EIP-712 domain)

26. Acceptance Tests

Unit Tests

TestSetupExpected result
p95 computation from rolling windowInject 100 samples with 95th sample = 180msp95_ms=180 > warn_p95_ms=150; LATENCY_WARN emitted
Route flagged degraded when p99 > fail_p99_msp99=600ms, fail_p99_ms=500route_degraded=true; LATENCY_HARD_BREACH emitted
No alert when both p95 and p99 within thresholdsp95=100ms, p99=200msNo ObservationReport emitted

Integration Tests

TestExpected result
Probe cycle: send probe → receive ack → compute latency → update rolling windowRolling window updated; metrics emitted; alert fired only if threshold breached
ws_user lag detection via heartbeat comparisonfeed_lag_ms computed; LATENCY_WARN if lag > warn_p95_ms

Property Tests

PropertyRequired behaviour
Rolling window always contains <= 100 samples per routeAlways true — oldest sample evicted on overflow
p99 >= p95 >= p50 always holdsAlways true

27. Operational Runbook

LatencyProfiler incidents are always route degradations. Check CLOB status page and ws_user heartbeat freshness first.

On-call actions

AlertFirst stepDiagnosisMitigationEscalate to
LatencyProfilerRoutesDegradedCheck Polymarket status page; check CLOB auth endpoint health. If degraded, pause order submission until route recovers.Infra on-call if CLOB unreachable > 2 min
LatencyProfilerHighP99Check p99 histogram by route; identify which route is degraded. Cross-reference with ExchangeStatusMonitor.Exec pod lead if p99 > 750ms sustained

Manual overrides

  • polytraders bot unflag-route exec.latencyprofiler --route clob_auth — Route was incorrectly flagged degraded due to a probe anomaly; confirm route is healthy first.

Healthcheck

GET /internal/health/latencyprofiler -> 200 if All probed routes healthy, degraded_routes=0, p99 < fail_p99_ms on all routes. Red: degraded_routes > 0, probe_errors_total spiking, scheduler not firing.

28. Promotion Gates

A bot does not advance to the next readiness state until every gate below is green. Gates are observable from production data — no subjective sign-off.

Promote to Shadow

GateHow measuredThreshold
p95/p99 computation unit tests pass with known input windowsCI test run100% pass

Promote to Limited live

GateHow measuredThreshold
No false-positive route-degraded flags over 48h shadow rundegraded_routes gauge cross-referenced with CLOB status pageZero false positives

Promote to General live

GateHow measuredThreshold
Latency breach detected within 2 probe cycles of actual CLOB degradation over 7-day limited-liveCorrelation of LATENCY_HARD_BREACH events with CLOB incident logDetection within 2 × probe_interval_s

29. Developer Checklist

Ready-to-ship score: 27/27 sections complete · 100%

RequirementStatus
Purpose defined✓ done
Required inputs listed✓ done
Parameters defined✓ done
Defaults defined✓ done
Warning thresholds defined✓ done
Hard thresholds defined✓ done
Safe fallback defined✓ done
Structured output defined✓ done
Developer log defined✓ done
Plain-English explanation✓ done
Unit tests defined✓ done
Integration tests defined✓ done
Property tests defined✓ done
Failure-mode block complete✓ done
Reference implementation pseudocode✓ done
Wire examples (input + output)✓ done
Reason codes listed✓ done
Metrics & logs defined✓ done
State & persistence defined✓ done
Concurrency & idempotency defined✓ done
Dependencies declared✓ done
Security surfaces declared✓ done
Polymarket V2 compatibility declared✓ done
Version & migration history declared✓ done
Operational runbook defined✓ done
Promotion gates defined✓ done
Failure-injection recipes defined✓ done