docs(runbooks): flatKV↔memIAVL parity via sharded historical replay by bdchatham · Pull Request #448 · sei-protocol/sei-k8s-controller

bdchatham · 2026-07-01T20:56:53Z

Adds an agent-first runbook to .agent/runbooks/ for driving a flatKV-vs-memIAVL storage-engine correctness validation on harbor at scale (50+ shards), plus its README index row.

What it captures

The method a cold Claude session (or operator) needs to run the validation end-to-end, with the load-bearing correctness traps front-loaded:

Compare the two replay nodes to each other, never to the archive — comparing a re-executing shadow to the archive's stored pre-v6.5 results conflates the storage engine with version drift and manufactures false divergences. The archive is the block source only.
Verify the flatKV node's migration is complete before trusting any result (sei_chain_seidb_migration_version == target) — migrate_evm is a boundary-split router that serves un-migrated EVM reads from memIAVL, so a premature comparison is silently vacuous (memIAVL-vs-memIAVL). Free at EVM genesis; high-height shards need evm_migrated/flatkv_only/forced completion.
historical_replay build tag for pre-v6.5 non-canonical tx bodies (else the strict decoder skips their execution).

Then: the replay-pair topology (same binary + snapshot, blocks from a shared full-history archive), standing up a pair, the seictl result-export shadow comparator (L1 + L2), result aggregation, the Notion report, the 50+ shard fan-out, and a failure-modes table.

Provenance

Distilled from an end-to-end run of this validation on harbor eng-fromtherain. Every technical claim was accuracy-reviewed against the controller + sei-db + seictl source and the live cluster (a systems-engineer on storage/comparator/metrics, a kubernetes-specialist on the CRD/GitOps surface), and legibility-reviewed for agent-first execution (a prose-steward) before this PR.

🤖 Generated with Claude Code

Agent-first runbook for driving a flatKV-vs-memIAVL storage-engine correctness validation on harbor at scale: the replay-pair topology (same binary + snapshot, blocks from a shared archive), the load-bearing correctness gates (compare pair-not-archive; verify migration complete; historical_replay build), the seictl shadow comparator, result aggregation + Notion report, and the 50+ shard fan-out. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

cursor · 2026-07-01T20:56:59Z

PR Summary

Low Risk
Documentation-only change under .agent/runbooks/; no controller, CRD, or runtime behavior is modified.

Overview
Adds .agent/runbooks/validating-flatkv-memiavl-parity-via-sharded-replay.md, an agent-first operator runbook for flatKV vs memIAVL correctness validation on harbor, and a matching row in .agent/runbooks/README.md.

The new doc encodes the differential replay method (flatKV + memIAVL SeiNode pairs, shared archive, same image/snapshot) and front-loads traps that make naïve runs look green while measuring nothing: compare replay nodes to each other, not the archive; gate on sei_chain_seidb_migration_version before trusting flatKV reads; use mock_chain_validation + historical_replay builds (with the documented sei-chain PR dependency). It also covers SeiNode overrides, Flux/harbor-dev rollout, seictl result-export comparator params, S3 aggregation, Notion reporting, 50+ shard fan-out, and a failure-modes table—with L1 as the flatKV verdict and L2 explicitly as same-history sanity only.

^{Reviewed by Cursor Bugbot for commit b34fdbf. Bugbot is set up for automated code reviews on this repo. Configure here.}

… dependency, §7 invariant-first - §3 + §1 trap3: state honestly that historical_replay is not on sei-chain main (lands via PR #3691); build from that branch ref, not main; correct the decoder-symbol framing (main has only DefaultTxDecoderWithoutBodyBloatRejection, evmrpc-trace only). Convergent DISSENT (systems-engineer + prose-steward). - §7: lead with the canonical invariant (canonical* must hold the compared heights; resolve behind/ahead at submit time). prose-steward DISSENT. - §0: add <you>, <archive>, <task-uuid> placeholders. - cosmetic (kubernetes-specialist RATIFY nits): CEL vs planner for the snapshot requirement; finalizer-driven PVC delete; sei.io/node selector clarification; block-level layer2 indeterminate wording; §6 step-N cross-refs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ical reads, not flatKV) Dissenter (sei-network-specialist) caught a correctness-grade mislead the other lenses missed, verified against sei-chain main 1c66d878: CacheMultiStoreWithVersion serves ALL historical IAVL reads from the State Store (pebbledb) when SS is enabled — the SC layer, where write_mode picks flatKV vs memIAVL, is bypassed. So L2 (historical eth_getStorageAt) compares SS↔SS on both nodes and is vacuous for flatKV. - §1: L1 (execution results) is the flatKV verdict; L2 does NOT exercise flatKV on SS-enabled nodes (same-history sanity check only); flatKV read path needs SS-off + latest-height, committed root is the seidb-digest track. - §2: scope the migration-complete gate to the latest-version SC path; it cannot make L2 meaningful while SS is on. - §8/§9: L1 is the reported verdict; L2 reported as sanity check, not parity. - §7/§11: layer indeterminate wording; boundary-block = parent-validators lookup; new failure-mode row for the L2-SS-vacuous door. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ss, L2-determinism rationale Round 3 unanimous RATIFY (k8s, systems-engineer, prose-steward, sei-network dissenter). Style/advisory polish: - §1: expand SC -> 'SC (State Commit)' at first use. - §7: move the 'same-history sanity check, not a flatKV signal' gloss onto layer2 (was misattached to the indeterminate flag). - §4: note L2-determinism still earns its keep as the SS history-agreement check. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

bdchatham and others added 3 commits July 1, 2026 14:16

bdchatham merged commit 56eea3b into main Jul 1, 2026
5 checks passed

bdchatham deleted the feat/runbook-flatkv-parity-replay branch July 1, 2026 21:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(runbooks): flatKV↔memIAVL parity via sharded historical replay#448

docs(runbooks): flatKV↔memIAVL parity via sharded historical replay#448
bdchatham merged 4 commits into
mainfrom
feat/runbook-flatkv-parity-replay

bdchatham commented Jul 1, 2026

Uh oh!

cursor Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

bdchatham commented Jul 1, 2026

What it captures

Provenance

Uh oh!

cursor Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cursor Bot commented Jul 1, 2026 •

edited

Loading