docs(runbooks): flatKV↔memIAVL parity via sharded historical replay#448
Conversation
Agent-first runbook for driving a flatKV-vs-memIAVL storage-engine correctness validation on harbor at scale: the replay-pair topology (same binary + snapshot, blocks from a shared archive), the load-bearing correctness gates (compare pair-not-archive; verify migration complete; historical_replay build), the seictl shadow comparator, result aggregation + Notion report, and the 50+ shard fan-out. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PR SummaryLow Risk Overview The new doc encodes the differential replay method (flatKV + memIAVL Reviewed by Cursor Bugbot for commit b34fdbf. Bugbot is set up for automated code reviews on this repo. Configure here. |
… dependency, §7 invariant-first - §3 + §1 trap3: state honestly that historical_replay is not on sei-chain main (lands via PR #3691); build from that branch ref, not main; correct the decoder-symbol framing (main has only DefaultTxDecoderWithoutBodyBloatRejection, evmrpc-trace only). Convergent DISSENT (systems-engineer + prose-steward). - §7: lead with the canonical invariant (canonical* must hold the compared heights; resolve behind/ahead at submit time). prose-steward DISSENT. - §0: add <you>, <archive>, <task-uuid> placeholders. - cosmetic (kubernetes-specialist RATIFY nits): CEL vs planner for the snapshot requirement; finalizer-driven PVC delete; sei.io/node selector clarification; block-level layer2 indeterminate wording; §6 step-N cross-refs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ical reads, not flatKV) Dissenter (sei-network-specialist) caught a correctness-grade mislead the other lenses missed, verified against sei-chain main 1c66d878: CacheMultiStoreWithVersion serves ALL historical IAVL reads from the State Store (pebbledb) when SS is enabled — the SC layer, where write_mode picks flatKV vs memIAVL, is bypassed. So L2 (historical eth_getStorageAt) compares SS↔SS on both nodes and is vacuous for flatKV. - §1: L1 (execution results) is the flatKV verdict; L2 does NOT exercise flatKV on SS-enabled nodes (same-history sanity check only); flatKV read path needs SS-off + latest-height, committed root is the seidb-digest track. - §2: scope the migration-complete gate to the latest-version SC path; it cannot make L2 meaningful while SS is on. - §8/§9: L1 is the reported verdict; L2 reported as sanity check, not parity. - §7/§11: layer indeterminate wording; boundary-block = parent-validators lookup; new failure-mode row for the L2-SS-vacuous door. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ss, L2-determinism rationale Round 3 unanimous RATIFY (k8s, systems-engineer, prose-steward, sei-network dissenter). Style/advisory polish: - §1: expand SC -> 'SC (State Commit)' at first use. - §7: move the 'same-history sanity check, not a flatKV signal' gloss onto layer2 (was misattached to the indeterminate flag). - §4: note L2-determinism still earns its keep as the SS history-agreement check. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds an agent-first runbook to
.agent/runbooks/for driving a flatKV-vs-memIAVL storage-engine correctness validation on harbor at scale (50+ shards), plus its README index row.What it captures
The method a cold Claude session (or operator) needs to run the validation end-to-end, with the load-bearing correctness traps front-loaded:
sei_chain_seidb_migration_version == target) —migrate_evmis a boundary-split router that serves un-migrated EVM reads from memIAVL, so a premature comparison is silently vacuous (memIAVL-vs-memIAVL). Free at EVM genesis; high-height shards needevm_migrated/flatkv_only/forced completion.historical_replaybuild tag for pre-v6.5 non-canonical tx bodies (else the strict decoder skips their execution).Then: the replay-pair topology (same binary + snapshot, blocks from a shared full-history archive), standing up a pair, the seictl
result-exportshadow comparator (L1 + L2), result aggregation, the Notion report, the 50+ shard fan-out, and a failure-modes table.Provenance
Distilled from an end-to-end run of this validation on harbor
eng-fromtherain. Every technical claim was accuracy-reviewed against the controller + sei-db + seictl source and the live cluster (asystems-engineeron storage/comparator/metrics, akubernetes-specialiston the CRD/GitOps surface), and legibility-reviewed for agent-first execution (aprose-steward) before this PR.🤖 Generated with Claude Code