feat(replay): historical_replay build tag for lenient tx decoder#3691
feat(replay): historical_replay build tag for lenient tx decoder#3691bdchatham wants to merge 4 commits into
Conversation
Add a build-tag-gated lenient tx-decoder path so a historical_replay-tagged build can replay pre-v6.5 blocks whose protobuf tx bodies are non-canonical. The strict rejectBodyBloat decoder otherwise rejects them (code 2 "tx parse error"), which skips execution and diverges replayed state from history. The default (untagged) build is unchanged: the strict decoder stays on all mempool/CheckTx/DeliverTx paths (a lenient validator would accept txs strict peers reject — a consensus hazard), so the lenient decoder is reachable ONLY via the build tag, never at runtime. - sei-cosmos/x/auth/tx/config.go: NewTxConfigWithoutBodyBloatRejection wiring the existing DefaultTxDecoderWithoutBodyBloatRejection. - app/params: //go:build split of MakeEncodingConfig + MakeLegacyEncodingConfig (proto.go gated !historical_replay; proto_historical_replay.go the lenient variant). This single seam covers both the BaseApp decoder and app.txDecoder. Refs PLT-782. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3691 +/- ##
==========================================
- Coverage 59.29% 58.34% -0.95%
==========================================
Files 2272 2185 -87
Lines 188210 178423 -9787
==========================================
- Hits 111594 104104 -7490
+ Misses 66565 65071 -1494
+ Partials 10051 9248 -803
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
PR SummaryMedium Risk Overview Default builds are unchanged: Lenient API is compile-time isolated: Reviewed by Cursor Bugbot for commit ce1c84f. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
A small, well-scoped change that adds a historical_replay build tag swapping only the block-execution encoding config to the already-existing lenient (no body-bloat rejection) tx decoder, so tagged builds can replay pre-v6.5 pacific-1 history. The design is sound — a build tag rather than a runtime flag correctly avoids shipping a live binary that could be misconfigured into consensus-unsafe leniency — and the untagged build is unchanged. No blockers; a few minor non-blocking notes.
Findings: 0 blocking | 5 non-blocking | 2 posted inline
Blockers
- None at the file/PR level.
Non-blocking
- The second-opinion passes were largely empty:
REVIEW_GUIDELINES.mdis empty (no repo-specific standards applied), andcursor-review.mdis empty (Cursor produced no output). Only Codex produced a finding. - No direct automated coverage for the new tagged seam.
TestDefaultTxDecoderWithoutBodyBloatRejectioncovers the underlying decoder, but neitherNewTxConfigWithoutBodyBloatRejectionnor theMakeEncodingConfig/MakeLegacyEncodingConfigbuild-tag split is exercised by a test that asserts a bloated-body tx decodes under the tag and is rejected without it. Build-tagged code is awkward to test, but a//go:build historical_replaytest asserting the tagged EncodingConfig accepts a non-canonical body would lock in the intended behavior and guard against future regressions. - Consider documenting the
historical_replaytag in the replay/tooling docs (or a Makefile target) so operators know it must never be used for validator/mempool-serving nodes, matching the in-code warnings. - 2 suggestion(s)/nit(s) flagged inline on specific lines.
…ase refs) Describe what the historical_replay lenient decoder is FOR — replaying historical blocks whose tx bodies predate strict body-bloat rejection — rather than naming a specific release/chain. No behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
A well-scoped, well-documented change that adds a historical_replay build tag swapping only the block-execution encoding config to the existing lenient tx decoder, leaving the default/untagged build byte-for-byte unchanged. No blocking issues; a couple of minor defensive notes.
Findings: 0 blocking | 5 non-blocking | 1 posted inline
Blockers
- None at the file/PR level.
Non-blocking
- Cursor's second-opinion review (cursor-review.md) was empty — no findings from that pass. Codex flagged the exported constructor (addressed inline below).
- No automated test accompanies the change. It is config wiring behind a build tag and hard to unit-test, but a small test that asserts a
-tags historical_replaybuild'sMakeEncodingConfig().TxConfigdecodes a known non-canonical/body-bloated tx body that the default build rejects would lock in the intended behavior and guard against future regressions in the decoder wiring. - Edge case: with both
test_aminoandhistorical_replayset,amino.go(gated only ontest_amino) wins and the lenient decoder is silently not used. This combination is nonsensical for replay, but worth being aware of; no change strictly required. - The doc comments on the new
MakeEncodingConfig/MakeLegacyEncodingConfig("...for an amino based test configuration") are inaccurate copy-paste boilerplate inherited from proto.go — these are neither amino-based nor test-only. Pre-existing nit; could be cleaned up while here. - 1 suggestion(s)/nit(s) flagged inline on specific lines.
…doc comments seidroid[bot] review on #3691: - [suggestion] NewTxConfigWithoutBodyBloatRejection was compiled into untagged builds, weakening the 'lenient decoder reachable only from historical_replay artifacts' invariant. Move it into a //go:build historical_replay file (config_historical_replay.go) so an untagged/production binary never compiles it — a future caller cannot accidentally wire it into a live path. Default and tagged builds both verified. - [nit] Replace the stale copy-pasted 'amino based test configuration' doc comments on MakeEncodingConfig / MakeLegacyEncodingConfig with the lenient/replay purpose. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
A well-scoped, consensus-safe change that adds a historical_replay build tag to swap the block-execution tx decoder to the existing lenient (no body-bloat rejection) variant for replaying pre-v6.5 history; the code is correct and the untagged/production build is unchanged. Only minor non-blocking notes.
Findings: 0 blocking | 3 non-blocking | 0 posted inline
Blockers
- None at the file/PR level.
Non-blocking
- Cursor's second-opinion review file was empty (no output produced); Codex's review reported "no material findings" but noted it could not run
go test(sandbox had Go 1.24.13 vs required 1.25.6, no network). Consider confirming both tagged and untagged builds compile in CI. - No automated test accompanies the change. A build-tag-gated smoke test (e.g. a
historical_replay-tagged test assertingMakeEncodingConfig().TxConfig.TxDecoder()decodes a non-canonical/body-bloated tx body that the default decoder rejects) would guard against future regressions in the decoder wiring, though this is inherently hard to unit-test. - The new
proto_historical_replay.godoc comments are more accurate than the originalproto.gocomments ("for an amino based test configuration"), which are now stale/misleading for both variants — a minor cleanup opportunity in the base file.
Problem
Replaying pre-v6.5 pacific-1 history on a current build records historical transactions with non-canonical protobuf bodies as failed (
code 2,tx decode error: tx parse error) instead of executing them. Empirically: at height79,205,095tx 6, a full-historyv6.5.1archive executes the tx (code 0), while a replayer on a newer build rejects it. The tx isn't dropped (block tx-counts match), but it gets a failure result with no state mutation, so replayed state diverges from history and compounds downstream.Root cause: the strict decoder is hardcoded —
NewTxConfigWithHandler→DefaultTxDecoder(protoCodec)→defaultTxDecoder(cdc, /*rejectBodyBloat=*/true).rejectBloatedBody()re-marshals theTxBodyand rejects on any byte-length delta (a post-v6.5 ADR-027 strictness). Pre-v6.5 blocks were committed by a more lenient binary, so their bodies carry non-canonical protobuf padding the newer decoder rejects. There is no runtime/config/build-tag override today; everyapp.Newfunnels throughMakeEncodingConfig(), which feeds both the BaseApp decoder andapp.txDecoder. The lenientDefaultTxDecoderWithoutBodyBloatRejectionalready exists but is wired only intoevmrpctrace.Change
A
historical_replaybuild tag that swaps only the block-execution encoding config to the lenient decoder:sei-cosmos/x/auth/tx/config.go— newNewTxConfigWithoutBodyBloatRejection(...)wiring the existingDefaultTxDecoderWithoutBodyBloatRejection(no bool added to the strict constructor — strict stays the only accidentally-reachable default;config.decoderis unexported so a named constructor is required).app/params/proto.go(gated!test_amino && !historical_replay) + newapp/params/proto_historical_replay.go(gated!test_amino && historical_replay) — a//go:buildsplit ofMakeEncodingConfigandMakeLegacyEncodingConfig. This single seam covers both the BaseApp decoder andapp.txDecoder.Consensus safety
Build tag, not a runtime flag — deliberately. A runtime flag would leave one binary that could be misconfigured to accept non-canonical bodies on live CheckTx/DeliverTx (a lenient validator accepts txs its strict peers reject → fork risk). The lenient decoder is reachable only in a
historical_replay-tagged artifact, intended purely for historical replay / analysis — never a validator or mempool-serving node. The default/untagged build is byte-for-byte unchanged.Verification
-tags historical_replaybuilds both compile (app/params,sei-cosmos/x/auth/tx,app); combination withmock_chain_validationalso builds.🤖 Generated with Claude Code