Skip to content

feat(autobahn): Add epoch.Registry to maintain committee/stake (CON-358)#3632

Open
wen-coding wants to merge 71 commits into
mainfrom
wen/autobahn_epoch_registry
Open

feat(autobahn): Add epoch.Registry to maintain committee/stake (CON-358)#3632
wen-coding wants to merge 71 commits into
mainfrom
wen/autobahn_epoch_registry

Conversation

@wen-coding

@wen-coding wen-coding commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Introduces epoch.Registry which consolidates firstBlock and genesisTimestamp — genesis config that doesn't change across epochs — alongside the committee
  • Bakes firstBlock into Proposal so GlobalRange() returns absolute block numbers directly, removing it as a call-site parameter throughout the stack. The proto message is now self-contained on the wire, making it much easier to interpret during emergency debugging. Also fixes a reproposal bug where TimeoutQC.reproposal() was double-counting firstBlock
  • All consumers (consensus, avail, data) now route committee lookups through registry.CommitteeFor(roadIndex). The registry currently returns the genesis committee for all RoadIndexes — we wire the read side first so that any bug introduced when implementing dynamic epochs in follow-up PRs is immediately visible at the call sites rather than requiring a cascading refactor to find it

🤖 Generated with Claude Code

@cursor

cursor Bot commented Jun 24, 2026

Copy link
Copy Markdown

PR Summary

High Risk
Changes consensus wire format and verification (required proto fields, epoch-scoped QCs), which is security-critical; incompatible with pre-epoch messages and touches avail persistence and global block numbering semantics.

Overview
Autobahn consensus messages now carry epoch binding end-to-end: Proposal, AppProposal, and TimeoutVote include epoch_index (and proposals include first_block), with proto decode hard-rejecting missing fields. Committee no longer holds genesis firstBlock / timestamp; that lives on a new local Epoch (road range, committee, first block, first timestamp). GlobalRange() and related helpers no longer take a committee argument—global block numbers are computed inside the proposal from embedded first_block.

Prepare/Commit/FullCommit/Timeout QCs verify against *Epoch (proposal epoch match, road index within epoch roads, then quorum checks on the epoch committee). ViewSpec gains an Epoch; NewProposal reads the leader from viewSpec.Epoch instead of a separate committee arg, and continuity checks use NextGlobalBlock() / epoch timestamps rather than Committee.FirstBlock().

Avail and data are wired to epoch.Registry: newInner/NewState use LatestEpoch(), commit/app QCs resolve epoch by index from the registry, lane vote QC formation uses inner.ep, and WaitForLaneQCs takes an explicit epoch. Block DB / blocksim tests and helpers follow the parameterless GlobalRange() and BuildCommitQC / simplified NewRoundRobinElection.

Minor: .gitignore narrows ignored litt bench binaries to *.test only.

Reviewed by Cursor Bugbot for commit 40877c0. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJul 2, 2026, 1:01 AM

@github-actions

Copy link
Copy Markdown

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJun 24, 2026, 2:30 AM

Comment thread sei-tendermint/internal/autobahn/avail/inner.go
@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 74.07407% with 119 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.71%. Comparing base (1c66d87) to head (40877c0).

Files with missing lines Patch % Lines
sei-tendermint/internal/autobahn/avail/state.go 49.05% 13 Missing and 14 partials ⚠️
sei-tendermint/internal/autobahn/data/state.go 60.65% 8 Missing and 16 partials ⚠️
sei-tendermint/autobahn/types/proposal.go 78.12% 6 Missing and 8 partials ⚠️
...dermint/internal/autobahn/pb/autobahn.wireguard.go 36.84% 12 Missing ⚠️
sei-tendermint/internal/autobahn/epoch/registry.go 81.81% 9 Missing and 1 partial ⚠️
sei-tendermint/autobahn/types/timeout.go 57.14% 4 Missing and 5 partials ⚠️
sei-tendermint/autobahn/types/commit_qc.go 53.33% 2 Missing and 5 partials ⚠️
...ei-tendermint/internal/autobahn/consensus/inner.go 61.53% 0 Missing and 5 partials ⚠️
sei-tendermint/autobahn/types/app_proposal.go 70.00% 1 Missing and 2 partials ⚠️
...ei-tendermint/internal/autobahn/consensus/state.go 78.57% 0 Missing and 3 partials ⚠️
... and 3 more
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #3632       +/-   ##
===========================================
+ Coverage   59.29%   74.71%   +15.42%     
===========================================
  Files        2272       87     -2185     
  Lines      188210     7646   -180564     
===========================================
- Hits       111594     5713   -105881     
+ Misses      66565     1475    -65090     
+ Partials    10051      458     -9593     
Flag Coverage Δ
sei-chain ?
sei-chain-pr 74.99% <74.07%> (+18.35%) ⬆️
sei-db 70.41% <ø> (ø)
sei-db-state-db ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/ledger_db/block/blocksim/block_generator.go 80.64% <100.00%> (ø)
sei-tendermint/autobahn/types/committee.go 98.48% <100.00%> (-0.09%) ⬇️
sei-tendermint/autobahn/types/epoch.go 100.00% <100.00%> (ø)
sei-tendermint/autobahn/types/opt.go 100.00% <ø> (ø)
sei-tendermint/autobahn/types/prepare_qc.go 93.75% <100.00%> (+1.75%) ⬆️
sei-tendermint/autobahn/types/testonly.go 95.32% <100.00%> (+0.87%) ⬆️
...-tendermint/internal/autobahn/avail/block_votes.go 90.90% <100.00%> (+0.43%) ⬆️
...ternal/autobahn/consensus/persist/fullcommitqcs.go 78.75% <100.00%> (ø)
...nternal/autobahn/consensus/persist/globalblocks.go 77.77% <100.00%> (ø)
...int/internal/autobahn/consensus/persisted_inner.go 97.70% <100.00%> (+0.02%) ⬆️
... and 17 more

... and 2190 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread sei-tendermint/internal/autobahn/data/state.go
Comment thread sei-tendermint/internal/autobahn/avail/state.go Outdated
Comment thread sei-tendermint/internal/autobahn/avail/state.go Outdated
@wen-coding wen-coding changed the title feat(autobahn): introduce epoch.Registry as single source of truth for committee/stake feat(autobahn): Add epoch.Registry to maintain committee/stake (CON-358) Jun 24, 2026
Comment thread sei-tendermint/internal/autobahn/avail/state_test.go Outdated
Comment thread sei-tendermint/internal/autobahn/epoch/registry.go Outdated
@wen-coding wen-coding force-pushed the wen/autobahn_epoch_registry branch from d764443 to 705d8c9 Compare June 24, 2026 03:59
Comment thread sei-tendermint/internal/autobahn/avail/inner.go Outdated
@wen-coding wen-coding force-pushed the wen/autobahn_epoch_registry branch 2 times, most recently from ab4c67c to 9cce5d2 Compare June 24, 2026 17:32
Comment thread sei-tendermint/autobahn/types/proposal.go Outdated
@wen-coding wen-coding force-pushed the wen/autobahn_epoch_registry branch 3 times, most recently from f1f1d55 to 09e677e Compare June 25, 2026 00:46
@wen-coding wen-coding requested a review from pompon0 June 25, 2026 02:36
TimeoutQC utils.Option[*TimeoutQC]
CommitQC utils.Option[*CommitQC]
TimeoutQC utils.Option[*TimeoutQC]
FirstBlock GlobalBlockNumber // genesis InitialHeight; added to lane-relative block numbers to produce absolute global block numbers

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that ViewSpec should have the whole EpochConfig instead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently treatment of firstblock and genesisTimestamp is inconsistent

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put Epoch into ViewSpec.

Comment thread sei-tendermint/internal/autobahn/autobahn.proto Outdated

// Verify verifies the FullProposal against the current view.
func (m *FullProposal) Verify(c *Committee, vs ViewSpec) error {
func (m *FullProposal) Verify(c *Committee, vs ViewSpec, genesisTimestamp time.Time) error {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

genesisTimestamp seems kind of random in this context and that's why IMO ViewSpec should be self-contained.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Timestamp: TimeConv.Encode(m.timestamp),
LaneRanges: LaneRangeConv.EncodeSlice(laneRanges),
App: AppProposalConv.EncodeOpt(m.app),
GlobalFirst: &globalFirst,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: utils.Alloc(uint64(m.firstBlock))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why inconsistent naming (globalFirst vs firstBlock)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

app,
)
if m.GlobalFirst == nil {
return nil, fmt.Errorf("global_first is required")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "global_first: missing" for consistency

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// GenProposal generates a random Proposal.
func GenProposal(rng utils.Rng) *Proposal {
return newProposal(GenView(rng), time.Now(), utils.GenSlice(rng, GenLaneRange), utils.Some(GenAppProposal(rng)))
return newProposal(GenView(rng), time.Now(), utils.GenSlice(rng, GenLaneRange), utils.Some(GenAppProposal(rng)), 0)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

randomize the generated data

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Number: GenViewNumber(rng) % view.Number,
}
p := newProposal(pView, time.Now(), utils.GenSlice(rng, GenLaneRange), utils.Some(GenAppProposal(rng)))
p := newProposal(pView, time.Now(), utils.GenSlice(rng, GenLaneRange), utils.Some(GenAppProposal(rng)), 0)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

//
// Currently the committee is fixed at genesis. Dynamic committee support
// will be wired up when the execution layer is ready.
type Registry struct {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to have a type representing a complete context for an epoch (perhaps just named "Epoch"):
committee,
firstBlock, (or globalRange?)
firstTimestamp,
epochID (?)
...

I currently don't understand what "Registry" represents, but that's perhaps because it is not implemented yet.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

types.Epoch added.

Registry is basically a list of Epochs protected by rwlock, so we can keep generating new Epochs and throwing into it, it will return Epoch from EpochId for readers and purge old Epochs.

// EpochWindow returns the epoch→committee map for message acceptance across
// epoch transitions. With a fixed genesis committee it always returns a
// single entry.
func (r *Registry) EpochWindow() map[Index]*types.Committee {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "window"? I don't see it in the doc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly for the Epoch transition, we need to accept lane blocks for the current Epoch and possibly the Epoch before (repair blocks) or after it (others have got the commitQC to transition into new Epoch before me).

// EpochFor returns the epoch index for the given RoadIndex.
// Currently always returns 0 (genesis epoch); dynamic lookup will be added
// with the execution layer.
func (r *Registry) EpochFor(_ types.RoadIndex) Index {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If feasible, I'd like the all the messages to include the EpochID, so that we don't have to map RoadIndex to EpochID. Perhaps we could even make the RoadIndex start from 0 at every epoch start.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added EpochID whenever we don't have a proposal attached.

wen-coding and others added 24 commits July 1, 2026 19:47
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire epoch_index into AppProposal, LaneQC, and TimeoutVote so each
message self-declares its epoch. Cross-validate in FullProposal.Verify
(LaneQC epoch vs proposal epoch) and TimeoutQC/FullTimeoutVote.Verify
(vote epoch vs expected epoch).

Replace EpochFor(roadIndex) in PushAppVote with EpochByIndex from the
proposal's self-declared epoch_index. Thread *Epoch (instead of
*Committee) through block_votes.pushVote and inner.laneQC so the epoch
index is available when constructing LaneQCs.

Fix nil-deref in data.State.PushBlock: drop pruned-block entries
silently rather than dereferencing a nil QC map entry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove Registry.EpochFor (road-index lookup) now that all messages
carry epoch_index and callers use EpochByIndex directly. Test helpers
that built ViewSpec now set Epoch from registry.LatestEpoch() and read
vs.Epoch.Committee() directly.

Fix nil map lookup in data.State.PushBlock: drop blocks whose QC entry
was already pruned rather than dereferencing a nil pointer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oposal.Verify

The field is not part of the signed LaneVote payload; it is cross-checked
against the signed proposal's epoch_index in FullProposal.Verify.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ushAppVote

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Verify AppProposal.epoch_index matches the enclosing proposal's
epoch_index in FullProposal.Verify. Reject AppProposal and LaneQC
on decode when epoch_index is missing rather than silently defaulting
to zero.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…paths

- AppProposal.Verify: check epoch_index matches CommitQC's proposal
- PushAppQC: check AppProposal epoch_index matches CommitQC epoch_index
- FullProposal.Verify: check proposal road_index is within Epoch.Roads()
- Add OpenRoadRange() helper; use it in registry, testonly, and tests
  instead of RoadRange{} (zero Last) or bare math.MaxUint64

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… in sync

The advisory note on LaneQC.epoch_index belongs in the Go struct comment
in lane_qc.go, not the proto file which would require regenerating pb.go.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dy checked pre-lock

VerifyInWindow now returns all epochs whose committee accepted the block.
PushBlock skips the in-lock re-verify when blockEp is already in that set,
avoiding duplicate crypto work in the common single-epoch case.
Regenerate autobahn.pb.go to match current proto.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed set

Removes in-lock block.Verify entirely. If blockEp is not in the set returned
by VerifyInWindow, the block was signed by the wrong epoch and is rejected.
No crypto inside the lock.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…FullCommitQC

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…orter; rename insertQC param ep -> registry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d cutover in Proposal decoder

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Restores range-bounds and max-length validation as defense-in-depth.
Lane membership is omitted (needs committee context, enforced by FullProposal.Verify).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…f hardcoding 0

TestCommitQC now takes *types.Epoch instead of committee+firstBlock+genesisTimestamp,
ensuring laneQCs and appQC use the same epochIndex as the proposal being built.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
qc.Verify(c) already binds the committee, so epochIndex in LaneQC
is redundant. Drop it from the struct, proto, encode/decode, and all
callers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Follow-up to the LaneQC epochIndex removal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When AddEpoch is wired, recovery must resolve the epoch from the
persisted QC/proposal rather than assuming LatestEpoch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cover wrong epochIndex, wrong firstBlock, and out-of-roads road index.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@wen-coding wen-coding force-pushed the wen/autobahn_epoch_registry branch from 69bfc5f to 86c7599 Compare July 2, 2026 00:05

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A large, mostly-mechanical refactor introducing epoch.Registry and baking first_block/epoch_index into proposals, with solid added epoch-binding verification and comprehensive test updates. No blocking correctness issues in the currently single-epoch code paths; the notable findings are forward-looking hazards for when dynamic epochs (AddEpoch) are wired, which are already documented with TODOs.

Findings: 0 blocking | 7 non-blocking | 2 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • REVIEW_GUIDELINES.md is empty (no repo-specific standards applied) and cursor-review.md produced no output — only the Codex second-opinion pass contributed findings, both of which are forward-looking.
  • Codex P1 (registry.go AddEpoch): closing an epoch replaces the slice entry, but consensus/avail cache *types.Epoch pointers (inner.ep, myView.Epoch) taken from LatestEpoch(). Those cached epochs retain Roads().Last=MaxUint64, so once AddEpoch is wired a QC for a post-boundary road could still be accepted under the old committee until every cached pointer is rotated. Not exploitable in this PR (AddEpoch has no production callers; a single open epoch exists) and TODOs in consensus/inner.go and avail/state.go already flag the rotation gap — worth resolving in the same PR that wires AddEpoch.
  • Codex P1 (registry.go VerifyInWindow / data.PushBlock): VerifyInWindow only checks the latest epoch, while data.PushBlock rejects a block whose QC resolves to an epoch not in that returned set ("signed by epoch X, not in window"). After a future epoch transition, syncing blocks finalized under the previous epoch would fail. Again inert today (one epoch) and covered by the VerifyInWindow TODO to expand to neighbor epochs — track alongside dynamic-epoch enablement.
  • Proto migration hard-rejects messages missing first_block/epoch_index with no backward-compat path; the code comments justify this as acceptable because Autobahn is pre-production. Confirm no persisted state or in-flight peers predate these fields before deploying, since this is an app-hash/wire-incompatible change despite the non-app-hash-breaking label.
  • avail.PushAppVote resolves the committee for signature verification from the vote's own (attacker-controlled) proposal.EpochIndex() before cross-checking it against the finalizing CommitQC's epoch; harmless with one epoch and the later equality check makes it consistent, but once multiple epochs exist consider ordering the epoch/CommitQC agreement check before trusting the declared epoch's committee.
  • 2 suggestion(s)/nit(s) flagged inline on specific lines.

// retroactively reassigning already-finalized roads to the new committee.
// Requires the caller to pass the commit watermark once the execution bridge is wired.
// Replace latest with a closed version (Roads().Last = startRoad-1).
r.epochs[len(r.epochs)-1] = types.NewEpoch(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] AddEpoch closes the previous epoch by allocating a new *types.Epoch and replacing the slice entry, but consumers cache the *types.Epoch pointer they obtained earlier from LatestEpoch() (e.g. consensus inner.ep, myView.Epoch, avail inner.ep). Those cached pointers keep Roads().Last = MaxUint64, so a QC for a road past the boundary could still verify under the old committee until each cached pointer is rotated. Inert today because AddEpoch has no production callers, but when it is wired ensure all cached epochs are refreshed (the TODOs in consensus/inner.go and avail/state.go PushCommitQC track this).

// Returns a slice of all matching epochs so callers can skip re-verification for any
// epoch already checked here.
// TODO: expand to neighbor epochs (previous and next) once multi-epoch transitions are wired up.
func (r *Registry) VerifyInWindow(fn func(*types.Committee) error) ([]*types.Epoch, error) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] VerifyInWindow only verifies against the latest epoch and returns just that epoch. data.PushBlock then rejects any block whose QC resolves to an epoch not in this returned set ("signed by epoch X, not in window"). After a future epoch transition, still-needed blocks finalized under the previous epoch would fail to sync. Not reachable while only one epoch exists; resolve together with the AddEpoch wiring per the existing TODO to include neighbor epochs.

EpochByIndex(Index(p.EpochIndex())) is the direct call; the wrapper
adds no logic beyond an inline-able error message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 40877c0. Configure here.

return fmt.Errorf("laneRange[%v]: non-zero hash for empty range", r.Lane())
}
if got := r.Len(); got > MaxLaneRangeInProposal {
return fmt.Errorf("laneRange[%v].Len() = %d, want <= %d", r.Lane(), got, MaxLaneRangeInProposal)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid lanes crash full QC build

High Severity

Proposal.Verify no longer ensures each lane range belongs to the committee; that check moved to FullProposal.Verify only. CommitQC.Verify still calls Proposal.Verify, so a commit QC whose proposal includes a non-committee lane can pass verification and be stored. When availability later assembles a FullCommitQC, header collection walks only committee lanes while GlobalRange() counts every lane in the proposal, so NewFullCommitQC can panic on length mismatch.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 40877c0. Configure here.

vs := ViewSpec{
CommitQC: prev,
Epoch: NewEpoch(0, OpenRoadRange(), genesisTimestamp, committee, firstBlock),
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BuildCommitQC hardcodes epoch zero

Low Severity

BuildCommitQC always builds its ViewSpec with NewEpoch(0, …) regardless of the caller’s epoch. Callers such as data.TestCommitQC pass ep.FirstBlock() and attach app proposals using ep.EpochIndex(), so for any non-genesis epoch index the resulting commit QC proposal and app metadata disagree and epoch binding checks fail or build misleading test fixtures.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 40877c0. Configure here.

@seidroid seidroid Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A large but internally consistent refactor introducing epoch.Registry and baking firstBlock/epochIndex into consensus messages so GlobalRange() is committee-free; no live correctness bugs found. The Codex-flagged multi-epoch issues are real future concerns but are dead-code-guarded (AddEpoch is never called in production) and explicitly documented with TODOs.

Findings: 0 blocking | 5 non-blocking | 1 posted inline

Blockers

  • None at the file/PR level.

Non-blocking

  • Second-opinion coverage: cursor-review.md is empty (Cursor produced no output); Codex ran but could not execute tests (Go 1.25.6 unavailable in its sandbox). Review rests on the Claude pass plus Codex's static findings.
  • Latent multi-epoch limitation (from Codex, kept but downgraded to non-blocking): the signature-verification path VerifyInWindow (registry.go:117) and consensus's inner.ep pinned to registry.LatestEpoch() (inner.go:117, pushCommitQC inner.go:132/141) only ever consult the latest epoch. This is NOT a live bug — AddEpoch is never invoked in production (NewRegistry always builds a single genesis epoch) and the code carries explicit TODOs (registry.go:116, inner.go:114-116). Flagging so the follow-up PR that wires AddEpoch resolves the epoch from the QC/road index and rotates inner.ep on epoch transitions/restart before enabling dynamic epochs; otherwise valid older-epoch blocks/votes would be rejected and consensus could use the wrong committee.
  • AddEpoch is exported and reachable but unused, and its guard only rejects startRoad <= latest.Roads().First without rejecting startRoad <= the highest already-committed road (its own TODO at registry.go:90-92). Consider keeping it unexported or panicking until the execution bridge is wired, to prevent a future caller from enabling it before the safety checks exist.
  • Minor redundancy: FullProposal.Verify runs proposal.Verify(vs.Epoch) (which now does structural lane-range bound/len checks) and then separately re-loops proposal.laneRanges to call r.Verify(c) plus an identical Len(...) > MaxLaneRangeInProposal check (proposal.go). The max-length check in particular is duplicated; could be consolidated.
  • 1 suggestion(s)/nit(s) flagged inline on specific lines.

// Returns a slice of all matching epochs so callers can skip re-verification for any
// epoch already checked here.
// TODO: expand to neighbor epochs (previous and next) once multi-epoch transitions are wired up.
func (r *Registry) VerifyInWindow(fn func(*types.Committee) error) ([]*types.Epoch, error) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] Non-blocking / forward-looking: VerifyInWindow only verifies against LatestEpoch(). Callers (data.State.PushBlock, avail.State.PushBlock/PushVote) verify the block/vote signature here before resolving the authoritative epoch from the QC. Today this is fine because AddEpoch is never called in production, so the registry always holds exactly one epoch — but once AddEpoch is wired, a valid block/vote signed by an older-but-unpruned epoch's committee would be rejected (data.State.PushBlock even fails the epochInSet(blockEp, preEps) check with "not in window"). The existing TODO here acknowledges this; please ensure the neighbor-epoch expansion lands together with AddEpoch.

@wen-coding wen-coding requested a review from pompon0 July 2, 2026 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants