Skip to content

feat(sequencer): record L1 inclusion window and gas-price ladder in failed-tx store#24427

Open
aminsammara wants to merge 6 commits into
merge-train/spartanfrom
as/failed-tx-store-gas-pricing
Open

feat(sequencer): record L1 inclusion window and gas-price ladder in failed-tx store#24427
aminsammara wants to merge 6 commits into
merge-train/spartanfrom
as/failed-tx-store-gas-pricing

Conversation

@aminsammara

@aminsammara aminsammara commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Why

When a sequencer's propose transaction fails to land, the most common cause is underpricing during an L1 fee spike — but until now the failed-tx store recorded almost nothing to confirm or quantify that. Operators couldn't answer "was I underpriced, and by how much?" from the stored records.

The subtlety is that underpriced txs don't revert — they time out. The publisher escalates the priority fee (a ladder of speed-ups at the same nonce) and eventually gives up. So diagnosing underpricing needs two things side by side: what you paid (the whole escalation ladder, since the intermediate txs are replaced and evicted, so they can't be recovered after the fact) and what it cost to get in (the fee distribution of the actual L1 blocks you were competing for).

This PR records both, so an operator (or a script over the store) can compare their bid against the real inclusion bar of the blocks in their slot's window.

What changed

  • Inclusion window, not "next block." Replaces the previous poll-for-next-block capture with captureWindowBlockFees, which reads (historically, once per slot) the L1 blocks whose timestamps fall in the target L2 slot's inclusion window and records per-block baseFeePerGas, p75PriorityFee, minIncludedPriorityFee, blob fees, and blockBlobsFull. Historical reads only — no waiting on the chain, so no RPC-amplification during a spike.
  • Gas-price ladder. L1TxUtils retains the escalating prices (initial send + each speed-up) on tx state and surfaces them on timeout via a new L1TxTimeoutError (subclass of TimeoutError, so existing instanceof checks are unaffected). The sequencer writes them to the record as sentGasPriceLadder + attempts. Retention is gated by the existing L1_TX_FAILED_STORE flag, so the shared tx-monitor hot path is unchanged when the store is disabled — no new operator config.
  • Per-attempt records + correct gating. Send-error/timeout records now get a per-attempt id (retries no longer overwrite each other), and all capture is gated on the resolved store so no fee-read RPC runs when the store is off (fixes a latent guard that checked an always-truthy promise).

Sample record (timeout/data-<id>.json)

All wei values are decimal strings (gwei = ÷1e9).

{
  "failureType": "timeout",
  "l1BlockNumber": "21050006",
  "error": { "message": "L1 transaction 0x… timed out", "name": "TimeoutError" },
  "context": { "actions": ["propose"], "slot": 12345, "sender": "0xYourAttester…" },
  "timing": { "targetL2Slot": 12345, "msUntilSlotDeadline": -8000 },
  "gasInfo": {
    "attempts": 2,
    "gasLimit": "2100000",
    "nonce": 812,
    "sentGasPriceLadder": [
      { "maxFeePerGas": "31000000000", "maxPriorityFeePerGas": "1000000000" },
      { "maxFeePerGas": "33000000000", "maxPriorityFeePerGas": "2000000000" }
    ],
    "windowBlocks": [
      { "blockNumber": "21050001", "baseFeePerGas": "29000000000",
        "minIncludedPriorityFee": "2500000000", "p75PriorityFee": "3200000000",
        "blockBlobsFull": false, "includedBlobTxCount": 3, "includedBlobCount": 5 },
      { "blockNumber": "21050002", "baseFeePerGas": "29500000000",
        "minIncludedPriorityFee": "2400000000", "p75PriorityFee": "3100000000",
        "blockBlobsFull": false, "includedBlobTxCount": 2, "includedBlobCount": 4 }
    ]
  }
}

Reading it: the top of sentGasPriceLadder was 2 gwei; every window block's minIncludedPriorityFee was ~2.4–2.5 gwei (and p75 ~3.1–3.2 gwei), so even the escalated bid was below the minimum that got in anywhere in the slot → underpriced by ~0.5 gwei to clear, ~1.2 to be competitive. msUntilSlotDeadline: -8000 confirms it missed the slot. blockBlobsFull: false rules out blob-space contention (if it were true, the loss would be blob space, not tip).

Testing

  • Unit: window-walk logic; L1TxTimeoutError instanceof TimeoutError contract.
  • Integration (anvil): real send → speed-up → timeout captures & surfaces the ladder, and the flag gates retention.
  • Integration: sendRequests timeout → ladder written to a real file-backed store → record read back off disk.

Follow-up

Capturing the cancellation tx (hash + price) is a planned follow-up — it fires fire-and-forget, so it needs its own hook rather than a throw-time snapshot.

aminsammara and others added 6 commits June 30, 2026 19:36
Adds gas pricing, L1 fee environment, and slot timing data to the
FailedL1Tx records so operators can diagnose underpriced L1 transactions.

Captures: sent gas prices, pending pool p75 priority fees, next mined
block inclusion thresholds, and time remaining until slot deadline.
Also adds backup calls for send failures, on-chain reverts, and
timeouts which were previously not recorded.
…ailed-tx store

Enriches failed-L1-tx records so operators can diagnose underpricing:

- Capture the fee data of the L1 inclusion window (the blocks the tx could
  have landed in for its L2 slot) via historical reads, once per slot, instead
  of polling for a single "next" block.
- Retain the escalating gas-price ladder (initial send + speed-ups) on tx state
  and surface it on timeout via L1TxTimeoutError, so timeout records show what
  was actually paid across retries. Gated by the existing L1_TX_FAILED_STORE
  flag, so the tx-monitor hot path is unchanged when the store is disabled.
- Give send-error/timeout records a per-attempt id so retries no longer
  overwrite each other; gate all capture on the resolved store so no fee-read
  RPC runs when the store is disabled.

Adds unit and anvil integration tests for the window capture, the ladder
capture/surfacing, and the sequencer writing the ladder into a real store.
…s for fee data

The mined inclusion-window blocks already record the p75 and min-included
priority fees of the txs that actually got in, which is the authoritative
underpricing signal. The separate pending-pool snapshot only added a stale,
post-deadline view (captured after the timeout) plus the single heaviest RPC
call in the flow (pending block with full transactions). Remove
captureFeeSnapshot and its eight gasInfo fields; windowBlocks is now the sole
fee-environment source. Early send-errors whose window is not yet mined simply
carry no window data.
…t compiles

Multicall3.forward's return type gained a `state` field, but the existing
forwardSpy mocks in sequencer-publisher and checkpoint_voter tests weren't
updated — latent since the yarn-project build was previously blocked by
ungenerated l1-artifacts. Regenerating the artifacts surfaced the errors.

@spalladino spalladino left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! No errors I could find, just some nits or configuration decisions.

txConfigOverrides: gasConfigOverrides ?? {},
sentAtL1Ts: now,
lastSentAtL1Ts: now,
gasPriceHistory: this.config.captureGasPriceHistory ? [baseState.gasPrice] : undefined,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For simplicity's sake, I'd remove the captureGasPriceHistory and save it always. It's pretty cheap to just keep track of this extra gasPrice field per attempt.

Comment on lines +344 to +356
return {
windowBlocks: windowBlocks.map(b => ({
blockNumber: b.blockNumber.toString(),
timestamp: b.timestamp.toString(),
baseFeePerGas: b.baseFeePerGas.toString(),
p75PriorityFee: b.p75PriorityFee.toString(),
minIncludedPriorityFee: b.minIncludedPriorityFee.toString(),
minIncludedBlobPriorityFee: b.minIncludedBlobPriorityFee.toString(),
blockBlobsFull: b.blockBlobsFull,
includedBlobTxCount: b.includedBlobTxCount,
includedBlobCount: b.includedBlobCount,
})),
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these mappings? Why are we casting everything to string here? If it's just for serialization purposes, we have a custom jsonStringify that knows how to deal with bigints properly.

Comment on lines +590 to +595
let l1BlockNumber = 0n;
try {
l1BlockNumber = await this.l1TxUtils.getBlockNumber();
} catch {
// ignore - back up without the block number
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let l1BlockNumber = 0n;
try {
l1BlockNumber = await this.l1TxUtils.getBlockNumber();
} catch {
// ignore - back up without the block number
}
const let l1BlockNumber = await this.l1TxUtils.getBlockNumber().catch(() => 0n);

Sorry for the OCD

Comment on lines +311 to +313
const feeSummary =
opts?.sharedFeeSummary ??
(opts?.captureFeeSummary ? await this.captureFeeEnvironment(opts.targetSlot) : undefined);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd wrap the captureFeeEnvironment in a try/catch (or catch.(() => undefined)), so an error while retrieving fee env does not break the saving of the failed tx.

Comment on lines +288 to +289
* When captureFeeSummary is true, captures L1 fee environment and waits for the next
* mined block (~12s) to record the definitive inclusion threshold before saving.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find where we're waiting for the next block

this.failedTxStore = createL1TxFailedStore(config.l1TxFailedStore, this.log);

// Only retain the gas-price ladder on publishers when we'll actually store failures with it.
this.captureGasPriceHistory = !!config.l1TxFailedStore;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue it's useful to capture gas price history (or the "fee env" as called here) even if you don't have a failed tx store. Outputting that data in logs under a warn (or error) can help diagnosing, even when there's no failed tx store configured.

Comment on lines +830 to +834
export async function captureWindowBlockFees(
client: ViemClient,
windowStartS: bigint,
windowEndS: bigint,
): Promise<WindowBlockFees[]> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC there's an eth RPC call that returns gas price history. Can you check if we get enough data from it, so we don't have to download every single tx in the window to compute the stats? Maybe that call plus the block headers (ie includeTransactions: false) are good enough?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we could move the logic for backing up failed txs and capturing fee env to a separate component? Could be a wrapper of the L1TxUtils, or a dependency, or something completely different, but outside the publisher itself. All this failed tx management is starting to pollute the publisher a lot. No need to do on this PR though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants