otel metrics for autobahn/avail and p2p/mux#3682
Conversation
PR SummaryLow Risk Overview Autobahn avail gets a new metrics package with gauges for latest app/commit road and global block indices, a histogram for proposal-to-commit latency, and commit-to-commit interval counters labeled by capped view timeouts. ObserveAppQC and ObserveCommitQC run from prune and from PushCommitQC, with mutex-backed dedup so stale or duplicate QCs are ignored. P2P mux records per-stream open/close latency, in-flight streams, and send/recv message and byte counts, broken out by role (accept vs connect) and rpc_name. StreamKindConfig.Name is populated from RPC registration (request type name) when building the mux config. Stream lifecycle hooks live on open, send, recv, and close. Also simplifies the legacy P2P metricsLabelCache mutex (embedded sync.RWMutex, any instead of interface{}) and a small rpc Serve refactor plus a client-side rate-limit comment typo fix. Reviewed by Cursor Bugbot for commit ba94963. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3682 +/- ##
==========================================
- Coverage 58.97% 58.04% -0.94%
==========================================
Files 2263 2181 -82
Lines 187223 177649 -9574
==========================================
- Hits 110421 103117 -7304
+ Misses 66858 65384 -1474
+ Partials 9944 9148 -796
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Adds OpenTelemetry metrics for autobahn/avail consensus QCs and p2p/mux stream I/O, plus minor refactors. The instrumentation is reasonable, but the new mux metrics file is not gofmt-clean (will fail the lint CI), and several monotonic totals are declared as UpDownCounters.
Findings: 2 blocking | 4 non-blocking | 2 posted inline
Blockers
sei-tendermint/internal/p2p/mux/metrics/metrics.gois not gofmt-clean, which violates the repo's formatting requirement and will fail golangci/make lintin CI (see inline comment). Rungofmt -s -won the file.- 1 blocking issue(s) flagged inline on specific lines.
Non-blocking
- Repository review guidelines (
REVIEW_GUIDELINES.md) were empty/missing, so no repo-specific standards were applied beyond AGENTS.md. - The Cursor second-opinion pass (
cursor-review.md) produced no output. - In
avail/metrics.goObserveCommitQC,proposalToCommitLatency.Record(...)is called before the ordering guardlast.val.Index() >= qc.Index(), so a re-observed/stale CommitQC still records a latency sample. Consider moving the record after the guard if only fresh QCs should be measured (minor, metrics-only). - 1 suggestion(s)/nit(s) flagged inline on specific lines.
| "go.opentelemetry.io/otel" | ||
| "go.opentelemetry.io/otel/attribute" | ||
| "go.opentelemetry.io/otel/metric" | ||
| prometheus "github.com/prometheus/client_golang/prometheus" |
There was a problem hiding this comment.
[blocker] This file is not gofmt-clean and will fail the lint CI. Two issues: (1) within this import group the prometheus "github.com/prometheus/client_golang/prometheus" line is out of order — gofmt sorts by path, so github.com/... must come before the go.opentelemetry.io/... imports; (2) the following blank line (line 11) contains a trailing tab. Run gofmt -s -w to fix. (Also flagged by Codex.)
| var meter = otel.Meter("tendermint_internal_p2p_mux") | ||
| var latency = utils.OrPanic1(meter.Float64Histogram("latency", metric.WithUnit("s"), buckets(0.001, 1.3, 30))) | ||
| var inFlight = utils.OrPanic1(meter.Int64UpDownCounter("inflight")) | ||
| var sendMsgs = utils.OrPanic1(meter.Int64UpDownCounter("send_msgs")) |
There was a problem hiding this comment.
[suggestion] sendMsgs, sendBytes, recvMsgs, and recvBytes are only ever incremented (Add(+n)), so they are monotonic totals and should be Int64Counter, not Int64UpDownCounter. UpDownCounter exports as a non-monotonic gauge, which loses the counter/_total semantics that rate() and similar queries expect. (inFlight correctly stays an UpDownCounter.) Also flagged by Codex.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ba94963. Configure here.
| commitToCommitLatencySum.Add(ctx, now.Sub(last.time).Seconds(), attrs) | ||
| commitToCommitLatencyCount.Add(ctx, 1, attrs) | ||
| } | ||
| proposalToCommitLatency.Record(ctx, now.Sub(qc.Proposal().Timestamp()).Seconds()) |
There was a problem hiding this comment.
Restart skews commit latency histogram
Medium Severity
On startup, observing the persisted prune anchor records proposal_to_commit_latency as wall time since the anchor proposal’s original timestamp. That can be hours or days of downtime, producing a bogus latency sample that distorts histograms and SLO views after every restart.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit ba94963. Configure here.
| ) | ||
|
|
||
| func buckets(start float64, factor float64, count int) metric.HistogramOption { | ||
| return metric.WithExplicitBucketBoundaries(prometheus.ExponentialBuckets(start, factor, count)...) |
There was a problem hiding this comment.
Histogram buckets use wrong helper
Low Severity
New OTel histograms build bucket boundaries via prometheus.ExponentialBuckets in a local helper. Repo convention for new OTel metrics is ExponentialBucketsRange so bucket spans stay consistent with instruments like mempool compact duration.
Additional Locations (1)
Triggered by learned rule: sei-tendermint: new metrics must be OTel native with established conventions
Reviewed by Cursor Bugbot for commit ba94963. Configure here.
There was a problem hiding this comment.
Observability-only PR adding OpenTelemetry metrics for autobahn/avail consensus QCs and p2p/mux stream traffic. The changes are correct and well-structured (dedup guards, bounded label cardinality, safe mutex-by-value refactor); the only substantive concern is that the in-flight stream gauge can drift upward when streams are abandoned without a local Close.
Findings: 0 blocking | 4 non-blocking | 1 posted inline
Blockers
- None at the file/PR level.
Non-blocking
- No tests were added for the new metrics helpers (ObserveCommitQC/ObserveAppQC dedup logic and the Stream metric lifecycle). A small unit test around the monotonic dedup guards (out-of-order QCs are ignored) and Open/Close idempotency would guard against regressions; acceptable to skip for an observability-only change, but worth noting.
- Cursor's second-opinion pass (cursor-review.md) produced no output, and REVIEW_GUIDELINES.md was empty on the base branch, so no repo-specific standards were applied. Codex's single medium finding (the inFlight counter leak) is incorporated as an inline comment.
- Minor: the
Sendmetric is recorded when a message is enqueued to the frame queue (stream.go:95) rather than when it is actually written to the connection, so send_bytes/send_msgs slightly lead the wire. This is a reasonable approximation but worth a comment if precise byte accounting is ever expected. - 1 suggestion(s)/nit(s) flagged inline on specific lines.
| return | ||
| } | ||
| inner.closed.local = true | ||
| inner.metrics.Close() |
There was a problem hiding this comment.
[suggestion] The inFlight up/down counter is only decremented here in the local close() path. Open() increments it once a stream's recv end is established, but the decrement never runs if a stream is abandoned without a local Close() — e.g. Mux.Run exits on a connection/read/write error, or a caller observes remote close (tryPrune in runRecv only fully prunes once both local and remote are closed) but never calls Close(). Because inFlight is a process-global counter shared across all connections, this drifts permanently upward across peer churn, making the gauge unreliable over a long-running node. Consider decrementing symmetrically on prune/teardown (e.g. from tryPrune or a runner shutdown hook) so an opened stream is always balanced by exactly one decrement. (Matches Codex's medium finding.)
Superseded: latest AI review found no blocking issues.
bdchatham
left a comment
There was a problem hiding this comment.
xreview summary (opentelemetry / sei-networking / systems / idiom lenses). Nice, well-scoped instrumentation — most of it is clean. Three points worth resolving before merge, plus a few smaller ones inline.
🔴 MeterProvider wiring — where is the global MeterProvider set up? otel.Meter() is a no-op unless a provider with a Prometheus/OTLP reader and a service.name resource is registered in the binary, and I don't see it in this diff. If it isn't already wired in main, every instrument here silently records nothing — worth confirming before merge.
On your weighted-histogram question: you don't need to call Record more than once. You have exactly one latency value per commit, so it's a single Record into a Float64Histogram — the sum+count counter split was solving a problem the code doesn't have. Details in the inline comment on commitToCommitLatency.
| "tendermint_internal_autobahn_avail__app_global_block_number", | ||
| metric.WithDescription("global block number of the highest observed appQC"), | ||
| )) | ||
| var proposalToCommitLatency = utils.OrPanic1(meter.Float64Histogram( |
There was a problem hiding this comment.
🟡 Add metric.WithUnit("s") on the avail latency instruments — the mux-side latency sets it but these don't, so the exporter won't append _seconds and the two packages end up with inconsistently-named latency series.
| return &RPC[API, Req, Resp]{kind, limit, req, resp} | ||
| service[kind] = &rpcConfig{ | ||
| limit: limit, | ||
| name: fmt.Sprintf("%T", utils.Zero[Req]()), |
There was a problem hiding this comment.
🟡 fmt.Sprintf("%T", ...) bakes the fully-qualified Go type into the rpc_name label, so renaming or moving the request type silently breaks dashboard continuity. Prefer an explicit, stable name string.
| f.Header.PayloadSize = utils.Alloc(uint64(len(msg))) | ||
| f.Header.MsgEnd = utils.Alloc(true) | ||
| } | ||
| inner.metrics.Send(len(msg)) |
There was a problem hiding this comment.
🟡 Each .Add allocates ~16 B via the variadic AddConfig, and all same-kind streams share one aggregator mutex that's taken here while the stream lock is held — ~64 B / 4 allocs / 4 shared-lock acquisitions per Send+Recv round-trip. Fine at today's rates; if a mux ever gets hot, batch these or move the Add outside the stream lock.
Weighted histograms in OTel — summary of the threadTL;DR on "how do I migrate the weighted histograms without calling There's no weighted 1. Looping recordOpts := metric.WithAttributeSet(attrSet) // build once, outside the loop
for i := uint64(0); i < blocks; i++ {
seqLatency.Record(ctx, latencySec, recordOpts)
}2. But I'd reconsider the weighted histogram itself. Replaying one shared per-commit measurement into N identical points inflates the statistical support and quietly redefines seqLatency.Record(ctx, latencySec, opts) // per-commit, unweighted → commit-health p99
blocksSequenced.Add(ctx, int64(blocks), opts) // blocks_sequenced_total → throughput
// optional: blocksPerCommit.Record(ctx, float64(blocks), opts) // batch-size distributionNow 3. If you genuinely need a pre-aggregated / weighted histogram in OTel, it's doable natively via a custom func (p *seqLatencyProducer) Observe(latencySec float64, blocks uint64) {
p.mu.Lock(); defer p.mu.Unlock()
i := sort.SearchFloat64s(p.bounds, latencySec) // first bound >= value
p.bucketCounts[i] += blocks // weight by N, O(1)
p.count += blocks
p.sum += latencySec * float64(blocks)
}
func (p *seqLatencyProducer) Produce(ctx context.Context) ([]metricdata.ScopeMetrics, error) {
p.mu.Lock()
bc := append([]uint64(nil), p.bucketCounts...) // snapshot under lock
dp := metricdata.HistogramDataPoint[float64]{
Attributes: p.attrs, StartTime: p.start, Time: time.Now(),
Count: p.count, Bounds: p.bounds, BucketCounts: bc, Sum: p.sum,
}
p.mu.Unlock()
return []metricdata.ScopeMetrics{{
Scope: instrumentation.Scope{Name: "consensus"},
Metrics: []metricdata.Metrics{{
Name: "consensus_block_sequencing_latency", Unit: "s",
Data: metricdata.Histogram[float64]{
Temporality: metricdata.CumulativeTemporality,
DataPoints: []metricdata.HistogramDataPoint[float64]{dp},
},
}},
}}, nil
}The catch: you now own the cumulative-temporality invariants (bucket counts monotonically non-decreasing + a stable On the API-ergonomics point: yes, client_golang's |
| } | ||
| // Constructed once per CommitQC, which we should afford. | ||
| attrs := metric.WithAttributeSet(attribute.NewSet( | ||
| // Timeouts capped: 20 means [20,inf) |
There was a problem hiding this comment.
I suppose we almost never have 20+ timeouts on one RoadIndex? What happens if the cluster gets stuck?
| } | ||
|
|
||
| var observedCommitQC = newObserved[*types.CommitQC]() | ||
| var observedAppQC = newObserved[*types.AppQC]() |
There was a problem hiding this comment.
Do we need the full QC? We are only using RoadIndex, View Index, and timestamp I suppose?
| return false, nil | ||
| } | ||
| i.latestAppQC = utils.Some(appQC) | ||
| metrics.ObserveAppQC(appQC) |
There was a problem hiding this comment.
This could be much later than we actually observe AppQC for this RoadIndex right? Because the block proposal packing AppQC is optional.
| i.commitQCs.prune(idx) | ||
| if i.commitQCs.next == idx { | ||
| i.commitQCs.pushBack(commitQC) | ||
| metrics.ObserveCommitQC(c, commitQC) |
There was a problem hiding this comment.
Sorry I'm confused, why do we observe these QCs inside prune() instead of right after the QCs are verified?


It will give us insight into consensus state and rpc performance.