proxy_h2: acknowledge downstream RST_STREAM while the upstream request-body write is blocked#911
Open
molocule wants to merge 7 commits into
Open
proxy_h2: acknowledge downstream RST_STREAM while the upstream request-body write is blocked#911molocule wants to merge 7 commits into
molocule wants to merge 7 commits into
Conversation
andrewhavck
reviewed
Jun 19, 2026
| client_body.send_reset(h2::Reason::CANCEL); | ||
| // Mark the underlying H2 connection for shutdown so it's not used | ||
| // for new streams in case it is hung. | ||
| client_session.conn.mark_shutdown(); |
Collaborator
There was a problem hiding this comment.
I don't think we want to mark the entire upstream connection for shutdown on a downstream error. Sending a CANCEL seems fine on the impacted stream.
andrewhavck
reviewed
Jun 19, 2026
| } | ||
| // ignore downstream error so that upstream can continue to write cache | ||
| downstream_state.to_errored(); | ||
| warn!( |
Collaborator
There was a problem hiding this comment.
Let's use the same pattern of checking suppress_proxy_warn_log before emitting a warn here.
andrewhavck
reviewed
Jun 20, 2026
| /// For HTTP/2 this resolves when the client resets the stream (RST_STREAM) or the | ||
| /// stream errors. Other protocols have no out-of-band abort signal (detecting a | ||
| /// close would require consuming reads), so this future is pending forever for them. | ||
| pub async fn watch_h2_stream_reset(&mut self) -> Result<h2::Reason> { |
Collaborator
There was a problem hiding this comment.
I'd prefer us return the future here Option<Idle<'_>>, we then avoid the future and select for downstream HTTP/1.1 clients in proxy_h2. We also avoid the case where a caller mistakenly invokes this for something other than H2 outside of select and stays pending.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR closes #910
Problem
In
bidirection_down_to_up, the downstream-read select arm awaitssend_body_to2()inside the arm body. While that write is parked on upstream flow control, the downstreamRecvStreamis not polled, so a downstreamRST_STREAMis never observed.The task keeps holding the downstream stream handles, and since the
h2crate only returns a reset stream's connection-window credit once all handles drop (release_closed_capacityatref_count == 0), the cancelled stream pins its share of the downstream connection window for as long as the upstream write stays blocked (potentially indefinitely, because there is no default write timeout).This is dangerous because cancelling a stream immediately frees its stream slot (max_concurrent_streams) on both ends, so the connection appears to have plenty of capacity and the client (e.g. Envoy or any gRPC client) keeps multiplexing new requests onto it. But the shared connection send window does not come back: credit the client spent can only be restored by the receiver's connection-level WINDOW_UPDATEs, and pingora never sends them because the cancelled streams' unconsumed credit stays pinned to their zombie streams. Flow-control accounting is correct, but clients will not be able to send data upstream.
Changes
pingora-core: addSession::watch_h2_stream_reset()to the HTTP server session enum. For H2 it resolves when the client resets the stream, reusing the existingIdle/poll_resetfuture; for H1/other protocols it is pending forever (there is no out-of-band abort signal — detecting an H1 close would require destructively reading the socket).proxy_h2::send_body_to2: racewrite_bodyagainstwatch_h2_stream_reset(). A reset surfaces as anH2ErrorwithErrorSource::Downstream.bidirection_down_to_up: on a downstream-sourced error fromsend_body_to2, fail so the stream handles drop and the window credit is reclaimed immediately, except when a cache fill is in progress, in which case the downstream error is swallowed and the upstream response continues to be admitted to cache, mirroring the existing policy for downstream read errors.proxy_down_to_up: on downstream-sourced errors, sendRST_STREAM CANCELon the upstream stream so the upstream peer also releases its stream resources promptly (previously only done for upstream read timeouts).Notes
The race is placed around the raw
write_body(after the request-body filters) rather than around all ofsend_body_to2or at the call site. This is because it is the only spot where the borrows are disjoint (&mut SendStreamvs.&mut Session), and it means a reset only ever cancels anh2frame write on a stream that is being torn down, it never cancels user-defined body filters mid-await, which were never required to be cancel-safe.Testing
Two integration tests (plus an
h2clistener for the cache test service on :6154, since rawh2frame control is needed):test_h2_downstream_rst_while_upstream_write_blocked: upstream never grants window updates; client fills the upstream stream window until the proxy parks, then sendsRST_STREAM. Asserts the upstream stream is cancelled (RST_STREAM CANCELreceived) within a bounded time. Hangs without this fix.test_h2_downstream_rst_during_cache_fill: same blocked-write reset, but with a cacheable response mid-admission; the upstream withholds the response tail until after the reset. Asserts the fill still completes: a follow-up request is a cache hit with the full body.