fix: parallelize per-object HEAD on list-objects to stop recursive-list stalls by ServerSideHannes · Pull Request #88 · ServerSideHannes/s3proxy-python

ServerSideHannes · 2026-06-24T09:27:53Z

Problem

ClickHouse remote backups (web-production-platform) hang at the S3 list step. The backup tool issues a recursive list (list-type=2&delimiter=) against the bucket via this proxy; the proxy never returns, the client aborts at ~60s (HAProxy CH--, timing …/-1/60034), and the backup pod sits idle. clickhouse-backup list remote also times out. Folder-level lists (delimiter=/) return fast — that split is the tell.

Root cause

_process_list_objects in s3proxy/handlers/buckets.py issues one head_object per key, sequentially, to recover the SSE plaintext size/etag:

for obj in contents:
    head = await client.head_object(bucket, obj["Key"])   # serial, no cache

A recursive list returns up to max-keys (1000) objects per page → up to 1000 serial HEAD round-trips → latency grows linearly and trips the 60s client timeout. Delimited lists return few objects → fast. Borderline timing is why most hourly backups pass and some hang as the bucket grows.

Fix

Run the per-object HEADs concurrently, bounded by LIST_HEAD_CONCURRENCY = 50 (half the client max_pool_connections=100). Output order and the per-object fallback (listed size/etag on HEAD failure) are preserved — purely a latency change, identical output.

Also in this PR: unbreak the package import

The suite could not even collect: 8 sites used Py2 except A, B: syntax (utils.py, concurrency.py, request_handler.py, dashboard/collectors.py, dashboard/stats_store.py) which does not parse under Py3, so import s3proxy failed and CI was red on everything. Parenthesized them to catch both types as intended. This means a fresh import s3proxy was broken on main — worth understanding how that shipped.

Verification

tests/unit/test_list_objects_parallel.py (new) proves HEADs run concurrently (max_inflight > 1, capped at the limit), order preserved, internal keys dropped, failed HEAD falls back to listed size/etag.
Full unit suite now collects and runs: 435 passed, 1 failed. The single failure — test_dashboard.py::test_session_cookie_authenticates_api (Queue ... bound to a different event loop) — is a pre-existing, unrelated dashboard test-infra bug; it fails identically in isolation and was only hidden by the import break. Not touched here; should be a separate fix.

…st stalls ListObjects resolved the SSE plaintext size/etag with one sequential head_object per key. A recursive (empty-delimiter) list of up to max-keys objects stacked into a multi-second stall that tripped client timeouts, hanging ClickHouse/Postgres remote backups at the S3 list step. Run the HEADs concurrently, bounded by LIST_HEAD_CONCURRENCY (50), preserving output order and the per-object fallback to the listed size/etag.

…mports except A, B: is Py2 syntax (catch A, bind to name B) and fails to parse under Py3 — it broke 'import s3proxy', so the whole pytest suite couldn't collect (conftest imports s3proxy.concurrency). Parenthesize the 8 sites across utils, concurrency, request_handler and dashboard/* to catch both types as intended. Lets the list-objects self-check import the real module directly (drops the sibling-stubbing workaround).

ServerSideHannes added 2 commits June 24, 2026 11:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: parallelize per-object HEAD on list-objects to stop recursive-list stalls#88

fix: parallelize per-object HEAD on list-objects to stop recursive-list stalls#88
ServerSideHannes wants to merge 2 commits into
mainfrom
fix/parallel-list-head-fanout

ServerSideHannes commented Jun 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ServerSideHannes commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root cause

Fix

Also in this PR: unbreak the package import

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ServerSideHannes commented Jun 24, 2026 •

edited

Loading