Skip to content

fix: unify HashSet and HashMap usages#7159

Merged
hanabi1224 merged 5 commits into
mainfrom
hm/eth-filter-hashset-contains
Jun 10, 2026
Merged

fix: unify HashSet and HashMap usages#7159
hanabi1224 merged 5 commits into
mainfrom
hm/eth-filter-hashset-contains

Conversation

@hanabi1224

@hanabi1224 hanabi1224 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary of changes

There're quite a few contains in Matcher implementations, where HashSet should be more efficient than Vec

Partially ports 7df7237 from #6720

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • Chores
    • Added linting rules to restrict usage of specific hash types and guide developers toward alternatives.
    • Refactored imports across multiple modules to use a centralized prelude module, reducing code duplication.
    • Simplified import patterns by consolidating common traits and types into the crate's prelude, improving maintainability.

@hanabi1224 hanabi1224 added the RPC requires calibnet RPC checks to run on CI label Jun 10, 2026
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

This PR enforces stricter ahash usage via Clippy rules, centralizes extension traits in the crate prelude, removes redundant imports throughout the codebase, and standardizes empty container initialization using Default::default() in filter modules.

Changes

Import consolidation and Clippy enforcement

Layer / File(s) Summary
Clippy enforcement and prelude trait exports
.clippy.toml, src/lib.rs
Clippy configuration now disallows ahash::AHashMap and ahash::AHashSet, guiding developers to use standard variants. Extension traits HashMapExt and HashSetExt are added to the crate's internal prelude for module-wide availability.
Filter module import refactoring and Default initialization
src/rpc/methods/eth/filter/*, src/rpc/methods/eth/filter/mod.rs
Filter submodules (event.rs, mempool.rs, store.rs, tipset.rs) now use crate::prelude::* and direct ahash::HashMap imports. The filter/mod.rs file switches from explicit empty constructors (vec![], Default::new()) to Default::default() for container initialization, and removes aliased import patterns.
RPC methods and eth trace module imports
src/rpc/methods/eth.rs, src/rpc/methods/eth/trace/*, src/rpc/auth_layer.rs, src/rpc/methods/eth/utils.rs
RPC and trace modules now import from crate::prelude::*, remove extension trait imports, use direct ahash::HashMap instead of aliases, and narrow trait imports while relying on prelude for resolution.
Core infrastructure and storage module imports
src/chain/store/base_fee.rs, src/db/blockstore_with_write_buffer.rs, src/interpreter/vm.rs, src/libp2p_bitswap/request_manager.rs
Core modules update imports to use prelude, remove explicit Arc imports, eliminate extension trait imports, and standardize ahash type imports.
Message pool, registry, state, and utility module imports
src/message_pool/msgpool/*, src/rpc/registry/*, src/rpc/methods/state.rs, src/state_manager/message_search.rs, src/tool/subcommands/api_cmd/report.rs
Remaining modules consolidate imports to use prelude, remove extension trait imports, and standardize ahash usage across the codebase.

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

  • akaladarshi
  • LesnyRumcajs
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'fix: unify HashSet and HashMap usages' accurately summarizes the main refactoring changes across the codebase, which involve consolidating import patterns and removing redundant extension trait imports.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch hm/eth-filter-hashset-contains
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch hm/eth-filter-hashset-contains

Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review June 10, 2026 03:11
@hanabi1224 hanabi1224 requested a review from a team as a code owner June 10, 2026 03:11
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and akaladarshi and removed request for a team June 10, 2026 03:11

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/rpc/methods/eth/types.rs`:
- Around line 488-493: The public RPC enum EthHashList should preserve
wire-level list semantics: change the List variant back to List(Vec<EthHash>)
(keep Single(Option<EthHash>) as-is) so serialized JSON retains order and
duplicates; then update call sites that need set-operations to convert that Vec
into a HashSet for internal matching/parsing (e.g., when building topic_set use
hashes.iter().copied().collect::<HashSet<_>>() and use .contains(&hash)),
ensuring only internal logic uses HashSet while the EthHashList type stays
Vec-based for RPC compatibility.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 722679c1-3de2-4e42-a867-bd84b60d7588

📥 Commits

Reviewing files that changed from the base of the PR and between 7bafcfb and d5ab85c.

📒 Files selected for processing (25)
  • .clippy.toml
  • src/chain/store/base_fee.rs
  • src/db/blockstore_with_write_buffer.rs
  • src/interpreter/vm.rs
  • src/lib.rs
  • src/libp2p_bitswap/request_manager.rs
  • src/message_pool/msgpool/republish.rs
  • src/message_pool/msgpool/selection.rs
  • src/rpc/auth_layer.rs
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/filter/event.rs
  • src/rpc/methods/eth/filter/mempool.rs
  • src/rpc/methods/eth/filter/mod.rs
  • src/rpc/methods/eth/filter/store.rs
  • src/rpc/methods/eth/filter/tipset.rs
  • src/rpc/methods/eth/trace/geth.rs
  • src/rpc/methods/eth/trace/state_diff.rs
  • src/rpc/methods/eth/types.rs
  • src/rpc/methods/eth/utils.rs
  • src/rpc/methods/mpool.rs
  • src/rpc/methods/state.rs
  • src/rpc/registry/actors_reg.rs
  • src/rpc/registry/methods_reg.rs
  • src/state_manager/message_search.rs
  • src/tool/subcommands/api_cmd/report.rs
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • filecoin-project/lotus (manual)

Comment thread src/rpc/methods/eth/types.rs
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.31%. Comparing base (f178b5b) to head (59b0801).
✅ All tests successful. No failed tests found.

Additional details and impacted files
Files with missing lines Coverage Δ
src/chain/store/base_fee.rs 98.57% <ø> (ø)
src/db/blockstore_with_write_buffer.rs 95.16% <ø> (ø)
src/interpreter/vm.rs 82.03% <ø> (ø)
src/libp2p_bitswap/request_manager.rs 74.74% <ø> (ø)
src/message_pool/msgpool/republish.rs 87.09% <ø> (ø)
src/message_pool/msgpool/selection.rs 86.28% <ø> (ø)
src/rpc/auth_layer.rs 78.78% <ø> (ø)
src/rpc/methods/eth.rs 65.52% <ø> (ø)
src/rpc/methods/eth/filter/event.rs 84.48% <ø> (ø)
src/rpc/methods/eth/filter/mempool.rs 95.91% <ø> (ø)
... and 12 more

... and 3 files with indirect coverage changes


Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f178b5b...59b0801. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hanabi1224 hanabi1224 changed the title fix: perfer HashSet::contains over Vec::contains in eth filters fix: unify HashSet and HashMap usages Jun 10, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/rpc/methods/eth/filter/event.rs`:
- Line 20: EventFilter currently exposes pub addresses: Vec<Address> but uses
Vec::contains in its Matcher/ParsedFilter match logic (linear scan); to keep
external Vec shape for compatibility while improving hot-path membership checks,
add an internal HashSet<Address> constructed once when creating the matcher
(e.g., in the Matcher::new or ParsedFilter::new path) and switch the membership
checks in the Matcher impl and ParsedFilter match methods from
self.addresses.contains(...) to this internal_set.contains(...); keep the public
addresses: Vec<Address> as-is so serialization/compatibility doesn't change, but
use the HashSet for all hot-path .contains lookups and ensure duplicates are
deduplicated when building the set.

In `@src/rpc/methods/eth/filter/mod.rs`:
- Line 596: The hot-path call EthHashList::List(vec) uses vec.contains(&hash)
inside matches(), causing O(n) checks per event; change the membership test to
use a HashSet or a cached lookup structure to make it O(1): either modify
EthHashList to hold a HashSet for List variants or, inside the matches() method,
convert the Vec to a HashSet once (or maintain a cached HashSet per filter)
before iterating events; update references to EthHashList::List and matches()
accordingly and ensure any conversion cost is amortized or benchmarked for
typical topic-list sizes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9d8cb73a-543d-4c6f-b6e5-78970e260235

📥 Commits

Reviewing files that changed from the base of the PR and between d5ab85c and 4b71aa6.

⛔ Files ignored due to path filters (3)
  • src/rpc/snapshots/forest__rpc__tests__rpc__v0.snap is excluded by !**/*.snap
  • src/rpc/snapshots/forest__rpc__tests__rpc__v1.snap is excluded by !**/*.snap
  • src/rpc/snapshots/forest__rpc__tests__rpc__v2.snap is excluded by !**/*.snap
📒 Files selected for processing (3)
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/filter/event.rs
  • src/rpc/methods/eth/filter/mod.rs
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • filecoin-project/lotus (manual)
✅ Files skipped from review due to trivial changes (1)
  • src/rpc/methods/eth.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/rpc/methods/eth/filter/event.rs`:
- Line 20: EventFilter currently exposes pub addresses: Vec<Address> but uses
Vec::contains in its Matcher/ParsedFilter match logic (linear scan); to keep
external Vec shape for compatibility while improving hot-path membership checks,
add an internal HashSet<Address> constructed once when creating the matcher
(e.g., in the Matcher::new or ParsedFilter::new path) and switch the membership
checks in the Matcher impl and ParsedFilter match methods from
self.addresses.contains(...) to this internal_set.contains(...); keep the public
addresses: Vec<Address> as-is so serialization/compatibility doesn't change, but
use the HashSet for all hot-path .contains lookups and ensure duplicates are
deduplicated when building the set.

In `@src/rpc/methods/eth/filter/mod.rs`:
- Line 596: The hot-path call EthHashList::List(vec) uses vec.contains(&hash)
inside matches(), causing O(n) checks per event; change the membership test to
use a HashSet or a cached lookup structure to make it O(1): either modify
EthHashList to hold a HashSet for List variants or, inside the matches() method,
convert the Vec to a HashSet once (or maintain a cached HashSet per filter)
before iterating events; update references to EthHashList::List and matches()
accordingly and ensure any conversion cost is amortized or benchmarked for
typical topic-list sizes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9d8cb73a-543d-4c6f-b6e5-78970e260235

📥 Commits

Reviewing files that changed from the base of the PR and between d5ab85c and 4b71aa6.

⛔ Files ignored due to path filters (3)
  • src/rpc/snapshots/forest__rpc__tests__rpc__v0.snap is excluded by !**/*.snap
  • src/rpc/snapshots/forest__rpc__tests__rpc__v1.snap is excluded by !**/*.snap
  • src/rpc/snapshots/forest__rpc__tests__rpc__v2.snap is excluded by !**/*.snap
📒 Files selected for processing (3)
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/filter/event.rs
  • src/rpc/methods/eth/filter/mod.rs
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • filecoin-project/lotus (manual)
✅ Files skipped from review due to trivial changes (1)
  • src/rpc/methods/eth.rs
🛑 Comments failed to post (2)
src/rpc/methods/eth/filter/event.rs (1)

20-20: ⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for EthFilterSpec and ParsedFilter construction patterns to estimate typical address list sizes
rg -nC3 'EthFilterSpec\s*\{' --type=rs | head -50
rg -nC3 'ParsedFilter\s*\{' --type=rs | head -50

Repository: ChainSafe/forest

Length of output: 114


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="src/rpc/methods/eth/filter/event.rs"
ls -la "$FILE"

# Print around the addresses field definition and the Matcher::matches implementation.
rg -n "pub addresses|struct EventFilter|trait Matcher|fn matches" "$FILE" || true

echo "---- addresses field context ----"
nl -ba "$FILE" | sed -n '1,120p'

echo "---- Matcher::matches context (search for matches fn) ----"
MATCH_LINE=$(rg -n "fn matches" "$FILE" | head -n1 | cut -d: -f1)
echo "matches line: $MATCH_LINE"
nl -ba "$FILE" | sed -n "$((MATCH_LINE-40)),$((MATCH_LINE+80))p"

echo "---- Look for HashSet usage in this module ----"
rg -n "HashSet|contains\(" src/rpc/methods/eth/filter "$FILE" || true

Repository: ChainSafe/forest

Length of output: 340


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="src/rpc/methods/eth/filter/event.rs"

echo "---- serialization/derive info for EventFilter ----"
rg -n "struct EventFilter|derive\(|Serialize|Deserialize|serde" "$FILE" || true

echo "---- grep for vec->set/revert markers in repo ----"
rg -n "vec->set|revert vec|HashSet<Address>|HashSet<.*Address" src || true

echo "---- locate where EventFilter.addresses is constructed/used ----"
rg -n "EventFilter\s*\{|addresses\s*:" src/rpc/methods/eth/filter src/rpc/methods/eth -S || true

Repository: ChainSafe/forest

Length of output: 4492


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- find EthFilterSpec and ParsedFilter definitions/usages ----"
rg -n "EthFilterSpec\s*\{" --glob '*.rs' | head -n 50
rg -n "ParsedFilter\s*\{" --glob '*.rs' | head -n 50

echo "---- look for construction sites of address lists in eth filter code ----"
rg -n "addresses\s*:\s*vec!\[|addresses\s*:\s*|address(es)?\s*:\s*vec!\[" src/rpc/methods/eth --glob '*.rs' || true

echo "---- search for any HashSet usage for addresses in eth filter code ----"
rg -n "HashSet<.*Address|HashSet" src/rpc/methods/eth/filter --glob '*.rs' || true

Repository: ChainSafe/forest

Length of output: 6095


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="src/rpc/methods/eth/filter/event.rs"
ls -la "$FILE"

rg -n "pub addresses|struct EventFilter|trait Matcher|fn matches" "$FILE" || true

echo "---- addresses field context ----"
nl -ba "$FILE" | sed -n '1,140p'

echo "---- Matcher::matches context (search for matches fn) ----"
MATCH_LINES=$(rg -n "fn matches" "$FILE" | cut -d: -f1 | head -n 5 || true)
echo "matches lines: $MATCH_LINES"

FIRST_LINE=$(echo "$MATCH_LINES" | head -n1)
if [[ -n "${FIRST_LINE:-}" ]]; then
  nl -ba "$FILE" | sed -n "$((FIRST_LINE-40)),$((FIRST_LINE+100))p"
fi

echo "---- Look for HashSet usage in eth/filter subtree ----"
rg -n "HashSet|contains\(" src/rpc/methods/eth/filter --glob '*.rs' || true

Repository: ChainSafe/forest

Length of output: 340


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="src/rpc/methods/eth/filter/event.rs"

echo "---- serialization/derive info for EventFilter ----"
rg -n "struct EventFilter|derive\(|Serialize|Deserialize|serde" "$FILE" || true

echo "---- grep for vec->set/revert markers in repo ----"
rg -n "vec->set|revert vec|HashSet<Address>|HashSet<.*Address|HashSet" src || true

echo "---- locate where EventFilter.addresses is constructed/used ----"
rg -n "EventFilter\s*\{|addresses\s*:" src/rpc/methods/eth/filter src/rpc/methods/eth -S --glob '*.rs' || true

Repository: ChainSafe/forest

Length of output: 22870


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- find EthFilterSpec and ParsedFilter definitions/usages ----"
rg -n "EthFilterSpec\s*\{" --glob '*.rs' | head -n 50
rg -n "ParsedFilter\s*\{" --glob '*.rs' | head -n 50

echo "---- look for construction sites of address lists in eth filter code ----"
rg -n "addresses\s*:\s*vec!\[|addresses\s*:\s*|address(es)?\s*:\s*vec!\[" src/rpc/methods/eth --glob '*.rs' || true

echo "---- search for any HashSet usage for addresses in eth filter code ----"
rg -n "HashSet<.*Address|HashSet" src/rpc/methods/eth/filter --glob '*.rs' || true

Repository: ChainSafe/forest

Length of output: 6095


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show key parts of EventFilter and Matcher logic
EVENT_FILE="src/rpc/methods/eth/filter/event.rs"
MOD_FILE="src/rpc/methods/eth/filter/mod.rs"

echo "==== event.rs: EventFilter struct + impls (lines 1-140) ===="
sed -n '1,160p' "$EVENT_FILE" | nl -ba | sed -n '1,160p' || sed -n '1,160p' "$EVENT_FILE"

echo "==== mod.rs: find impl Matcher for EventFilter and print nearby ===="
# locate exact "impl Matcher for EventFilter"
MATCHER_LINE=$(rg -n "impl Matcher for EventFilter" "$MOD_FILE" | cut -d: -f1 | head -n1 || true)
echo "impl Matcher for EventFilter line: ${MATCHER_LINE:-<none>}"

if [[ -n "${MATCHER_LINE:-}" ]]; then
  start=$((MATCHER_LINE-40))
  end=$((MATCHER_LINE+120))
  sed -n "${start},${end}p" "$MOD_FILE" | nl -ba | sed -n "1,200p" || sed -n "${start},${end}p" "$MOD_FILE"
fi

echo "==== mod.rs: search for .contains( with addresses ===="
rg -n "addresses\.contains|\.addresses\.contains|contains\(\s*resolved|contains\(" "$MOD_FILE" | head -n 50 || true

echo "==== sample addresses list constructions (Vec sizes) ===="
# Count simple patterns of vec![...] used for addresses in filter code/tests
rg -n "addresses:\s*vec!\[" "$MOD_FILE" "$EVENT_FILE" --glob '*.rs' | head -n 50

echo "==== sample sizes in constructed address vectors across repo (limit output) ===="
rg -n "addresses:\s*vec!\[" src --glob '*.rs' | head -n 50

Repository: ChainSafe/forest

Length of output: 11386


🏁 Script executed:

#!/bin/bash
set -euo pipefail

EVENT_FILE="src/rpc/methods/eth/filter/event.rs"
MOD_FILE="src/rpc/methods/eth/filter/mod.rs"

echo "==== event.rs: EventFilter struct + impls (lines 1-160) ===="
awk 'NR<=160{printf "%5d:%s\n", NR, $0}' "$EVENT_FILE"

echo "==== mod.rs: impl Matcher for EventFilter ===="
MATCHER_LINE=$(rg -n "impl Matcher for EventFilter" "$MOD_FILE" | cut -d: -f1 | head -n1 || true)
echo "impl Matcher for EventFilter line: ${MATCHER_LINE:-<none>}"

if [[ -n "${MATCHER_LINE:-}" ]]; then
  start=$((MATCHER_LINE-60))
  end=$((MATCHER_LINE+160))
  awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%5d:%s\n", NR, $0}' "$MOD_FILE"
fi

echo "==== mod.rs: where membership uses contains ===="
rg -n "addresses\.contains|\.addresses\.contains|contains\(\s*resolved|contains\(" "$MOD_FILE" | head -n 80 || true

echo "==== sample addresses: vec![...] in filter code/tests ===="
rg -n "addresses:\s*vec!\[" "$MOD_FILE" "$EVENT_FILE" --glob '*.rs' | head -n 80 || true

echo "==== sample addresses: vec![...] across src ===="
rg -n "addresses:\s*vec!\[" src --glob '*.rs' | head -n 80 || true

Repository: ChainSafe/forest

Length of output: 15447


🏁 Script executed:

#!/bin/bash
set -euo pipefail
MOD_FILE="src/rpc/methods/eth/filter/mod.rs"

echo "=== rg 'addresses.contains' ==="
rg -n "addresses\.contains" "$MOD_FILE" --glob '*.rs' || true

echo "=== rg 'resolved' and 'contains' in Matcher::matches ==="
rg -n "fn matches|resolved|addresses\.contains" "$MOD_FILE" | head -n 120 || true

Repository: ChainSafe/forest

Length of output: 1062


🏁 Script executed:

#!/bin/bash
set -euo pipefail
MOD_FILE="src/rpc/methods/eth/filter/mod.rs"

echo "== locations of 'fn matches' in mod.rs (first 30) =="
rg -n "fn matches" "$MOD_FILE" | head -n 30

echo "== print each 'fn matches' block context around EventFilter/ParsedFilter matchers =="
# For each line with fn matches, print ~30 lines after to capture addresses matching.
# Limit to first 6 occurrences.
COUNT=0
while read -r line; do
  COUNT=$((COUNT+1))
  lineno=$(echo "$line" | cut -d: -f1)
  echo "--- fn matches at $lineno ---"
  start=$((lineno-20))
  end=$((lineno+60))
  awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%5d:%s\n", NR, $0}' "$MOD_FILE" | head -n 140
  if [[ $COUNT -ge 6 ]]; then break; fi
done < <(rg -n "fn matches" "$MOD_FILE" | head -n 6)

Repository: ChainSafe/forest

Length of output: 353


🏁 Script executed:

#!/bin/bash
set -euo pipefail
MOD_FILE="src/rpc/methods/eth/filter/mod.rs"

echo "=== first 80 occurrences of 'contains(' in filter/mod.rs ==="
rg -n "\.contains\(" "$MOD_FILE" | head -n 80 || true

echo "=== occurrences containing 'addresses' and 'contains' together (same line) ==="
rg -n "addresses.*contains|contains.*addresses" "$MOD_FILE" || true

echo "=== occurrences where contains is used near 'addresses' (within 5 lines) ==="
rg -n -C5 "addresses" "$MOD_FILE" | rg -n "contains\(" | head -n 40 || true

Repository: ChainSafe/forest

Length of output: 821


Fix PR title / clarify Vec vs HashSet for EventFilter.addresses membership

EventFilter defines pub addresses: Vec<Address> and Matcher for EventFilter performs membership via self.addresses.contains(...) (linear scan); the same Vec::contains pattern is used in Matcher for ParsedFilter. This contradicts the PR title claiming to prefer HashSet::contains.

Either update the PR title/description to reflect the intentional Vec choice (for Lotus-compatible behavior/serialization) or switch to building an internal HashSet for the hot-path membership check while keeping the external Vec shape.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/eth/filter/event.rs` at line 20, EventFilter currently
exposes pub addresses: Vec<Address> but uses Vec::contains in its
Matcher/ParsedFilter match logic (linear scan); to keep external Vec shape for
compatibility while improving hot-path membership checks, add an internal
HashSet<Address> constructed once when creating the matcher (e.g., in the
Matcher::new or ParsedFilter::new path) and switch the membership checks in the
Matcher impl and ParsedFilter match methods from self.addresses.contains(...) to
this internal_set.contains(...); keep the public addresses: Vec<Address> as-is
so serialization/compatibility doesn't change, but use the HashSet for all
hot-path .contains lookups and ensure duplicates are deduplicated when building
the set.

Source: Linked repositories

src/rpc/methods/eth/filter/mod.rs (1)

596-596: ⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff

Performance concern: Vec.contains() in topic matching hot path.

This line performs vec.contains(&hash) where EthHashList::List now uses Vec instead of HashSet. The matches() method is invoked per entry per event, making this an O(n) membership check in a hot path. If topic lists contain many hashes, this could create a performance bottleneck.

The linked repositories research indicates Lotus defines EthHashList as a slice, explaining the Vec choice for interoperability. However, the performance tradeoff should be measured, especially for filters with large topic lists.

Consider: if topic lists are typically small (1-5 items), the overhead may be negligible. If they can be large (dozens or hundreds), converting to HashSet for membership checks would provide significant speedup.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/eth/filter/mod.rs` at line 596, The hot-path call
EthHashList::List(vec) uses vec.contains(&hash) inside matches(), causing O(n)
checks per event; change the membership test to use a HashSet or a cached lookup
structure to make it O(1): either modify EthHashList to hold a HashSet for List
variants or, inside the matches() method, convert the Vec to a HashSet once (or
maintain a cached HashSet per filter) before iterating events; update references
to EthHashList::List and matches() accordingly and ensure any conversion cost is
amortized or benchmarked for typical topic-list sizes.

Source: Linked repositories

@hanabi1224 hanabi1224 enabled auto-merge June 10, 2026 10:21
@hanabi1224 hanabi1224 added this pull request to the merge queue Jun 10, 2026
Merged via the queue into main with commit 0ad035e Jun 10, 2026
32 of 33 checks passed
@hanabi1224 hanabi1224 deleted the hm/eth-filter-hashset-contains branch June 10, 2026 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants