fix: unify HashSet and HashMap usages#7159
Conversation
WalkthroughThis PR enforces stricter ChangesImport consolidation and Clippy enforcement
🎯 2 (Simple) | ⏱️ ~12 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/rpc/methods/eth/types.rs`:
- Around line 488-493: The public RPC enum EthHashList should preserve
wire-level list semantics: change the List variant back to List(Vec<EthHash>)
(keep Single(Option<EthHash>) as-is) so serialized JSON retains order and
duplicates; then update call sites that need set-operations to convert that Vec
into a HashSet for internal matching/parsing (e.g., when building topic_set use
hashes.iter().copied().collect::<HashSet<_>>() and use .contains(&hash)),
ensuring only internal logic uses HashSet while the EthHashList type stays
Vec-based for RPC compatibility.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 722679c1-3de2-4e42-a867-bd84b60d7588
📒 Files selected for processing (25)
.clippy.tomlsrc/chain/store/base_fee.rssrc/db/blockstore_with_write_buffer.rssrc/interpreter/vm.rssrc/lib.rssrc/libp2p_bitswap/request_manager.rssrc/message_pool/msgpool/republish.rssrc/message_pool/msgpool/selection.rssrc/rpc/auth_layer.rssrc/rpc/methods/eth.rssrc/rpc/methods/eth/filter/event.rssrc/rpc/methods/eth/filter/mempool.rssrc/rpc/methods/eth/filter/mod.rssrc/rpc/methods/eth/filter/store.rssrc/rpc/methods/eth/filter/tipset.rssrc/rpc/methods/eth/trace/geth.rssrc/rpc/methods/eth/trace/state_diff.rssrc/rpc/methods/eth/types.rssrc/rpc/methods/eth/utils.rssrc/rpc/methods/mpool.rssrc/rpc/methods/state.rssrc/rpc/registry/actors_reg.rssrc/rpc/registry/methods_reg.rssrc/state_manager/message_search.rssrc/tool/subcommands/api_cmd/report.rs
🔗 Linked repositories identified
CodeRabbit considers these linked repositories for cross-repo context during reviews:
filecoin-project/lotus(manual)
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files
... and 3 files with indirect coverage changes Continue to review full report in Codecov by Harness.
🚀 New features to boost your workflow:
|
HashSet::contains over Vec::contains in eth filtersHashSet and HashMap usages
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/rpc/methods/eth/filter/event.rs`:
- Line 20: EventFilter currently exposes pub addresses: Vec<Address> but uses
Vec::contains in its Matcher/ParsedFilter match logic (linear scan); to keep
external Vec shape for compatibility while improving hot-path membership checks,
add an internal HashSet<Address> constructed once when creating the matcher
(e.g., in the Matcher::new or ParsedFilter::new path) and switch the membership
checks in the Matcher impl and ParsedFilter match methods from
self.addresses.contains(...) to this internal_set.contains(...); keep the public
addresses: Vec<Address> as-is so serialization/compatibility doesn't change, but
use the HashSet for all hot-path .contains lookups and ensure duplicates are
deduplicated when building the set.
In `@src/rpc/methods/eth/filter/mod.rs`:
- Line 596: The hot-path call EthHashList::List(vec) uses vec.contains(&hash)
inside matches(), causing O(n) checks per event; change the membership test to
use a HashSet or a cached lookup structure to make it O(1): either modify
EthHashList to hold a HashSet for List variants or, inside the matches() method,
convert the Vec to a HashSet once (or maintain a cached HashSet per filter)
before iterating events; update references to EthHashList::List and matches()
accordingly and ensure any conversion cost is amortized or benchmarked for
typical topic-list sizes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9d8cb73a-543d-4c6f-b6e5-78970e260235
⛔ Files ignored due to path filters (3)
src/rpc/snapshots/forest__rpc__tests__rpc__v0.snapis excluded by!**/*.snapsrc/rpc/snapshots/forest__rpc__tests__rpc__v1.snapis excluded by!**/*.snapsrc/rpc/snapshots/forest__rpc__tests__rpc__v2.snapis excluded by!**/*.snap
📒 Files selected for processing (3)
src/rpc/methods/eth.rssrc/rpc/methods/eth/filter/event.rssrc/rpc/methods/eth/filter/mod.rs
🔗 Linked repositories identified
CodeRabbit considers these linked repositories for cross-repo context during reviews:
filecoin-project/lotus(manual)
✅ Files skipped from review due to trivial changes (1)
- src/rpc/methods/eth.rs
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/rpc/methods/eth/filter/event.rs`:
- Line 20: EventFilter currently exposes pub addresses: Vec<Address> but uses
Vec::contains in its Matcher/ParsedFilter match logic (linear scan); to keep
external Vec shape for compatibility while improving hot-path membership checks,
add an internal HashSet<Address> constructed once when creating the matcher
(e.g., in the Matcher::new or ParsedFilter::new path) and switch the membership
checks in the Matcher impl and ParsedFilter match methods from
self.addresses.contains(...) to this internal_set.contains(...); keep the public
addresses: Vec<Address> as-is so serialization/compatibility doesn't change, but
use the HashSet for all hot-path .contains lookups and ensure duplicates are
deduplicated when building the set.
In `@src/rpc/methods/eth/filter/mod.rs`:
- Line 596: The hot-path call EthHashList::List(vec) uses vec.contains(&hash)
inside matches(), causing O(n) checks per event; change the membership test to
use a HashSet or a cached lookup structure to make it O(1): either modify
EthHashList to hold a HashSet for List variants or, inside the matches() method,
convert the Vec to a HashSet once (or maintain a cached HashSet per filter)
before iterating events; update references to EthHashList::List and matches()
accordingly and ensure any conversion cost is amortized or benchmarked for
typical topic-list sizes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9d8cb73a-543d-4c6f-b6e5-78970e260235
⛔ Files ignored due to path filters (3)
src/rpc/snapshots/forest__rpc__tests__rpc__v0.snapis excluded by!**/*.snapsrc/rpc/snapshots/forest__rpc__tests__rpc__v1.snapis excluded by!**/*.snapsrc/rpc/snapshots/forest__rpc__tests__rpc__v2.snapis excluded by!**/*.snap
📒 Files selected for processing (3)
src/rpc/methods/eth.rssrc/rpc/methods/eth/filter/event.rssrc/rpc/methods/eth/filter/mod.rs
🔗 Linked repositories identified
CodeRabbit considers these linked repositories for cross-repo context during reviews:
filecoin-project/lotus(manual)
✅ Files skipped from review due to trivial changes (1)
- src/rpc/methods/eth.rs
🛑 Comments failed to post (2)
src/rpc/methods/eth/filter/event.rs (1)
20-20:
⚠️ Potential issue | 🟠 Major🧩 Analysis chain
🏁 Script executed:
#!/bin/bash # Check for EthFilterSpec and ParsedFilter construction patterns to estimate typical address list sizes rg -nC3 'EthFilterSpec\s*\{' --type=rs | head -50 rg -nC3 'ParsedFilter\s*\{' --type=rs | head -50Repository: ChainSafe/forest
Length of output: 114
🏁 Script executed:
#!/bin/bash set -euo pipefail FILE="src/rpc/methods/eth/filter/event.rs" ls -la "$FILE" # Print around the addresses field definition and the Matcher::matches implementation. rg -n "pub addresses|struct EventFilter|trait Matcher|fn matches" "$FILE" || true echo "---- addresses field context ----" nl -ba "$FILE" | sed -n '1,120p' echo "---- Matcher::matches context (search for matches fn) ----" MATCH_LINE=$(rg -n "fn matches" "$FILE" | head -n1 | cut -d: -f1) echo "matches line: $MATCH_LINE" nl -ba "$FILE" | sed -n "$((MATCH_LINE-40)),$((MATCH_LINE+80))p" echo "---- Look for HashSet usage in this module ----" rg -n "HashSet|contains\(" src/rpc/methods/eth/filter "$FILE" || trueRepository: ChainSafe/forest
Length of output: 340
🏁 Script executed:
#!/bin/bash set -euo pipefail FILE="src/rpc/methods/eth/filter/event.rs" echo "---- serialization/derive info for EventFilter ----" rg -n "struct EventFilter|derive\(|Serialize|Deserialize|serde" "$FILE" || true echo "---- grep for vec->set/revert markers in repo ----" rg -n "vec->set|revert vec|HashSet<Address>|HashSet<.*Address" src || true echo "---- locate where EventFilter.addresses is constructed/used ----" rg -n "EventFilter\s*\{|addresses\s*:" src/rpc/methods/eth/filter src/rpc/methods/eth -S || trueRepository: ChainSafe/forest
Length of output: 4492
🏁 Script executed:
#!/bin/bash set -euo pipefail echo "---- find EthFilterSpec and ParsedFilter definitions/usages ----" rg -n "EthFilterSpec\s*\{" --glob '*.rs' | head -n 50 rg -n "ParsedFilter\s*\{" --glob '*.rs' | head -n 50 echo "---- look for construction sites of address lists in eth filter code ----" rg -n "addresses\s*:\s*vec!\[|addresses\s*:\s*|address(es)?\s*:\s*vec!\[" src/rpc/methods/eth --glob '*.rs' || true echo "---- search for any HashSet usage for addresses in eth filter code ----" rg -n "HashSet<.*Address|HashSet" src/rpc/methods/eth/filter --glob '*.rs' || trueRepository: ChainSafe/forest
Length of output: 6095
🏁 Script executed:
#!/bin/bash set -euo pipefail FILE="src/rpc/methods/eth/filter/event.rs" ls -la "$FILE" rg -n "pub addresses|struct EventFilter|trait Matcher|fn matches" "$FILE" || true echo "---- addresses field context ----" nl -ba "$FILE" | sed -n '1,140p' echo "---- Matcher::matches context (search for matches fn) ----" MATCH_LINES=$(rg -n "fn matches" "$FILE" | cut -d: -f1 | head -n 5 || true) echo "matches lines: $MATCH_LINES" FIRST_LINE=$(echo "$MATCH_LINES" | head -n1) if [[ -n "${FIRST_LINE:-}" ]]; then nl -ba "$FILE" | sed -n "$((FIRST_LINE-40)),$((FIRST_LINE+100))p" fi echo "---- Look for HashSet usage in eth/filter subtree ----" rg -n "HashSet|contains\(" src/rpc/methods/eth/filter --glob '*.rs' || trueRepository: ChainSafe/forest
Length of output: 340
🏁 Script executed:
#!/bin/bash set -euo pipefail FILE="src/rpc/methods/eth/filter/event.rs" echo "---- serialization/derive info for EventFilter ----" rg -n "struct EventFilter|derive\(|Serialize|Deserialize|serde" "$FILE" || true echo "---- grep for vec->set/revert markers in repo ----" rg -n "vec->set|revert vec|HashSet<Address>|HashSet<.*Address|HashSet" src || true echo "---- locate where EventFilter.addresses is constructed/used ----" rg -n "EventFilter\s*\{|addresses\s*:" src/rpc/methods/eth/filter src/rpc/methods/eth -S --glob '*.rs' || trueRepository: ChainSafe/forest
Length of output: 22870
🏁 Script executed:
#!/bin/bash set -euo pipefail echo "---- find EthFilterSpec and ParsedFilter definitions/usages ----" rg -n "EthFilterSpec\s*\{" --glob '*.rs' | head -n 50 rg -n "ParsedFilter\s*\{" --glob '*.rs' | head -n 50 echo "---- look for construction sites of address lists in eth filter code ----" rg -n "addresses\s*:\s*vec!\[|addresses\s*:\s*|address(es)?\s*:\s*vec!\[" src/rpc/methods/eth --glob '*.rs' || true echo "---- search for any HashSet usage for addresses in eth filter code ----" rg -n "HashSet<.*Address|HashSet" src/rpc/methods/eth/filter --glob '*.rs' || trueRepository: ChainSafe/forest
Length of output: 6095
🏁 Script executed:
#!/bin/bash set -euo pipefail # Show key parts of EventFilter and Matcher logic EVENT_FILE="src/rpc/methods/eth/filter/event.rs" MOD_FILE="src/rpc/methods/eth/filter/mod.rs" echo "==== event.rs: EventFilter struct + impls (lines 1-140) ====" sed -n '1,160p' "$EVENT_FILE" | nl -ba | sed -n '1,160p' || sed -n '1,160p' "$EVENT_FILE" echo "==== mod.rs: find impl Matcher for EventFilter and print nearby ====" # locate exact "impl Matcher for EventFilter" MATCHER_LINE=$(rg -n "impl Matcher for EventFilter" "$MOD_FILE" | cut -d: -f1 | head -n1 || true) echo "impl Matcher for EventFilter line: ${MATCHER_LINE:-<none>}" if [[ -n "${MATCHER_LINE:-}" ]]; then start=$((MATCHER_LINE-40)) end=$((MATCHER_LINE+120)) sed -n "${start},${end}p" "$MOD_FILE" | nl -ba | sed -n "1,200p" || sed -n "${start},${end}p" "$MOD_FILE" fi echo "==== mod.rs: search for .contains( with addresses ====" rg -n "addresses\.contains|\.addresses\.contains|contains\(\s*resolved|contains\(" "$MOD_FILE" | head -n 50 || true echo "==== sample addresses list constructions (Vec sizes) ====" # Count simple patterns of vec![...] used for addresses in filter code/tests rg -n "addresses:\s*vec!\[" "$MOD_FILE" "$EVENT_FILE" --glob '*.rs' | head -n 50 echo "==== sample sizes in constructed address vectors across repo (limit output) ====" rg -n "addresses:\s*vec!\[" src --glob '*.rs' | head -n 50Repository: ChainSafe/forest
Length of output: 11386
🏁 Script executed:
#!/bin/bash set -euo pipefail EVENT_FILE="src/rpc/methods/eth/filter/event.rs" MOD_FILE="src/rpc/methods/eth/filter/mod.rs" echo "==== event.rs: EventFilter struct + impls (lines 1-160) ====" awk 'NR<=160{printf "%5d:%s\n", NR, $0}' "$EVENT_FILE" echo "==== mod.rs: impl Matcher for EventFilter ====" MATCHER_LINE=$(rg -n "impl Matcher for EventFilter" "$MOD_FILE" | cut -d: -f1 | head -n1 || true) echo "impl Matcher for EventFilter line: ${MATCHER_LINE:-<none>}" if [[ -n "${MATCHER_LINE:-}" ]]; then start=$((MATCHER_LINE-60)) end=$((MATCHER_LINE+160)) awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%5d:%s\n", NR, $0}' "$MOD_FILE" fi echo "==== mod.rs: where membership uses contains ====" rg -n "addresses\.contains|\.addresses\.contains|contains\(\s*resolved|contains\(" "$MOD_FILE" | head -n 80 || true echo "==== sample addresses: vec![...] in filter code/tests ====" rg -n "addresses:\s*vec!\[" "$MOD_FILE" "$EVENT_FILE" --glob '*.rs' | head -n 80 || true echo "==== sample addresses: vec![...] across src ====" rg -n "addresses:\s*vec!\[" src --glob '*.rs' | head -n 80 || trueRepository: ChainSafe/forest
Length of output: 15447
🏁 Script executed:
#!/bin/bash set -euo pipefail MOD_FILE="src/rpc/methods/eth/filter/mod.rs" echo "=== rg 'addresses.contains' ===" rg -n "addresses\.contains" "$MOD_FILE" --glob '*.rs' || true echo "=== rg 'resolved' and 'contains' in Matcher::matches ===" rg -n "fn matches|resolved|addresses\.contains" "$MOD_FILE" | head -n 120 || trueRepository: ChainSafe/forest
Length of output: 1062
🏁 Script executed:
#!/bin/bash set -euo pipefail MOD_FILE="src/rpc/methods/eth/filter/mod.rs" echo "== locations of 'fn matches' in mod.rs (first 30) ==" rg -n "fn matches" "$MOD_FILE" | head -n 30 echo "== print each 'fn matches' block context around EventFilter/ParsedFilter matchers ==" # For each line with fn matches, print ~30 lines after to capture addresses matching. # Limit to first 6 occurrences. COUNT=0 while read -r line; do COUNT=$((COUNT+1)) lineno=$(echo "$line" | cut -d: -f1) echo "--- fn matches at $lineno ---" start=$((lineno-20)) end=$((lineno+60)) awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%5d:%s\n", NR, $0}' "$MOD_FILE" | head -n 140 if [[ $COUNT -ge 6 ]]; then break; fi done < <(rg -n "fn matches" "$MOD_FILE" | head -n 6)Repository: ChainSafe/forest
Length of output: 353
🏁 Script executed:
#!/bin/bash set -euo pipefail MOD_FILE="src/rpc/methods/eth/filter/mod.rs" echo "=== first 80 occurrences of 'contains(' in filter/mod.rs ===" rg -n "\.contains\(" "$MOD_FILE" | head -n 80 || true echo "=== occurrences containing 'addresses' and 'contains' together (same line) ===" rg -n "addresses.*contains|contains.*addresses" "$MOD_FILE" || true echo "=== occurrences where contains is used near 'addresses' (within 5 lines) ===" rg -n -C5 "addresses" "$MOD_FILE" | rg -n "contains\(" | head -n 40 || trueRepository: ChainSafe/forest
Length of output: 821
Fix PR title / clarify Vec vs HashSet for
EventFilter.addressesmembership
EventFilterdefinespub addresses: Vec<Address>andMatcher for EventFilterperforms membership viaself.addresses.contains(...)(linear scan); the sameVec::containspattern is used inMatcher for ParsedFilter. This contradicts the PR title claiming to preferHashSet::contains.Either update the PR title/description to reflect the intentional
Vecchoice (for Lotus-compatible behavior/serialization) or switch to building an internalHashSetfor the hot-path membership check while keeping the externalVecshape.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rpc/methods/eth/filter/event.rs` at line 20, EventFilter currently exposes pub addresses: Vec<Address> but uses Vec::contains in its Matcher/ParsedFilter match logic (linear scan); to keep external Vec shape for compatibility while improving hot-path membership checks, add an internal HashSet<Address> constructed once when creating the matcher (e.g., in the Matcher::new or ParsedFilter::new path) and switch the membership checks in the Matcher impl and ParsedFilter match methods from self.addresses.contains(...) to this internal_set.contains(...); keep the public addresses: Vec<Address> as-is so serialization/compatibility doesn't change, but use the HashSet for all hot-path .contains lookups and ensure duplicates are deduplicated when building the set.Source: Linked repositories
src/rpc/methods/eth/filter/mod.rs (1)
596-596:
⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoffPerformance concern: Vec.contains() in topic matching hot path.
This line performs
vec.contains(&hash)whereEthHashList::Listnow usesVecinstead ofHashSet. Thematches()method is invoked per entry per event, making this an O(n) membership check in a hot path. If topic lists contain many hashes, this could create a performance bottleneck.The linked repositories research indicates Lotus defines
EthHashListas a slice, explaining the Vec choice for interoperability. However, the performance tradeoff should be measured, especially for filters with large topic lists.Consider: if topic lists are typically small (1-5 items), the overhead may be negligible. If they can be large (dozens or hundreds), converting to HashSet for membership checks would provide significant speedup.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rpc/methods/eth/filter/mod.rs` at line 596, The hot-path call EthHashList::List(vec) uses vec.contains(&hash) inside matches(), causing O(n) checks per event; change the membership test to use a HashSet or a cached lookup structure to make it O(1): either modify EthHashList to hold a HashSet for List variants or, inside the matches() method, convert the Vec to a HashSet once (or maintain a cached HashSet per filter) before iterating events; update references to EthHashList::List and matches() accordingly and ensure any conversion cost is amortized or benchmarked for typical topic-list sizes.Source: Linked repositories
Summary of changes
There're quite a fewcontainsinMatcherimplementations, whereHashSetshould be more efficient thanVecPartially ports 7df7237 from #6720
Changes introduced in this pull request:
Reference issue to close (if applicable)
Closes
Other information and links
Change checklist
Outside contributions
Summary by CodeRabbit