fix: stream large Keras HDF5 inspection#1663
Conversation
|
@codex review |
Performance BenchmarksCompared
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e8da917608
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Independent review of exact head
Strengthen pinned real-model QA to assert actual Keras layout/roots and an immutable digest, not only a passing generic format check. Full report: @codex address these blockers, add regressions and bounded-memory proof, rerun pinned QA + focused/broad validation, push, and request exact-head review. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9ece3b123f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex incorporate this additional pinned large-HDF5 regression before merge:
|
|
Summary
Testing
Committed changes on the current branch with commit |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14c0248cb0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
PR #1663 independent re-reviewMerge dispositionREQUEST CHANGES — DO NOT MERGE exact head 14c0248. Two security/correctness blockers remain after the remediation push:
A separate same-file VDS false positive causes benign aliases to emit CVE-2026-1669 and exit 1. The prior generic SoftLink and changelog blockers are fixed. Review identity and live state
FindingsP1 — Keras 3 /vars and /optimizer/vars external references still pass cleanAffected control: modelaudit/scanners/keras_h5_scanner.py:911-1039, especially the Keras 3 root enumeration at lines 1008-1039. Traversal only examines the returned roots at lines 1368-1431. The remediation adds direct VDS inspection, but its Keras 3 root discovery only enumerates layers//vars. It does not include:
Official Keras v3.13.1 evidence is in keras/src/saving/saving_lib.py: H5IOStore.get reads /vars for an empty path and path/vars for child saveables, while _load_state recursively visits children such as optimizers. I reproduced two independent 536,871,936-byte exact-head artifacts. Both contained an ordinary layers/dense/vars/0 dataset to establish Keras 3 ownership.
The exact base rejects the same large Keras 3 shape with max_file_read_size_exceeded and exit 2. This PR therefore changes a fail-closed result into a clean false negative on loader-consumed paths. This validates the newest live P1 thread, “Scan all Keras 3 vars roots before lifting size cap.” It is not fixed at the requested head. Required fix: derive all proven Keras 3 loader-consumed saveable roots under a bounded ownership strategy, at minimum root /vars and /optimizer/vars, and test ExternalLink, external storage, external VDS, same-file VDS, mixed VDS, truncation, and clean controls for each independently reachable root. P1 — Oversized name metadata remains native-memory-unboundedAffected control: modelaudit/scanners/keras_h5_scanner.py:1075-1115. Commit f9444f6 correctly avoids attrs[name] and Python list conversion after the dataspace/storage checks. However, the helper first performs attribute membership and attrs.get_id. With dense fixed-width attributes, the native HDF5/h5py layer allocates memory proportional to the encoded attribute before the code can reject its point count or storage size. Exact-head isolated subprocess measurements for valid HDF5 files sparsely inflated to 536,871,936 bytes:
For the 64 MiB attribute, exact base stayed at 75,464,704 bytes because it rejected before h5py metadata access; exact head reached 202,866,688 bytes. A component probe placed h5py.File open near 39 MiB but attribute membership/get-id near 167 MiB, proving the residual growth occurs before Python decoding. The scan now returns inconclusive rather than materializing the full Python array, which is an improvement, but a sufficiently large attribute can still OOM a worker. The production behavior is not bounded by the advertised 10 MiB/4,096-item limits. The added regression at tests/scanners/test_keras_h5_scanner.py:980-1015 uses a tiny two-element attribute and monkeypatches AttributeManager.getitem. It proves only that Python-level conversion is skipped; it cannot detect native allocation during membership/get_id. This prior P1 blocker is partially addressed but remains merge-blocking. Required fix: retain a conservative cap, inspect risky HDF5 parsing in a hard memory-limited subprocess, or add a pre-h5py mechanism that can reject oversized attribute storage without opening it through the allocating native path. Add a real dense-attribute RSS/memory-limit regression. P2 — Same-file VDS is falsely reported as CVE-2026-1669Affected control: modelaudit/scanners/keras_h5_scanner.py:1271-1313. record_virtual_dataset_sources increments external_reference_count for every virtual dataset before checking source filenames. HDF5 filename . means the virtual source is in the same file and does not cross the external-file boundary. A 536,871,936-byte Keras HDF5 with model_weights/dense/kernel mapped to VirtualSource(".", "/source") produced:
An equivalent mapping to external.h5 also produced exit 1, as expected. The unchanged Keras ZIP implementation already has the correct distinction at modelaudit/scanners/keras_zip_scanner.py:3724-3727: it skips source_filename == "." before incrementing the external count. This validates the newest live P2 thread, “Ignore same-file virtual dataset sources.” Required fix: filter same-file VDS mappings before counting or reporting external sources, keep the dataset clean when all mappings are same-file, preserve external findings for mixed mappings, and fail closed if the bounded source walk cannot determine whether unvisited mappings are external. P3 validation gap — pinned real Hugging Face QA is still incompleteAffected tests: tests/scanners/test_keras_h5_scanner.py:1018-1055. The XLM-R test is stronger than the prior head: it now asserts exact root keys, exact root attributes, bounded layer_names decoding, and layer_names == ["roberta"]. That closes the prior concern that a generic HDF5 pass alone could satisfy the test. Two requested pieces are still absent at exact head:
The connector comment references commit 7290161f9927df61c2422d6992ac3c6cb66dee66, but that commit is not in the exact four-commit PR head. The current branch contains only e8da917, 9ece3b1, f9444f6, and 14c0248. The opt-in XLM-R test was skipped locally because no cached 2.24 GB artifact was present and huggingface.co is unavailable from this environment. This gap does not supersede the reproduced source bugs above, but it means realistic multi-GB Hugging Face compatibility is not independently closed. Prior-blocker reconciliation
Review-thread reconciliationLive review threads inspected at exact head:
No GitHub comments or reviews were posted by this re-review. Exact-head validationStatic checks
Focused tests
Adversarial end-to-end probes
CI snapshotAt the final snapshot, passing checks included CodeQL, CodeQL Security Analysis, lint/format, type check, dependency audit, build/package, Docker lightweight/TensorFlow images and aggregate Docker success, benchmarks, documentation formatting, vendored-proto tests/reproducibility, PR title, and Python 3.13. Still pending at the snapshot:
Skipped matrix jobs were reported as skipped rather than failed. CI therefore was not fully complete, but the merge disposition is already request-changes based on deterministic exact-head reproductions. Merge recommendationDo not merge 14c0248. Before another merge-gate review:
|
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep them coming! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep them coming! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Independent exact-head re-review of
Prior |
|
Pinned real-model QA from Hugging Face rank 255 reproduces the large-file coverage blocker on current main.
Positive control: the 6,173,264,923-byte |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7f2efbfdc5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep it up! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Didn't find any major issues. 👍 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 91bcec711d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep it up! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Independent review of exact head
Validation: 322 Keras HDF5 tests passed with three opt-in HF skips. Custom QA covered malformed configs, Lambda code, userblock HDF5, extensionless/HF paths, link cycles, VDS, sparse/chunked/compressed datasets, and direct/archive parity. Exact-head CI is green and threads are resolved, but the P1 false-clean behavior blocks merge. |
…t33-large-keras-h5-streaming-20260610 # Conflicts: # modelaudit/core.py
…t33-large-keras-h5-streaming-20260610
…t33-large-keras-h5-streaming-20260610 # Conflicts: # modelaudit/core.py
…t33-large-keras-h5-streaming-20260610
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aeae89f400
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…t33-large-keras-h5-streaming-20260610
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dda569c5f1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2a49afb026
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 08924c1c79
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…t33-large-keras-h5-streaming-20260610
Summary
h5pymetadata/link traversal instead of the generic 512 MiB whole-file read gate.Root Cause
KerasH5Scannerinherited the defaultBaseScannermax_file_read_sizeof 512 MiB, and core also attempted top-level hashing before scanner dispatch. The pinned 2.24 GB valid HDF5 model was therefore rejected beforeh5py.File(...)could perform file-backed inspection.Baseline on
origin/mainat8d6c4864fe2ea833ceaef1b9803d225afb1e8d69:{"success": false, "exit_code": 2, "bytes_scanned": 0, "scan_outcome_reasons": ["max_file_read_size_exceeded"], "file_size": 2240076248}Security Tradeoffs
Large HDF5 files now skip whole-file integrity hashing and report
content_hash=nullat the aggregate level, because hashing would reintroduce the same full-file read. Small H5 files keep the existing file integrity hash. Keras H5 inspection remains bounded by explicit HDF5 JSON/name attribute, link traversal, SoftLink resolution, virtual dataset source, and external-reference reporting limits; ambiguous or malformed evidence remains inconclusive unless a security finding is already present.Pinned Real-Model QA
Pinned model:
Opt-in regression:
Outcome:
Direct terminal scan outcome on the cached pinned file:
{"content_hash": null, "elapsed_seconds": 0.869, "exit_code": 0, "file_size": 2240076248, "files_scanned": 1, "max_rss_kb": 96920, "scan_outcome": null, "scan_outcome_reasons": null, "scanner_names": ["keras_h5"], "success": true}The real-model test also asserts the bounded HDF5 metadata shape: root keys
["roberta", "top_level_model_weights"], attrs{"backend", "keras_version", "layer_names"}, and boundedlayer_names == ["roberta"].Malicious And Malformed Controls
CVE-2025-9905and exits with security findings.model_configfails closed as inconclusive withoutmax_file_read_size_exceeded.CVE-2026-1669.CVE-2026-1669.CVE-2026-1669.varsandoptimizer/varsexternal-link/VDS controls still reportCVE-2026-1669.layer_names/weight_namesattributes fail closed before materialization./layersSoftLinks stay clean instead of being treated as Keras weights.Validation
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/ uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/ uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/ PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_keras_h5_scanner.py -q MODELAUDIT_RUN_REAL_HF_H5=1 MODELAUDIT_HF_CACHE_DIR=../hf-cache PROMPTFOO_DISABLE_TELEMETRY=1 HF_HUB_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_keras_h5_scanner.py::test_real_hf_xlm_roberta_large_h5_reaches_keras_scan_without_read_cap -q PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest -n auto -m "not slow and not integration" --maxfail=1 git diff --checkResults: