Skip to content

Fenrir fixes 2026 06 09#791

Merged
mattia-moffa merged 12 commits into
wolfSSL:masterfrom
danielinux:fenrir-fixes-2026-06-09
Jun 9, 2026
Merged

Fenrir fixes 2026 06 09#791
mattia-moffa merged 12 commits into
wolfSSL:masterfrom
danielinux:fenrir-fixes-2026-06-09

Conversation

@danielinux

Copy link
Copy Markdown
Member

3743b00 F-4789: fix uint64_t overflow in squashelf range filter segment end
ae5ee14 F-4790: clamp OTP pubkey_size in keystore_get_size to prevent OOB read
d37dc6c F-4973: clamp TPM-supplied nvPublic.dataSize before NV read-back in rot.c
191f7c8 F-5129: fix stm32h7 hal_flash_erase Bank 2 sector underflow
2affb7d F-5131: fix stale delta inverse-patch offset when cert chain expands header
8c52619 F-5132: complete mpusize() table so MPU stays enabled for >64KB wolfBoot
5044d6f F-5352: emit 4-byte delta size TLVs from Python signer
bfa7311 F-5356: bound i.MX RT flash write copy to caller buffer length
09fef8f F-5674: cap PCI IO BAR allocator at the 16-bit IO ceiling

Copilot AI review requested due to automatic review settings June 9, 2026 08:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is a batch of security/correctness fixes across wolfBoot tooling and HALs, with accompanying regression tests to prevent reintroducing overflow/OOB and sizing issues.

Changes:

  • Hardened multiple size/offset computations (TPM NV read-back size, squashelf range filter end computation, PCI IO BAR allocation ceiling, i.MX RT flash write source bounds, STM32H7 dual-bank erase sector math, OTP keystore pubkey_size validation, MPU region sizing table completeness).
  • Fixed delta-signing metadata issues (Python signer emits 4-byte delta size TLVs; C signer avoids stale inverse-patch offset when header expands).
  • Added/expanded unit/regression tests for the above behaviors (C unit tests + Python integration-style tests for signers/squashelf).

Reviewed changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tools/unit-tests/unit-sign-delta-tlv.py New regression test ensuring sign.py emits 4-byte delta TLV lengths compatible with bootloader parsing.
tools/unit-tests/unit-sign-delta-cert-inv-off.py New regression test for C signer inverse-patch offset consistency when cert chain expands the delta header.
tools/unit-tests/unit-rot-auth.c Extends TPM ROT-auth unit test harness to validate NV dataSize clamping behavior.
tools/unit-tests/unit-pci.c Adds coverage ensuring PCI IO allocation never programs bridge IO windows past 16-bit IO space.
tools/unit-tests/unit-otp-keystore.c New unit tests for keystore_get_size() rejecting oversized OTP pubkey_size.
tools/unit-tests/unit-mpusize.c New unit tests for mpusize() mapping above 64KB so MPU remains enabled for larger images.
tools/unit-tests/unit-flash-erase-h7.c New unit tests validating STM32H7 Bank2 erase sector arithmetic across multiple sectors.
tools/unit-tests/Makefile Adds new unit test binaries and runs new Python regression tests.
tools/tpm/rot.c Clamps TPM-provided NV public dataSize before NV read-back into a fixed digest buffer.
tools/squashelf/test-range-overflow.py New regression test for crafted ELF64 segment end overflow in range filtering.
tools/squashelf/squashelf.c Prevents uint64 overflow when computing PT_LOAD segment end for range filtering.
tools/squashelf/Makefile Adds test target to run the new squashelf regression test.
tools/keytools/sign.py Emits delta size TLVs with 4-byte value lengths to match bootloader expectations.
tools/keytools/sign.c Pre-resolves header growth due to cert chain before capturing inverse patch offset for delta images.
src/pci.c Caps IO BAR allocation at the 16-bit IO ceiling to avoid bridge base/limit truncation.
src/flash_otp_keystore.c Rejects OTP pubkey_size larger than the fixed cache buffer to prevent OOB reads.
src/boot_arm.c Completes mpusize() table for >64KB regions and adds a unit-test-only compilation guard.
hal/stm32h7.c Fixes Bank2 sector index computation without corrupting loop variable; adds unit-test-only guards.
hal/imx_rt.c Prevents over-reading caller buffer on sub-page writes by bounding memcpy and padding with 0xFF.
.gitignore Updates ignores/un-ignores to include newly added unit test artifacts and scripts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tools/keytools/sign.c
Comment thread tools/tpm/rot.c
Comment thread tools/unit-tests/unit-sign-delta-cert-inv-off.py
Comment thread tools/squashelf/test-range-overflow.py
Comment thread tools/squashelf/test-range-overflow.py
danielinux added 12 commits June 9, 2026 15:50
A device advertising a 64KB IO BAR pushed the IO allocator cursor
(info->io) past 0x10000 because pci_program_bar used 0xffffffff as the
IO BAR limit. The PCI-to-PCI bridge IO base/limit registers are 8-bit
and only carry address bits [15:8], so io_start >> 8 silently narrows a
0x20000 cursor to 0x00, programming a bogus bridge IO window
0x0000-0x0FFF that forwards legacy IO (8259A PIC, 8254 PIT, MC146818
RTC) to the secondary bus.

x86 IO space is 16-bit, so IO BARs must never be allocated above 0xFFFF.
Cap the IO BAR allocator limit at PCI_IO32_LIMIT (0x10000): oversized IO
BARs are now skipped at allocation time and the bridge IO window is
programmed from the real cursor, never narrowing onto legacy IO.

Add test_program_bridge_io_64k_no_narrow proving the bridge IO window is
no longer mis-programmed. The post-enum IO OOM case is removed as the
cap makes that wrap unreachable.
hal_flash_write() programs one full CONFIG_FLASH_PAGE_SIZE (256/512 byte)
page per loop iteration but unconditionally memcpy'd a whole page out of
the caller's buffer regardless of len. Sub-page writes - notably the
1-byte trailer updates from set_trailer_at()/trailer_write() on the
non-NVM_FLASH_WRITEONCE i.MX RT configs - therefore overread the source
buffer (e.g. 255 bytes past a 1-byte stack value) and could program
adjacent stack/RAM contents into flash.

Bound the copy to min(page, len - i) and pad the rest of the page buffer
with the erased value (0xFF). 0xFF is a no-op for NOR programming, so the
existing flash contents of the rest of the page are preserved.
sign.py encoded HDR_IMG_DELTA_SIZE and HDR_IMG_DELTA_INVERSE_SIZE with a
2-byte length via struct.pack("<H", ...), but wolfBoot_get_delta_info()
accepts those tags only when wolfBoot_find_header() returns
sizeof(uint32_t). Delta images produced by sign.py were therefore signed
with parseable TLVs yet rejected by the bootloader before the patch was
applied. Encode both size TLVs as 4-byte little-endian values, matching
sign.c (header_append_tag_u32) and the bootloader parser.

Add a regression test that signs a real delta image with sign.py and
asserts the bootloader-side parse recovers each delta TLV with the
required 4-byte length.
The mpusize() lookup in boot_arm.c only covered sizes up to 64KB and
returned MPUSIZE_ERR for anything larger. mpu_init() passes the wolfBoot
.text+.rodata span (_stored_data - _start_text) to mpusize() and bails
out at the MPUSIZE_ERR guard before reaching mpu_on(). Any build whose
bootloader image exceeds 64KB (TrustZone, PQC, delta-update, or several
crypto algorithms) therefore left MPU_CTRL clear, silently disabling all
five MPU regions for the lifetime of the bootloader.

Fill in the missing power-of-two entries from 128KB through 128MB
(ARMv7-M SIZE field = log2(bytes)-1, shifted into the RASR layout) so
the flash region size is resolved and mpu_on() is reached.

Add tools/unit-tests/unit-mpusize.c, which includes the real mpusize()
from boot_arm.c (guarded to its host-portable MPU helpers via
WOLFBOOT_UNIT_TEST_MPU) and checks that sizes above 64KB no longer map
to MPUSIZE_ERR. The test fails before this fix and passes after.
…header

base_diff() captured patch_inv_off = len3 + CMD.header_sz before calling
make_header_delta(), which signs the delta image via make_header_ex(is_diff=1).
When a certificate chain is present, the delta (is_diff=1) header needs ~72
more bytes than the non-delta header for the four delta TLVs plus the base-hash
TLV. For a window of cert-chain sizes, header_required_size(is_diff=0) still fit
the current CMD.header_sz while header_required_size(is_diff=1) did not, so
make_header_ex(is_diff=1) grew CMD.header_sz to the next power of two *after*
patch_inv_off was captured. The HDR_IMG_DELTA_INVERSE TLV then encoded a stale,
too-small offset; the bootloader (update_flash.c) uses it as a raw byte offset
into the update partition to locate the inverse patch, so rollback read from the
wrong offset and failed.

Resolve the is_diff=1 header-size expansion (same logic as make_header_ex)
before computing patch_inv_off. Add unit-sign-delta-cert-inv-off.py, which signs
an ed25519 delta with a 300-byte chain (inside the triggering window) and
asserts the inverse patch is the trailing HDR_IMG_DELTA_INVERSE_SIZE bytes of
the file; it fails before this fix.
The Bank 2 branch of hal_flash_erase() subtracted the absolute base
FLASH_BANK2_BASE (0x08100000) from the bank-relative loop offset p,
instead of the relative FLASH_BANK2_BASE_REL (0x00100000) used by every
other comparison in the function. For a Bank 2 offset (p >= 0x00100000)
this underflowed uint32_t to ~0xF8000000, with two effects:

  1. the SNB sector index programmed into FLASH_CR2 came from the
     underflowed value (always sector 0, with stray high bits leaking
     into other CR2 fields), and
  2. the subtraction mutated the loop variable p itself, so after
     p += FLASH_PAGE_SIZE the offset jumped past any end_address and the
     loop exited after a single iteration.

The combined result: any multi-sector Bank 2 erase touched only one
sector with the wrong index, silently leaving the remaining requested
sectors (e.g. the SWAP partition at 0x081C0000) unerased.

Compute the sector index in a temporary from FLASH_BANK2_BASE_REL so the
loop variable is preserved, and mask it with FLASH_CR_SNB_MASK before
shifting into FLASH_CR2.

Adds unit-flash-erase-h7, which compiles hal_flash_erase() in isolation
(guarded by WOLFBOOT_UNIT_TEST_FLASH_ERASE, mirroring the unit-mpusize
approach for boot_arm.c) against mocked flash registers and asserts that
a two-sector Bank 2 erase programs sectors 6 and 7 across two iterations.
…ot.c

The TPM-bus-supplied UINT16 nvPublic.dataSize was assigned to digestSz and
forwarded to wolfTPM2_NVReadAuth as the read byte count with no bound check.
wolfTPM2_NVReadAuthPolicy uses that count as the XMEMCPY length into the
caller's buffer with no separate capacity argument, so a malicious/emulated
TPM (or a pre-existing NV index larger than the hash) reporting dataSize > 64
overflows the 64-byte digest[WC_MAX_DIGEST_SIZE] stack buffer.

Clamp digestSz to sizeof(digest) before the read. Stored values are key-hash
digests (<= WC_MAX_DIGEST_SIZE), so the clamp never truncates valid data.

Add a unit-rot-auth case driving nvPublic.dataSize=1000 through the existing
mocked harness, asserting the requested read count is clamped to the buffer.
keystore_get_size() returned slot->pubkey_size verbatim from the OTP
keystore slot with no upper bound. A corrupted or mis-provisioned slot
with pubkey_size > KEYSTORE_PUBKEY_SIZE produces a positive value that
passes every caller guard (pubkey_sz < 0 / <= 0). The callers in image.c
(key_sha256/key_sha384/key_sha3_384 and the ECC verify y-coordinate
offset) then read past otp_slot_item_cache, which only holds
KEYSTORE_PUBKEY_SIZE pubkey bytes.

Reject an out-of-range pubkey_size by returning -1, matching the existing
defensive validation of item_count in keystore_num_pubkeys() and the -1
error convention the callers already handle.

Add unit-otp-keystore, which compiles flash_otp_keystore.c in isolation
and verifies keystore_get_size() rejects oversized slots.
In the PT_LOAD range filter, segmentEnd was computed as
p_paddr + p_memsz - 1 in uint64_t with no overflow check. Both fields
come straight from the (possibly crafted) ELF64 program header. When
p_paddr + p_memsz exceeds 2^64 the result wraps below p_paddr: the
wrapped end can land back inside the requested range, so a segment whose
true span lies entirely outside the range is spuriously included (and the
converse can silently drop a valid segment). squashelf would then emit a
squashed image that wolfBoot signs and boots.

Detect the overflow before computing segmentEnd and treat such a segment
as out-of-range. ELF32 cannot overflow a uint64 sum and is unaffected.

Adds tools/squashelf/test-range-overflow.py (run via `make test`) which
fails before the fix and passes after.
@danielinux danielinux force-pushed the fenrir-fixes-2026-06-09 branch from 44a9d89 to 091756c Compare June 9, 2026 13:52
@mattia-moffa mattia-moffa merged commit b1ad603 into wolfSSL:master Jun 9, 2026
535 of 538 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants