Skip to content

[P3][kernel] Stream recovery record-by-record to bound open() memory #64

Description

@cevheri

Summary

Roadmap item. recover() currently reads the ENTIRE WAL into one Uint8Array (file.read(0, file.size())) and replays from that buffer. Two ceilings follow, both hit before any OS or browser storage quota does:

  • A single-allocation ceiling: JS engines cap one ArrayBuffer/TypedArray at ~8 GiB on 64-bit (~2 GiB on 32-bit), so a file beyond that cannot even be opened.
  • A transient ~2x memory cost at open: the file buffer and the parsed store coexist.

In browsers this is the binding constraint on database size (OPFS quotas are tens of GB in Chrome; the single-buffer read gives up far earlier) — measured and discussed in the browser-storage docs issue.

Why this is cheap architecturally

The WalFile seam ALREADY supports positional reads (read(offset, length)), and the v0.2.0 adapters honor it properly (node-fs does fd/offset reads; OPFS loops reads). recover() simply chooses to issue one whole-file read. Streaming is a recover()-local change:

  1. Read the 8-byte file header (or probe the legacy first record).
  2. Loop: read one 12-byte record header (validate its self-checksum), then read exactly the payload it promises, verify the payload CRC, replay, advance.
  3. Torn-tail / mid-log corruption classification carries over unchanged — the same decisions, made per-record instead of over a full buffer.
  4. A short read that is not end-of-file remains INCOMPLETE_READ.

Peak memory drops from (file size + store) to (largest single record + store).

Scope notes

Acceptance criteria

  • open() peak memory no longer includes the whole file (demonstrated with a large-file test or instrumented fake fs).
  • All existing recovery/DST tests green unchanged; new chunked-read DST profile.
  • Changeset (minor) if observable; gate green.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions