interactive: port the data model from flat [i64] to a Value ADT + Term scalar language#760
Merged
frankmcsherry merged 6 commits intoJun 16, 2026
Conversation
Replace the flat [i64]/FieldExpr data model with the Value ADT (Int/Tuple/
Variant/List) and the Term scalar language, on master-next's scope-tree IR +
substrate-generic backend.
- ir.rs: Value + the tree-walking Term interpreter (eval); LinearOp gains
FlatMap, Filter/EnterAt now carry Term. Drops RowLike/FieldExpr eval and the
arity transfer functions (those were explain-only).
- parse: Projection is now {key: Term, val: Term}; Reducer gains Collect; Expr
gains FlatMap. Both front-ends parse the full Term grammar (tuples/lists/
spread, proj, inject/case, fold, builtins) plus named constructors + pattern
`case` (pipe), reconciled with master-next's import/export syntax.
- backend/vec.rs: Row = Value; render_linear/join/reduce evaluate Terms;
Collect NEST reducer. Value derives serde (ExchangeData bound).
- gen_row produces (Tuple[Int;arity], unit); ddir_vec gains EDGES_FILE input.
Deferred to later stages: explain + its folded helper (need RowModel for
Value/Term), and the col substrate (needs a Columnar story for Value).
Verified: lib tests pass; reach.ddp (root 0, chain 0-1-2-3) -> 4 reachable;
scc.ddp (cycle 0-1-2 + trivial 3-4) -> 3 cycle edges.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Port the Value/ADT example programs onto the scope-tree base (old `result …;` -> `export "result" = …;`), exercising the new scalar language end-to-end: - unnest.ddp — flatmap (UNNEST) / collect (NEST) list round-trip - binders.ddp — fold with named pattern-`case` binders - adt.ddp — named constructors + pattern `case` - congruence.ddp / eqsat.ddp — variable-arity e-node congruence and the full equality-saturation fixpoint - cse_tree.ddp — common-subexpression sharing over expression trees Verified on master-next: eqsat reproduces both scenarios (pure congruence 5~1 then mul(5,2)~mul(1,2); and the a~b cascade collapsing all three muls); unnest round-trips position-ordered; adt yields the same 98/102 buckets. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Re-enable the explanation rewrite on the Value data model by implementing the decoupled `RowModel`/`Dataflow` traits for `Value`/`Term`. - explain/mod.rs: a `Val` RowModel whose demand envelope is a flat value tuple `[V | chain (innermost-first) | q]` — matching the host lift's `append_iter`. Each rule builds `Term`-based projections/predicates over field indices (replacing the flat `[i64]` `FieldExpr` column ranges); `time_le`/`strip` are inlined (the `folded` algebra), and a `Spread`-bounding `expand_value_fields` keeps bare-row refs from pulling in chain coords. `Sb`'s `Dataflow` predicate is now `Term`. The clone/resolve/shape machinery is unchanged; the shape pass is `Term`-arity. - Count now yields a one-field tuple `(count)`, keeping "a value is a tuple" so `$1[0]` and the explain envelope hold uniformly. - decouple.rs: drop the flat executable contract; the `nested_contract` model-agnostic proof remains the runnable spec. `folded.rs` retired. - tests/explain.rs restored, ported to Value rows + the flat query envelope. Verified: all 8 sufficiency tests pass, plus the heavy --ignored sweeps (scc 100/110, the join partner-time regression at 1000/1100, tc/reach fuzz). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- dump_explain: re-enabled (prints the scope-tree IR before/after the rewrite); it has no data-model dependencies and works as-is now that explain is online. - ddir_vec --explain / --query=K:V[,q] / --debug-demand: re-enabled. The query input is seeded with the flat demand envelope `(key ; val ++ q)`; demand collections can be tapped with --debug-demand. The CLI assigns every source the uniform shape (arity, 0), so --explain is for single-input-arity programs (e.g. scc); mixed-arity programs (reach's arity-1 roots) need explicit per-input shapes, as the integration tests use. Verified: scc.ddp --explain demands the cycle edges that produced the queried output. The columnar substrate (ddir_col / backend::col) stays deferred — it needs a Columnar story for Value. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…(stage 5) Restore a unit-level by-example spec for the reverse rules — but over the model the crate actually evaluates. The removed `[i64]` `contract` tested `Flat` via `eval_fields`/`eval_condition`; this `value_contract` runs the same six specs on real `Value` rows in `Val`'s flat envelope `[V | chain | q]`, through an in-memory `Value` dataflow against `explain::Val`. `nested_contract` (a different, nested layout over a toy model) stays as the proof that the *rules* are model-agnostic; `value_contract` pins the *model* the backend runs, closing the unit-coverage gap the deletion opened. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Consolidate the front-end language docs into one reference, on the `pipe` module (the .ddp front-end): the collection language (sources, pipe operators incl. flatmap/collect, statements, `con` decls) and the scalar `Term` language (row/field access, arithmetic, products/lists/sums, named constructors, pattern `case`, `fold` with `^0`/`^1`, binders, `if`). Doc-only; previously this had to be teased out of the `Term` variants, `build_builtin`, and example programs. `Term`'s doc now points here for the concrete syntax. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
b650165 to
7bbaa06
Compare
Merged
frankmcsherry
added a commit
that referenced
this pull request
Jun 17, 2026
* decouple: re-ground with the universal-backstop flatmap test Re-land the proof-of-concept (removed from #760 as PR-scoped) on the follow-up branch where it belongs: the universal backstop reverses `flatmap` — the op the live rewrite still panics on — via the existing Dataflow primitives (forward pair table, join on the output, REFORM the whole input). Grounds the inverse work before the real rule + wiring. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * explain(value): reverse FlatMap (UNNEST) — close the explainability regression Replace the `panic!` on `LinearOp::FlatMap` in the reverse walk with a real rule, `emit_lookup_flatmap`. FlatMap is same-depth (it doesn't touch iteration time) and its list rides as one opaque value, so no envelope change is needed: build the (output -> input) pair table by running the op forward on the input side, join the demand on the packed output, and recover the whole input (the `None`-inverse endpoint). The one wrinkle — a plain flatmap drops the source row and the key isn't unique — is handled by re-keying the input by itself before exploding (the source rides through in the join key) and re-projecting to (k, pos, elem) after; a chain_in <= chain_out filter keeps it sound in iterating scopes. No new primitive; uses the existing project/flatmap/join. With this, `writable => explainable` is restored for flatmap programs. Verified: a flatmap sufficiency test passes, and the full suite — incl. the heavy --ignored relational sweeps — stays green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * explain(value): confirm Collect reverses via the non-min keyed path Collect (NEST) is a Reducer, so the reverse walk already routes it through the non-min keyed lookup ("demand all same-key inputs") — which is exactly the demand for a collected list (all its members). A sufficiency test over a `| collect` program confirms the existing path handles a List-valued reducer output; no new rule needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces the flat [i64]/FieldExpr data model with a Value ADT (Int/Tuple/Variant/List) and a Term scalar language, on top of master-next's scope-tree IR +
substrate-generic backend, and brings the explanation rewrite back online over the new model. Four reviewable commits:
both parsers parse the full Term grammar (tuples/lists/spread, proj, inject/case, fold, builtins, named constructors). backend/vec.rs evaluates Terms over Value rows.
Existing programs verified (reach → 4 reachable; scc → 3 cycle edges).
congruence and the full equality-saturation fixpoint), cse_tree.
lift; time_le/strip are inlined and folded retired. All sufficiency tests pass, including the --ignored sweeps (scc 100/110, the join partner-time regression at
1000/1100, tc/reach fuzz).
Deferred: the columnar substrate (backend::col/ddir_col) needs a Columnar story for Value.
🤖 Generated with Claude Code