Several visitor/evaluator edge cases appear unsafe or inconsistent:
-
_StrictMetricsEvaluator.visit_not_equal / visit_not_in return ROWS_MUST_MATCH when a file can contain nulls or NaNs. Example stats with [null, 5] or [NaN, 5.0] and lower/upper bounds both 5 return true for NotEqualTo("x", 5) / NotIn("x", {5}), even though one row does not match. This can incorrectly mark whole files deleted.
-
_StrictMetricsEvaluator.eval returns ROWS_MUST_MATCH for record_count <= 0. record_count=0 is vacuously true, but record_count=-1 is unknown per the local comment; even AlwaysFalse() returns true.
-
ResidualVisitor comparison methods directly compare partition values to literals. A nullable identity partition value of None with LessThan("x", 1) raises TypeError, while row evaluation returns false.
-
ResidualVisitor.visit_not_nan(None) returns AlwaysFalse, while expression evaluation treats NotNaN(None) as true. Existing tests encode both behaviors, so the semantics are inconsistent.
Validated against the current tree; examples use stats/partition shapes already supported by the repo tests.
Several visitor/evaluator edge cases appear unsafe or inconsistent:
_StrictMetricsEvaluator.visit_not_equal/visit_not_inreturnROWS_MUST_MATCHwhen a file can contain nulls or NaNs. Example stats with[null, 5]or[NaN, 5.0]and lower/upper bounds both5return true forNotEqualTo("x", 5)/NotIn("x", {5}), even though one row does not match. This can incorrectly mark whole files deleted._StrictMetricsEvaluator.evalreturnsROWS_MUST_MATCHforrecord_count <= 0.record_count=0is vacuously true, butrecord_count=-1is unknown per the local comment; evenAlwaysFalse()returns true.ResidualVisitorcomparison methods directly compare partition values to literals. A nullable identity partition value ofNonewithLessThan("x", 1)raisesTypeError, while row evaluation returns false.ResidualVisitor.visit_not_nan(None)returnsAlwaysFalse, while expression evaluation treatsNotNaN(None)as true. Existing tests encode both behaviors, so the semantics are inconsistent.Validated against the current tree; examples use stats/partition shapes already supported by the repo tests.