Skip to content

DM-55232: fix zstd failure in aggregate-graph due to huge logs#565

Open
TallJimbo wants to merge 2 commits into
mainfrom
tickets/DM-55232
Open

DM-55232: fix zstd failure in aggregate-graph due to huge logs#565
TallJimbo wants to merge 2 commits into
mainfrom
tickets/DM-55232

Conversation

@TallJimbo

@TallJimbo TallJimbo commented Jun 17, 2026

Copy link
Copy Markdown
Member

Checklist

  • ran Jenkins
  • ran and inspected package-docs build
  • added a release note for user-visible changes to doc/changes

@codecov

codecov Bot commented Jun 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 40.62500% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.67%. Comparing base (85e053f) to head (b017815).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...lsst/pipe/base/quantum_graph/aggregator/_writer.py 36.66% 17 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #565      +/-   ##
==========================================
- Coverage   88.79%   88.67%   -0.13%     
==========================================
  Files         160      160              
  Lines       22120    22150      +30     
  Branches     2625     2627       +2     
==========================================
  Hits        19641    19641              
- Misses       1843     1866      +23     
- Partials      636      643       +7     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@MichelleGower MichelleGower left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple comments. Merge approved.

break
if training_sample_size >= self.comms.config.zstd_dict_input_max_bytes:
self.comms.log.warning(
"Truncating compression dict training sample after %d predicted quanta.",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it would be quicker to understand if instead of saying "Truncating" saying something like "Reached compression dict training sample limit after..."

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do you want a similar warning with the above break for counts?

pred_quanta_sum = sum(len(block) for block in training_inputs[:n_pred_quanta])
prov_quanta_sum = sum(len(block) for block in training_inputs[n_pred_quanta::3])
metadata_sum = sum(len(block) for block in training_inputs[n_pred_quanta + 1 :: 3])
logs_sum = sum(len(block) for block in training_inputs[n_pred_quanta + 2 :: 3])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I possibly would set the step (3) in a variable to make sure future changes get them all. But the lines are all here together so...

If it is easy to write a unit test to catch if we add/remove something to/from the training_inputs but forget to change the step, that could avoid trying to track down a bug in the exception handling code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants