fix: downloads count for ai report (IN-1182)#4228
Open
joanagmaia wants to merge 10 commits into
Open
Conversation
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the Tinybird pipe that computes per-project package metrics for the Agentic AI report, aiming to correct how cumulative package download metrics are deduplicated and rolled up to the project level.
Changes:
- Replaces
argMax(..., date)reads frompackageDownloads FINALwith a deduplication subquery usingargMax(..., updatedAt)per(insightsProjectId, date, ecosystem, repo, name). - Switches the rollup logic for downloads and docker pulls to use
max/maxIfwhen computing totals and 30-day deltas. - Tightens the AI project filter to exclude empty
insightsProjectIdvalues fromai_repos.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 80d2424. Configure here.
…(IN-1182) Signed-off-by: Joana Maia <jmaia@contractor.linuxfoundation.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Fixes incorrect download and Docker pull metrics on the Agentic AI projects list by rewriting the
ai_package_metricsnode inagentic_ai_projects_list_copy.pipe.What changed:
ai_reposnow excludes repositories with an emptyinsightsProjectId, so all downstream nodes (activity, contributor, PR, issue metrics) automatically skip those repos.ai_package_metricsis restructured as a three-level pipeline:argMax(..., updatedAt)per(insightsProjectId, date, ecosystem, repo, name)to collapse duplicate/versioned rows inpackageDownloads.(insightsProjectId, ecosystem, repo, name)and appliesargMax(..., date)/argMaxIf(..., date, ...)independently per package, so each package contributes its own latest cumulative value and its own 30-day delta regardless of when it was last updated.sumacross packages perinsightsProjectIdfor downloads and Docker pulls;maxfor dependent repo/package counts.The previous single-pass query either picked one package's value per project (original) or summed per calendar date and silently dropped packages whose last row predated the project's latest date (intermediate version).
Note
Medium Risk
Changes analytics SQL that feeds the daily
agentic_ai_projects_list_dsrefresh; metrics will shift for multi-package projects and repos withoutinsightsProjectId, but scope is limited to the Agentic AI collection pipe.Overview
Fixes incorrect download and Docker pull totals on the Agentic AI projects list by changing how
agentic_ai_projects_list_copy.pipebuilds package metrics and which repos feed the pipeline.ai_reposnow drops repositories with an emptyinsightsProjectId, so every downstream node (activity, contributors, PRs, issues, packages) no longer includes those repos.ai_package_metricsis rewritten as a three-stage rollup instead of a singleargMaxoverpackageDownloads: dedupe rows per(insightsProjectId, date, ecosystem, repo, name)withargMax(..., updatedAt), compute each package’s latest cumulative counts and 30-day deltas, thensumdownloads/Docker metrics per project (andmaxfor dependent repo/package counts). That fixes under-counting when a project has multiple packages or when packages last updated on different dates.Reviewed by Cursor Bugbot for commit 398029a. Bugbot is set up for automated code reviews on this repo. Configure here.