Skip to content

OCPBUGS-88450: Use fixed artifacts directory to prevent stale temp dir accumulation#16606

Merged
openshift-merge-bot[bot] merged 2 commits into
openshift:mainfrom
matzew:OCPBUGS-88450
Jun 17, 2026
Merged

OCPBUGS-88450: Use fixed artifacts directory to prevent stale temp dir accumulation#16606
openshift-merge-bot[bot] merged 2 commits into
openshift:mainfrom
matzew:OCPBUGS-88450

Conversation

@matzew

@matzew matzew commented Jun 12, 2026

Copy link
Copy Markdown
Member

The downloads server previously created a new random temp directory via os.MkdirTemp on each startup. When Kubernetes restarts the container (without deleting the pod), the emptyDir volume persists and each restart leaves behind ~3.1G of orphaned artifacts. The defer-based cleanup in main() only runs on graceful exit, not on SIGKILL.

Switch to a fixed, well-known path (/tmp/artifacts) and clean it on startup with os.RemoveAll before re-creating it. This guarantees at most one copy of the artifacts exists at any time.

Summary by CodeRabbit

  • Improvements
    • Downloads now use a consistent, fixed working directory for storing artifacts rather than transient random locations.
    • On startup, the service automatically cleans up stale or outdated artifact data to prevent conflicts and improve reliability.
    • Artifact data is no longer proactively removed during shutdown, reducing unexpected cleanup during service restarts.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d2e38a79-a3e5-43af-b73a-7039116227ae

📥 Commits

Reviewing files that changed from the base of the PR and between f192faf and 1bb68e7.

📒 Files selected for processing (2)
  • cmd/downloads/config/downloads_config.go
  • cmd/downloads/main.go
💤 Files with no reviewable changes (1)
  • cmd/downloads/main.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/downloads/config/downloads_config.go

Walkthrough

The downloads server now uses a fixed /tmp/artifacts directory. Initialization deletes any existing contents and recreates the directory; the previous deferred removal on shutdown was removed.

Changes

Downloads Config Artifact Directory

Layer / File(s) Summary
Fixed artifacts directory initialization & startup cleanup
cmd/downloads/config/downloads_config.go, cmd/downloads/main.go
Adds defaultArtifactsDir = "/tmp/artifacts", updates NewDownloadsServerConfig to os.RemoveAll and os.MkdirAll that path (recreating with permissions), and removes the deferred os.RemoveAll(downloadsServerConfig.TempDir) from main.go.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description provides clear analysis of the root cause and solution, but most required template sections (Screenshots, Test setup, Test cases, Browser conformance, Reviewers) are missing or incomplete. Fill in the missing template sections including test setup, test cases, and reviewer assignments to meet the repository's pull request requirements.
✅ Passed checks (14 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: using a fixed artifacts directory to prevent stale temp directory accumulation, matching the primary objective of the pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Repo contains no Ginkgo tests (no onsi/ginkgo imports and no It(/Describe(/Context(/When( in *_test.go), so PR introduces no dynamic Ginkgo test titles.
Test Structure And Quality ✅ Passed No Ginkgo test code present: repo search found 0 *_test.go importing github.com/onsi/ginkgo; only standard testing in cmd/downloads/config/downloads_config_test.go.
Microshift Test Compatibility ✅ Passed PR #16606 only modifies cmd/downloads/config/downloads_config.go and cmd/downloads/main.go; no new Ginkgo e2e tests or MicroShift-incompatible API/resource usage detected.
Single Node Openshift (Sno) Test Compatibility ✅ Passed Checked PR commit file list and scanned all .go files for ginkgo imports/usage (e.g., "onsi/ginkgo"); 0 matches found, so no Ginkgo e2e tests were added/affected.
Topology-Aware Scheduling Compatibility ✅ Passed PR only updates cmd/downloads/config/downloads_config.go to use/clean /tmp/artifacts; no changes to pod scheduling constraints (affinity/anti-affinity/topologySpread/nodeSelector/PDB) were introduced.
Ote Binary Stdout Contract ✅ Passed cmd/downloads/main.go and downloads_config.go contain no fmt.Print/Printf to stdout; they only use klog.Infof/Info. klog defaults to writing to stderr.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR #16606 changes only cmd/downloads/config/downloads_config.go (no e2e/Ginkgo test files touched), so no IPv4/disconnected network assumptions to flag. citeturn1view0
No-Weak-Crypto ✅ Passed Scanned cmd/downloads (incl. downloads_config.go + main.go) for MD5/SHA1/DES/RC4/3DES/Blowfish/ECB and crypto-token compare patterns; none found—changes are only /tmp/artifacts cleanup logic.
Container-Privileges ✅ Passed Repo manifests/fixtures contain no privileged indicators (hostPID/hostNetwork/hostIPC, SYS_ADMIN, privileged:true, allowPrivilegeEscalation:true, or runAsUser:0/runAsNonRoot:false) in current conte...
No-Sensitive-Data-In-Logs ✅ Passed Reviewed cmd/downloads/config/downloads_config.go and cmd/downloads/main.go logs: only cleanup/config messages with /tmp artifacts path and port; no passwords/tokens/API keys/PII appear.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 12, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@matzew: This pull request references Jira Issue OCPBUGS-88450, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

The downloads server previously created a new random temp directory via os.MkdirTemp on each startup. When Kubernetes restarts the container (without deleting the pod), the emptyDir volume persists and each restart leaves behind ~3.1G of orphaned artifacts. The defer-based cleanup in main() only runs on graceful exit, not on SIGKILL.

Switch to a fixed, well-known path (/tmp/artifacts) and clean it on startup with os.RemoveAll before re-creating it. This guarantees at most one copy of the artifacts exists at any time.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot requested review from TheRealJon and jhadvig June 12, 2026 08:30
@openshift-ci openshift-ci Bot added the component/backend Related to backend label Jun 12, 2026
…r accumulation

The downloads server previously created a new random temp directory via
os.MkdirTemp on each startup. When Kubernetes restarts the container
(without deleting the pod), the emptyDir volume persists and each
restart leaves behind ~3.1G of orphaned artifacts. The defer-based
cleanup in main() only runs on graceful exit, not on SIGKILL.

Switch to a fixed, well-known path (/tmp/artifacts) and clean it on
startup with os.RemoveAll before re-creating it. This guarantees at
most one copy of the artifacts exists at any time.

Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>
@matzew

matzew commented Jun 12, 2026

Copy link
Copy Markdown
Member Author

/cherry-pick release-4.22

@openshift-cherrypick-robot

Copy link
Copy Markdown

@matzew: once the present PR merges, I will cherry-pick it on top of release-4.22 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-4.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@matzew

matzew commented Jun 12, 2026

Copy link
Copy Markdown
Member Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 12, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@matzew: This pull request references Jira Issue OCPBUGS-88450, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@matzew

matzew commented Jun 12, 2026

Copy link
Copy Markdown
Member Author

/assign @jhadvig
mind take a look ?

praveenkumar added a commit to praveenkumar/snc that referenced this pull request Jun 15, 2026
The downloads server creates a new temporary directory (via Python's `tempfile.mkdtemp()`)
each time it starts up and extracts the `oc` CLI artifacts for all platforms into it (~3.1G per directory).
The `emptyDir` volume backing `/tmp` persists across container restarts by Kubernetes design,
it is only cleared when the pod is deleted from the node.

The downloads server does not clean up old temporary directories from previous runs on startup.
Each container restart leaves behind a full copy of the CLI artifacts, causing linear disk growth.

- https://redhat.atlassian.net/browse/OCPBUGS-88450

This is used as workaround and once
openshift/console#16606 is merged and part of
OCP payload, will revert this one.
@praveenkumar

Copy link
Copy Markdown

This is kind of urgent for us since it is increasing the size of disk can we prioritize it?

@matzew

matzew commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

/test e2e-playwright

@jhadvig

jhadvig commented Jun 17, 2026

Copy link
Copy Markdown
Member

Nice fix! Thanks @matzew
The defer cleanup was effectively dead code since klog.Fatal calls os.Exit(1) which skips defers, and SIGTERM/SIGKILL from Kubernetes also won't run them. Cleanup-on-startup is the right approach.

Suggestion: clean up pre-existing random temp dirs from the old code

The first deployment with this fix won't clean up stale random temp dirs created by the old os.MkdirTemp("", "artifacts") code (e.g. /tmp/artifacts1234567890). Consider adding a glob cleanup before recreating the fixed directory:

// Clean up stale temp dirs from previous code versions and current fixed path
matches, _ := filepath.Glob("/tmp/artifacts*")
for _, match := range matches {
    os.RemoveAll(match)
}
if err := os.MkdirAll(tempDir, 0755); err != nil {
    return nil, fmt.Errorf("failed to create artifacts directory: %w", err)
}

This handles both the old random dirs and the new fixed path in one pass, and the existing os.RemoveAll(tempDir) call becomes unnecessary.

Request: add unit tests for NewDownloadsServerConfig directory handling

The existing tests in downloads_config_test.go all use a newTestConfig helper that bypasses NewDownloadsServerConfig entirely, so the changed code path (directory creation and cleanup) has no test coverage. It would be good to add tests verifying:

  1. A clean start creates the directory and populates it
  2. A restart with stale data from a previous run cleans up and recreates correctly
  3. Old random temp dirs (/tmp/artifacts* from the pre-fix code) are also cleaned up

Address review feedback:

- Use filepath.Glob("/tmp/artifacts*") to also remove stale random
  temp directories left by the old os.MkdirTemp code (e.g.
  /tmp/artifacts1234567890). The first deployment with the fixed path
  would otherwise leave those orphaned dirs behind.

- Add unit tests for NewDownloadsServerConfig directory handling:
  1. Clean start creates the directory and populates it
  2. Restart with stale data cleans up and recreates correctly
  3. Old random temp dirs (/tmp/artifacts*) are also cleaned up

Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>
@matzew

matzew commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

@jhadvig thx!
I updated this PR

@openshift-ci

openshift-ci Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

@matzew: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jhadvig

jhadvig commented Jun 17, 2026

Copy link
Copy Markdown
Member

/lgtm
/approve
/verified by CI

@jhadvig jhadvig added the plugin-api-approved Indicates a PR with plugin API changes has been approved by an API reviewer label Jun 17, 2026
@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jun 17, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@jhadvig: This PR has been marked as verified by CI.

Details

In response to this:

/lgtm
/approve
/verified by CI

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 17, 2026
@openshift-ci

openshift-ci Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jhadvig, matzew

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 17, 2026
@openshift-merge-bot openshift-merge-bot Bot merged commit 7bd93e0 into openshift:main Jun 17, 2026
9 checks passed
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@matzew: Jira Issue Verification Checks: Jira Issue OCPBUGS-88450
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-88450 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

The downloads server previously created a new random temp directory via os.MkdirTemp on each startup. When Kubernetes restarts the container (without deleting the pod), the emptyDir volume persists and each restart leaves behind ~3.1G of orphaned artifacts. The defer-based cleanup in main() only runs on graceful exit, not on SIGKILL.

Switch to a fixed, well-known path (/tmp/artifacts) and clean it on startup with os.RemoveAll before re-creating it. This guarantees at most one copy of the artifacts exists at any time.

Summary by CodeRabbit

  • Improvements
  • Downloads now use a consistent, fixed working directory for storing artifacts rather than transient random locations.
  • On startup, the service automatically cleans up stale or outdated artifact data to prevent conflicts and improve reliability.
  • Artifact data is no longer proactively removed during shutdown, reducing unexpected cleanup during service restarts.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

Copy link
Copy Markdown

@matzew: #16606 failed to apply on top of branch "release-4.22":

Applying: OCPBUGS-88450: Use fixed artifacts directory to prevent stale temp dir accumulation
Using index info to reconstruct a base tree...
M	cmd/downloads/config/downloads_config.go
M	cmd/downloads/main.go
Falling back to patching base and 3-way merge...
Auto-merging cmd/downloads/config/downloads_config.go
Auto-merging cmd/downloads/main.go
CONFLICT (content): Merge conflict in cmd/downloads/main.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Patch failed at 0001 OCPBUGS-88450: Use fixed artifacts directory to prevent stale temp dir accumulation

Details

In response to this:

/cherry-pick release-4.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@matzew matzew deleted the OCPBUGS-88450 branch June 17, 2026 14:19
@openshift-merge-robot

Copy link
Copy Markdown
Contributor

Fix included in release 5.0.0-0.nightly-2026-06-18-000016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. component/backend Related to backend jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. plugin-api-approved Indicates a PR with plugin API changes has been approved by an API reviewer verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants