feat: add kfutil upgrade command by spbsoluble · Pull Request #330 · Keyfactor/kfutil

spbsoluble · 2026-06-19T22:35:43Z

Summary

Adds kfutil upgrade to atomically replace the running binary with any published GitHub release
Supports Linux, macOS, FreeBSD, and Windows (via minio/selfupdate rename-swap — no restart required)
Verifies SHA-256 checksum against the goreleaser-published SHA256SUMS file before applying

Usage

kfutil upgrade                        # install latest release
kfutil upgrade --version 1.9.2-rc.14 # install any valid GitHub tag
kfutil upgrade --dry-run              # preview without replacing the binary
kfutil upgrade --force                # upgrade even if already at target version

Changes

File	Description
`pkg/upgrade/upgrade.go`	GitHub API fetch, zip extraction, SHA-256 verification, atomic binary replace
`cmd/upgrade.go`	Cobra command with `--version`, `--dry-run`, `--force` flags
`pkg/upgrade/upgrade_test.go`	22 unit tests (asset name resolution, checksum, zip extraction, mock HTTP, size limits, URL sanitization)
`go.mod` / `go.sum`	Adds `github.com/minio/selfupdate v0.6.0`

Test plan

go test ./pkg/upgrade/... passes (22/22)
kfutil upgrade --dry-run reports "already at latest" when current
kfutil upgrade --dry-run --version <tag> resolves correct platform asset URL
kfutil upgrade --dry-run --version <nonexistent> returns clear error
kfutil upgrade --force --dry-run shows download target even when already current
Manual upgrade on Linux replaces binary and kfutil version reflects new version
Manual upgrade on Windows replaces binary without requiring a restart

Adds 'kfutil upgrade' to atomically replace the running binary with any GitHub-tagged release, including pre-releases and older versions. - pkg/upgrade/upgrade.go: fetches release metadata from GitHub API, resolves the goreleaser zip asset for the current GOOS/GOARCH, verifies SHA-256 against the published SUMS file, and applies the update via minio/selfupdate (handles Windows rename-swap natively) - cmd/upgrade.go: cobra command wiring with --version, --dry-run, --force - pkg/upgrade/upgrade_test.go: 13 unit tests covering asset name resolution, checksum verification, zip extraction, and GitHub API responses via httptest mock server Flags: --version any valid GitHub tag (e.g. v1.9.0, v1.9.2-rc.14) --dry-run show what would happen without replacing the binary --force upgrade even when already at the target version

Security (P0): - SEC-1: verifyChecksum now hashes the zip archive, not the extracted binary, matching what goreleaser records in SHA256SUMS; missing SUMS entry is now an error instead of a silent pass - SEC-2: GITHUB_TOKEN is only forwarded to hosts in an explicit allowlist (api.github.com, github.com, objects.githubusercontent.com) to prevent token exfiltration via a tampered BrowserDownloadURL - SEC-3: upgrade aborts with an error when no SHA256SUMS asset is present in the release, making integrity verification mandatory Compliance (P1/P2): - AUD-6: import zerolog; all audit log entries now use structured fields - AUD-1: log.Info events before and after binary replacement with from_version, to_version, executable, operator, and source_url fields - AUD-2: resolveOperator() captures os/user.Current().Username and includes it in every structured log entry - AUD-3: log.Error before each error return in Run() with a stable event name - AUD-4: log.Warn when --force bypasses the version-match safety guard - AUD-5: log.Debug after each HTTP response in fetchReleaseFrom and download with url, method, and status_code fields Tests: inverted TestVerifyChecksum_AssetNotInSums to require.Error; added TestDownload_TokenNotSentToUntrustedHost to verify the allowlist behaviour.

- Add upgrade.run_started event at Run() entry with operator, versions, force, and dry_run fields — establishes baseline for anomaly detection - Add log.Info for the already-current early exit (upgrade.already_current) - Add log.Info for the dry-run exit path (upgrade.dry_run) - Add log.Error for archiveURL-not-found and sumsURL-missing returns (upgrade.asset_not_found, upgrade.sums_missing) - Expand --force logging to fire whenever the flag is set, not only when the version-equality guard would have blocked — captures --version usage - Thread operator through fetchReleaseFrom and download so all HTTP-level log entries carry identity context - Log a warning when resolveOperator() fails so auditors can distinguish the "unknown" sentinel from a real user named "unknown"

- cmd/upgrade.go: explicitly set zerolog level to InfoLevel (non-debug) or DebugLevel (--debug) instead of relying on the unset default; ensures audit events are durably emitted regardless of future changes to the informDebug initialization pattern used by other commands - upgrade.fetch_release_failed: add tag field so the attempted release tag is always captured in the forensic record - upgrade.extract_failed, upgrade.apply_failed: add source_url field for full binary provenance on failed privileged operations - upgrade.dry_run: rename archive_url → source_url for consistency with applying/applied/apply_failed events (single SIEM query field) - upgrade.checksum_mismatch: add source_url field so the archive provenance is captured on the security event - upgrade.rollback_failed: add dedicated event inside apply() for the rollback-also-failed sub-path so on-call engineers can distinguish a broken binary state from an ordinary upgrade failure - fetchReleaseFrom, download: add latency_ms field to all HTTP log events (SOC2 CC9.2 vendor API response metadata) - TestDownload_TokenSentToTrustedHost: add positive allowlist test verifying token IS forwarded to hosts in allowedTokenHosts

Logging is disabled when --debug is not passed, consistent with all other kfutil commands. The compliance gap (audit trail silently dropped) is an accepted risk — noted in project memory.

- apply(): add operator param so upgrade.rollback_failed carries identity - upgrade.download_failed: rename url → source_url for SIEM field consistency - upgrade.checksum_download_failed: rename url → sums_url (distinct resource) - upgrade.github_api_response, upgrade.http_response: promote Debug → Info so external API interactions appear in production log pipelines (SOC2 CC9.2) - upgrade.apply_failed: add failure_reason field ("permission_denied" or "apply_error") so incident responders can distinguish failure modes from structured fields rather than error message strings - normalizeTag(): log "latest" instead of "" when no --version was passed so upgrade.run_started and upgrade.fetch_release_failed are unambiguous - TestDownload_NonOKStatus: test that download() errors on non-200 responses

- fetchReleaseFrom, download: log upgrade.github_api_network_error / upgrade.http_network_error (with latency_ms) on transport failure so network errors are captured even when the HTTP response is never received - fetchReleaseFrom: log upgrade.release_parse_failed when JSON decode fails so the failure category (network vs application) is distinguishable - upgrade.github_api_response, upgrade.http_response: emit at log.Warn when status_code >= 400 so SIEM error thresholds trigger on GitHub API errors (previously always log.Info regardless of status) - apiClient / downloadClient: replace http.DefaultClient (no timeout) with explicit http.Client{Timeout} — 30s for API calls, 5m for binary downloads; ensures timed-out requests produce an error event rather than silently hanging

- upgrade.rollback_succeeded: log Info event when apply() fails but rollback succeeds, so auditors can distinguish safe vs corrupted binary outcomes - sanitizeURL(): strip query-string params before logging any URL field; prevents presigned CDN query-string credentials from appearing in log storage - github_token_present: add Bool field to all HTTP log events so credential use is traceable without logging the token value - fetchReleaseFrom: url.PathEscape(tag) before appending to URL so user-supplied --version values cannot alter the request URL structure - extractBinary: wrap rc with io.LimitReader(maxBinaryBytes=100MiB) to bound memory use and produce an explicit error rather than OOM on malformed or zip-bomb archives

- sanitizeURL applied to reqURL in fetchReleaseFrom (all 3 log sites: github_api_network_error, github_api_response, release_parse_failed); consistent with download() which already sanitized its URL fields - download(): wrap resp.Body with io.LimitReader(maxBinaryBytes) before io.ReadAll so the raw HTTP response (archive + SUMS) is bounded; closes the gap where the zip-level limit in extractBinary was not yet applied to the network layer - upgrade.applying: add Bool("force", force) field so the event is self-contained for change management audit without requiring cross-event lookup to upgrade.run_started - currentExecutable(): emit upgrade.executable_resolution_failed warning when os.Executable() fails so auditors can distinguish real path from the "kfutil" fallback sentinel in log records Skipped (accepted): - C-1: logs silent without --debug — user explicitly accepted this risk - H-1: ConsoleWriter destroys structured fields — codebase-wide infra concern - M-1: duplicate fetch_release_failed events — intentional layering; removing the Run-level log would leave HTTP status error paths with no audit record

- dry-run fmt.Printf: use sanitizeURL() for archiveURL and sumsURL so presigned CDN query-string credentials never reach console output - download() error string: use sanitizeURL(rawURL) in the HTTP status error so query-string credentials are stripped from stderr / error aggregators - fetchReleaseFrom: log upgrade.github_api_request_build_failed on http.NewRequest error so malformed-URL failures leave an audit trace - download: log upgrade.http_request_build_failed on http.NewRequest error Skipped (accepted / out of scope): - M-1: force_override test requires Run() end-to-end mock harness

H1: add from_version, to_version, executable to rollback_failed and rollback_succeeded by threading those values into apply(); all three are in scope in Run() where the call site lives H2: add operator param to currentExecutable() so upgrade.executable_resolution_failed carries operator for identity correlation M1: emit upgrade.checksum_verified Info event (with operator, asset, source_url) after verifyChecksum returns nil, closing the gap in the checksum success audit trail M2: add TestSanitizeURL_StripsQueryParams and TestSanitizeURL_ParseErrorFallback to cover presigned-URL stripping and parse-error fallback behaviour

H1: emit upgrade.github_api_rejected Error event before each non-200 return in fetchReleaseFrom switch — gives SIEM rules an outcome-specific event name distinct from upgrade.github_api_response H2: emit upgrade.http_request_failed Error event before the non-200 return in download — same outcome-specificity fix for asset fetches M1: upgrade.rollback_succeeded changed from Info to Warn — a rollback is a degraded-state recovery action, not a routine operation M2: emit upgrade.apply_started Info inside apply() before selfupdate.Apply() is called — records the exact point the filesystem write commences M5: add TestExtractBinary_ExceedsMaxSize to assert the maxBinaryBytes size cap returns an error rather than an oversized slice Low: strip URL fragments in sanitizeURL(); add TestSanitizeURL_StripsFragment to cover the new behaviour

H1: add source_url to upgrade.rollback_failed and upgrade.rollback_succeeded by adding sourceURL param to apply(); forensic reconstruction of a failed upgrade can now trace the artifact without cross-event correlation H2: detect silently-truncated downloads — after io.ReadAll+LimitReader, check len(data) >= maxBinaryBytes and emit upgrade.download_size_limit_reached Error before returning an error; prevents a 100 MiB-capped payload from reaching checksum verification with a misleading mismatch event M1: add TestDownload_BodyTruncatedAtLimit to assert the HTTP body size cap rejects oversized payloads with the correct error message M2: add comments on the paired Warn/Error emit blocks in fetchReleaseFrom and download explaining why both events are intentional (latency_ms lives in the Warn, the control decision in the Error)

M1: emit upgrade.http_body_read_error Error before returning a mid-body read failure; distinguishes a truncated transfer from a refused connection in forensic reconstruction (SOC2 CC7.3) M3: add captureLog() test helper that redirects the global zerolog logger to a buffer for test duration; use it to assert upgrade.http_request_failed, upgrade.download_size_limit_reached, and upgrade.github_api_rejected are actually emitted — test coverage now proves audit events fire, not just that errors propagate M4: add .Err(err) to upgrade.rollback_succeeded so the root cause of the apply failure is in the structured log record alongside the rollback outcome (SOC1 completeness and accuracy)

H1: capture applyErr before RollbackError unwraps it; add apply_error string field to upgrade.rollback_failed so forensic reconstruction has the root cause alongside the rollback failure (SOC1 completeness) M3: split github_token_present into github_token_present (env var set?) and github_token_forwarded (actually sent in header?) in both fetchReleaseFrom and download; log events for untrusted-host downloads now correctly record that a token existed but was withheld, rather than masking token availability (SOC2 CC6.6 boundary protection telemetry) M4: add comment to TestDownload_TokenSentToTrustedHost explaining why these two tests must not run in parallel (allowedTokenHosts mutation)

spbsoluble added 17 commits June 19, 2026 15:35

docs: generate CLI docs for upgrade command

9823890

revert(upgrade): restore informDebug(debugFlag) pattern

b6af038

Logging is disabled when --debug is not passed, consistent with all other kfutil commands. The compliance gap (audit trail silently dropped) is an accepted risk — noted in project memory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add kfutil upgrade command#330

feat: add kfutil upgrade command#330
spbsoluble wants to merge 17 commits into
mainfrom
feat/self-upgrade

spbsoluble commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spbsoluble commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Usage

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

spbsoluble commented Jun 19, 2026 •

edited

Loading