Skip to content

NO-ISSUE: Add opt-in kube-rbac-proxy sidecar for pprof endpoint (AWS)#1502

Open
perdasilva wants to merge 1 commit into
openshift:mainfrom
perdasilva:fix/pprof-container-port
Open

NO-ISSUE: Add opt-in kube-rbac-proxy sidecar for pprof endpoint (AWS)#1502
perdasilva wants to merge 1 commit into
openshift:mainfrom
perdasilva:fix/pprof-container-port

Conversation

@perdasilva

@perdasilva perdasilva commented May 28, 2026

Copy link
Copy Markdown

Description

Add an --enable-pprof flag to machine-api-operator that, when set, configures the AWS machine controller with a pprof profiling endpoint and a kube-rbac-proxy sidecar to securely expose it.

When --enable-pprof is passed to MAO and the platform is AWS:

  • The machine controller receives --enable-pprof and --pprof-bind-address=127.0.0.1:6061 to serve plaintext pprof on loopback
  • A kube-rbac-proxy-pprof sidecar is added, listening on port 6060 with centralized TLS, proxying to the upstream pprof server

The flag is off by default and only meaningful on AWS (the only platform whose machine controller supports pprof). On all other platforms, the flag is accepted but has no effect.

Changes

  • cmd/machine-api-operator/start.go — new --enable-pprof CLI flag (default: false, help: "AWS only"), passed through to operator.New
  • pkg/operator/config.goEnablePprof field added to OperatorConfig
  • pkg/operator/operator.goenablePprof field on Operator struct, threaded through New()OperatorConfig
  • pkg/operator/sync.go — when EnablePprof && PlatformType == AWS: adds --enable-pprof + --pprof-bind-address args to machine controller, adds kube-rbac-proxy-pprof sidecar
  • pkg/operator/sync_test.go — new test case verifying pprof proxy + args when enabled; existing AWS tests verify pprof is absent by default

Dependencies

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added a --enable-pprof CLI flag to optionally enable a profiling endpoint for the machine controller (AWS-only), exposed via an added proxy container and bound to a local upstream.
  • Tests
    • Extended pod template and proxy container argument tests to validate both AWS pprof enabled and disabled scenarios.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 28, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@perdasilva: This pull request explicitly references no jira issue.

Details

In response to this:

Add an informational containerPort (6060, named "pprof") to the machine-controller container spec so that the pprof endpoint added in machine-api-provider-aws is discoverable.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown

Walkthrough

Adds conditional pprof profiling endpoint to the machine-controller deployment. When the --enable-pprof CLI flag is enabled and the platform is AWS, the operator extends machine-controller arguments and wires a kube-rbac-proxy container that exposes the profiling endpoint on port 6060, forwarding to the machine-controller's upstream port 6061.

Changes

pprof Endpoint Exposure on AWS

Layer / File(s) Summary
Configuration flag and operator initialization
cmd/machine-api-operator/start.go, pkg/operator/config.go, pkg/operator/operator.go
CLI flag --enable-pprof is registered, stored in OperatorConfig.EnablePprof, and threaded through the Operator constructor as an enablePprof field for use during reconciliation.
Pod template pprof proxy container setup
pkg/operator/sync.go
Package constants define pprof ports (6060 for proxy exposure, 6061 for upstream bind). Pod template construction conditionally enables pprof on AWS by extending machine-controller args with --enable-pprof and binding address, updating newKubeProxyContainers to accept a withPprofProxy flag, and appending a pprof kube-rbac-proxy container that forwards from 6060 to the upstream port.
Test coverage for pprof enablement
pkg/operator/sync_test.go
Test invocations updated to match new function signature. AWS pprof test case added to validate kube-rbac-proxy-pprof container presence and --enable-pprof flag in machine-controller args when enabled, with assertions for their absence when disabled.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (14 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main change: adding an opt-in kube-rbac-proxy sidecar for pprof endpoint with AWS support, which is the primary objective across all modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names in pkg/operator/sync_test.go are stable and deterministic. The new pprof test case and all 26 other test names contain only static, descriptive strings with no dynamic values, timest...
Test Structure And Quality ✅ Passed The test file uses standard Go testing (testing.T) with table-driven patterns and Gomega assertions, not Ginkgo. The custom check is specifically for "Ginkgo test code" and does not apply to these...
Microshift Test Compatibility ✅ Passed PR does not add any Ginkgo e2e tests. Only standard Go unit tests are modified (TestXxx functions in sync_test.go), which are not subject to the MicroShift compatibility check.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests added. The PR modifies pkg/operator/sync_test.go (standard Go unit tests with testing.T), not e2e tests. Check not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR makes only additive changes to pprof container configuration. It does not introduce or modify any scheduling constraints (affinity, topology spreads, node selectors, PDBs, replicas, or toleratio...
Ote Binary Stdout Contract ✅ Passed All logging uses klog configured to stderr via logtostderr flag and logs.InitLogs(). No fmt.Print*/os.Stdout writes found in main(), init(), or process-level code.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Tests use IPv6-compatible functions (net.ParseIP/ParseCIDR/JoinHostPort), have no hardcoded IPv4, and are platform-specific (VSphere/Nutanix) so don't apply to IPv6-only disconnected environments.
No-Weak-Crypto ✅ Passed PR adds pprof container port configuration without introducing any weak cryptography. No MD5, SHA1, DES, RC4, 3DES, Blowfish, or ECB usage found; only standard crypto/tls imports used; no insecure...
Container-Privileges ✅ Passed No privileged container settings (privileged, hostPID, hostNetwork, hostIPC, SYS_ADMIN, allowPrivilegeEscalation) are introduced. The PR adds pprof support via a kube-rbac-proxy sidecar with standa...
No-Sensitive-Data-In-Logs ✅ Passed No logging of sensitive data (passwords, tokens, API keys, PII, session IDs, internal hostnames, or customer data) found. The PR adds pprof configuration with localhost binding (127.0.0.1:6061); po...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from damdo and nrb May 28, 2026 09:17
@openshift-ci

openshift-ci Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign radekmanak for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/operator/sync.go (1)

805-808: LGTM!

Operational note: If this pprof port is later exposed via a Kubernetes Service, ensure proper access controls are in place. Profiling endpoints can expose sensitive runtime information (memory dumps, goroutine stacks, CPU profiles). Consider restricting access to cluster-internal traffic only or implementing authentication/authorization mechanisms.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/operator/sync.go` around lines 805 - 808, The pprof container port entry
(Name: "pprof", ContainerPort: pprofPort) exposes a profiling endpoint that must
be protected if you create a Kubernetes Service for it; update any
Service/Ingress that exposes this port to use ClusterIP (no external exposure),
add NetworkPolicy rules to restrict traffic to trusted pods/namespaces, and/or
place an auth-proxy or mTLS ingress in front of the endpoint so only
authenticated users can reach it; locate the pprof port definition (Name
"pprof", variable pprofPort) and ensure associated Service/NetworkPolicy/Ingress
manifests or code paths enforce these access controls before exposing the port.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/operator/sync.go`:
- Around line 805-808: The pprof container port entry (Name: "pprof",
ContainerPort: pprofPort) exposes a profiling endpoint that must be protected if
you create a Kubernetes Service for it; update any Service/Ingress that exposes
this port to use ClusterIP (no external exposure), add NetworkPolicy rules to
restrict traffic to trusted pods/namespaces, and/or place an auth-proxy or mTLS
ingress in front of the endpoint so only authenticated users can reach it;
locate the pprof port definition (Name "pprof", variable pprofPort) and ensure
associated Service/NetworkPolicy/Ingress manifests or code paths enforce these
access controls before exposing the port.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a34fb9da-94e1-4760-bdb0-50dea8a56586

📥 Commits

Reviewing files that changed from the base of the PR and between d7772c6 and 31f027b.

📒 Files selected for processing (1)
  • pkg/operator/sync.go

@perdasilva perdasilva marked this pull request as draft May 28, 2026 09:57
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2026
@perdasilva

Copy link
Copy Markdown
Author

/retest

@perdasilva

Copy link
Copy Markdown
Author

/retest

@perdasilva

Copy link
Copy Markdown
Author

/unhold

@perdasilva perdasilva marked this pull request as ready for review June 10, 2026 09:44
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 10, 2026
@openshift-ci openshift-ci Bot requested a review from theobarberbany June 10, 2026 09:44
@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@perdasilva: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Comment thread pkg/operator/sync.go Outdated
Comment on lines +48 to +808
@@ -801,6 +802,10 @@ func newContainers(config *OperatorConfig, features map[string]bool, tlsArgs []s
Name: "healthz",
ContainerPort: defaultMachineHealthPort,
},
{
Name: "pprof",
ContainerPort: pprofPort,
},

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on where we go with: https://github.com/openshift/machine-api-provider-aws/pull/189/changes#r3434353653, this may need changing

Add an --enable-pprof flag to machine-api-operator that, when set,
configures the AWS machine controller with a pprof profiling endpoint
and a kube-rbac-proxy sidecar to securely expose it.

When --enable-pprof is passed to MAO and the platform is AWS:
- The machine controller receives --enable-pprof and
  --pprof-bind-address=127.0.0.1:6061 to serve plaintext pprof on
  loopback
- A kube-rbac-proxy-pprof sidecar is added, listening on port 6060
  with centralized TLS, proxying to the upstream pprof server

The flag is off by default and only meaningful on AWS (the only
platform whose machine controller supports pprof). On all other
platforms, the flag is accepted but has no effect.

Signed-off-by: Per G. da Silva <pegoncal@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@perdasilva perdasilva force-pushed the fix/pprof-container-port branch from 31f027b to 641d58e Compare June 18, 2026 09:12
@perdasilva perdasilva changed the title NO-ISSUE: Declare pprof containerPort for machine-controller NO-ISSUE: Add opt-in kube-rbac-proxy sidecar for pprof endpoint (AWS) Jun 18, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/operator/sync_test.go (1)

666-666: ⚡ Quick win

Cover the new withPprofProxy branch in this unit test.

TestNewKubeProxyContainers always passes false for the new parameter, so the pprof proxy path is not directly validated here.

Proposed diff
 func TestNewKubeProxyContainers(t *testing.T) {
 	testCases := []struct {
 		name                       string
 		image                      string
 		withMHCProxy               bool
+		withPprofProxy             bool
 		tlsProfile                 configv1.TLSProfileSpec
 		expectedCipherSuitesInArgs bool
 		expectedPorts              map[string]int32
 	}{
 		{
 			name:         "TLS 1.2 Intermediate profile with cipher suites",
 			image:        "test-image:latest",
 			withMHCProxy: true,
+			withPprofProxy: false,
 			tlsProfile: configv1.TLSProfileSpec{
 				Ciphers: []string{
 					"TLS_AES_128_GCM_SHA256",
@@
 			expectedPorts: map[string]int32{
 				"kube-rbac-proxy-machineset-mtrc": machineSetExposeMetricsPort,
 				"kube-rbac-proxy-machine-mtrc":    machineExposeMetricsPort,
 				"kube-rbac-proxy-mhc-mtrc":        machineHealthCheckExposeMetricsPort,
 			},
 		},
+		{
+			name:           "TLS 1.2 Intermediate profile with pprof proxy",
+			image:          "test-image:latest",
+			withMHCProxy:   false,
+			withPprofProxy: true,
+			tlsProfile: configv1.TLSProfileSpec{
+				Ciphers: []string{
+					"ECDHE-ECDSA-AES128-GCM-SHA256",
+					"ECDHE-RSA-AES128-GCM-SHA256",
+				},
+				MinTLSVersion: configv1.VersionTLS12,
+			},
+			expectedCipherSuitesInArgs: true,
+			expectedPorts: map[string]int32{
+				"kube-rbac-proxy-machineset-mtrc": machineSetExposeMetricsPort,
+				"kube-rbac-proxy-machine-mtrc":    machineExposeMetricsPort,
+				"kube-rbac-proxy-pprof":           pprofExposePort,
+			},
+		},
@@
-			containers := newKubeProxyContainers(tc.image, tc.withMHCProxy, false, getTLSArgs(tc.tlsProfile))
+			containers := newKubeProxyContainers(tc.image, tc.withMHCProxy, tc.withPprofProxy, getTLSArgs(tc.tlsProfile))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/operator/sync_test.go` at line 666, The test `TestNewKubeProxyContainers`
currently passes `false` for the `withPprofProxy` parameter in the
`newKubeProxyContainers` function call, which means the code path for pprof
proxy is never tested. Add test coverage for the `withPprofProxy` parameter by
either adding additional test cases with `withPprofProxy` set to `true` or
modifying the existing test structure to validate behavior in both the true and
false scenarios for this parameter.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/operator/sync_test.go`:
- Line 666: The test `TestNewKubeProxyContainers` currently passes `false` for
the `withPprofProxy` parameter in the `newKubeProxyContainers` function call,
which means the code path for pprof proxy is never tested. Add test coverage for
the `withPprofProxy` parameter by either adding additional test cases with
`withPprofProxy` set to `true` or modifying the existing test structure to
validate behavior in both the true and false scenarios for this parameter.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a2ba5be1-b66f-4f34-911c-9cc18a0bc7e0

📥 Commits

Reviewing files that changed from the base of the PR and between 31f027b and 641d58e.

📒 Files selected for processing (5)
  • cmd/machine-api-operator/start.go
  • pkg/operator/config.go
  • pkg/operator/operator.go
  • pkg/operator/sync.go
  • pkg/operator/sync_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants