Skip to content

OCPEDGE-2746: Add MutableTopology feature gated infra spec.controlPlaneTopology#2891

Open
jeff-roche wants to merge 3 commits into
openshift:masterfrom
jeff-roche:controlPlaneTopologySpec
Open

OCPEDGE-2746: Add MutableTopology feature gated infra spec.controlPlaneTopology#2891
jeff-roche wants to merge 3 commits into
openshift:masterfrom
jeff-roche:controlPlaneTopologySpec

Conversation

@jeff-roche

Copy link
Copy Markdown

Utilizes the MutableTopology feature gate which enables spec.controlPlaneTopology on the Infrastructure resource, allowing cluster topology to be set to HighlyAvailable or SingleReplica.

CRD manifests are now split per cluster profile (Hypershift, SelfManagedHA) so the field is only present in the appropriate profile/feature-set combinations.

Includes integration tests verifying:

  • Accepted values when the gate is enabled (MutableTopology.yaml)
  • Field pruning when the gate is disabled (AAA_ungated.yaml)

Implements API changes needed for enhancements#2008

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 15, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 15, 2026

Copy link
Copy Markdown

@jeff-roche: This pull request references OCPEDGE-2746 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Utilizes the MutableTopology feature gate which enables spec.controlPlaneTopology on the Infrastructure resource, allowing cluster topology to be set to HighlyAvailable or SingleReplica.

CRD manifests are now split per cluster profile (Hypershift, SelfManagedHA) so the field is only present in the appropriate profile/feature-set combinations.

Includes integration tests verifying:

  • Accepted values when the gate is enabled (MutableTopology.yaml)
  • Field pruning when the gate is disabled (AAA_ungated.yaml)

Implements API changes needed for enhancements#2008

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Hello @jeff-roche! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 4e9250df-165a-4599-810b-58a0006b72d0

📥 Commits

Reviewing files that changed from the base of the PR and between eaf0036 and 4dee825.

⛔ Files ignored due to path filters (8)
  • config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_infrastructures-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_infrastructures-SelfManagedHA-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1/zz_generated.crd-manifests/0000_10_config-operator_01_infrastructures-SelfManagedHA-DevPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1/zz_generated.featuregated-crd-manifests/infrastructures.config.openshift.io/MutableTopology.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • machineconfiguration/v1/zz_generated.crd-manifests/0000_80_machine-config_01_controllerconfigs-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • machineconfiguration/v1/zz_generated.crd-manifests/0000_80_machine-config_01_controllerconfigs-SelfManagedHA-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • machineconfiguration/v1/zz_generated.crd-manifests/0000_80_machine-config_01_controllerconfigs-SelfManagedHA-DevPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • machineconfiguration/v1/zz_generated.featuregated-crd-manifests/controllerconfigs.machineconfiguration.openshift.io/MutableTopology.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
📒 Files selected for processing (8)
  • config/v1/types_infrastructure.go
  • config/v1/types_infrastructure_test.go
  • payload-manifests/crds/0000_10_config-operator_01_infrastructures-Hypershift-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_infrastructures-SelfManagedHA-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_infrastructures-SelfManagedHA-DevPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_80_machine-config_01_controllerconfigs-Hypershift-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_80_machine-config_01_controllerconfigs-SelfManagedHA-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_80_machine-config_01_controllerconfigs-SelfManagedHA-DevPreviewNoUpgrade.crd.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • config/v1/types_infrastructure.go

📝 Walkthrough

Walkthrough

The PR adds an optional ControlPlaneTopology field to InfrastructureSpec, gated by the MutableTopology OpenShift feature gate with kubebuilder enum validation restricting values to HighlyAvailable and SingleReplica. A new Infrastructure CRD test suite validates field creation with omitted/allowed values, update transitions between allowed values, and negative cases for unsupported values. A ControllerConfig test fixture confirms downstream consumption with spec set to SingleReplica and status defaulting to HighlyAvailable. CRD payload manifests have annotations removed for self-managed-high-availability and ibm-cloud-managed inclusion, and feature-set values are updated from CustomNoUpgrade to TechPreviewNoUpgrade in applicable variants. The test configuration path is updated to reference the new SelfManagedHA variant. Existing ungated test expectations are reformatted to use single-quoted YAML scalar strings for consistency.

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding a MutableTopology feature-gated spec.controlPlaneTopology field to Infrastructure, which aligns with the primary modifications in the changeset.
Description check ✅ Passed The description accurately relates to the changeset, explaining the MutableTopology feature gate, the new spec.controlPlaneTopology field, profile-specific CRD splits, and included integration tests that match the actual changes made.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR contains no Ginkgo tests and all test names in declarative YAML configs are stable/deterministic with no dynamic information (UUIDs, timestamps, pod/node names, IPs, or generated suffixes).
Test Structure And Quality ✅ Passed This PR adds no Ginkgo tests. The only new test file (types_infrastructure_test.go) is a standard Go test using the testing package with t.Fatalf assertions, not Ginkgo. The custom check is not app...
Microshift Test Compatibility ✅ Passed PR does not add any new Ginkgo e2e tests (using It(), Describe(), etc.). Changes consist only of YAML-based CRD validation tests, API type definitions, and CRD manifests.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests were added in this PR. The changes include only CRD validation test configurations (YAML files) and a Go unit test, neither of which use Ginkgo framework.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds infrastructure API support for topology-aware scheduling via the MutableTopology feature gate, enabling ControlPlaneTopology field. No deployment manifests, operator code, or scheduling con...
Ote Binary Stdout Contract ✅ Passed PR adds only field definitions, test constants, and YAML test files. No stdout writes in process-level code detected; logging properly uses GinkgoWriter.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests were added. The PR adds YAML test specifications for API validation that are declarative, run in local test environment, contain no IPv4 assumptions, and require no external con...
No-Weak-Crypto ✅ Passed No weak cryptographic algorithms (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB), custom crypto implementations, or non-constant-time secret comparisons found in this PR. Changes are limited to topology...
Container-Privileges ✅ Passed No container privilege escalation settings (privileged, hostPID/IPC/Network, SYS_ADMIN, allowPrivilegeEscalation) found. PR contains API type definitions, CRD schemas, and test fixtures—not contain...
No-Sensitive-Data-In-Logs ✅ Passed The PR contains no logging statements or code that outputs sensitive data. The code changes are limited to: (1) Adding a new ControlPlaneTopology field to the Infrastructure CRD type definition w...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Error: build linters: unable to load custom analyzer "kubeapilinter": tools/_output/bin/kube-api-linter.so, plugin: not implemented
The command is terminated due to an error: build linters: unable to load custom analyzer "kubeapilinter": tools/_output/bin/kube-api-linter.so, plugin: not implemented


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 15, 2026
@openshift-ci openshift-ci Bot requested review from JoelSpeed and jkyros June 15, 2026 18:03
@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jeff-roche

Copy link
Copy Markdown
Author

The linter failure doesn't make much sense to me, enum should be a better validator than string length.

Comment thread config/v1/types_infrastructure.go
Comment thread config/v1/tests/infrastructures.config.openshift.io/AAA_ungated.yaml Outdated
Comment on lines +64 to +65
// +openshift:validation:FeatureGateAwareEnum:featureGate="",enum=
// +openshift:validation:FeatureGateAwareEnum:featureGate=MutableTopology,enum=HighlyAvailable;SingleReplica

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the field is gated, you don't need to gate the validation

Suggested change
// +openshift:validation:FeatureGateAwareEnum:featureGate="",enum=
// +openshift:validation:FeatureGateAwareEnum:featureGate=MutableTopology,enum=HighlyAvailable;SingleReplica
// +kubebuilder:validation:enum=HighlyAvailable;SingleReplica

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So switching to kubebuilder validation appears to be less-secure (according to the integration tests). the feature gate aware enum successfully blocks invalid values in the integration tests whereas the standard kubebuilder validation does not.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, my bad. That's because it's case sensitive, needs Enum not enum in that final segment

Comment thread config/v1/types_infrastructure.go Outdated
Comment on lines +59 to +62
// controlPlaneTopology expresses the desired topology configuration for control nodes.
// The 'HighlyAvailable' mode represents a "normal", 3 control node cluster.
// The 'SingleReplica' mode represents configuration where there is a single control node.
// If left blank, no change is required and no transitions will be triggered.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to explain here what happens when you set the different values. At the moment you're kind of defining the meaning but not saying what the observable changes are.

We also need to think about valid transitions. For example, once the spec is HighlyAvailable, and the status is also HighlyAvailable, moving spec back to SingleReplica wouldn't be supported and we should prevent that with a validation and explain it here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would you suggest is the best place to prevent that? Current plan for the new CCO controller was only react to SNO->HA for now, do you think we need some sort of CEL validation to prevent changing from HA->SNO if status is HA?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add a CEL rule to the infrastructure object to to prevent the spec change if status is HA yes, that shouldn't be super complex I don't think

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
config/v1/types_infrastructure.go (1)

59-66: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Enhance field documentation to explain observable behavior and add transition validation.

The field documentation needs improvement in three areas:

  1. Observable behavior: The comment defines what each mode represents but doesn't explain what observable changes occur when setting these values. For example, the status field (lines 108-115) explains that operators "should not configure the operand for highly-available operation" in SingleReplica mode. The spec field should similarly explain what happens when you request each topology.

  2. Valid transitions: A past review comment indicates that certain transitions (e.g., HighlyAvailable → SingleReplica after status is set) may not be supported. If transition constraints exist, they must be:

    • Documented in the field comment
    • Enforced with +kubebuilder:validation:XValidation CEL rules
  3. Relationship to status: The comment should clarify how spec.controlPlaneTopology relates to status.controlPlaneTopology, especially regarding the behavior when the spec field is omitted.

As per coding guidelines, field relationships or constraints must be enforced with XValidation rules using CEL expressions, and all validation markers must be fully documented in field comments.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/v1/types_infrastructure.go` around lines 59 - 66, The
controlPlaneTopology field comment needs enhancement in three areas: add
documentation explaining what observable changes occur when each mode is set
(e.g., what happens with HighlyAvailable vs SingleReplica), document any valid
topology transitions and add kubebuilder:validation:XValidation CEL rules to
enforce transition constraints (such as preventing HighlyAvailable to
SingleReplica transitions after status is set), and clarify in the comment how
the spec.controlPlaneTopology field relates to the status.controlPlaneTopology
field including the behavior when the spec field is omitted. Update the comment
block above the controlPlaneTopology field definition and add appropriate
XValidation markers with CEL expressions to enforce any documented transition
rules.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@config/v1/types_infrastructure.go`:
- Around line 59-66: The controlPlaneTopology field comment needs enhancement in
three areas: add documentation explaining what observable changes occur when
each mode is set (e.g., what happens with HighlyAvailable vs SingleReplica),
document any valid topology transitions and add
kubebuilder:validation:XValidation CEL rules to enforce transition constraints
(such as preventing HighlyAvailable to SingleReplica transitions after status is
set), and clarify in the comment how the spec.controlPlaneTopology field relates
to the status.controlPlaneTopology field including the behavior when the spec
field is omitted. Update the comment block above the controlPlaneTopology field
definition and add appropriate XValidation markers with CEL expressions to
enforce any documented transition rules.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: f6a425df-2096-4cea-b08f-b935c6be042c

📥 Commits

Reviewing files that changed from the base of the PR and between 4fa8a9b and 9d6450a.

📒 Files selected for processing (1)
  • config/v1/types_infrastructure.go

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 17, 2026
Introduce the MutableTopology feature gate which enables
spec.controlPlaneTopology on the Infrastructure resource, allowing
cluster topology to be set to HighlyAvailable or SingleReplica.

CRD manifests are now split per cluster profile (Hypershift,
SelfManagedHA) so the field is only present in the appropriate
profile/feature-set combinations.

Includes integration tests verifying:
- Accepted values when the gate is enabled (MutableTopology.yaml)
- Field pruning when the gate is disabled (AAA_ungated.yaml)

Assisted-by: Claude <noreply@anthropic.com>
Signed-off-by: Jeff Roche <jeroche@redhat.com>
@jeff-roche jeff-roche force-pushed the controlPlaneTopologySpec branch from 6bc4bb9 to eaf0036 Compare June 17, 2026 15:46
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 17, 2026
Signed-off-by: Jeff Roche <jeroche@redhat.com>
@openshift-ci

openshift-ci Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

@jeff-roche: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/lint 4dee825 link true /test lint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants