Skip to content

RFC: AWS-LC as TLS Backend for HAProxy#1501

Open
hoffmaen wants to merge 4 commits into
cloudfoundry:mainfrom
sap-contributions:rfc-haproxy-awslc
Open

RFC: AWS-LC as TLS Backend for HAProxy#1501
hoffmaen wants to merge 4 commits into
cloudfoundry:mainfrom
sap-contributions:rfc-haproxy-awslc

Conversation

@hoffmaen

@hoffmaen hoffmaen commented May 19, 2026

Copy link
Copy Markdown
Contributor

Link to the document for quick review: rfc-draft-haproxy-awslc.md

Summary

This PR adds the RFC for AWS-LC as TLS Backend for the haproxy-boshrelease.

AWS-LC is independent of the version provided via the BOSH stemcell and can be kept up to date interdependently. The OpenSSL that is currently used via the stemcell is on an outdated version that has severe performance impacts in high-load scenarios. These would be improved in newer OpenSSL versions, but not resolved. HAProxy explicitely recommends against OpenSSL in production (haproxy/haproxy#3086).

AWS-LC alleviates these issues and is proposed to be added as an optional release. For operators nothing changes, unless they want to opt into using AWS-LC.

AWS-LC is FIPS 140-3 compliant.

@hoffmaen hoffmaen force-pushed the rfc-haproxy-awslc branch from b0c5b6d to d570aaf Compare May 19, 2026 09:18
@beyhan

beyhan commented Jun 2, 2026

Copy link
Copy Markdown
Member

Hi @hoffmaen,

thank you for this pr. Is it intentionally in draft status?

@Soha-Albaghdady Soha-Albaghdady left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, proposing some minor formatting and restructuring

Comment thread toc/rfc/rfc-draft-haproxy-awslc.md
Comment thread toc/rfc/rfc-draft-haproxy-awslc.md Outdated
Comment thread toc/rfc/rfc-draft-haproxy-awslc.md Outdated
Comment thread toc/rfc/rfc-draft-haproxy-awslc.md Outdated
Comment thread toc/rfc/rfc-draft-haproxy-awslc.md Outdated
Comment thread toc/rfc/rfc-draft-haproxy-awslc.md Outdated
Comment thread toc/rfc/rfc-draft-haproxy-awslc.md
@hoffmaen hoffmaen marked this pull request as ready for review June 11, 2026 08:50
@beyhan beyhan requested review from a team, Gerg, beyhan, cweibel, rkoster and stephanme and removed request for a team June 17, 2026 11:34
@beyhan beyhan added toc rfc CFF community RFC labels Jun 17, 2026
@beyhan beyhan moved this from Inbox to In Progress in CF Community Jun 23, 2026
@beyhan

beyhan commented Jun 30, 2026

Copy link
Copy Markdown
Member

@hoffmaen could you please ask in your WG (meeting) about feedback.

@b1tamara b1tamara left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rkoster

rkoster commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Code Review

PR #1501: RFC: AWS-LC as TLS Backend for HAProxy

Summary

This RFC proposes adding AWS-LC as an optional TLS backend for the haproxy-boshrelease to address OpenSSL 3.0.2 TLS-handshake performance and the lack of post-quantum key exchange. The problem statement is strong: it is well-evidenced (benchmarks, HAProxy upstream recommendation in haproxy#3086, FIPS 140-3 validation) and the non-breaking, opt-in framing with OpenSSL-as-default is the right compatibility posture. The core AWS-LC value proposition is convincing. The main gap is the packaging design: the RFC introduces a 7-tarball matrix (openssl, openssl-patched, awslc, awslc-patched, awslc-fips, awslc-fips-patched, multi) but only justifies 4 of the 7. The three '-patched' variants are never explained, and there is no analysis of the CI/compilation cost this matrix imposes despite it roughly doubling the build surface. There are also smaller correctness/process issues: the implementation link 404s (the branch lives on a fork as PR #925), the RFC metadata omits fields required by the template, and the document still carries Draft status with an empty RFC Pull Request field.

Critical Issues - Must Fix

  1. The 7-variant release matrix is under-justified: the three '-patched' variants have no stated rationale

    • File: toc/rfc/rfc-draft-haproxy-awslc.md
    • The Release Variants table defines seven tarballs: openssl, openssl-patched, awslc, awslc-patched, awslc-fips, awslc-fips-patched, and multi. The RFC justifies openssl (backward compat), awslc (performance), awslc-fips (FIPS compliance), and multi (migration safety net). It does NOT justify the '-patched' axis at all. 'custom HAProxy patches' appears in the table and Build Architecture sections but the RFC never states what these patches are, why they are needed, which users need them, or why they must ship as separate first-class release variants rather than being folded into the base builds or omitted. This single unexplained axis doubles the variant count from 3 to 6 and doubles the number of HAProxy binaries in the multi release from 3 to 6. Reviewers cannot evaluate the proposal's scope without knowing what problem the patched variants solve.
    • Fix: Add a subsection explaining the custom HAProxy patches: what they change, why they cannot be upstreamed or made default, and who requires them. Then justify (or drop) each patched variant explicitly. If the patches are not required for the initial rollout, consider deferring all three '-patched' variants to a follow-up and shipping only openssl / awslc / awslc-fips / multi.
  2. No analysis of CI and compilation cost for the expanded build matrix

    • File: toc/rfc/rfc-draft-haproxy-awslc.md
    • The proposal significantly expands what haproxy-boshrelease must compile and test: the multi release alone builds 6 HAProxy binaries plus aws-lc, aws-lc-fips, cmake, and a Go toolchain (confirmed against the implementation's packages-multi directory). Six additional single-variant tarballs are also produced. The RFC gives only two isolated build-time figures (~15 min OpenSSL, ~25 min AWS-LC) and no accounting of the total CI impact: how many release tarballs are built and tested per pipeline run, how compilation time and cost scale across the full matrix, what the storage/artifact footprint is, or how the test matrix (BOSH deployment tests per variant) grows. For a shared community release this recurring cost is a first-order concern, not an implementation detail.
    • Fix: Add a 'Costs' or 'CI / Build Impact' section quantifying the added compilation time, pipeline duration, artifact count/size, and test-matrix growth for the full set of variants. Use this to justify the matrix size, and explicitly weigh it against dropping the patched variants and/or the multi release.

Important Issues - Should Fix

  1. Implementation branch link is broken (404)

  2. RFC metadata omits required template fields

    • File: toc/rfc/rfc-draft-haproxy-awslc.md
    • The Meta block is missing the 'Related RFCs' field present in rfc-template.md, and the 'RFC Pull Request' field is left empty. rfc-0023-add-cf-supports-for-fips-stemcell may be relevant to the FIPS discussion and worth cross-linking under Related RFCs.
    • Suggestion: Add the 'Related RFCs' field (linking rfc-0023 if applicable, or leaving empty per template) and fill in 'RFC Pull Request' with this PR's URL, as the RFC README process step requires.
  3. Runtime binary selection via filesystem-path probing is described but its failure modes are not

    • File: toc/rfc/rfc-draft-haproxy-awslc.md
    • Runtime Selection resolves the binary by checking whether /var/vcap/packages/haproxy-<ssl_variant> exists, else falling back to /var/vcap/packages/haproxy. The RFC does not specify behaviour when ssl_variant is set to a valid value on a single-variant (non-multi) release where that package is absent: does it silently fall back to the wrong backend, or fail fast? Silent fallback could mask a misconfiguration where an operator believes they are running AWS-LC FIPS but are actually on OpenSSL.
    • Suggestion: Specify the validation/error behaviour when ssl_variant names a variant not present in the deployed release. Prefer failing fast with a clear error over silent fallback, and state this in the RFC.
  4. No testing strategy for the variants, and the reference implementation adds zero test coverage

    • File: toc/rfc/rfc-draft-haproxy-awslc.md
    • The RFC defines 7 variants with materially different TLS behaviour (no DHE, no X448, no FFDH in AWS-LC; FIPS-mode claims; PQ named groups) and a runtime ssl_variant selector, but says nothing about how any of this is verified. haproxy-boshrelease already has good coverage - rspec template specs and a Go/Ginkgo acceptance suite (https_frontend_test.go, mtls_frontend_test.go, strict_sni_test.go, tcp_frontend_test.go). The linked implementation branch (aws-lc-multi, 36 files changed) modifies NONE of these: zero *_spec.rb and zero *_test.go changes; the only acceptance-tests edits are README.md and run-local.sh. So nothing verifies that awslc negotiates the expected ciphers, that the documented DHE->ECDHE fallback actually works, that SNI/mTLS still pass on the new binaries, that the ssl_variant switch selects the right binary, or that FIPS is genuinely active. These are the exact claims the RFC relies on.
    • Suggestion: Add a Testing section stating which variants receive automated acceptance coverage and what it asserts (cipher/curve negotiation, DHE->ECDHE fallback, SNI + mTLS regression, ssl_variant selection resolves the correct binary, FIPS-active assertion for the FIPS variants), and how that test matrix scales in the pipeline. The implementation PR should extend the existing rspec/acceptance suites accordingly.
  5. cf-deployment integration and testing expectation is undefined

    • File: toc/rfc/rfc-draft-haproxy-awslc.md
    • The only cf-deployment reference is the line-18 disclaimer that this is not about the TCP Router. haproxy-boshrelease is not in the cf-deployment base manifest; it is opt-in via operations/use-haproxy.yml, which pins a single haproxy release version and adds one haproxy instance group. The RFC does not state whether the AWS-LC variants are expected to be exercised at the cf-deployment level - e.g. via a companion ops-file to use-haproxy.yml or an ssl_variant wiring - nor who owns that integration. Without this, it is unclear how an operator using the standard cf-deployment ops-file path would adopt or validate AWS-LC end to end.
    • Suggestion: State the cf-deployment expectation explicitly: either add/describe a companion ops-file (e.g. use-haproxy-awslc.yml) naming which variant it wires and who maintains it, or declare cf-deployment integration testing out of scope with a rationale and point operators at the haproxy-boshrelease-level path instead.

Minor Issues

  1. Filename still uses the rfc-draft- prefix in toc/rfc/rfc-draft-haproxy-awslc.md
  2. Status is Draft while the PR is out of draft and under active review in toc/rfc/rfc-draft-haproxy-awslc.md

Positive Notes

  • Strong, well-evidenced problem statement: benchmark tables for both non-FIPS and FIPS, the HAProxy upstream recommendation (haproxy#3086), and FIPS 140-3 validation together make a compelling case.
  • Correct compatibility posture: OpenSSL remains the default and AWS-LC is strictly opt-in, so existing deployments see no change.
  • The cipher/curve differences section (no DHE, no X448, no FFDH) is thorough and honestly addresses the main client-compatibility risk, correctly noting TLS 1.2-minimum policy filters out affected legacy clients.
  • Including a concrete, phased, reversible migration path with realistic rollback timing (~2 min property switch) shows good operational thinking.
  • Alternatives Considered addresses both other TLS libraries (WolfSSL, BoringSSL) and other proxies (Envoy, nginx) with clear reasons for rejection.

Recommendation

Request changes - critical issues need to be addressed before merging.

@rkoster rkoster left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes - see the detailed review comment above. Main blockers: the -patched variant axis is unjustified and there is no CI/build-cost analysis for the expanded matrix. Important: broken implementation link, missing testing strategy (impl branch changes no tests), and undefined cf-deployment integration.

- [AWS-LC GitHub](https://github.com/aws/aws-lc)
- [AWS-LC FIPS 140-3 certificate #4816](https://csrc.nist.gov/projects/cryptographic-module-validation-program/certificate/4816)
- [oha load testing tool](https://github.com/hatoo/oha)
- [Implementation branch](https://github.com/cloudfoundry/haproxy-boshrelease/tree/aws-lc-multi)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation link 404s - the aws-lc-multi branch does not exist in cloudfoundry/haproxy-boshrelease. The code lives on the fork sap-contributions/haproxy-boshrelease (branch aws-lc-multi) and is proposed upstream as PR #925. Please update this link so reviewers can reach the implementation.

| `awslc` | AWS-LC | High-performance, non-FIPS |
| `awslc-patched` | AWS-LC | With custom HAProxy patches |
| `awslc-fips` | AWS-LC FIPS 3.3.0 | FIPS 140-3 validated |
| `awslc-fips-patched` | AWS-LC FIPS | FIPS with custom HAProxy patches |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The '-patched' axis (openssl-patched, awslc-patched, awslc-fips-patched) is introduced here but never explained anywhere in the RFC. What are the 'custom HAProxy patches', why are they needed, and who needs them? This axis doubles the variant count from 3 to 6 (and doubles the multi release's HAProxy binaries), which is the main driver of the added build/CI cost. Please justify each patched variant explicitly, or defer them to a follow-up if they are not required for the initial rollout.

aws-lc-fips → AWS-LC FIPS library (requires Go toolchain for delocate)
haproxy-openssl → HAProxy binary linked to system OpenSSL
haproxy-awslc → HAProxy binary linked to AWS-LC
haproxy-awslc-fips → HAProxy binary linked to AWS-LC FIPS

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This build architecture makes the CI cost concrete: the multi release compiles 6 HAProxy binaries plus aws-lc, aws-lc-fips, cmake, and a Go toolchain, and six additional single-variant tarballs are produced on top. The RFC gives only ~15 min / ~25 min per-variant figures but never totals the pipeline time, artifact count/size, or per-variant BOSH test-matrix growth. For a shared community release this recurring cost is a first-order design concern. Please add a Costs / CI-impact section quantifying it and use it to justify the matrix size.

The job wrapper resolves the binary path at startup:

1. If `/var/vcap/packages/haproxy-<ssl_variant>` exists → use variant binary (multi release)
2. Otherwise → fall back to `/var/vcap/packages/haproxy` (single-variant release)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens on a single-variant release if an operator sets ha_proxy.ssl_variant to a valid-but-absent variant? As written, the fallback to /var/vcap/packages/haproxy could silently run a different TLS backend than the operator intended (e.g. they believe they are on awslc-fips but are actually on openssl). Recommend failing fast with a clear error instead of silent fallback, and stating that behaviour here.


In practice this has no client impact: DHE is rarely required by modern clients, and ECDHE provides equivalent or better security with superior performance. Existing cipher configurations that include both DHE and ECDHE ciphers continue to work — HAProxy silently skips DHE entries that AWS-LC cannot fulfill, and clients negotiate ECDHE instead.

**Compatibility note**: A client would only break on AWS-LC if it supports **DHE but not ECDHE**. Such stacks predate ~2010 and also predate TLS 1.2. Under a TLS 1.2-minimum policy (already enforced in this deployment), they are filtered out at the protocol layer regardless of cipher selection — every client that can negotiate TLS 1.2 also supports ECDHE. Clients that currently happen to negotiate DHE will transparently fall back to ECDHE with no visible impact.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These behavioural claims (DHE absent, clients transparently fall back to ECDHE, existing mixed cipher configs keep working) are exactly the kind of thing that needs an automated test - but the implementation branch (aws-lc-multi) changes no acceptance or rspec tests at all (the existing suite has https_frontend_test.go, mtls_frontend_test.go, strict_sni_test.go). Please add a Testing section covering which variants get automated coverage and what it asserts: cipher/curve negotiation, DHE->ECDHE fallback, SNI + mTLS regression, ssl_variant selecting the correct binary, and a FIPS-active assertion for the FIPS variants. Separately, please clarify the cf-deployment expectation - haproxy is opt-in there via operations/use-haproxy.yml (one pinned release), so state whether a companion ops-file / ssl_variant wiring is planned or whether cf-deployment integration is out of scope.

- Author(s): @hoffmaen
- Status: Draft
- RFC Pull Request:
- Affected Component(s): haproxy-boshrelease

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two metadata gaps vs rfc-template.md: the 'Related RFCs' field is missing, and 'RFC Pull Request' is empty. The RFC README process asks you to fill in the PR link. Consider cross-linking rfc-0023-add-cf-supports-for-fips-stemcell under Related RFCs given the FIPS overlap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rfc CFF community RFC toc

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

5 participants