Skip to content

[fs] Support AWS S3 credentials provider mode#3540

Open
litiliu wants to merge 2 commits into
apache:mainfrom
litiliu:#3493
Open

[fs] Support AWS S3 credentials provider mode#3540
litiliu wants to merge 2 commits into
apache:mainfrom
litiliu:#3493

Conversation

@litiliu

@litiliu litiliu commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Purpose

Closes #3493.

This PR adds a server-side AWS S3 credentials provider mode for the S3 filesystem. When fs.s3a.aws.credentials.provider is explicitly configured in Fluss configuration, Fluss treats it as the authoritative server-side credential source instead of injecting the client delegated-token provider.

This is intended for deployments that use standard AWS SDK/Hadoop S3A providers such as com.amazonaws.auth.profile.ProfileCredentialsProvider, so rotated long-term credentials can be picked up by the provider without restarting Fluss servers.

The full motivation and design discussion are in #3493. This PR description keeps the reviewer-facing summary and the final implemented behavior.

Brief change log

  • Detect explicitly configured S3 credential providers from Fluss config keys:
    • s3.aws.credentials.provider
    • s3a.aws.credentials.provider
    • fs.s3a.aws.credentials.provider
  • Store an internal Hadoop configuration marker so S3DelegationTokenProvider can distinguish an explicit Fluss provider from Hadoop default resources.
  • In explicit provider mode:
    • do not inject DynamicTemporaryAWSCredentialsProvider;
    • reject provider mode combined with fs.s3a.assumed.role.arn;
    • make the explicit provider win over static AK/SK if both are configured;
    • reuse Hadoop S3A's provider loader for the token path.
  • Resolve credentials from the configured provider on each STS token request instead of freezing AK/SK in the token provider constructor.
  • Reject session credentials before calling STS, so temporary credentials are not used as input for client-token generation.
  • Preserve existing static AK/SK, AssumeRole, and delegated-token client behavior when no explicit server-side provider is configured.

Credential mode resolution:

Static AK/SK AssumeRole ARN Explicit provider Result
any set set rejected
any unset set server-side provider mode; provider wins
set unset unset static AK/SK
set set unset existing static AK/SK + AssumeRole behavior
unset set unset AssumeRole
unset unset unset delegated-token client path

Tests

  • mvn -pl fluss-filesystems/fluss-fs-s3 test -Dtest=S3FileSystemPluginTest,S3DelegationTokenProviderTest

Added/updated coverage for:

  • explicit provider mode does not inject DynamicTemporaryAWSCredentialsProvider;
  • static AK/SK and AssumeRole modes keep existing behavior;
  • explicit provider wins over static AK/SK;
  • explicit provider + AssumeRole is rejected;
  • explicitly configuring DynamicTemporaryAWSCredentialsProvider for server mode is rejected;
  • configured provider credentials are resolved for each token request;
  • session credentials are rejected before STS.

API and Format

No public API or storage format changes.

The PR adds an internal Hadoop configuration marker under fluss.fs.s3.aws.credentials.provider.explicitly.configured. It is not a user-facing option; it only carries whether the provider was explicitly configured through Fluss config.

Documentation

No separate documentation update in this PR. The user-facing behavior and operational motivation are described in #3493.

Generative AI disclosure

  • Yes: OpenAI Codex was used to help implement and review this change, following the repository AGENTS.md guidance.

@litiliu litiliu changed the title Support AWS S3 credentials provider mode [fs] Support AWS S3 credentials provider mode Jun 29, 2026
@litiliu

litiliu commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

@fresh-borzoni could you please help review this when you have time? Thanks!

@fresh-borzoni fresh-borzoni left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@litiliu Thank you for the PR, looks good overall, couple of minor comments, PTAL

throw new IllegalArgumentException(
"AssumeRole and a custom AWS credentials provider cannot be configured together.");
}
this.credentialProviderList =

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description says configuring DynamicTemporaryAWSCredentialsProvider for server mode is "rejected", but there's no guard, so it just gets instantiated here and throws NoAwsCredentialsException at the first token request. Can you clarify?

AWSCredentialsProvider createStsCredentialsProvider() {
if (credentialProviderList != null) {
AWSCredentials credentials = credentialProviderList.getCredentials();
checkArgument(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is effectively the "long-term creds only" gate, and it's the thing people will trip on instance profiles/IRSA roles all return session creds and land here at token time (lazily).

Given s3.md already has an IRSA section, can we add a short note there for this new mode: long-term creds only, not compatible with AssumeRole? Otherwise it's a bit hidden for operational usage

import com.amazonaws.services.securitytoken.model.AssumeRoleResult;
import com.amazonaws.services.securitytoken.model.Credentials;
import com.amazonaws.services.securitytoken.model.GetSessionTokenResult;
import org.apache.commons.lang3.StringUtils;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commons-lang3 is only transitive here, shall we use org.apache.commons.lang3.StringUtils?

import org.apache.fluss.fs.s3.token.S3DelegationTokenProvider;
import org.apache.fluss.fs.s3.token.S3DelegationTokenReceiver;

import org.apache.commons.lang3.StringUtils;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

(accessKey == null) == (secretKey == null),
"S3 access key and secret key must both be set or both be unset.");
if (accessKey == null) {
if (hasCredentialProvider && roleArn != null) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're adding all these checks in this PR - this one fails fast, but the session/empty-cred cases only fail on the first token request.

Could we check those at construction too, so they're consistent? Not blocking though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feature] Support AWS S3 credentials provider mode to refresh credentials without restarting servers

2 participants