Skip to content

feat: Add S3 archive FileIO support #8143

Open
Shekharrajak wants to merge 6 commits into
apache:masterfrom
Shekharrajak:feature/paimon-5510-s3-archive
Open

feat: Add S3 archive FileIO support #8143
Shekharrajak wants to merge 6 commits into
apache:masterfrom
Shekharrajak:feature/paimon-5510-s3-archive

Conversation

@Shekharrajak

Copy link
Copy Markdown
Contributor

Ref #5510 (comment)

Purpose

Implements S3-backed archive, restore, and unarchive operations for Paimon FileIO by mapping StorageType to S3 storage classes and issuing same-key S3 copy/restore requests.

We will have follow up PRs for OSS, other supported object storage and all other not supported will through unsupported exception.

Tests

mvn -pl paimon-filesystems/paimon-s3-impl -am -Pfast-build -DfailIfNoTests=false -Dtest=S3ArchiveOperationsTest test

@JingsongLi

Copy link
Copy Markdown
Contributor

I found one blocker in the S3 archive implementation: archive/unarchive currently change storage class by issuing a single CopyObject request for the same key. S3 single-copy only supports objects up to 5 GB, while Paimon data files can be larger than that, so this will fail for valid large data files. Please branch on the object size from HeadObjectResponse and use multipart copy / UploadPartCopy for large objects, with a test covering that path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants