feat: vendor-pluggable S3 credentials for native scans#4309
feat: vendor-pluggable S3 credentials for native scans#4309mbutrovich wants to merge 34 commits into
Conversation
# Conflicts: # docs/source/contributor-guide/index.md
b549155 to
0cd8a36
Compare
… activation (fs.s3a.comet.credential.provider.class for Parquet, s3.comet.credential.provider.class for Iceberg), so the bridge is opt-in per Spark config rather than implicit on classpath presence.
|
edit: |
|
CC @snmvaughan |
karuppayya
left a comment
There was a problem hiding this comment.
Left some comments. Will do another pass later today
| InstanceKey key = new InstanceKey(providerClassName, dispatchKey == null ? "" : dispatchKey); | ||
| Map<String, String> props = | ||
| catalogProperties == null ? Collections.emptyMap() : catalogProperties; | ||
| INSTANCES.computeIfAbsent( |
There was a problem hiding this comment.
A vendor whose initialize throws gets re-attempted on every get_credential call from object_store. Should we cache error per key and backoff. May be a followup
|
Thanks for the feedback @karuppayya! I think I addressed everything but:
I will update my internal credential provider to align with these SPI changes and test again. |
|
Updated my internal implementation to match the latest SPI changes, and things are working well! |
Which issue does this PR close?
Closes #4332.
Rationale for this change
Comet's native scan paths (
object_storefor raw Parquet,opendalviaiceberg-rustfor Iceberg) bypass Spark's Hadoop S3A credential infrastructure. Vendors with per-path STS, REST-vended creds, or other custom mechanisms cannot reach Comet through any existing SPI.AWSCredentialsProvider.getCredentials()is parameterless, Hadoop S3A custom signers never return credentials outside the signing pipeline, and Spark'sCloudCredentialsProvideryields one JWT per service name with no path argument.This PR adds a narrow, S3-specific SPI plus JNI plumbing to call it from native code. Activation is config-driven and modeled on
parquet.crypto.factory.class(PME KMS, #2447). The user names one vendor class in a Spark or Hadoop config and the vendor dispatches across backends inside it.Design rationale (keying, lifecycle, returns-or-throws, no Comet-side cache, property-bag handling, error-fidelity caveats) lives in the contributor guide page
s3-credential-provider-design.md. Operator setup and vendor contract live in the user guide pages3-credential-providers.md.What is in this PR
org.apache.comet.cloud.s3(in thesparkmodule, since refactor: Move most ofcomet-commonmodule intocomet-spark#4325 collapsedcommonto a minimal bootstrap):CometS3CredentialProvider(AutoCloseable,default initialize(Map)),CometS3Credentials,CometS3AccessMode,CometS3CredentialContext, andCometS3CredentialDispatcherkeyed by(FQCN, dispatchKey, catalogProperties)withensureInitialized(...)returning alonghandle, hot-pathgetCredentialsForPath(handle, ...), and a JVM shutdown hook that closes every cached provider.org.apache.comet.util.ClassLoaders.loadClassprefers the thread context ClassLoader. Both the dispatcher andIcebergReflection.loadClassdelegate to it.CometS3CredentialBridge(undernative/core/src/cloud/s3/) implementingobject_store::CredentialProviderandreqsign_core::ProvideCredential, plus a JNI handle innative/jni-bridge.fs.s3a.comet.credential.provider.class(with per-bucket override) for Parquet, ands3.comet.credential.provider.classon the Spark catalog property for Iceberg.dispatchKeyis the bucket on the Parquet path and the V2 catalog name on the Iceberg path.catalog_properties. The storage-prefix filter (s3.,gcs.,adls.,client.) moves native-side toiceberg_scan.rs::load_file_io.IcebergScanExecgets a manual redactingDebugso plan dumps do not leak the property bag.iceberg-rustpin bumped to83b4595(forreqsign-core3.0 andCustomAwsCredentialLoader).testcontainersbumped to 1.21.4 anddocker-javato 3.7.1 for modern Docker daemons.IcebergRESTVendedS3Provider(test scope, Spark 4.x build only) wrapping Iceberg'sVendedCredentialsProvider. Test scope keepsiceberg-awsand AWS SDK v2 off Comet's runtime classpath.How are these changes tested?
CometS3CredentialDispatcherTest: handle round-trip,ensureInitializedidempotence, distinctdispatchKeyandcatalogPropertiesisolation,closeAllswallows provider exceptions, missing-class / wrong-interface / no-arg-ctor / empty-FQCN failure modes, get-without-init guard.IcebergRESTVendedS3ProviderTest(Spark 4.x).CometS3CredentialBridgeSuite(Minio): Parquet on S3, Iceberg on S3, REST plus SPI integration with a sentinel non-storage-prefix key reachinginitialize(Map), multi-catalog isolation across two catalogs sharing one FQCN. Added todev/ci/check-suites.pyignore list (manual, like other Docker-dependent S3 suites).