[SPARK-55830][SQL] Fix JDBC predicate pushdown dropping driver properties#55408
Draft
yadavay-amzn wants to merge 1 commit intoapache:masterfrom
Draft
[SPARK-55830][SQL] Fix JDBC predicate pushdown dropping driver properties#55408yadavay-amzn wants to merge 1 commit intoapache:masterfrom
yadavay-amzn wants to merge 1 commit intoapache:masterfrom
Conversation
…ties When using spark.read.jdbc() with predicates, custom JDBC driver properties (like socketFactory, cloudSqlInstance) were silently dropped because CaseInsensitiveMap.iterator returns lowercased keys. The JDBCOptions constructor called the inherited Map.++ (due to static typing as Map[String, String]), which iterated via iterator and lost the original key case. JDBC drivers expect case-sensitive property names, so this caused connection failures. The fix wraps parameters in CaseInsensitiveMap first, then uses CaseInsensitiveMap.++ which preserves original key case via the updated() method and originalMap. Closes #XXXXX ### Was this patch authored or co-authored using generative AI tooling? Yes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
When using
spark.read.jdbc()with predicates, custom JDBC driver properties (likesocketFactory,cloudSqlInstance) were silently dropped, causing connection failures. Without predicates, the same properties worked fine.Root Cause
In
JDBCOptions, the constructorthis(url, table, parameters: Map[String, String])did:When
parametersis aCaseInsensitiveMap(which it is in the predicate code path),parameters ++ Map(...)calls the inheritedMap.++(because the static type isMap[String, String]). The inheritedMap.++iteratesthisusingCaseInsensitiveMap.iterator, which returns lowercased keys fromkeyLowerCasedMap. This creates a plainHashMapwith lowercased keys, losing the original case.JDBC drivers expect case-sensitive property names (e.g.,
socketFactory, notsocketfactory), soasConnectionPropertiesreturns properties with wrong-cased keys, andProperties.getProperty("socketFactory")returnsnull.Fix
Wrap
parametersinCaseInsensitiveMapfirst, then useCaseInsensitiveMap.++which preserves original key case via theupdated()method:Applied to both
JDBCOptionsandJdbcOptionsInWrite.How was this patch tested?
Added a unit test in
JDBCSuitethat simulates the predicate code path: createsJDBCOptionsfromCaseInsensitiveMap ++ Properties.asScalaand verifiesasConnectionPropertiespreserves original key case.Does this PR introduce any user-facing change?
Yes. Custom JDBC driver properties (e.g.,
socketFactory,cloudSqlInstance) now work correctly when usingspark.read.jdbc()with predicates.Was this patch authored or co-authored using generative AI tooling?
Yes