PECOBLR-1121 Arrow patch to circumvent Arrow issues with JDk 16+. #1156

tejassp-db · 2025-12-22T10:50:38Z

Description

Databricks server shares query results in Arrow format for easy cross language functionality. The JDBC driver experiences compatibility issues with JDK 16 and later versions when processing Arrow results.

This problem arises from stricter encapsulation of internal APIs in newer Java versions, which affects the driver's use of the Apache Arrow result format consumption with the Apache Arrow library. The JDBC driver is used in partner solutions, where they do not have control of the runtime environment, and the workaround of setting JVM arguments is not feasible.

Testing

Tests are added in other stacked PRs.

Additional Notes to the Reviewer

Its a stacked PR.

Patch Arrow to create a Databricks ArrowBuf which allocates memory on the heap and provides access to it through Java methods. This removes the need to specify "--add-opens=java.base/java.nio=ALL-UNNAMED" as JVM args for JDK 16+.

Use native Arrow if available. Otherwise fallback to the patch version.

Remove irrelevant reference counting in patch code. Patch code uses heap memory for arrow operations and reference counting is not required.

Remove redundant todos for accounting.

Patch DecimalUtility to not use unsafe methods to set decimal values on DatabricksArrowBuf.

Add notice to all patched Arrow Java code. In NOTICE file mention Arrow has been patched by Databricks.

On static init failure of MemoryUtil class, it prints a stack trace to stderr. Remove this print, since now we fallback to DatabricksBufferAllocator when this happens. And the error is logged as well.

vikrantpuppala · 2026-01-02T10:15:53Z

src/main/java/org/apache/arrow/memory/ArrowBuf.java

+
+  // ---- Databricks patch start ----
+  private final HistoricalLog historicalLog =
+      DEBUG ? new HistoricalLog(DEBUG_LOG_LENGTH, "ArrowBuf[%d]", id) : null;


do we even need to worry about the historical log? can we not just set it to null? seems like its usage is null checked everywhere anyway?

+1, add a comment on why this is needed

vikrantpuppala · 2026-01-02T10:29:48Z

src/main/java/org/apache/arrow/memory/DatabricksAllocationReservation.java

+    long currentReservation = reservedSize.get();
+    long newReservation = currentReservation + nBytes;
+    if (newReservation > allocator.getHeadroom() + currentReservation) {
+      return false;
+    }
+    reservedSize.addAndGet(nBytes);


this is not thread safe, should we use compareAndSet?

vikrantpuppala · 2026-01-02T10:35:48Z

src/main/java/org/apache/arrow/memory/DatabricksArrowBuf.java

+  @Override
+  public ReferenceManager getReferenceManager() {
+    return referenceManager;
+  }
+
+  @Override
+  public long capacity() {
+    return capacity;
+  }


i think a bunch of these methods do not have a changed override behaviour, can we remove them so that it is easy to review and maintain?

vikrantpuppala · 2026-01-02T10:36:55Z

src/main/java/org/apache/arrow/memory/DatabricksArrowBuf.java

+    if (capacity > Integer.MAX_VALUE) {
+      throw new IllegalArgumentException(
+          "DatabricksArrowBuf does not support capacity > Integer.MAX_VALUE");
+    }


why this limit?

this is missing from the other constructor

can we reuse constructors

gopalldb · 2026-01-05T05:18:59Z

src/main/java/com/databricks/jdbc/api/impl/arrow/ArrowBufferAllocator.java

+  static {
+    RootAllocator rootAllocator = null;
+    try {
+      rootAllocator = new RootAllocator();


we were using Integer.MAX_VALUE, not needed now?

gopalldb · 2026-01-05T06:07:13Z

src/main/java/org/apache/arrow/memory/ArrowBuf.java

+  // ---- to avoid unsafe allocation initialization errors.
+  public static final String DEBUG_ALLOCATOR = "arrow.memory.debug.allocator";
+  public static final int DEBUG_LOG_LENGTH = 6;
+  public static final boolean DEBUG;


nit: rename to better name than debug?

tejassp-db added 4 commits December 16, 2025 15:58

PECOBLR-1121 Patch Arrow to circumvent JVM args issue.

37d7d15

Patch Arrow to create a Databricks ArrowBuf which allocates memory on the heap and provides access to it through Java methods. This removes the need to specify "--add-opens=java.base/java.nio=ALL-UNNAMED" as JVM args for JDK 16+.

PECOBLR-1121 Use Arrow patch as fallback.

ffd6c1c

Use native Arrow if available. Otherwise fallback to the patch version.

PECOBLR-1121 Simplify patch code.

a78a597

Remove irrelevant reference counting in patch code. Patch code uses heap memory for arrow operations and reference counting is not required.

PECOBLR-1121 Minor refactor.

1654f74

tejassp-db self-assigned this Dec 22, 2025

tejassp-db added 5 commits December 23, 2025 16:11

PECOBLR-1121 Fix todos and fixmes.

42422f1

Remove redundant todos for accounting.

PECOBLR-1121 Fix derive buffer

36c2d3d

PECOBLR-1121 Patch DecimalUtility.

dcdc49a

Patch DecimalUtility to not use unsafe methods to set decimal values on DatabricksArrowBuf.

PECOBLR-1121 Add Apache 2 compliant changes.

7c44728

Add notice to all patched Arrow Java code. In NOTICE file mention Arrow has been patched by Databricks.

PECOBLR-1121 Suppress stack trace print on Arrow class init failure.

44271f9

On static init failure of MemoryUtil class, it prints a stack trace to stderr. Remove this print, since now we fallback to DatabricksBufferAllocator when this happens. And the error is logged as well.

tejassp-db requested review from gopalldb, jayantsing-db, jprakash-db, madhav-db, msrathore-db, samikshya-db and vikrantpuppala January 2, 2026 09:47

tejassp-db marked this pull request as ready for review January 2, 2026 09:48

tejassp-db requested a review from sreekanth-db January 2, 2026 09:53

vikrantpuppala reviewed Jan 2, 2026

View reviewed changes

gopalldb reviewed Jan 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PECOBLR-1121 Arrow patch to circumvent Arrow issues with JDk 16+. #1156

PECOBLR-1121 Arrow patch to circumvent Arrow issues with JDk 16+. #1156

Uh oh!

tejassp-db commented Dec 22, 2025

Uh oh!

vikrantpuppala Jan 2, 2026

Uh oh!

gopalldb Jan 5, 2026

Uh oh!

vikrantpuppala Jan 2, 2026

Uh oh!

vikrantpuppala Jan 2, 2026

Uh oh!

vikrantpuppala Jan 2, 2026

Uh oh!

gopalldb Jan 5, 2026

Uh oh!

gopalldb Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

PECOBLR-1121 Arrow patch to circumvent Arrow issues with JDk 16+. #1156

Are you sure you want to change the base?

PECOBLR-1121 Arrow patch to circumvent Arrow issues with JDk 16+. #1156

Uh oh!

Conversation

tejassp-db commented Dec 22, 2025

Description

Testing

Additional Notes to the Reviewer

Uh oh!

vikrantpuppala Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

gopalldb Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

vikrantpuppala Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

vikrantpuppala Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

vikrantpuppala Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

gopalldb Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

gopalldb Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants