Skip to content

Conversation

@loserwang1024
Copy link
Contributor

Purpose

Linked issue: close #2646

Brief change log

Tests

API and Format

Documentation

@loserwang1024
Copy link
Contributor Author

@platinumhamburg @wuchong , CC

// vectorSchemaRootMap,
// for example temporary buffers left behind when an OOM occurs during decompression
// (see https://github.com/apache/fluss/issues/2646).
bufferAllocator.releaseBytes(bufferAllocator.getAllocatedMemory());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! The approach is consistent with RecordAccumulator.close().

One concern: releaseBytes() only adjusts the allocator's accounting but doesn't actually free the underlying direct memory - those buffers will
remain until GC. This fix prevents the IllegalStateException on close, which is good for avoiding cascading failures, but the actual memory may still
leak temporarily.

For a more robust solution, consider adding try-finally in ZstdArrowCompressionCodec.doDecompress() to ensure temp buffers are released on failure.
But the current fix is acceptable as a defensive measure.

Copy link
Contributor Author

@loserwang1024 loserwang1024 Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ZstdArrowCompressionCodec.doDecompress() cannot do this, because as i mention in issue:When fail in ZstdArrowCompressionCodec.doDecompress() , we need to release buffers in in org.apache.arrow.vector.VectorLoader#loadBuffers which is the code of arrow.
image

Copy link
Contributor Author

@loserwang1024 loserwang1024 Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@platinumhamburg , I modified this PR and add a patched version of Arrow's {@code VectorLoader} to fix this problem. Detail see VectorLoaderTest

Copy link
Contributor

@platinumhamburg platinumhamburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work identifying and fixing this Arrow buffer leak! I verified that the upstream Arrow-Java main branch
still has this bug in VectorLoader.loadBuffers() — the decompression loop sits outside the try block, and
buf.close() only runs on the success path. So this patch is definitely necessary.

The fix logic is correct: moving the decompression loop inside the try block and releasing buffers in the finally
clause properly handles both the success path (loadFieldBuffers retains +1, finally close -1) and the error path
(finally close frees all already-decompressed buffers).

Below are only one suggestion to improve maintainability.

*
* @see <a href="https://github.com/apache/fluss/issues/2646">FLUSS-2646</a>
*/
public class VectorLoader {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest changing the approach from class-shadowing to an explicitly named class under a
Fluss-owned package. Specifically: rename this to FlussVectorLoader and move it to
org.apache.fluss.record, then update the call site in ArrowUtils.createArrowReader().

I've verified locally that all Arrow APIs used in loadBuffers() — TypeLayout, FieldVector,
Collections2, Preconditions, etc. — are public in the shaded JAR, so there's no package-visibility
blocker. Compilation and all 3 tests pass with this change.

Why this is better than class-shadowing:

  1. Explicit over implicit. The current approach relies on classpath ordering to silently
    override the shaded Arrow JAR's VectorLoader. With FlussVectorLoader, the call site in
    ArrowUtils makes it obvious that a patched loader is being used — no hidden magic.

  2. Compile-time safety on Arrow upgrades. If Arrow's VectorLoader API changes in a future
    version (e.g. the upstream main branch already added a variadicBufferCounts parameter),
    the shadowed class would silently diverge. With an explicit class, any API incompatibility
    surfaces as a compile error immediately.

  3. No risk of classpath conflicts. Class-shadowing behavior depends on JAR ordering, which
    can vary across build tools, IDEs, and deployment environments. An explicitly named class
    eliminates this uncertainty entirely.

The change is straightforward:

  • Rename VectorLoaderFlussVectorLoader in org.apache.fluss.record
  • Update import + usage in ArrowUtils.createArrowReader()
  • Update import + usage in VectorLoaderTest

Please also add a TODO in the class Javadoc for future cleanup:

// TODO(FLUSS-2646): Remove this patched VectorLoader once Apache Arrow fixes
// the buffer leak in loadBuffers(). Current Arrow version: 15.0.0

Additionally, I'd recommend filing an issue in https://github.com/apache/arrow-java/issues to
push for an upstream fix — that way we have a clear signal for when this patch can be retired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bufferAllocator of LogRecordReadContext will leak if oom when decompress data.

2 participants