Skip to content

perf: testing iceberg-rust ArrowReader optimizations [iceberg]#3551

Draft
mbutrovich wants to merge 16 commits intoapache:mainfrom
mbutrovich:reader_perf
Draft

perf: testing iceberg-rust ArrowReader optimizations [iceberg]#3551
mbutrovich wants to merge 16 commits intoapache:mainfrom
mbutrovich:reader_perf

Conversation

@mbutrovich
Copy link
Contributor

@mbutrovich mbutrovich commented Feb 19, 2026

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

  • Change iceberg-rust dependency to experimental branch at https://github.com/mbutrovich/iceberg-rust/tree/reader_perf that includes:
    • Parquet metadata caching
    • Serialize file size from Iceberg DataFile to native FileScanTask to avoid a stat() call
    • OpenDAL operator caching
    • Parquet metadata prefetch (512KB)
  • Add a config to set iceberg-rust's data_file_concurrency_limit, defaults to 1 since tests without an ORDER BY fail when you increase the value.

How are these changes tested?

Existing tests. I also benchmarked and saw the flame graphs stacks associated with stat() disappear, and other stacks shrink.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments