Performance concern: fillHoles() method and read buffer expansion efficiency.

Hi everyone,

I have the following use case: I’m benchmarking the read throughput performance when dealing with a large number of non-dictionary string columns (300 columns). Based on the profiler output (see the attached picture), I’ve noticed that a significant amount of time is spent in the fillHoles() method, which is part of the read buffer expansion process.

My question is: why is the buffer filled one element at a time instead of using a bulk operation? Wouldn’t a batch approach be more efficient?

Looking forward to your insights. Thanks!

![Image](https://github.com/user-attachments/assets/57108c7a-126d-4370-9a01-4f0aa85218d9)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance concern: fillHoles() method and read buffer expansion efficiency. #599

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance concern: fillHoles() method and read buffer expansion efficiency. #599

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions