Skip to content

Conversation

@keith-turner
Copy link
Contributor

CachableBlockFile.Reader.getBCFile() always read the RFile root block from the index cache even if would not use it. Modified the method to only read from the index cache when the data is actually needed. Noticed this while working on #6010, saw for every data block read from a rfile it would also read from the index cache.

Adjusted ScanTracingIT because before this change there was an rfile index cache read per rfile data block read. After this change the number of rfile index reads align w/ the number of rfiles opened, which in this case aligns with the number of tablets in the test tables.

The existing behavior was likely not a performance problem if the data was in cache, but could potentially cause thread contention and extra CPU usage. So its probably good to fix, also nice to make the tracing stats align better w/ expectations.

CachableBlockFile.Reader.getBCFile() always read the RFile root block
from the index cache even if would not use it. Modified the method to
only read from the index cache when the data is actually needed.
Noticed this while working on apache#6010, saw for every data block read from
a rfile it would also read from the index cache.

Adjusted ScanTracingIT because before this change there was an rfile
index cache read per rfile data block read.  After this change the
number of rfile index reads align w/ the number of rfiles opened, which
in this case aligns with the number of tablets in the test tables.

The existing behavior was likely not a performance problem if the data
was in cache, but could potentially cause thread contention and extra
CPU usage. So its probably good to fix, also nice to make the tracing
stats align better w/ expectations.
@keith-turner keith-turner added this to the 2.1.5 milestone Dec 16, 2025
Copy link
Member

@kevinrr888 kevinrr888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I don't follow the test changes but seeing as you recently wrote the test, I don't have any problems

@keith-turner
Copy link
Contributor Author

I don't follow the test changes but seeing as you recently wrote the test, I don't have any problems

Yeah those test are hard to follow and not sure how to improve it. They were created from manually inspecting the new stats to see if they made sense and then creating some loose bounds that matched the stats seen.

@keith-turner keith-turner merged commit f407e27 into apache:2.1 Dec 18, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants