Skip to content

Conversation

@rachel88888
Copy link
Contributor

Hello!

I have noticed the same issue as Issue #1506 where the number of results retrieved is inconsistent across reads and traced the issue to the reuse of the same metrics evaluator across threads when reading manifests. Because the metrics evaluator is stateful, this will result in the wrong results being retrieved nondeterministically, depending on the execution order of the threads.

This PR addresses the issue by creating a single metrics evaluator per thread, which I have tested locally. Please let me know if there are any tests I can add, and I am happy to receive feedback.

Thank you!

Closes #1506

@Fokko Fokko added this to the PyIceberg 0.9.0 release milestone Feb 18, 2025
Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @rachel88888, thanks for fixing this 👍

@Fokko Fokko merged commit d26d1e4 into apache:main Feb 18, 2025
7 checks passed
@rachel88888 rachel88888 deleted the bugfix-1506/scan-plan branch February 18, 2025 15:47
@rachel88888
Copy link
Contributor Author

Thank you for the quick review @Fokko!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Explore potential issue with scan returning the incorrect results

2 participants