-
Notifications
You must be signed in to change notification settings - Fork 339
Faster minmax by using fold
#968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Faster minmax by using fold
#968
Conversation
Do less comparisons by considering pairs is smart but this was done by `loop { it.next(), it.next() }` while `it.fold` is generally faster.
This results in a lot of changes but most of it is merely moving code and indentation corrections.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #968 +/- ##
==========================================
+ Coverage 94.38% 94.67% +0.28%
==========================================
Files 48 49 +1
Lines 6665 7232 +567
==========================================
+ Hits 6291 6847 +556
- Misses 374 385 +11 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think I get the idea. fold pairwise and see if there's one excess element that has to be processed manually.
Do you have some numbers that show that the proposed implementation indeed is faster? (My gut thought: For a slice, the existing implementation can just call next twice, whereas your proposal has to match acc twice, even though that seems unneeded.)
I'll approve this so that you can merge if you want. But I'd be happy about benchmarks that substantiate that we're taking the right direction.
|
I've also been exploring another approach, using the This gives a concise and elegant implementation (feel free to pick it under the Itertools Apache/MIT license): fn minmax(x: &[u32]) -> MinMaxResult<&u32> {
let mut tuples = x.iter().tuples();
let result = tuples
.by_ref()
.map(|(a, b)| if a < b { (a, b) } else { (b, a) })
.reduce(|(a, b), (c, d)| (a.min(c), b.max(d)));
// Handle the last element, if any.
match (result, tuples.into_buffer().next()) {
(None, None) => MinMaxResult::NoElements,
(None, Some(x)) => MinMaxResult::OneElement(x),
(Some((min, max)), None) => MinMaxResult::MinMax(min, max),
(Some((min, max)), Some(x)) => MinMaxResult::MinMax(min.min(x), max.max(x)),
}
}Intuitively, this should play nice with the amount of branches in the hot loop, and indeed the generated assembly is shorter than the vanilla itertools implementation: https://play.rust-lang.org/?version=stable&mode=release&edition=2024&gist=d7f89e74389221fdb6c89b5a4dfc5b2c Note that I haven't tested nor benchmarked any of these approaches. |
|
That two-phase phrasing it pretty nice 👍 Might want to double-check the edge cases, though. I think .map(|(a, b)| if a < b { (a, b) } else { (b, a) })should probably be .map(|(a, b)| if a > b { (b, a) } else { (a, b) })because if they're ord-equivalent-but-not-actually-identical you want them to stay in the original order. (If there's not already, should have some tests for |
No description provided.