Skip to content

Comments

Add getcpu system call support for getting a NUMA node of the current thread#1575

Open
Eugene-Usachev wants to merge 3 commits intobytecodealliance:mainfrom
Eugene-Usachev:getcpu
Open

Add getcpu system call support for getting a NUMA node of the current thread#1575
Eugene-Usachev wants to merge 3 commits intobytecodealliance:mainfrom
Eugene-Usachev:getcpu

Conversation

@Eugene-Usachev
Copy link

Not much time ago I needed to get a NUMA node of the current thread. Linux provides the getcpu system call for it. But I did not find any Rust library that provides this function. I read this repository code, and I like what you are doing, so I want to help you to extend the functionality of this library.

I implemented the getcpu syscall for Linux only. Also, I develop with WSL, which adds \r\n at the end of lines, so I patched tests for not reading the end of lines.

&buf[..58],
b"//! Utilities for functions that return data via buffers.\n"
&buf[..57],
b"//! Utilities for functions that return data via buffers."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this code changing?

Copy link
Author

@Eugene-Usachev Eugene-Usachev Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I develop with WSL, which adds \r\n at the end of lines, so I patched tests for not reading the end of lines. cargo test fails if we try to read 58 characters (with the end of line char).

/// [Linux]: https://man7.org/linux/man-pages/man2/getcpu.2.html
#[cfg(linux_kernel)]
#[inline]
pub fn getcpu() -> (usize, usize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the Linux docs, the values written by getcpu have type unsigned int.

   int getcpu(unsigned int *_Nullable cpu, unsigned int *_Nullable node);

Would it be better to reflect them here as u32, rather than usize?

I see that sched_getcpu already returns usize, but that appears to be an error.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do the same as sched_getcpu. To be honest, I am not sure what users prefer more here. I use numa_node: usize in my code, but I can't say everyone does it. If you want me to change it I can do it but I think it is not important.

target_arch = "s390x",
))]
#[cold]
#[inline(never)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for inline(never) here? It's already marked #[cold], which should have the desired effect. It a compiler decides it really wants to inline this, even given what we've told it, that seems fine.

Copy link
Author

@Eugene-Usachev Eugene-Usachev Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still sure that non-inlining provides a better performance, but I agree with both points: the difference is tiny and the compiler can do it as it wants. But I want to explicitly tell the compiler to generate exactly one call method for the calling this function that should be called only once. If you don't like this, I can't roll back this change. After all, the main goal of the PR is adding getcpu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants