-
Notifications
You must be signed in to change notification settings - Fork 5.3k
[utest]: Modification of the SMP Threads Auto Assign to Idle Core uTest #10942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 感谢您对 RT-Thread 的贡献!Thank you for your contribution to RT-Thread! 为确保代码符合 RT-Thread 的编码规范,请在你的仓库中执行以下步骤运行代码格式化工作流(如果格式化CI运行失败)。 🛠 操作步骤 | Steps
完成后,提交将自动更新至 如有问题欢迎联系我们,再次感谢您的贡献!💐 |
📌 Code Review Assignment🏷️ Tag: kernelReviewers: GorrayLi ReviewSun hamburger-os lianux-mm wdfk-prog xu18838022837 Changed Files (Click to expand)
📊 Current Review Status (Last Updated: 2025-11-19 16:05 CST)
📝 Review Instructions
|
899f857 to
0c3c3ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves the SMP (Symmetric Multi-Processing) threads auto-assignment test case to more accurately verify that threads are evenly distributed across idle CPU cores. The key improvement is removing the rt_thread_delay() call that previously caused all threads to be suspended when checking their distribution, making it impossible to observe which cores they were running on.
Key Changes
- Modified thread creation count from
RT_CPUS_NRtoRT_CPUS_NR - 1to account for the core running the test itself - Changed completion detection from a magic number (
0xA55A) to a bitmask-based approach that tracks each thread's completion individually - Removed
rt_thread_delay(5)to keep threads in running state for observation - Added comprehensive test case documentation explaining objectives, scenarios, verification metrics, and expected results
| * - to idle harts under the SMP architecture. | ||
| * | ||
| * Test Scenarios: | ||
| * - Under the SMP architecture, each hart spends most of its time running the | ||
| * - idle thread after the system starts. At this point, create RT_CPUS_NR-1 cyclic | ||
| * - tasks and observe whether these tasks can be evenly distributed across all | ||
| * - harts for execution. Since the thread running the utest occupies one hart, it | ||
| * - is only necessary to observe whether the remaining (RT_CPUS_NR - 1) harts can |
Copilot
AI
Nov 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] [Documentation/文档]: Inconsistent terminology - "harts" vs "cores"
English: The documentation uses both "harts" (lines 25, 28, 30, 32) and "cores" (lines 24, 78, 95, 103) to refer to the same concept. In the RT-Thread context and general embedded systems terminology, "core" is the more common term. RISC-V uses "hart" (hardware thread), but for consistency with the rest of the codebase and the test name itself ("idle_cores"), it would be better to use "cores" consistently throughout the documentation.
中文: 文档中同时使用了 "harts"(第25、28、30、32行)和 "cores"(第24、78、95、103行)来指代相同的概念。在 RT-Thread 上下文和一般嵌入式系统术语中,"core" 是更常见的术语。RISC-V 使用 "hart"(硬件线程),但为了与代码库其余部分和测试名称本身("idle_cores")保持一致,最好在整个文档中始终使用 "cores"。
Suggested change / 建议修改:
- Replace "harts" with "cores" in lines 25, 28, 30, 32
| * - to idle harts under the SMP architecture. | |
| * | |
| * Test Scenarios: | |
| * - Under the SMP architecture, each hart spends most of its time running the | |
| * - idle thread after the system starts. At this point, create RT_CPUS_NR-1 cyclic | |
| * - tasks and observe whether these tasks can be evenly distributed across all | |
| * - harts for execution. Since the thread running the utest occupies one hart, it | |
| * - is only necessary to observe whether the remaining (RT_CPUS_NR - 1) harts can | |
| * - to idle cores under the SMP architecture. | |
| * | |
| * Test Scenarios: | |
| * - Under the SMP architecture, each core spends most of its time running the | |
| * - idle thread after the system starts. At this point, create RT_CPUS_NR-1 cyclic | |
| * - tasks and observe whether these tasks can be evenly distributed across all | |
| * - cores for execution. Since the thread running the utest occupies one core, it | |
| * - is only necessary to observe whether the remaining (RT_CPUS_NR - 1) cores can |
| int hartid = *(int *)parameter; | ||
| while (1) | ||
| { | ||
| tick++; |
Copilot
AI
Nov 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Concurrency/并发]: Potential race condition on shared variable tick
English: Multiple threads increment the shared variable tick without synchronization (line 60). In an SMP environment, this creates a race condition where concurrent increments from different cores can lead to lost updates. While this may not affect the test's primary purpose (observing thread distribution), it could cause the tick >= 100 condition to be reached unreliably. Consider using atomic operations (rt_atomic_t and atomic increment) or protecting the increment with a lock.
中文: 多个线程在没有同步的情况下递增共享变量 tick(第60行)。在 SMP 环境中,这会产生竞态条件,来自不同核心的并发递增可能导致更新丢失。虽然这可能不影响测试的主要目的(观察线程分布),但可能导致 tick >= 100 条件不可靠地到达。建议使用原子操作(rt_atomic_t 和原子递增)或使用锁保护递增操作。
| if (tick >= 100 && (finish_flag & (1 << hartid)) == 0) | ||
| { | ||
| /* Output the current core running threads */ | ||
| extern long list_thread(void); | ||
| list_thread(); | ||
| finsh_flag = 0xA55A; | ||
| finish_flag |= (1 << hartid); |
Copilot
AI
Nov 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Concurrency/并发]: Race condition on finish_flag bit operations
English: The check-and-set operation on finish_flag at lines 61-63 is not atomic. Multiple threads could simultaneously read (finish_flag & (1 << hartid)) == 0 as true, then all execute the finish_flag |= (1 << hartid) operation. While the bitwise OR is idempotent for the same bit position (setting the same bit multiple times has the same effect), the lack of atomicity could still cause issues on some architectures. Consider using atomic operations or ensuring the read-modify-write sequence is atomic.
中文: 第61-63行对 finish_flag 的检查和设置操作不是原子的。多个线程可能同时读取 (finish_flag & (1 << hartid)) == 0 为真,然后都执行 finish_flag |= (1 << hartid) 操作。虽然对于相同的位位置,按位或是幂等的(多次设置相同的位具有相同的效果),但在某些架构上缺乏原子性仍可能导致问题。建议使用原子操作或确保读-修改-写序列是原子的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
| finsh_flag = 0xA55A; | ||
| finish_flag |= (1 << hartid); | ||
| uassert_true(1); | ||
| } |
Copilot
AI
Nov 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Performance/性能]: Busy-wait loop consuming CPU cycles
English: The thread entry function now contains a tight busy-wait loop without any delay (line 58-66). Previously, rt_thread_delay(5) was called at line 66 (old code), which yielded the CPU and allowed the scheduler to run other threads. Removing this delay causes all threads to continuously spin in busy loops, consuming 100% CPU on all cores. This could interfere with the test's ability to observe proper thread distribution, as the system is now under heavy load rather than demonstrating idle core assignment. Consider adding a small delay or yield operation to make the test more realistic.
中文: 线程入口函数现在包含一个紧密的忙等循环,没有任何延迟(第58-66行)。之前在第66行(旧代码)调用了 rt_thread_delay(5),它会让出 CPU 并允许调度器运行其他线程。移除此延迟会导致所有线程在忙循环中持续旋转,在所有核心上消耗 100% CPU。这可能会干扰测试观察正确线程分布的能力,因为系统现在处于重负载状态,而不是展示空闲核心分配。建议添加一个小的延迟或让出操作,使测试更加真实。
| } | |
| } | |
| rt_thread_delay(5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加延时会让线程处于suspend态,导致观察不到新创建的线程是否被均匀分配到了空闲hart上。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加延时会让线程处于suspend态,导致观察不到新创建的线程是否被均匀分配到了空闲hart上。
是的,你这个改动合理
0c3c3ea to
5facf8f
Compare
| finsh_flag = 0xA55A; | ||
| finish_flag |= (1 << hartid); | ||
| uassert_true(1); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加延时会让线程处于suspend态,导致观察不到新创建的线程是否被均匀分配到了空闲hart上。
是的,你这个改动合理
|
@Tm-C-mT 作者PR标题也修改下吧/拆分成两个commit,因为我看是分为两个部分,一个是文档,一个是测试用例的改进 |
Currently, this utest cannot determine whether threads are evenly distributed across idle harts by observing the result of list_thread(). This is because the presence of rt_thread_delay(5); causes all other threads to be in the suspended state when thread information is printed. For example, if RT_CPUS_NR=4, T0 executes list_thread() to print information, while T1~T3 are in hibernation and thus it is impossible to observe which hart they are running on. Solution:Here, the completion judgment condition has been modified. For example, when RT_CPUS_NR=4, only RT_CPUS_NR-1 threads will be created (i.e., T0 to T2), because running the utest occupies one hart. The execution is judged as completed when finish_flag=0x0007, and the thread running the utest will call list_thread() to print the information. Observe whether T0 to T2 are running on different harts simultaneously. Signed-off-by: Mengchen Teng <teng_mengchen@163.com>
5facf8f to
9319fde
Compare
…tion for idle harts Add explanatory comments for the utest of smp_assigned_idle_cores. Signed-off-by: Mengchen Teng <teng_mengchen@163.com>
拉取/合并请求描述:(PR description)
[
为什么提交这份PR (why to submit this PR)
Currently, this utest cannot determine whether threads are evenly distributed across idle harts by observing the result of list_thread(). This is because the presence of rt_thread_delay(5); causes all other threads to be in the suspended state when thread information is printed. For example, if RT_CPUS_NR=4, T0 executes list_thread() to print information, while T1~T3 are in hibernation and thus it is impossible to observe which hart they are running on.
你的解决方案是什么 (what is your solution)
Solution:Here, the completion judgment condition has been modified. For example, when RT_CPUS_NR=4, only RT_CPUS_NR-1 threads will be created (i.e., T0 to T2), because running the utest occupies one hart. The execution is judged as completed when finish_flag=0x0007, and the thread running the utest will call list_thread() to print the information. Observe whether T0 to T2 are running on different harts simultaneously.
In addition, relevant explanations have been added to this modified utest.
请提供验证的bsp和config (provide the config and bsp)
]
当前拉取/合并请求的状态 Intent for your PR
必须选择一项 Choose one (Mandatory):
代码质量 Code Quality:
我在这个拉取/合并请求中已经考虑了 As part of this pull request, I've considered the following:
#if 0代码,不包含已经被注释了的代码 All redundant code is removed and cleaned up