feat(prof): use a trampoline for FLF functions to intercept timings#3595
feat(prof): use a trampoline for FLF functions to intercept timings#3595
Conversation
1976c3c to
222e3a4
Compare
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3595 +/- ##
==========================================
- Coverage 62.21% 62.11% -0.11%
==========================================
Files 141 141
Lines 13387 13387
Branches 1753 1753
==========================================
- Hits 8329 8315 -14
- Misses 4260 4273 +13
- Partials 798 799 +1 see 3 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
morrisonlevi
left a comment
There was a problem hiding this comment.
I assume this can run afoul of permissions somehow, since we're making executable code at runtime? I'm not well versed here, and yes, if the JIT is enabled you'd have to have that capability anyway, but I'm trying to understand implications of this WIP.
Benchmarks [ profiler ]Benchmark execution time: 2026-02-16 17:25:03 Comparing candidate commit 0dd28b4 in PR branch Found 0 performance improvements and 3 performance regressions! Performance is the same for 26 metrics, 7 unstable metrics. scenario:php-profiler-timeline-memory-control
scenario:php-profiler-timeline-memory-with-profiler
|
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
222e3a4 to
0eb96ee
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
dc825a7 to
c6bdefc
Compare
|
@morrisonlevi dynasmrt takes care of setting RX permissions after compiling the code. As long as you're not running this under a hardened runtime (like app store apps or android (?)), there's no fundamental problem. Also I currently call |
95f422a to
7471599
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
7471599 to
4358ab8
Compare
profiling/src/wall_time.rs
Outdated
| dynasm!(assembler | ||
| ; mov rax, QWORD original as i64 | ||
| ; call rax | ||
| ; mov rax, QWORD interrupt_addr as i64 |
There was a problem hiding this comment.
I assume original is not returning anything, because you're writing over RAX.
profiling/src/wall_time.rs
Outdated
| #[cfg(target_arch = "aarch64")] | ||
| dynasm!(assembler | ||
| ; mov x16, original as u64 | ||
| ; blr x16 |
There was a problem hiding this comment.
this overwrites x30/lr, aren't you gonna lose the original return location of the handler? so when br x16 returns, it goes back to calling interrupt_addr
There was a problem hiding this comment.
Ah, right. call on x86_64 pushes %eip to the stack, but blr doesn't.
There was a problem hiding this comment.
Fixed up, is it correct now?
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
a3c79dc to
078acef
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
e4930fe to
2604b22
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
2604b22 to
6307860
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
a246940 to
bcebfb0
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
b0995bc to
d9d3f43
Compare
ab68993 to
5513636
Compare
b0d9759 to
b2149b6
Compare
514102d to
655fb2d
Compare
realFlowControl
left a comment
There was a problem hiding this comment.
Thanks @bwoebi this is awesome!
Frameless functions (FLF) do not check the
EG(vm_interrupt)flag like it would happen usually on internal function returns. This means that from the point of view of the profiler, those functions do not exist, as the engine never checks the interrupt flag and as such never calls the profilers interrupt handler.This PR adds a trampoline for
aarch64andx84_64to all known frameless functions and a tail call to check theEG(vm_interrupt)flag and in case it is raised, call the interrupt handler.One (acceptable) trade-off is that using these trampolines,
libunwindin the crash tracker is not able to unwind passed the trampoline. Anyway, the impact is minimal:The profiler itself is unaffected of this, as we are unwinding the VM stack, following the
execute_datalinked list.https://datadoghq.atlassian.net/browse/PROF-12085