initial spark launcher instrumentation#10629
initial spark launcher instrumentation#10629gh-worker-dd-mergequeue-cf854d[bot] merged 21 commits intomasterfrom
Conversation
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 63 metrics, 8 unstable metrics. Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.60.0-SNAPSHOT~7c0d7eeded, baseline=1.60.0-SNAPSHOT~a249dd3265
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.063 s) : 0, 1063435
Total [baseline] (10.884 s) : 0, 10883665
Agent [candidate] (1.063 s) : 0, 1062621
Total [candidate] (10.908 s) : 0, 10908243
section appsec
Agent [baseline] (1.24 s) : 0, 1239925
Total [baseline] (10.988 s) : 0, 10987529
Agent [candidate] (1.241 s) : 0, 1240676
Total [candidate] (10.979 s) : 0, 10978688
section iast
Agent [baseline] (1.234 s) : 0, 1233604
Total [baseline] (11.185 s) : 0, 11185223
Agent [candidate] (1.245 s) : 0, 1244567
Total [candidate] (11.373 s) : 0, 11372889
section profiling
Agent [baseline] (1.189 s) : 0, 1189461
Total [baseline] (11.043 s) : 0, 11042680
Agent [candidate] (1.192 s) : 0, 1192495
Total [candidate] (10.927 s) : 0, 10926707
gantt
title petclinic - break down per module: candidate=1.60.0-SNAPSHOT~7c0d7eeded, baseline=1.60.0-SNAPSHOT~a249dd3265
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.202 ms) : 0, 1202
crashtracking [candidate] (1.188 ms) : 0, 1188
BytebuddyAgent [baseline] (626.462 ms) : 0, 626462
BytebuddyAgent [candidate] (627.005 ms) : 0, 627005
AgentMeter [baseline] (29.062 ms) : 0, 29062
AgentMeter [candidate] (28.989 ms) : 0, 28989
GlobalTracer [baseline] (257.425 ms) : 0, 257425
GlobalTracer [candidate] (257.129 ms) : 0, 257129
AppSec [baseline] (33.051 ms) : 0, 33051
AppSec [candidate] (32.797 ms) : 0, 32797
Debugger [baseline] (64.349 ms) : 0, 64349
Debugger [candidate] (65.157 ms) : 0, 65157
Remote Config [baseline] (612.201 µs) : 0, 612
Remote Config [candidate] (631.497 µs) : 0, 631
Telemetry [baseline] (10.671 ms) : 0, 10671
Telemetry [candidate] (9.143 ms) : 0, 9143
Flare Poller [baseline] (4.478 ms) : 0, 4478
Flare Poller [candidate] (4.57 ms) : 0, 4570
section appsec
crashtracking [baseline] (1.21 ms) : 0, 1210
crashtracking [candidate] (1.196 ms) : 0, 1196
BytebuddyAgent [baseline] (659.271 ms) : 0, 659271
BytebuddyAgent [candidate] (658.405 ms) : 0, 658405
AgentMeter [baseline] (11.898 ms) : 0, 11898
AgentMeter [candidate] (11.919 ms) : 0, 11919
GlobalTracer [baseline] (257.625 ms) : 0, 257625
GlobalTracer [candidate] (258.818 ms) : 0, 258818
IAST [baseline] (25.37 ms) : 0, 25370
IAST [candidate] (25.346 ms) : 0, 25346
AppSec [baseline] (167.809 ms) : 0, 167809
AppSec [candidate] (168.171 ms) : 0, 168171
Debugger [baseline] (66.82 ms) : 0, 66820
Debugger [candidate] (66.853 ms) : 0, 66853
Remote Config [baseline] (651.257 µs) : 0, 651
Remote Config [candidate] (651.015 µs) : 0, 651
Telemetry [baseline] (9.435 ms) : 0, 9435
Telemetry [candidate] (9.506 ms) : 0, 9506
Flare Poller [baseline] (3.709 ms) : 0, 3709
Flare Poller [candidate] (3.692 ms) : 0, 3692
section iast
crashtracking [baseline] (1.202 ms) : 0, 1202
crashtracking [candidate] (1.218 ms) : 0, 1218
BytebuddyAgent [baseline] (796.514 ms) : 0, 796514
BytebuddyAgent [candidate] (806.433 ms) : 0, 806433
AgentMeter [baseline] (11.311 ms) : 0, 11311
AgentMeter [candidate] (11.557 ms) : 0, 11557
GlobalTracer [baseline] (248.136 ms) : 0, 248136
GlobalTracer [candidate] (248.555 ms) : 0, 248555
IAST [baseline] (26.941 ms) : 0, 26941
IAST [candidate] (27.115 ms) : 0, 27115
AppSec [baseline] (34.108 ms) : 0, 34108
AppSec [candidate] (35.631 ms) : 0, 35631
Debugger [baseline] (66.543 ms) : 0, 66543
Debugger [candidate] (65.013 ms) : 0, 65013
Remote Config [baseline] (533.458 µs) : 0, 533
Remote Config [candidate] (542.192 µs) : 0, 542
Telemetry [baseline] (8.724 ms) : 0, 8724
Telemetry [candidate] (8.711 ms) : 0, 8711
Flare Poller [baseline] (3.482 ms) : 0, 3482
Flare Poller [candidate] (3.471 ms) : 0, 3471
section profiling
crashtracking [baseline] (1.184 ms) : 0, 1184
crashtracking [candidate] (1.2 ms) : 0, 1200
BytebuddyAgent [baseline] (680.97 ms) : 0, 680970
BytebuddyAgent [candidate] (682.493 ms) : 0, 682493
AgentMeter [baseline] (8.563 ms) : 0, 8563
AgentMeter [candidate] (8.602 ms) : 0, 8602
GlobalTracer [baseline] (215.977 ms) : 0, 215977
GlobalTracer [candidate] (216.357 ms) : 0, 216357
AppSec [baseline] (32.399 ms) : 0, 32399
AppSec [candidate] (32.39 ms) : 0, 32390
Debugger [baseline] (67.211 ms) : 0, 67211
Debugger [candidate] (67.328 ms) : 0, 67328
Remote Config [baseline] (633.438 µs) : 0, 633
Remote Config [candidate] (644.467 µs) : 0, 644
Telemetry [baseline] (8.983 ms) : 0, 8983
Telemetry [candidate] (9.109 ms) : 0, 9109
Flare Poller [baseline] (3.728 ms) : 0, 3728
Flare Poller [candidate] (3.778 ms) : 0, 3778
ProfilingAgent [baseline] (99.192 ms) : 0, 99192
ProfilingAgent [candidate] (99.877 ms) : 0, 99877
Profiling [baseline] (99.774 ms) : 0, 99774
Profiling [candidate] (100.44 ms) : 0, 100440
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.60.0-SNAPSHOT~7c0d7eeded, baseline=1.60.0-SNAPSHOT~a249dd3265
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.073 s) : 0, 1072699
Total [baseline] (8.776 s) : 0, 8775549
Agent [candidate] (1.07 s) : 0, 1070437
Total [candidate] (8.749 s) : 0, 8748768
section iast
Agent [baseline] (1.231 s) : 0, 1231216
Total [baseline] (9.404 s) : 0, 9404433
Agent [candidate] (1.231 s) : 0, 1230717
Total [candidate] (9.383 s) : 0, 9383082
gantt
title insecure-bank - break down per module: candidate=1.60.0-SNAPSHOT~7c0d7eeded, baseline=1.60.0-SNAPSHOT~a249dd3265
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.211 ms) : 0, 1211
crashtracking [candidate] (1.22 ms) : 0, 1220
BytebuddyAgent [baseline] (632.156 ms) : 0, 632156
BytebuddyAgent [candidate] (631.373 ms) : 0, 631373
AgentMeter [baseline] (29.379 ms) : 0, 29379
AgentMeter [candidate] (29.214 ms) : 0, 29214
GlobalTracer [baseline] (258.895 ms) : 0, 258895
GlobalTracer [candidate] (258.937 ms) : 0, 258937
AppSec [baseline] (33.195 ms) : 0, 33195
AppSec [candidate] (33.101 ms) : 0, 33101
Debugger [baseline] (61.693 ms) : 0, 61693
Debugger [candidate] (64.584 ms) : 0, 64584
Remote Config [baseline] (634.621 µs) : 0, 635
Remote Config [candidate] (612.066 µs) : 0, 612
Telemetry [baseline] (11.538 ms) : 0, 11538
Telemetry [candidate] (9.062 ms) : 0, 9062
Flare Poller [baseline] (7.579 ms) : 0, 7579
Flare Poller [candidate] (6.034 ms) : 0, 6034
section iast
crashtracking [baseline] (1.207 ms) : 0, 1207
crashtracking [candidate] (1.207 ms) : 0, 1207
BytebuddyAgent [baseline] (795.116 ms) : 0, 795116
BytebuddyAgent [candidate] (795.427 ms) : 0, 795427
AgentMeter [baseline] (11.34 ms) : 0, 11340
AgentMeter [candidate] (11.333 ms) : 0, 11333
GlobalTracer [baseline] (247.658 ms) : 0, 247658
GlobalTracer [candidate] (247.621 ms) : 0, 247621
IAST [baseline] (27.074 ms) : 0, 27074
IAST [candidate] (26.904 ms) : 0, 26904
AppSec [baseline] (33.98 ms) : 0, 33980
AppSec [candidate] (35.333 ms) : 0, 35333
Debugger [baseline] (65.989 ms) : 0, 65989
Debugger [candidate] (64.026 ms) : 0, 64026
Remote Config [baseline] (532.998 µs) : 0, 533
Remote Config [candidate] (539.665 µs) : 0, 540
Telemetry [baseline] (8.63 ms) : 0, 8630
Telemetry [candidate] (8.748 ms) : 0, 8748
Flare Poller [baseline] (3.468 ms) : 0, 3468
Flare Poller [candidate] (3.508 ms) : 0, 3508
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 2 performance regressions! Performance is the same for 18 metrics, 16 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~7c0d7eeded, baseline=1.60.0-SNAPSHOT~a249dd3265
dateFormat X
axisFormat %s
section baseline
no_agent (18.999 ms) : 18808, 19191
. : milestone, 18999,
appsec (18.375 ms) : 18190, 18560
. : milestone, 18375,
code_origins (17.824 ms) : 17646, 18003
. : milestone, 17824,
iast (17.755 ms) : 17576, 17933
. : milestone, 17755,
profiling (19.193 ms) : 18996, 19390
. : milestone, 19193,
tracing (17.78 ms) : 17608, 17952
. : milestone, 17780,
section candidate
no_agent (18.698 ms) : 18503, 18894
. : milestone, 18698,
appsec (18.96 ms) : 18770, 19149
. : milestone, 18960,
code_origins (17.808 ms) : 17632, 17984
. : milestone, 17808,
iast (18.444 ms) : 18256, 18632
. : milestone, 18444,
profiling (18.535 ms) : 18350, 18720
. : milestone, 18535,
tracing (17.788 ms) : 17610, 17966
. : milestone, 17788,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~7c0d7eeded, baseline=1.60.0-SNAPSHOT~a249dd3265
dateFormat X
axisFormat %s
section baseline
no_agent (1.174 ms) : 1162, 1185
. : milestone, 1174,
iast (3.237 ms) : 3194, 3280
. : milestone, 3237,
iast_FULL (6.063 ms) : 6001, 6126
. : milestone, 6063,
iast_GLOBAL (3.459 ms) : 3409, 3509
. : milestone, 3459,
profiling (2.284 ms) : 2260, 2308
. : milestone, 2284,
tracing (1.817 ms) : 1802, 1831
. : milestone, 1817,
section candidate
no_agent (1.186 ms) : 1175, 1198
. : milestone, 1186,
iast (3.16 ms) : 3117, 3203
. : milestone, 3160,
iast_FULL (6.063 ms) : 6001, 6125
. : milestone, 6063,
iast_GLOBAL (3.605 ms) : 3544, 3666
. : milestone, 3605,
profiling (2.067 ms) : 2048, 2086
. : milestone, 2067,
tracing (1.752 ms) : 1738, 1766
. : milestone, 1752,
DacapoParameters
See matching parameters
SummaryFound 1 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 0 unstable metrics.
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~7c0d7eeded, baseline=1.60.0-SNAPSHOT~a249dd3265
dateFormat X
axisFormat %s
section baseline
no_agent (15.624 s) : 15624000, 15624000
. : milestone, 15624000,
appsec (14.954 s) : 14954000, 14954000
. : milestone, 14954000,
iast (18.009 s) : 18009000, 18009000
. : milestone, 18009000,
iast_GLOBAL (18.006 s) : 18006000, 18006000
. : milestone, 18006000,
profiling (14.629 s) : 14629000, 14629000
. : milestone, 14629000,
tracing (14.528 s) : 14528000, 14528000
. : milestone, 14528000,
section candidate
no_agent (15.392 s) : 15392000, 15392000
. : milestone, 15392000,
appsec (14.912 s) : 14912000, 14912000
. : milestone, 14912000,
iast (17.819 s) : 17819000, 17819000
. : milestone, 17819000,
iast_GLOBAL (17.785 s) : 17785000, 17785000
. : milestone, 17785000,
profiling (14.95 s) : 14950000, 14950000
. : milestone, 14950000,
tracing (14.569 s) : 14569000, 14569000
. : milestone, 14569000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~7c0d7eeded, baseline=1.60.0-SNAPSHOT~a249dd3265
dateFormat X
axisFormat %s
section baseline
no_agent (1.476 ms) : 1465, 1488
. : milestone, 1476,
appsec (3.722 ms) : 3504, 3939
. : milestone, 3722,
iast (2.265 ms) : 2196, 2335
. : milestone, 2265,
iast_GLOBAL (2.309 ms) : 2239, 2379
. : milestone, 2309,
profiling (2.089 ms) : 2034, 2145
. : milestone, 2089,
tracing (2.072 ms) : 2018, 2126
. : milestone, 2072,
section candidate
no_agent (1.473 ms) : 1461, 1485
. : milestone, 1473,
appsec (2.537 ms) : 2481, 2592
. : milestone, 2537,
iast (2.256 ms) : 2187, 2326
. : milestone, 2256,
iast_GLOBAL (2.3 ms) : 2230, 2370
. : milestone, 2300,
profiling (2.097 ms) : 2042, 2153
. : milestone, 2097,
tracing (2.07 ms) : 2016, 2124
. : milestone, 2070,
|
449dd10 to
8981bb1
Compare
8981bb1 to
74326e0
Compare
c227a36 to
f7d45ac
Compare
pawel-big-lebowski
left a comment
There was a problem hiding this comment.
Pushing first round of comments.
Main concern: do we have to call advice from within an advice?
...on/spark/spark-common/src/main/java/datadog/trace/instrumentation/spark/SparkExitAdvice.java
Outdated
Show resolved
Hide resolved
...k-common/src/main/java/datadog/trace/instrumentation/spark/AbstractSparkInstrumentation.java
Outdated
Show resolved
Hide resolved
| Map<String, String> conf = (Map<String, String>) confField.get(builder); | ||
| if (conf != null) { | ||
| for (Map.Entry<String, String> entry : conf.entrySet()) { | ||
| if (SparkConfAllowList.canCaptureJobParameter(entry.getKey())) { |
There was a problem hiding this comment.
Can't we use datadog.trace.instrumentation.spark.SparkConfAllowList#getRedactedSparkConf same way datadog.trace.instrumentation.spark.AbstractDatadogSparkListener#captureJobParameters ?
...park/spark-common/src/main/java/datadog/trace/instrumentation/spark/SparkLauncherAdvice.java
Outdated
Show resolved
Hide resolved
...rk/spark-common/src/main/java/datadog/trace/instrumentation/spark/SparkLauncherListener.java
Show resolved
Hide resolved
| private static final Logger log = LoggerFactory.getLogger(SparkLauncherAdvice.class); | ||
|
|
||
| // Same default pattern as spark.redaction.regex in Spark source | ||
| private static final Pattern CONF_REDACTION_PATTERN = |
There was a problem hiding this comment.
If you add SparkConfAllowList as helper class in advice, you should be able to use its redaction methods.
There was a problem hiding this comment.
Thanks, makes sense addressed !
390dafa to
559ce6c
Compare
f4c59f3 to
f57bf18
Compare
pawel-big-lebowski
left a comment
There was a problem hiding this comment.
Keeping the code within a single advice makes it more readable and makes the outcome easier to predict. Thanks for making this change.
...park/spark-common/src/main/java/datadog/trace/instrumentation/spark/SparkLauncherAdvice.java
Outdated
Show resolved
Hide resolved
025bd48 to
7f4844a
Compare
| AbstractDatadogSparkListener.listener.finishApplication( | ||
| System.currentTimeMillis(), throwable, 0, null); | ||
| } else { | ||
| SparkLauncherListener.finishSpanWithThrowable(throwable); |
There was a problem hiding this comment.
Leaving this static reference here rather than relying on InstanceStore because InstanceStore lives in bootstrap classloader and the values stored in must also be visible from the bootstrap classloader, but the SparkLauncherListener is an agent instrumentation class.
In the rest of the Spark instrumentation, instance store is used for SparkConf and SparkListenerInterface. These two are Spark classes loaded by the application classloader, not agent instrumentation classes.
pawel-big-lebowski
left a comment
There was a problem hiding this comment.
Code after changes looks good to me.
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
|
|
/merge |
|
View all feedbacks in Devflow UI.
PR already in the queue with status in_progress |
What Does This Do
Motivation
Get spark.launcher.launch spans for SparkAppHandler to monitor the launcher that can fail independently of the app it starts.
Span example: https://ddstaging.datadoghq.com/apm/traces?query=job_flow_id%3A%2A&agg_m=count&agg_m_source=base&agg_t=count&cols=core_service%2Ccore_resource_name%2Clog_duration%2Clog_http.method%2Clog_http.status_code&fromUser=false&graphType=flamegraph&historicalData=true&messageDisplay=inline&query_translation_version=v0&shouldShowLegend=true&sort=desc&spanID=6071379317785568039&spanType=all&spanViewType=metadata&sparkMetricsSections=io%2Cmemory%2CcpuTable&storage=hot&timeHint=1771578766670&trace=AwAAAZx6UrVOWC0CIQAAABhBWng2VXI0dEFBRFp2V0ZWbnpsclVfSjkAAAAkZjE5YzdhNTQtMjA0My00OTg2LTgyZmMtZDNmMjg4ZmY3ZGU0AABLaA&traceID=6998256f000000007df290e2863efdc1&traceQuery=&view=spans&start=1771540655768&end=1771583855768&paused=false
Additional Notes
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.