(pipelines): Add build performance observability pipeline by scottn12 · Pull Request #26316 · microsoft/FluidFramework

scottn12 · 2026-01-28T20:08:40Z

Description

This PR updates the placeholder pipeline added in #26299. The new pipeline does the following:

Collects build metrics from ADO REST APIs for PR and internal builds
Generates an HTML dashboard (published as a pipeline artifact) which includes:
- Summary metrics (total builds, avg duration, trend analysis)
- Duration trend charts over time
- Stage and task duration breakdown charts
- Tables of recent and longest builds, including links to the source commits/PRs
Will run on a daily schedule

Note: The pipeline must run in the public and internal projects to fetch PR build data and internal build data, respectively.

Next Steps/Other Considerations

Seek feedback from the team on what other metrics could be useful.
It seems ADO only retains a limited number of PR builds. If possible, we should try to increase the limit.
Consider adding an external data store (especially if we cannot increase the public build limit).

Misc

Spec
AB#55451

Example Screenshot

Copilot

Pull request overview

This PR transforms a placeholder build performance observability pipeline into a functional monitoring solution. The pipeline collects build metrics from Azure DevOps REST APIs for both PR and internal builds, processes the data to generate comprehensive performance metrics, and deploys an interactive HTML dashboard to an Azure Static Web App.

Changes:

Added daily cron schedules (2 AM UTC for PR builds, 3 AM UTC for internal builds) with logic to ensure each schedule runs in the correct project
Implemented data collection via ADO REST APIs with parallel timeline fetching for performance
Created metrics processing using jq to generate aggregated statistics, duration trends, and stage/task breakdowns
Built an interactive dashboard with Chart.js visualizations, sortable tables, and tabbed navigation for PR vs internal builds

tools/pipelines/build-performance-observability.yml

tylerbutler · 2026-02-09T21:27:25Z

tools/pipelines/build-performance-observability-utils/check-thresholds.sh

@@ -0,0 +1,105 @@
+#!/bin/bash


I'll bet you know what I am going to say. 😄 Is there any reason this and all the other bash logic can't be typescript and wrapped in a flub command? If it's functions in TypeScript, you can test it independently of the pipeline, and the infra will handle most of the concerns that aren't relevant to the job at hand, like reading env variables. E.g. in oclif you just declare that your flag should have an env var associated with it then your command flag var just gets that value fro the env var - and there are ways to test it. Anyway, broken record I know, but I would strongly recommend converting these. I'll bet claude can do it for you, and write some tests using the ones we have as a model.

Is there any reason this and all the other bash logic can't be typescript and wrapped in a flub command?

Moving it to a flub command is a good idea, either in this PR or in a V2. The main reason it isn't already a flub command is because it was easier to develop/iterate as a standalone utility, especially before I was sure where/how the utility would be used.

tylerbutler · 2026-02-09T21:28:41Z

build-tools/packages/build-cli/src/library/buildPerf/templates/dashboard-template.html

+            <div class="no-data" id="internal-no-data" style="display: none;"><h3>No Internal Build Data Available</h3><p>Internal build metrics will appear here once the internal pipeline runs.</p></div>
+        </div>
+    </div>
+    <script>


This feels hard to maintain - could it be externalized?

tools/pipelines/build-performance-observability-utils/generate-html-artifact.sh

tylerbutler · 2026-02-09T21:37:30Z

build-tools/packages/build-cli/src/library/buildPerf/templates/dashboard-template.html

+        // const STANDALONE_MODE = 'public'; // or 'internal'
+        // const INLINED_DATA = {...};


So this is sort of a template that gets "rendered" by some other scripts in some way? If so, I would recommend using a formal templating language. HTML is notoriously hard to handle with regex-style search and replace or other string parsing. It's generally easier and more maintainable to use something like nunjucks, ejs, etc. I think we have nunjucks usage in build-tools if you want to see an example.

tylerbutler · 2026-02-09T21:47:01Z

build-tools/packages/build-cli/src/library/buildPerf/templates/dashboard-template.html

+            if (dashboardData[mode]) setTimeout(() => renderDashboard(mode, dashboardData[mode]), 50);
+        }
+
+        async function loadData() {


The more I look at this, I think you might be best served by adopting something like Astro. Imagine if this was an HTML template that had placeholders in it that you could fill in with data - basically combine the template with data and produce the HTML output. Astro would formalize this, and provide you with a data "pipeline" of content sources that can be your JSON data. You can then have a clear separation between [loading data from ADO], [massaging/combining raw data into final forms] and [rendering the data into HTML format for viewing]. Something like Astro would make this easier to maintain, and you can get something deployable to Azure or elsewhere but also can spit out a static site if you want.

In other words, what if you thought about this as a single-page web app instead of a data pipeline or reporting pipeline? That sort of addresses the feedback @anthony-murphy had about making this a tool vs. a pipeline. I'm sort of suggesting that you make the tool itself an SPA and then you get the best of both worlds IMO. Regardless, separating the data collection and refinement clearly from the report generation would really help maintenance.

Claude with context7 is excellent at astro, and since you have the visuals worked out already, the changes are mostly mechanical and could be done by the agent.

You can also use components in such a system, which is another way to improve maintainability, especially of this kind of conditional behavior, which is basically "component state."

tylerbutler

My biggest feedback is around architecture, and that's driven primarily from maintainability concerns that I have. I think structurally you have a web app that ingests some JSON data, massages it and aggregates it, and then renders some HTML and JS output based on the JSON input.

All the pieces are there, they're individually well-labelled, and they function, so from a prototype or proof of concept perspective this is in good shape. But from a maintenance perspective it's a lot of code to parse and understand, and it spans multiple languages (HTML, JS, bash script), some embedded in others, and the way each piece connects to the other is not super clear just from the names.

One way to split the difference between going full web app and the current state is to fully separate the data collection part from the rendering part. Imagine one pipeline that gathers data and writes it as an artifact, and then a second pipeline that reads the artifact data and generates the HTML. That gets you closer to clean separation of concerns.

That said, it violates Tony's points about data retention concerns, so I'm a huge fan of the SPA approach. Lots of benefits.

tylerbutler

PR Review: Build Performance Observability Pipeline

Found 3 critical, 7 important, and 6 suggestion-level issues. See inline comments below.

Worst-case scenario: ADO API returns an error response (valid JSON, passes jq empty) → zero build IDs → timeline fetch exits 0 → process-data.cjs produces zero-data dashboard → threshold check passes (0 > 90 is false) → dashboard deployed showing all zeros → users see it and can't distinguish from real data. Pipeline reports success at every stage.

tools/pipelines/build-performance-observability.yml

tools/pipelines/build-performance-observability-utils/fetch-build-data.sh

tools/pipelines/build-performance-observability-utils/check-thresholds.sh

tools/pipelines/build-performance-observability-utils/fetch-timeline-data.sh

tools/pipelines/build-performance-observability-utils/process-data.cjs

tools/pipelines/build-performance-observability-utils/detect-run-mode.sh

scottn12 added 23 commits January 27, 2026 15:04

add pipeline for testing

9a259ea

change to boilerplate pipeline

6a62e37

revert

3f8b518

add recent/longest build tables

a69a3ce

naming/visual changes

25c6cea

upload HTML to AWSA

d67e638

edit variable group name

dcf0e6a

move template into pipeline

42aa2ed

change auth requirements

c248d69

update fetch logic

904619d

update color scheme

89fdd95

removed retry attempts

21ffa86

add debug output for fetch

a4b3bb5

update curl command

381b5ed

update schedule (disabled for now)

d96fd96

visual changes

bbb9f9c

misc cleanup

a2d98a3

uncomment schedule

627f55b

Merge branch 'main' into test/build-perf-observe

91fc88a

update schedule logic

100b063

move should run check into single job/misc cleanup

a59aebd

cleanup

caf055f

test auth

6f1d30c

scottn12 marked this pull request as ready for review January 29, 2026 15:42

Copilot AI review requested due to automatic review settings January 29, 2026 15:42

Copilot started reviewing on behalf of scottn12 January 29, 2026 15:43 View session

Copilot AI reviewed Jan 29, 2026

View reviewed changes

scottn12 added 3 commits January 29, 2026 10:54

misc cleanup

dafaf43

about this dashboard changes

ce8061e

factor out most of logic/compute into HTML/JSON files

f1b6720

scottn12 added 6 commits February 4, 2026 14:20

fix permission

5489c7a

fix dir path

da64bfc

make html artifact single tab

66c1b22

cleanup

8fe5c3d

run pipeline in parallel with SDL stages

e162f6b

revert

43c8bcc

tylerbutler reviewed Feb 9, 2026

View reviewed changes

tools/pipelines/build-performance-observability-utils/generate-html-artifact.sh Outdated Show resolved Hide resolved

tylerbutler reviewed Feb 9, 2026

View reviewed changes

scottn12 added 12 commits February 10, 2026 10:25

PR feedback

cd9c84e

refactor into flub command

973125e

Merge branch 'main' into test/build-perf-observe-refactor

d50d5f3

add ado-feeds vars

93a55fb

rename input dir

41b7428

change output to directory

0837044

include deployment in aswa command

02280f3

remove aswa related code

89ce644

format

3936400

update md files

771fdb7

fix test complie errors

606fe45

cleanup

73ca887

scottn12 requested a review from tylerbutler February 12, 2026 19:57

scottn12 added 4 commits February 20, 2026 14:00

cleanup, template changes, aswa command

24445af

package.json order

0f62eef

Merge branch 'main' into test/build-perf-observe

6f66aa4

fix import paths

abdcbb9

		// const STANDALONE_MODE = 'public'; // or 'internal'
		// const INLINED_DATA = {...};

Comments

Conversation

scottn12 commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Next Steps/Other Considerations

Misc

Example Screenshot

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tylerbutler Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

scottn12 Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

tylerbutler Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tylerbutler Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

tylerbutler Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tylerbutler Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

tylerbutler left a comment

Choose a reason for hiding this comment

Uh oh!

tylerbutler left a comment

Choose a reason for hiding this comment

PR Review: Build Performance Observability Pipeline

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

scottn12 commented Jan 28, 2026 •

edited

Loading

tylerbutler Feb 9, 2026 •

edited

Loading