Skip to content

Comments

(pipelines): Add build performance observability pipeline#26316

Open
scottn12 wants to merge 75 commits intomainfrom
test/build-perf-observe
Open

(pipelines): Add build performance observability pipeline#26316
scottn12 wants to merge 75 commits intomainfrom
test/build-perf-observe

Conversation

@scottn12
Copy link
Contributor

@scottn12 scottn12 commented Jan 28, 2026

Description

This PR updates the placeholder pipeline added in #26299. The new pipeline does the following:

  • Collects build metrics from ADO REST APIs for PR and internal builds
  • Generates an HTML dashboard (published as a pipeline artifact) which includes:
    • Summary metrics (total builds, avg duration, trend analysis)
    • Duration trend charts over time
    • Stage and task duration breakdown charts
    • Tables of recent and longest builds, including links to the source commits/PRs
  • Will run on a daily schedule

Note: The pipeline must run in the public and internal projects to fetch PR build data and internal build data, respectively.

Next Steps/Other Considerations

  • Seek feedback from the team on what other metrics could be useful.
  • It seems ADO only retains a limited number of PR builds. If possible, we should try to increase the limit.
  • Consider adding an external data store (especially if we cannot increase the public build limit).

Misc

Spec
AB#55451

Example Screenshot

dash screenshot

@scottn12 scottn12 marked this pull request as ready for review January 29, 2026 15:42
Copilot AI review requested due to automatic review settings January 29, 2026 15:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR transforms a placeholder build performance observability pipeline into a functional monitoring solution. The pipeline collects build metrics from Azure DevOps REST APIs for both PR and internal builds, processes the data to generate comprehensive performance metrics, and deploys an interactive HTML dashboard to an Azure Static Web App.

Changes:

  • Added daily cron schedules (2 AM UTC for PR builds, 3 AM UTC for internal builds) with logic to ensure each schedule runs in the correct project
  • Implemented data collection via ADO REST APIs with parallel timeline fetching for performance
  • Created metrics processing using jq to generate aggregated statistics, duration trends, and stage/task breakdowns
  • Built an interactive dashboard with Chart.js visualizations, sortable tables, and tabbed navigation for PR vs internal builds

@@ -0,0 +1,105 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll bet you know what I am going to say. 😄 Is there any reason this and all the other bash logic can't be typescript and wrapped in a flub command? If it's functions in TypeScript, you can test it independently of the pipeline, and the infra will handle most of the concerns that aren't relevant to the job at hand, like reading env variables. E.g. in oclif you just declare that your flag should have an env var associated with it then your command flag var just gets that value fro the env var - and there are ways to test it. Anyway, broken record I know, but I would strongly recommend converting these. I'll bet claude can do it for you, and write some tests using the ones we have as a model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason this and all the other bash logic can't be typescript and wrapped in a flub command?

Moving it to a flub command is a good idea, either in this PR or in a V2. The main reason it isn't already a flub command is because it was easier to develop/iterate as a standalone utility, especially before I was sure where/how the utility would be used.

<div class="no-data" id="internal-no-data" style="display: none;"><h3>No Internal Build Data Available</h3><p>Internal build metrics will appear here once the internal pipeline runs.</p></div>
</div>
</div>
<script>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels hard to maintain - could it be externalized?

Comment on lines 106 to 107
// const STANDALONE_MODE = 'public'; // or 'internal'
// const INLINED_DATA = {...};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is sort of a template that gets "rendered" by some other scripts in some way? If so, I would recommend using a formal templating language. HTML is notoriously hard to handle with regex-style search and replace or other string parsing. It's generally easier and more maintainable to use something like nunjucks, ejs, etc. I think we have nunjucks usage in build-tools if you want to see an example.

if (dashboardData[mode]) setTimeout(() => renderDashboard(mode, dashboardData[mode]), 50);
}

async function loadData() {
Copy link
Member

@tylerbutler tylerbutler Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I look at this, I think you might be best served by adopting something like Astro. Imagine if this was an HTML template that had placeholders in it that you could fill in with data - basically combine the template with data and produce the HTML output. Astro would formalize this, and provide you with a data "pipeline" of content sources that can be your JSON data. You can then have a clear separation between [loading data from ADO], [massaging/combining raw data into final forms] and [rendering the data into HTML format for viewing]. Something like Astro would make this easier to maintain, and you can get something deployable to Azure or elsewhere but also can spit out a static site if you want.

In other words, what if you thought about this as a single-page web app instead of a data pipeline or reporting pipeline? That sort of addresses the feedback @anthony-murphy had about making this a tool vs. a pipeline. I'm sort of suggesting that you make the tool itself an SPA and then you get the best of both worlds IMO. Regardless, separating the data collection and refinement clearly from the report generation would really help maintenance.

Claude with context7 is excellent at astro, and since you have the visuals worked out already, the changes are mostly mechanical and could be done by the agent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use components in such a system, which is another way to improve maintainability, especially of this kind of conditional behavior, which is basically "component state."

Copy link
Member

@tylerbutler tylerbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest feedback is around architecture, and that's driven primarily from maintainability concerns that I have. I think structurally you have a web app that ingests some JSON data, massages it and aggregates it, and then renders some HTML and JS output based on the JSON input.

All the pieces are there, they're individually well-labelled, and they function, so from a prototype or proof of concept perspective this is in good shape. But from a maintenance perspective it's a lot of code to parse and understand, and it spans multiple languages (HTML, JS, bash script), some embedded in others, and the way each piece connects to the other is not super clear just from the names.

One way to split the difference between going full web app and the current state is to fully separate the data collection part from the rendering part. Imagine one pipeline that gathers data and writes it as an artifact, and then a second pipeline that reads the artifact data and generates the HTML. That gets you closer to clean separation of concerns.

That said, it violates Tony's points about data retention concerns, so I'm a huge fan of the SPA approach. Lots of benefits.

Copy link
Member

@tylerbutler tylerbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Build Performance Observability Pipeline

Found 3 critical, 7 important, and 6 suggestion-level issues. See inline comments below.

Worst-case scenario: ADO API returns an error response (valid JSON, passes jq empty) → zero build IDs → timeline fetch exits 0 → process-data.cjs produces zero-data dashboard → threshold check passes (0 > 90 is false) → dashboard deployed showing all zeros → users see it and can't distinguish from real data. Pipeline reports success at every stage.

@scottn12 scottn12 requested a review from tylerbutler February 12, 2026 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants