(pipelines): Add build performance observability pipeline#26316
(pipelines): Add build performance observability pipeline#26316
Conversation
There was a problem hiding this comment.
Pull request overview
This PR transforms a placeholder build performance observability pipeline into a functional monitoring solution. The pipeline collects build metrics from Azure DevOps REST APIs for both PR and internal builds, processes the data to generate comprehensive performance metrics, and deploys an interactive HTML dashboard to an Azure Static Web App.
Changes:
- Added daily cron schedules (2 AM UTC for PR builds, 3 AM UTC for internal builds) with logic to ensure each schedule runs in the correct project
- Implemented data collection via ADO REST APIs with parallel timeline fetching for performance
- Created metrics processing using jq to generate aggregated statistics, duration trends, and stage/task breakdowns
- Built an interactive dashboard with Chart.js visualizations, sortable tables, and tabbed navigation for PR vs internal builds
| @@ -0,0 +1,105 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
I'll bet you know what I am going to say. 😄 Is there any reason this and all the other bash logic can't be typescript and wrapped in a flub command? If it's functions in TypeScript, you can test it independently of the pipeline, and the infra will handle most of the concerns that aren't relevant to the job at hand, like reading env variables. E.g. in oclif you just declare that your flag should have an env var associated with it then your command flag var just gets that value fro the env var - and there are ways to test it. Anyway, broken record I know, but I would strongly recommend converting these. I'll bet claude can do it for you, and write some tests using the ones we have as a model.
There was a problem hiding this comment.
Is there any reason this and all the other bash logic can't be typescript and wrapped in a flub command?
Moving it to a flub command is a good idea, either in this PR or in a V2. The main reason it isn't already a flub command is because it was easier to develop/iterate as a standalone utility, especially before I was sure where/how the utility would be used.
| <div class="no-data" id="internal-no-data" style="display: none;"><h3>No Internal Build Data Available</h3><p>Internal build metrics will appear here once the internal pipeline runs.</p></div> | ||
| </div> | ||
| </div> | ||
| <script> |
There was a problem hiding this comment.
This feels hard to maintain - could it be externalized?
tools/pipelines/build-performance-observability-utils/generate-html-artifact.sh
Outdated
Show resolved
Hide resolved
| // const STANDALONE_MODE = 'public'; // or 'internal' | ||
| // const INLINED_DATA = {...}; |
There was a problem hiding this comment.
So this is sort of a template that gets "rendered" by some other scripts in some way? If so, I would recommend using a formal templating language. HTML is notoriously hard to handle with regex-style search and replace or other string parsing. It's generally easier and more maintainable to use something like nunjucks, ejs, etc. I think we have nunjucks usage in build-tools if you want to see an example.
| if (dashboardData[mode]) setTimeout(() => renderDashboard(mode, dashboardData[mode]), 50); | ||
| } | ||
|
|
||
| async function loadData() { |
There was a problem hiding this comment.
The more I look at this, I think you might be best served by adopting something like Astro. Imagine if this was an HTML template that had placeholders in it that you could fill in with data - basically combine the template with data and produce the HTML output. Astro would formalize this, and provide you with a data "pipeline" of content sources that can be your JSON data. You can then have a clear separation between [loading data from ADO], [massaging/combining raw data into final forms] and [rendering the data into HTML format for viewing]. Something like Astro would make this easier to maintain, and you can get something deployable to Azure or elsewhere but also can spit out a static site if you want.
In other words, what if you thought about this as a single-page web app instead of a data pipeline or reporting pipeline? That sort of addresses the feedback @anthony-murphy had about making this a tool vs. a pipeline. I'm sort of suggesting that you make the tool itself an SPA and then you get the best of both worlds IMO. Regardless, separating the data collection and refinement clearly from the report generation would really help maintenance.
Claude with context7 is excellent at astro, and since you have the visuals worked out already, the changes are mostly mechanical and could be done by the agent.
There was a problem hiding this comment.
You can also use components in such a system, which is another way to improve maintainability, especially of this kind of conditional behavior, which is basically "component state."
tylerbutler
left a comment
There was a problem hiding this comment.
My biggest feedback is around architecture, and that's driven primarily from maintainability concerns that I have. I think structurally you have a web app that ingests some JSON data, massages it and aggregates it, and then renders some HTML and JS output based on the JSON input.
All the pieces are there, they're individually well-labelled, and they function, so from a prototype or proof of concept perspective this is in good shape. But from a maintenance perspective it's a lot of code to parse and understand, and it spans multiple languages (HTML, JS, bash script), some embedded in others, and the way each piece connects to the other is not super clear just from the names.
One way to split the difference between going full web app and the current state is to fully separate the data collection part from the rendering part. Imagine one pipeline that gathers data and writes it as an artifact, and then a second pipeline that reads the artifact data and generates the HTML. That gets you closer to clean separation of concerns.
That said, it violates Tony's points about data retention concerns, so I'm a huge fan of the SPA approach. Lots of benefits.
tylerbutler
left a comment
There was a problem hiding this comment.
PR Review: Build Performance Observability Pipeline
Found 3 critical, 7 important, and 6 suggestion-level issues. See inline comments below.
Worst-case scenario: ADO API returns an error response (valid JSON, passes jq empty) → zero build IDs → timeline fetch exits 0 → process-data.cjs produces zero-data dashboard → threshold check passes (0 > 90 is false) → dashboard deployed showing all zeros → users see it and can't distinguish from real data. Pipeline reports success at every stage.
Description
This PR updates the placeholder pipeline added in #26299. The new pipeline does the following:
Note: The pipeline must run in the public and internal projects to fetch PR build data and internal build data, respectively.
Next Steps/Other Considerations
Misc
Spec
AB#55451
Example Screenshot