|
| 1 | +# CI Speed Optimization Results - Updated |
| 2 | + |
| 3 | +## Multiple Run Comparison |
| 4 | + |
| 5 | +### Integration Tests |
| 6 | + |
| 7 | +| Run | maxWorkers | Test Execution | Total Time | Notes | |
| 8 | +|-----|------------|----------------|------------|-------| |
| 9 | +| Baseline (18827071962) | 200% | 193s | 3m59s | Before changes | |
| 10 | +| PR #443 Run 1 (18827186714) | 100% | 183s | 3m53s | -10s (5.2% faster) | |
| 11 | +| PR #443 Run 2 (18827282800) | 100% | **173s** | 4m16s | -20s (10.4% faster)* | |
| 12 | + |
| 13 | +\* Total time includes 37s queue/setup delay |
| 14 | + |
| 15 | +**Result: Consistent 10-20s improvement in test execution** ✅ |
| 16 | + |
| 17 | +### macOS Build |
| 18 | + |
| 19 | +| Run | setup-cmux | Build | Package | Total Time | |
| 20 | +|-----|------------|-------|---------|------------| |
| 21 | +| Baseline (18827071959) | 87s | 28s | 93s | 3m50s | |
| 22 | +| PR #443 Run 1 (18827186715) | 107s | 38s | 113s | 4m50s | |
| 23 | +| PR #443 Run 2 (18827290033) | 105s | 36s | 110s | 4m39s | |
| 24 | + |
| 25 | +**Result: Consistent 50-60s slowdown** ❌ |
| 26 | + |
| 27 | +## Analysis Update |
| 28 | + |
| 29 | +### Integration Tests: ✅ CLEAR WIN |
| 30 | + |
| 31 | +Reducing maxWorkers from 200% to 100% shows **consistent improvement**: |
| 32 | +- Run 1: 10s faster (5.2%) |
| 33 | +- Run 2: 20s faster (10.4%) |
| 34 | +- Average: **15s improvement** |
| 35 | + |
| 36 | +The hypothesis was correct: 64 workers on 32 cores was causing contention. Tests benefit from less parallelization. |
| 37 | + |
| 38 | +### macOS Build: ❌ CONSISTENT REGRESSION |
| 39 | + |
| 40 | +The slowdown is NOT runner variance - it's consistent across runs: |
| 41 | +- Run 1: +60s |
| 42 | +- Run 2: +49s |
| 43 | +- Average: **~55s slower** |
| 44 | + |
| 45 | +**Root cause identified:** |
| 46 | + |
| 47 | +Looking at the baseline run (18827071959), the setup-cmux step was 87s with ImageMagick install taking ~45-70s. |
| 48 | + |
| 49 | +In our optimized runs, setup-cmux is consistently 105-107s. This suggests: |
| 50 | + |
| 51 | +1. **The HOMEBREW_NO_INSTALL_CLEANUP flag may have made things worse** |
| 52 | + - OR it's being ignored and something else changed |
| 53 | + |
| 54 | +2. **Alternate hypothesis: The baseline was unusually fast** |
| 55 | + - Need to check more baseline runs from main branch |
| 56 | + - May have had partial cache or faster network |
| 57 | + |
| 58 | +3. **Build and Package steps also slower**: |
| 59 | + - Build: 28s → 36-38s (+8-10s) |
| 60 | + - Package: 93s → 110-113s (+17-20s) |
| 61 | + - This suggests overall runner slowness, not just ImageMagick |
| 62 | + |
| 63 | +## Decision |
| 64 | + |
| 65 | +### ✅ Keep Integration Test Optimization |
| 66 | +The maxWorkers=100% change is a clear win with no downsides. |
| 67 | + |
| 68 | +### ❌ Revert macOS Brew Change |
| 69 | +The `HOMEBREW_NO_INSTALL_CLEANUP=1` flag either: |
| 70 | +1. Made things slower (counterintuitively) |
| 71 | +2. Did nothing, and we're seeing normal variance |
| 72 | + |
| 73 | +Either way, it's not helping. Should revert and investigate further. |
| 74 | + |
| 75 | +### 🔍 Next Steps |
| 76 | + |
| 77 | +1. **Revert the HOMEBREW_NO_INSTALL_CLEANUP change** |
| 78 | +2. **Check if baseline was anomalous** - Run build workflow on unmodified main branch 2-3 times |
| 79 | +3. **If baseline was typical** - Then brew flag made things worse, investigate why |
| 80 | +4. **Focus on other optimizations** - Brew install may not be optimizable on ephemeral runners |
| 81 | + |
| 82 | +## Summary |
| 83 | + |
| 84 | +**Net Result of This PR (if we keep both changes):** |
| 85 | +- Integration tests: **-15s** ✅ |
| 86 | +- macOS build: **+55s** ❌ |
| 87 | +- **Net change: +40s overall** ❌ |
| 88 | + |
| 89 | +**Recommendation:** |
| 90 | +- Keep integration test change only |
| 91 | +- Revert brew cleanup change |
| 92 | +- Net result: **-15s** ✅ |
| 93 | + |
0 commit comments