Skip to content

Commit 8c32d7c

Browse files
committed
Add Nsight Systems profiling to CI
- Run nsys profile with 1000 iterations if nsys is available - Captures CUDA, NVTX, and OS runtime traces - Uploads .nsys-rep file as artifact for visual analysis - continue-on-error: true so CI doesn't fail if nsys unavailable
1 parent 1bfbaf9 commit 8c32d7c

File tree

1 file changed

+28
-0
lines changed

1 file changed

+28
-0
lines changed

.github/workflows/ci.yml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,34 @@ jobs:
3636
echo ""
3737
echo "The diagnostic shows if time scales linearly with iterations,"
3838
echo "which indicates constant per-iteration CPU-GPU sync overhead."
39+
- name: Nsight Systems Profile (if available)
40+
shell: bash -l {0}
41+
continue-on-error: true
42+
run: |
43+
echo "=== NVIDIA Nsight Systems Profiling ==="
44+
if command -v nsys &> /dev/null; then
45+
echo "nsys found, running profile with 1000 iterations..."
46+
mkdir -p nsight_profiles
47+
nsys profile -o nsight_profiles/lax_scan_trace \
48+
--trace=cuda,nvtx,osrt \
49+
--cuda-memory-usage=true \
50+
--stats=true \
51+
python scripts/profile_lax_scan.py --nsys -n 1000
52+
echo ""
53+
echo "Profile saved to nsight_profiles/lax_scan_trace.nsys-rep"
54+
echo "Download artifact and open in Nsight Systems UI to see CPU-GPU sync pattern"
55+
else
56+
echo "nsys not found, skipping Nsight profiling"
57+
echo "Install NVIDIA Nsight Systems to enable this profiling"
58+
fi
59+
- name: Upload Nsight Profile
60+
uses: actions/upload-artifact@v5
61+
if: success() || failure()
62+
continue-on-error: true
63+
with:
64+
name: nsight-profile
65+
path: nsight_profiles/
66+
if-no-files-found: ignore
3967
# === Benchmark Tests (Bare Metal, Jupyter, Jupyter-Book) ===
4068
- name: Run Hardware Benchmarks (Bare Metal)
4169
shell: bash -l {0}

0 commit comments

Comments
 (0)