feat(cugraph): upgrade to RAPIDS 25.12 / CUDA 13.1 with comprehensive e2e tests#3709
feat(cugraph): upgrade to RAPIDS 25.12 / CUDA 13.1 with comprehensive e2e tests#3709mattkjames7 merged 5 commits intomasterfrom
Conversation
… e2e tests (#710) ## Summary This PR upgrades MAGE's cuGraph integration from the legacy RAPIDS 22.02 / CUDA 11.5 stack to modern RAPIDS 25.12 / CUDA 13.1, bringing GPU-accelerated graph algorithms up to date with current NVIDIA tooling. ### Key Changes **Infrastructure Upgrade:** - CUDA 11.5.2 → 13.1.0 - RAPIDS/cuGraph 22.02 → 25.12 - Ubuntu 20.04 → 24.04 - Python 3.8 → 3.12 **API Migration:** All 9 cuGraph algorithms updated to use the modern pylibcugraph API: - `cugraph::pagerank` → `cugraph::pagerank()` with explicit graph view - `cugraph::betweenness_centrality` → normalized output handling - `cugraph::hits` → proper hub/authority vector management - `cugraph::katz_centrality` → updated alpha/beta parameter handling - `cugraph::louvain` / `cugraph::leiden` → new clustering return types - `cugraph::personalized_pagerank` → vertex list handling **Legacy API Preserved:** Two algorithms remain on `cugraph::ext_raft::` API as they haven't been migrated in RAPIDS 25.x: - `balanced_cut_clustering` - `spectral_clustering` ### E2E Tests Added Comprehensive end-to-end tests for all 9 algorithms following MAGE's existing test framework: ``` e2e/pagerank_test/test_cugraph_networkx_validation/ e2e/betweenness_centrality_test/test_cugraph_networkx_validation/ e2e/hits_test/test_cugraph_networkx_validation/ e2e/katz_test/test_cugraph_networkx_validation/ e2e/louvain_test/test_cugraph_networkx_validation/ e2e/leiden_cugraph_test/test_cugraph_networkx_validation/ e2e/personalized_pagerank_test/test_cugraph_networkx_validation/ e2e/balanced_cut_clustering_test/test_cugraph_networkx_validation/ e2e/spectral_clustering_test/test_cugraph_networkx_validation/ ``` Each test uses a 9-node two-community graph topology with expected values validated against NetworkX ground truth (5% tolerance for GPU floating-point variance). ### Validation Script Added `scripts/validate_cugraph_algorithms.py` - a standalone debugging tool that: 1. Builds identical graph in NetworkX (ground truth) 2. Spins up Memgraph container with cuGraph modules 3. Runs each algorithm and compares against NetworkX 4. Reports pass/fail with detailed value comparisons This is for developer debugging, not CI. ## Test Plan - [x] All 9 cuGraph algorithms pass validation against NetworkX ground truth - [x] Docker image builds successfully with `Dockerfile.cugraph` - [x] E2E tests follow existing MAGE test conventions - [ ] CI pipeline runs (pending merge) ## Breaking Changes None. All algorithm signatures and return types preserved. --------- Co-authored-by: matt <mattkjames7@gmail.com>
Tracking
Standard development
CI Testing Labels
Documentation checklist
|
|
TODO: update docs with changes to module inputs |
antejavor
left a comment
There was a problem hiding this comment.
This is a huge PR that mostly contains the changes in the cuGraph API and updates from legacy.
Since there is validate_cugraph_algorithms.py script there, I assume that the changes here are correct.
The only missing step is to update the docs API, since some args have changed in this PR.
|
Two questions @mattkjames7 now that Mage is in the main repo, are we running the Mage test anywhere? Did you try to use this on Nvidia hardware? I assume you did. |
I haven't written the docs for this just yet, but I will link them here and ping once I have. I think the API changes are fairly minimal.
We run the MAGE tests in this repo now as part of the diff workflow + daily build and RC build. We do not run any cuGraph tests in CI - the image built with this code will only run if a Nvidia GPU is present, though I did build it and run the tests locally. There will hopefully soon be a cuGraph image that is built regularly too, with this WIP PR: #3723 |
|


MAGE PR #710 transferred to this repo. Closes #3564.
Summary
This PR upgrades MAGE's cuGraph integration from the legacy RAPIDS 22.02 / CUDA 11.5 stack to modern RAPIDS 25.12 / CUDA 13.1, bringing GPU-accelerated graph algorithms up to date with current NVIDIA tooling.
Key Changes
Infrastructure Upgrade:
API Migration:
All 9 cuGraph algorithms updated to use the modern pylibcugraph API:
cugraph::pagerank→cugraph::pagerank()with explicit graph viewcugraph::betweenness_centrality→ normalized output handlingcugraph::hits→ proper hub/authority vector managementcugraph::katz_centrality→ updated alpha/beta parameter handlingcugraph::louvain/cugraph::leiden→ new clustering return typescugraph::personalized_pagerank→ vertex list handlingLegacy API Preserved:
Two algorithms remain on
cugraph::ext_raft::API as they haven't been migrated in RAPIDS 25.x:balanced_cut_clusteringspectral_clusteringE2E Tests Added
Comprehensive end-to-end tests for all 9 algorithms following MAGE's existing test framework:
Each test uses a 9-node two-community graph topology with expected values validated against NetworkX ground truth (5% tolerance for GPU floating-point variance).
Validation Script
Added
scripts/validate_cugraph_algorithms.py- a standalone debugging tool that:This is for developer debugging, not CI.
Test Plan
Dockerfile.cugraphBreaking Changes
None. All algorithm signatures and return types preserved.