Sketch of migration from BGFX to WebGPU (wgpu)#1605
Sketch of migration from BGFX to WebGPU (wgpu)#1605matthargett wants to merge 29 commits intoBabylonJS:masterfrom
Conversation
Merge the separate upstream-shim and CanvasWgpu Rust crates into the single GraphicsWgpu workspace crate, eliminating duplicate dependency trees and simplifying the Cargo workspace to one member. Standards/practices adopted: - Single-workspace Cargo idiom: one crate, feature-gated backends - Rust module extraction: compute.rs pulled out of inline module block into upstream_wgpu_native/ subdirectory (standard Rust path resolution) - catch_unwind at every FFI boundary to prevent Rust panics from unwinding into C++ (undefined behavior per Rust/C++ interop rules) - Poisoned-lock handling on all Mutex/RwLock guards (std::sync contract) - WgpuInterop.h extracted as a shared C++ header for Rust FFI types Remaining gaps: - lib.rs is still ~1800 lines; further extraction of render, surface, and adapter modules would improve navigability - CanvasWgpu Rust side still uses raw pointer casts for NVG context handle — a typed opaque handle would be safer - No Rust-side integration tests yet; testing is end-to-end via Playground smoke test only
Centralize Rust/Cargo build orchestration in Core/GraphicsWgpu/CMakeLists instead of duplicating it across CanvasWgpu and NativeWebGPU. The Cargo build now produces a single static library linked by all downstream targets. Android NDK toolchain integration added for cross-compilation. Standards/practices adopted: - Single ExternalProject_Add for Cargo, with platform-specific target triples for macOS/iOS/Android/Windows/Linux - Gitignore entries for Rust build artifacts (target/, *.d, *.fingerprint) - FetchContent for third-party dependencies (consistent with JsRuntimeHost pattern already used in the project) Remaining gaps: - Windows and Linux CMake paths are defined but not yet tested end-to-end - No CMake presets file for reproducible configure commands - CanvasWgpu CMakeLists is now a thin wrapper — could be further simplified once the polyfill stabilizes
Fix Canvas 2D gaps exposed by the WebGPU smoke test (text rendering,
image data, filter stack) and replace per-frame string allocations with
enum/numeric storage on the rendering hot path.
Hot-path optimizations:
- m_direction: std::string → Direction enum (1 byte vs heap string),
eliminates string.compare("rtl") on every FillText/StrokeText call
- m_lineCap/m_lineJoin: std::string → NVGlineCap enum, parse once on
set, reconstruct string only when JS getter is called
- SetFillStyle/SetFilter: std::move into member instead of copy
- Removed __nativeCanvasReady global — the FIFO WorkQueue guarantee
makes it unnecessary (matches W3C navigator.gpu pattern)
Standards/practices adopted:
- Initialization contract comment on Canvas::Initialize documenting the
AppRuntime FIFO WorkQueue guarantee (no polling needed)
- Enum storage for fixed-vocabulary Canvas properties (standard pattern
in browser Canvas implementations)
Remaining gaps:
- FillText/StrokeText still allocate std::string for text content on
every call; could use string_view for non-RTL paths
- StringToColor() re-parses color strings on every BindFillStyle call;
caching the parsed NVGcolor alongside the string would eliminate this
- RTL text reversal is byte-level (breaks multi-byte UTF-8); needs a
proper bidi algorithm (ICU/HarfBuzz) for correct internationalization
- Font parsing in SetFont is not cached across identical values
Remove the non-standard __nativeWebGpuReady global promise. Per the W3C WebGPU spec, navigator.gpu is a synchronous [SameObject] attribute — always present when WebGPU is enabled. The AppRuntime FIFO WorkQueue guarantees navigator.gpu is set before any script executes, matching the Chromium/Dawn and Servo/wgpu-core initialization model. Hot-path optimizations: - GetString() fallback: std::string → std::string_view parameter, avoids heap allocation of the fallback value on every call (the common case where the key exists never uses it) - Texture view cache key: replaced per-frame string construction + Utf8Value() comparison with FNV-1a hash stored as a JS number — cache hit is now a double==double comparison, zero allocation Standards/practices adopted: - W3C WebGPU spec alignment: navigator.gpu as synchronous attribute, no readiness promise (matches Chromium, Servo, Deno) - Initialization contract comment on Initialize() documenting the FIFO guarantee for embedders - string_view for read-only string parameters at API boundaries Remaining gaps: - TextureDescriptorData stores Format/Dimension as std::string; these are from a finite WebGPU enum set but need string form for Rust FFI - GetString() still returns std::string by value; callers that only need comparison could use a string_view-returning variant - ReadBooleanFlag() allocates string for "true"/"false" comparison (cold path, low priority) - No typed wrapper for WebGPU texture formats — raw strings flow through JS ↔ C++ ↔ Rust boundaries
Rewrite playground_runner.js to remove ~180 lines of dead polling code (6 polling constants, 7 polling/waiting functions) that were unnecessary given the AppRuntime FIFO WorkQueue initialization guarantee. Replace with fail-fast assertions and direct signal consumption. Expand webgpu_smoke.js into a comprehensive WebGPU validation scene: rotating cube with per-face Canvas 2D text, compute shader dispatch, queue.copyExternalImageToTexture, runtime telemetry counters, and font loading via bundled RobotoSlab.ttf. Platform bridge changes: - Android: stale-globals clearing in onViewReady() for surface recreation safety; WebGPU bootstrap in BabylonNativeJNI - iOS: WebGPU initialization in LibNativeBridge dispatch callback - macOS: refresh button wired to re-run scripts with proper cleanup Standards/practices adopted: - Synchronous navigator.gpu access (no polling) matching W3C spec and the FIFO WorkQueue contract documented in AppContext.cpp - Epoch-based reload safety for Android surface recreation and macOS refresh (prevents stale runtime cross-talk) - Scene factory signal pattern for decoupled script loading order Remaining gaps: - webgpu_smoke.js is both a smoke test and the default scene — a separate minimal "hello triangle" would be better for CI validation - No automated screenshot comparison; validation is visual-only - Android Gradle files reference specific SDK/NDK versions that may need updating for newer toolchains - validation_native.js is a stub — no actual validation framework yet
Add cmake.yml CI job definition for Rust+CMake cross-platform builds. Update linux.yml with Rust toolchain setup. Remove stale macOS Playground-only job (superseded by cmake.yml matrix). Add NOTICE.md with third-party license attributions for wgpu, wgpu-native, naga, RobotoSlab font, nanovg, and other bundled dependencies. Update WgpuMigrationPlan.md to reflect completed work: upstream shim removal, FIFO initialization guarantee, __nativeWebGpuReady/ __nativeCanvasReady removal, dead polling code cleanup, and hot-path copy optimizations. Standards/practices adopted: - NOTICE.md follows Apache-2.0 convention for bundled third-party code - Migration plan uses changelog format with standards-gap tracking Remaining gaps: - CI jobs are defined but not yet wired into a GitHub Actions workflow matrix (.github/workflows/ entry needed) - No automated Playground launch/screenshot test in CI — GPU-dependent tests need real device or software rasterizer strategy - NOTICE.md may need updating as wgpu upstream dependencies change
Prep for Windows/Visual C++ 2022 build without introducing platform-specific code paths. Fixes: - LineCaps.h, Colors.h: static → inline on header-defined functions to avoid MSVC C4505 "unreferenced local function" warnings (each TU was getting its own copy; inline gives proper ODR-safe sharing) - Colors.h: anonymous namespace for TRANSPARENT_BLACK → inline const at namespace scope (avoids internal linkage + static function mixing) - Colors.h: added missing #include <cstdio> for sscanf - LineCaps.h: removed unused #include <regex> (heavy header, not needed) - Canvas.cpp: C-style cast → static_cast for buffer.Data() - CMakeLists.txt: wrapped -Wno-* flags in generator expression to skip on MSVC; added _CRT_SECURE_NO_WARNINGS for sscanf usage in Colors.h
There was a problem hiding this comment.
Pull request overview
This PR migrates BabylonNative from the BGFX rendering backend to WebGPU (wgpu), addressing stability issues encountered with OpenXR on Android while enabling modern rendering capabilities. The migration replaces BGFX's multi-backend approach with wgpu-native (a Rust library) and introduces a femtovg-backed Canvas polyfill.
Changes:
- Replaces BGFX/bimg/bx dependencies with wgpu-native and femtovg for rendering
- Introduces NativeWebGPU plugin exposing navigator.gpu API surface
- Migrates Canvas polyfill from NanoVG/BGFX to femtovg/wgpu (CanvasWgpu)
- Updates build system to require Rust toolchain and removes BGFX-specific configurations
- Constrains Windows builds to D3D12 (removes D3D11 support) and Linux builds to Vulkan (removes OpenGL)
- Removes multiple plugins (NativeEngine, NativeCamera, NativeCapture, NativeOptimizations, NativeTracing, TestUtils, ExternalTexture, NativeXr)
- Updates Playground apps with WebGPU initialization and removes legacy input handling
Reviewed changes
Copilot reviewed 88 out of 98 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| Polyfills/CanvasWgpu/Source/*.{h,cpp} | New femtovg-backed Canvas implementation with Path2D, gradient, image, and context support |
| Core/GraphicsWgpu/Source/*.{h,cpp} | New wgpu-based graphics device implementation replacing BGFX |
| Core/GraphicsWgpu/Rust/* | Rust backend bridging wgpu-native C API to C++ |
| Plugins/NativeWebGPU/* | New plugin exposing WebGPU navigator.gpu API |
| Apps/Playground/* | Updated to use WebGPU rendering with new initialization sequence |
| CMakeLists.txt, Cargo.toml | Build system changes for Rust integration and dependency updates |
| NOTICE.md | License updates removing BGFX licenses, adding wgpu/femtovg licenses |
| .github/jobs/*.yml | CI pipeline updates for Rust toolchain and D3D12/Vulkan defaults |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| { | ||
| NSVGparser* parser = nsvg__createParser(); | ||
| const char* path[] = {"d", d.c_str(), NULL}; | ||
| const char** attr = {path}; |
There was a problem hiding this comment.
The initialization const char** attr = {path} creates a pointer to the local array path, but this should be const char** attr = path to correctly pass the array. The current syntax creates a single-element array containing the pointer to path, which is incorrect for the intended usage with nsvg__parsePath.
| const char** attr = {path}; | |
| const char** attr = path; |
| nativeConfig.reserved0 = 0; | ||
| nativeConfig.reserved1 = 0; |
There was a problem hiding this comment.
Explicitly assigning reserved fields to 0 is unnecessary if the struct is zero-initialized. Consider using aggregate initialization BabylonWgpuConfig nativeConfig{}; which already zero-initializes all fields, eliminating the need for explicit reserved field assignments.
| nativeConfig.reserved0 = 0; | |
| nativeConfig.reserved1 = 0; |
| "referenceImage": "scissor-test.png", | ||
| "excludedGraphicsApis": [ "D3D12", "OpenGL" ], | ||
| "comment": "TODO: reenable D3D12 when automatic mip-maps issue is fixed in bgfx. Incorrect rendering with OpenGL." | ||
| "comment": "TODO: reenable D3D12 when automatic mip-map generation issue is fixed. Incorrect rendering with OpenGL." |
There was a problem hiding this comment.
Corrected 'mip-maps' to 'mip-map generation' for consistency with standard terminology.
| appContext->ScriptLoader().LoadScript(sourcePath); | ||
| env->ReleaseStringUTFChars(path, sourcePath); |
There was a problem hiding this comment.
The JNI string should be released in a finally block or using RAII to ensure cleanup even if LoadScript throws an exception. Consider wrapping the UTF chars in a smart pointer or using a scope guard to guarantee the release call.
| appContext->ScriptLoader().LoadScript(sourcePath); | |
| env->ReleaseStringUTFChars(path, sourcePath); | |
| struct UtfCharsReleaser | |
| { | |
| JNIEnv* env; | |
| jstring jstr; | |
| const char* chars; | |
| ~UtfCharsReleaser() | |
| { | |
| if (env != nullptr && jstr != nullptr && chars != nullptr) | |
| { | |
| env->ReleaseStringUTFChars(jstr, chars); | |
| } | |
| } | |
| } utfCharsReleaser{env, path, sourcePath}; | |
| appContext->ScriptLoader().LoadScript(sourcePath); |
…dent targets Resolve 6 merge conflicts, preserving both sides' additions: - Declare all master plugin options (ShaderCache, ShaderCompiler, ShaderTool, NativeEngine, ExternalTexture, NativeCapture, NativeEncoding, etc.) for structural completeness, but FORCE OFF all bgfx-dependent plugins on the wgpu branch since no bgfx consumer exists. - Combine Plugins/CMakeLists.txt: all master subdirectories (guarded by option flags) plus NativeWebGPU from the wgpu branch. - AppContext: remove ShaderCache Enable/Disable calls and includes — the wgpu path uses WGSL→naga and has no BgfxShaderInfo consumer. - NOTICE.md: keep both wgpu-era and master-era (xxHash) license notices. ShaderCache integration notes: Master's ShaderCache caches compiled bgfx shader binaries (BgfxShaderInfo with vertex/fragment byte blobs) keyed by xxHash of GLSL source. The wgpu path has no equivalent consumer today. The migration path is wgpu::PipelineCache — an experimental API that serializes compiled pipeline state. A TODO documents this: implement a wgpu-native pipeline cache behind the same Enable/Disable/Save/Load public API surface once the experimental API stabilises. Also fixes Font::Familiy() → Font::Family() typo (PR review comment) in both Canvas and CanvasWgpu implementations plus all callers.
|
I'm not sure why you opened this PR here instead of using your own fork Github Action. Please close this PR. |
MSVC /W4 /WX errors: - NativeWebGPU.cpp: remove redeclaration of descriptorCacheHash that shadowed the outer variable (C4456 → C2220) - Colors.h: add static_cast<unsigned char> for nvgRGBA/nvgRGB args to silence int→unsigned char truncation warnings (C4244 → C2220) - Canvas.cpp: qualify ObjectWrap::Value() as this->Value() for MSVC two-phase name lookup (C3861) - Path2D.cpp: rename outer 'path' to 'svgAttr' to avoid shadowing the NSVGpath loop variable (C4456); remove redundant brace initializer - Path2D.cpp: suppress C4456 in third-party nanosvg.h via pragma iOS CI: - cmake.yml: add aarch64-apple-ios, aarch64-apple-ios-sim, and x86_64-apple-ios Rust cross-compilation targets PR review: - WgpuNative.cpp: remove redundant reserved0/reserved1 assignments (struct is already zero-initialized)
Prefix create_local_surface's instance parameter with underscore — it is only used inside platform-specific #[cfg] blocks and is unused on Linux where no surface backend is wired yet. Install libvulkan-dev on the Ubuntu CI runner so vulkan/vulkan.h is available for the Vulkan backend headers.
Canvas.cpp: remove ObjectWrap::Value() call in Dispose() — the method is not reliably available on MSVC. The _context JS property is already unreachable after m_contextObject.Reset(). Path2D.cpp: rename NSVGpath loop variable from 'path' to 'svgPath' to avoid C4456 shadow warning against the outer NativeCanvasPath2D* path. cmake.yml: add aarch64-linux-android Rust target for Android NDK cross-compilation.
Colors.h: cast std::tolower return (int) back to char in the std::transform lambda to silence C4244. CMakeLists.txt: pass BINDGEN_EXTRA_CLANG_ARGS with the NDK sysroot and target triple so bindgen uses Android headers instead of the host glibc headers when cross-compiling.
Context.cpp: disable C4100 (unreferenced formal parameter) for the file — napi callback signatures require info/value parameters that many getters and stubs do not reference. Colors.h: replace std::transform + std::tolower with a plain loop to avoid C4244 from MSVC STL template internals leaking the int return type of std::tolower.
Same std::transform + std::tolower pattern that was fixed in Colors.h. Replace with plain loop to avoid MSVC STL template instantiation leaking the int return type.
|
@matthargett We have no intention to remove bgfx and replace it with wgpu. Please run actions on your BabylonNative fork and close this PR. |
Problem:
We worked on trying to make an OpenXR backend that worked with Android XR simulator and Android 10(ish) commercial XR headsets for several months, and could never get it stable. We then tried using ARCore and Jetpack as a fallback, building off of the existing Android ARCore support, hoping that would give us an incremental way forward. One blocker is that Android simulators before API 31 will hang when using Vulkan bindings. We traced this into swapchain atomicity details within bgfx itself.
Solution:
Based on our experience with bringing visionOS support to BGFX in 2023, and our experience across 9 difference XR hardware (glasses and HMD), and 4 different simulators, I propose taking a leap forward into WebGPU rendering by leveraging the Rust wgpu library. In this sketch, NativeEngine is mapped onto wgpu-native primitives, but also a WebGPU API surface is exposed via Node API. When running through BabylonNative demos, I learned that Canvas API support necessitated migration as well. This repo represents an end-to-end working example, profiled, sanitized, and tested on macOS, iOS 16.4, Android 12 (simulator), Android 10 (XR device as 2D app).
The choice of wgpu-native was based on its export of a simple C API, and our tumultuous experience with integrating Dawn into React Native during our work for Infinite Reality in 2024. I have some concerns about a WebGPU monoculture emerging, and feel it is within the spirit of BabylonJS' commitment to diversification of open standards implementations (eg Khronos reference implementations) to put weight behind this diversification.
I carefully measured package size, RAM usage, CPU usage, and GPU usage for the playground example. There is probably more work to do there, so I'm eager to get holistic feedback from Azure CI pipelines and internal test suites.
TODOs:
Future:
Notes: