@@ -3,66 +3,73 @@ title: "Uniform Buffers and Bind Groups"
33document_id : " ubo-spec-2025-10-11"
44status : " living"
55created : " 2025-10-11T00:00:00Z"
6- last_updated : " 2025-10-13T00 :00:00Z"
7- version : " 0.2 .0"
6+ last_updated : " 2025-10-17T00 :00:00Z"
7+ version : " 0.4 .0"
88engine_workspace_version : " 2023.1.30"
99wgpu_version : " 26.0.1"
1010shader_backend_default : " naga"
1111winit_version : " 0.29.10"
12- repo_commit : " 3e63f82b0a364bc52a40ae297a5300f998800518 "
12+ repo_commit : " 00aababeb76370ebdeb67fc12ab4393aac5e4193 "
1313owners : ["lambda-sh"]
1414reviewers : ["engine", "rendering"]
1515tags : ["spec", "rendering", "uniforms", "bind-groups", "wgpu"]
1616---
1717
1818# Uniform Buffers and Bind Groups
1919
20- This spec defines uniform buffer objects (UBO) and bind groups for Lambda’s
21- wgpu-backed renderer. It follows the existing builder/command patterns and
22- splits responsibilities between the platform layer (` lambda-rs-platform ` ) and
23- the high-level API (` lambda-rs ` ).
20+ Summary
21+ - Specifies uniform buffer objects (UBOs) and bind groups for the
22+ wgpu‑backed renderer, preserving builder/command patterns and the separation
23+ between platform and high‑level layers.
24+ - Rationale: Enables structured constants (for example, cameras, materials,
25+ per‑frame data) beyond push constants and supports dynamic offsets for
26+ batching many small records efficiently.
2427
25- The design enables larger, structured GPU constants (cameras, materials,
26- per-frame data) beyond push constants, with an ergonomic path to dynamic
27- offsets for batching many small uniforms in a single buffer.
28+ ## Scope
2829
29- ## Goals
30+ ### Goals
3031
3132- Add first-class uniform buffers and bind groups.
3233- Maintain builder ergonomics consistent with buffers, pipelines, and passes.
3334- Integrate with the existing render command stream (inside a pass).
3435- Provide a portable, WGSL/GLSL-friendly layout model and validation.
3536- Expose dynamic uniform offsets (opt-in) with correct alignment handling.
3637
37- ## Non-Goals
38+ ### Non-Goals
3839
3940- Storage buffers, textures/samplers, and compute are referenced but not
4041 implemented here; separate specs cover them.
4142- Descriptor set caching beyond wgpu’s internal caches.
4243
43- ## Background
44+ ## Terminology
4445
45- Roadmap docs propose UBOs and bind groups to complement push constants and
46- unlock cameras/materials. This spec refines those sketches into concrete API
47- types, builders, commands, validation, and an implementation plan for both
48- layers of the workspace.
46+ - Uniform buffer object (UBO): Read‑only constant buffer accessed by shaders as
47+ ` var<uniform> ` .
48+ - Bind group: A collection of bound resources used together by a pipeline.
49+ - Bind group layout: The declared interface (bindings, types, visibility) for a
50+ bind group.
51+ - Dynamic offset: A per‑draw offset applied to a uniform binding to select a
52+ different slice within a larger buffer.
53+ - Visibility: Shader stage visibility for a binding (vertex, fragment, compute).
4954
5055## Architecture Overview
5156
5257- Platform (` lambda-rs-platform ` )
53- - Thin wrappers around ` wgpu::BindGroupLayout ` and ` wgpu::BindGroup ` with
54- builder structs that produce concrete ` wgpu ` descriptors and perform
55- validation against device limits.
56- - Expose the raw ` wgpu ` handles for use by higher layers.
58+ - Wrappers around ` wgpu::BindGroupLayout ` and ` wgpu::BindGroup ` with builder
59+ types that produce ` wgpu ` descriptors and perform validation against device
60+ limits.
61+ - The platform layer owns the raw ` wgpu ` handles and exposes them to the
62+ high-level layer as needed.
5763
5864- High level (` lambda-rs ` )
59- - Public builders/types for bind group layouts and bind groups aligned with
60- ` RenderPipelineBuilder ` and ` BufferBuilder ` patterns.
61- - Extend ` RenderPipelineBuilder ` to accept bind group layouts, building a
62- ` wgpu::PipelineLayout ` under the hood.
63- - Extend ` RenderCommand ` with ` SetBindGroup ` to bind resources during a pass.
64- - Avoid exposing ` wgpu ` types in the public API; surface numeric limits and
65- high-level wrappers only, delegating raw handles to the platform layer.
65+ - Public builders and types for bind group layouts and bind groups, aligned
66+ with existing ` RenderPipelineBuilder ` and ` BufferBuilder ` patterns.
67+ - ` RenderPipelineBuilder ` accepts bind group layouts and constructs a
68+ ` wgpu::PipelineLayout ` during build.
69+ - ` RenderCommand ` includes ` SetBindGroup ` to bind resources during a pass.
70+ - The public application programming interface avoids exposing ` wgpu ` types.
71+ Numeric limits and high-level wrappers are surfaced; raw handles live in the
72+ platform layer.
6673
6774Data flow (one-time setup → per-frame):
6875```
@@ -73,86 +80,64 @@ BufferBuilder (Usage::UNIFORM) --------------+--> BindGroupBuilder (uniform bind
7380Per-frame commands: BeginRenderPass -> SetPipeline -> SetBindGroup -> Draw -> End
7481```
7582
76- ## Platform API Design (lambda-rs-platform)
77-
78- - Module: ` lambda_platform::wgpu::bind `
79- - ` struct BindGroupLayout { raw: wgpu::BindGroupLayout, label: Option<String> } `
80- - ` struct BindGroup { raw: wgpu::BindGroup, label: Option<String> } `
81- - ` enum Visibility { Vertex, Fragment, Compute, VertexAndFragment, All } `
82- - Maps to ` wgpu::ShaderStages ` .
83- - ` struct BindGroupLayoutBuilder { entries: Vec<wgpu::BindGroupLayoutEntry>, label: Option<String> } `
84- - ` fn new() -> Self `
85- - ` fn with_uniform(mut self, binding: u32, visibility: Visibility) -> Self `
86- - ` fn with_uniform_dynamic(mut self, binding: u32, visibility: Visibility) -> Self `
87- - ` fn with_label(mut self, label: &str) -> Self `
88- - ` fn build(self, device: &wgpu::Device) -> BindGroupLayout `
89- - ` struct BindGroupBuilder { layout: wgpu::BindGroupLayout, entries: Vec<wgpu::BindGroupEntry>, label: Option<String> } `
90- - ` fn new() -> Self `
91- - ` fn with_layout(mut self, layout: &BindGroupLayout) -> Self `
92- - ` fn with_uniform(mut self, binding: u32, buffer: &wgpu::Buffer, offset: u64, size: Option<NonZeroU64>) -> Self `
93- - ` fn with_label(mut self, label: &str) -> Self `
94- - ` fn build(self, device: &wgpu::Device) -> BindGroup `
95-
96- Validation and limits
97- - High-level validation now checks common cases early:
98- - Bind group uniform binding sizes are asserted to be ≤ ` max_uniform_buffer_binding_size ` .
99- - Dynamic offset count and alignment are validated before encoding ` SetBindGroup ` .
100- - Pipeline builder asserts the number of bind group layouts ≤ ` max_bind_groups ` .
101- - Helpers are provided to compute aligned strides and to validate dynamic offsets.
102-
103- Helpers
104- - High-level exposes small helpers:
105- - ` align_up(value, align) ` to compute aligned uniform strides (for offsets).
106- - ` validate_dynamic_offsets(required, offsets, alignment, set) ` used internally and testable.
107-
108- ## High-Level API Design (lambda-rs)
109-
110- New module: ` lambda::render::bind `
111- - ` pub struct BindGroupLayout { /* holds Rc<wgpu::BindGroupLayout> */ } `
112- - ` pub struct BindGroup { /* holds Rc<wgpu::BindGroup> */ } `
113- - ` pub enum BindingVisibility { Vertex, Fragment, Compute, VertexAndFragment, All } `
114- - ` pub struct BindGroupLayoutBuilder { /* mirrors platform builder */ } `
115- - ` pub fn new() -> Self `
116- - ` pub fn with_uniform(self, binding: u32, visibility: BindingVisibility) -> Self `
117- - ` pub fn with_uniform_dynamic(self, binding: u32, visibility: BindingVisibility) -> Self `
118- - ` pub fn with_label(self, label: &str) -> Self `
119- - ` pub fn build(self, rc: &RenderContext) -> BindGroupLayout `
120- - ` pub struct BindGroupBuilder { /* mirrors platform builder */ } `
121- - ` pub fn new() -> Self `
122- - ` pub fn with_layout(self, layout: &BindGroupLayout) -> Self `
123- - ` pub fn with_uniform(self, binding: u32, buffer: &buffer::Buffer, offset: u64, size: Option<NonZeroU64>) -> Self `
124- - ` pub fn with_label(self, label: &str) -> Self `
125- - ` pub fn build(self, rc: &RenderContext) -> BindGroup `
126-
127- Pipeline integration
128- - ` RenderPipelineBuilder::with_layouts(&[&BindGroupLayout]) ` stores layouts and
129- constructs a ` wgpu::PipelineLayout ` during ` build(...) ` .
130-
131- Render commands
132- - Extend ` RenderCommand ` with:
133- - ` SetBindGroup { set: u32, group: super::ResourceId, dynamic_offsets: Vec<u32> } `
134- - ` RenderContext::encode_pass ` maps to ` wgpu::RenderPass::set_bind_group ` .
135-
136- Buffers
137- - Continue using ` buffer::BufferBuilder ` with ` Usage::UNIFORM ` and CPU-visible
138- properties for frequently updated UBOs.
139- - A typed ` UniformBuffer<T> ` wrapper is available with ` new(&mut rc, &T, label) `
140- and ` write(&rc, &T) ` , and exposes ` raw() ` to bind.
141-
142- ## Layout and Alignment Rules
143-
144- - WGSL/std140-like layout for uniform buffers (via naga/wgpu):
83+ ## Design
84+
85+ ### API Surface
86+
87+ - Platform layer (` lambda-rs-platform ` , module ` lambda_platform::wgpu::bind ` )
88+ - Types: ` BindGroupLayout ` , ` BindGroup ` , and ` Visibility ` (maps to
89+ ` wgpu::ShaderStages ` ).
90+ - Builders: ` BindGroupLayoutBuilder ` and ` BindGroupBuilder ` for declaring
91+ uniform bindings (static and dynamic), setting labels, and creating
92+ resources.
93+ - High-level layer (` lambda-rs ` , module ` lambda::render::bind ` )
94+ - Types: high-level ` BindGroupLayout ` and ` BindGroup ` wrappers, and
95+ ` BindingVisibility ` enumeration.
96+ - Builders: mirror the platform builders; integrate with ` RenderContext ` .
97+ - Pipeline integration: ` RenderPipelineBuilder::with_layouts(&[&BindGroupLayout]) `
98+ stores layouts and constructs a ` wgpu::PipelineLayout ` during ` build ` .
99+ - Render commands: ` RenderCommand::SetBindGroup { set, group, dynamic_offsets } `
100+ encodes ` wgpu::RenderPass::set_bind_group ` via ` RenderContext ` .
101+ - Buffers: Uniform buffers MUST be created with ` Usage::UNIFORM ` . For frequently
102+ updated data, pair with CPU-visible properties. A typed ` UniformBuffer<T> `
103+ provides ` new(&mut rc, &T, label) ` , ` write(&rc, &T) ` , and exposes ` raw() ` .
104+
105+ ### Behavior
106+
107+ - Bind group layouts declare uniform bindings and their stage visibility. Layout
108+ indices correspond to set numbers; binding indices map one-to-one to shader
109+ ` @binding(N) ` declarations.
110+ - Bind groups bind a buffer (with optional size slice) to a binding declared in
111+ the layout. When a binding is dynamic, the actual offset is supplied at draw
112+ time using ` dynamic_offsets ` .
113+ - Pipelines reference one or more bind group layouts; all render passes that use
114+ that pipeline MUST supply compatible bind groups at the expected sets.
115+
116+ ### Validation and Errors
117+
118+ - Uniform binding ranges MUST NOT exceed
119+ ` limits.max_uniform_buffer_binding_size ` .
120+ - Dynamic uniform offsets MUST be aligned to
121+ ` limits.min_uniform_buffer_offset_alignment ` and the count MUST match the
122+ number of dynamic bindings set.
123+ - The number of bind group layouts in a pipeline MUST be ≤ ` limits.max_bind_groups ` .
124+ - Violations surface as wgpu validation errors during resource creation or when
125+ encoding ` set_bind_group ` . Helper functions validate alignment and counts.
126+
127+ ## Constraints and Rules
128+
129+ - WGSL/std140-like layout for uniform buffers (as enforced by wgpu):
145130 - Scalars 4 B; ` vec2 ` 8 B; ` vec3/vec4 ` 16 B; matrices 16 B column alignment.
146131 - Struct members rounded up to their alignment; struct size rounded up to the
147132 max alignment of its fields.
148- - Rust-side structs used as UBOs must be ` #[repr(C)] ` and plain- old- data.
149- Recommend ` bytemuck::{Pod, Zeroable} ` in examples for safety.
133+ - Rust-side structs used as UBOs MUST be ` #[repr(C)] ` and plain old data. Using
134+ ` bytemuck::{Pod, Zeroable} ` in examples is recommended for safety.
150135- Dynamic offsets must be multiples of
151136 ` limits.min_uniform_buffer_offset_alignment ` .
152137- Respect ` limits.max_uniform_buffer_binding_size ` when slicing UBOs.
153- - Matrices are column‑major in GLSL/WGSL. If your CPU math builds row‑major
154- matrices, either transpose before uploading to the GPU or mark GLSL uniform
155- blocks with ` layout(row_major) ` to avoid unexpected transforms.
138+ - Matrices are column‑major in GLSL/WGSL. If CPU math constructs row‑major
139+ matrices, transpose before uploading or mark GLSL uniform
140+ blocks with ` layout(row_major) ` to avoid unexpected transforms.
156141
157142## Example Usage
158143
@@ -234,63 +219,86 @@ let stride = lambda::render::validation::align_up(size, align);
234219let offsets = vec! [0u32 , stride as u32 , (2 * stride ) as u32 ];
235220RC :: SetBindGroup { set : 0 , group : dyn_group_id , dynamic_offsets : offsets };
236221```
222+ ## Performance Considerations
223+
224+ - Prefer ` Properties::DEVICE_LOCAL ` for long‑lived uniform buffers that are
225+ updated infrequently; otherwise use CPU‑visible memory with
226+ ` Queue::write_buffer ` for per‑frame updates.
227+ - Rationale: Device‑local memory provides higher bandwidth and lower latency
228+ for repeated reads. When updates are rare, the staging copy cost is
229+ amortized and the GPU benefits every frame. For small
230+ per‑frame updates, writing directly to CPU‑visible memory avoids additional
231+ copies and reduces driver synchronization. On integrated graphics, the hint
232+ still guides an efficient path and helps avoid stalls.
233+ - Use dynamic offsets to reduce bind group churn; align and pack many objects in
234+ a single uniform buffer.
235+ - Rationale: Reusing one bind group and changing only a 32‑bit offset turns
236+ descriptor updates into a cheap command. This lowers CPU
237+ overhead, reduces driver validation and allocation, improves cache locality
238+ by keeping per‑object blocks contiguous, and reduces the number of bind
239+ groups created and cached. Align slices to
240+ ` min_uniform_buffer_offset_alignment ` to satisfy hardware requirements and
241+ avoid implicit padding or copies.
242+ - Separate stable data (for example, camera) from frequently changing data (for
243+ example, per‑object).
244+ - Rationale: Bind stable data once per pass and vary only the hot set per
245+ draw. This reduces state changes, keeps descriptor caches warm, avoids
246+ rebinding large constant blocks when only small data changes, and lowers
247+ bandwidth while improving cache effectiveness.
248+
249+ ## Requirements Checklist
250+
251+ - Functionality
252+ - [x] Core behavior implemented — crates/lambda-rs/src/render/bind.rs
253+ - [x] Dynamic offsets supported — crates/lambda-rs/src/render/command.rs
254+ - [x] Edge cases validated (alignment/size) — crates/lambda-rs/src/render/validation.rs
255+ - API Surface
256+ - [x] Platform types and builders — crates/lambda-rs-platform/src/wgpu/bind.rs
257+ - [x] High-level wrappers and builders — crates/lambda-rs/src/render/bind.rs
258+ - [x] Pipeline layout integration — crates/lambda-rs/src/render/pipeline.rs
259+ - Validation and Errors
260+ - [x] Uniform binding size checks — crates/lambda-rs/src/render/mod.rs
261+ - [x] Dynamic offset alignment/count checks — crates/lambda-rs/src/render/validation.rs
262+ - Performance
263+ - [x] Recommendations documented (this section)
264+ - [x] Dynamic offsets example provided — docs/specs/uniform-buffers-and-bind-groups.md
265+ - Documentation and Examples
266+ - [x] Spec updated (this document)
267+ - [x] Example added — crates/lambda-rs/examples/uniform_buffer_triangle.rs
268+
269+ ## Verification and Testing
237270
238- ## Error Handling
239-
240- - ` BufferBuilder ` already errors on zero length; keep behavior.
241- - Bind group and layout builders currently do not pre‑validate against device limits.
242- Invalid sizes/offsets typically surface as ` wgpu ` validation errors during creation
243- or when calling ` set_bind_group ` . Ensure dynamic offsets are aligned to device limits
244- and uniform ranges respect ` max_uniform_buffer_binding_size ` .
245-
246- ## Performance Notes
247-
248- - Prefer ` Properties::DEVICE_LOCAL ` for long-lived UBOs updated infrequently;
249- otherwise CPU-visible + ` Queue::write_buffer ` for per-frame updates.
250- - Dynamic offsets reduce bind group churn; align and pack many objects per UBO.
251- - Group stable data (camera) separate from frequently changing data (object).
252-
253- ## Implementation Plan
254-
255- Phase 0 (minimal, static UBO)
256- - Platform: add bind module, layout/bind builders, validation helpers.
257- - High level: expose ` bind ` module; add pipeline ` .with_layouts ` ; extend
258- ` RenderCommand ` and encoder with ` SetBindGroup ` .
259- - Update examples to use one UBO for a transform/camera.
260-
261- Phase 1 (dynamic offsets)
262- - Done: ` .with_uniform_dynamic ` in layout builder, support for dynamic offsets, and
263- validation of count/alignment before binding. Alignment helper implemented.
264-
265- Phase 2 (ergonomics/testing)
266- - Done: ` UniformBuffer<T> ` wrapper with ` .write(&T) ` convenience.
267- - Added unit tests for alignment and dynamic offset validation; example animates a
268- triangle with a UBO (integration test remains minimal).
269-
270- File layout
271- - Platform: ` crates/lambda-rs-platform/src/wgpu/bind.rs ` (+ ` mod.rs ` re-export).
272- - High level: ` crates/lambda-rs/src/render/bind.rs ` , plus edits to
273- ` render/pipeline.rs ` , ` render/command.rs ` , and ` render/mod.rs ` to wire in
274- pipeline layouts and ` SetBindGroup ` encoding.
271+ - Unit tests
272+ - Alignment helper and dynamic offset validation — crates/lambda-rs/src/render/validation.rs
273+ - Visibility mapping — crates/lambda-rs-platform/src/wgpu/bind.rs
274+ - Command encoding satisfies device limits — crates/lambda-rs/src/render/command.rs
275+ - Command: ` cargo test --workspace `
276+ - Integration tests and examples
277+ - ` uniform_buffer_triangle ` exercises the full path — crates/lambda-rs/examples/uniform_buffer_triangle.rs
278+ - Command: ` cargo run -p lambda-rs --example uniform_buffer_triangle `
279+ - Manual checks (optional)
280+ - Validate dynamic offsets across multiple objects render correctly (no
281+ misaligned reads) by varying object counts and strides.
275282
276- ## Testing Plan
283+ ## Compatibility and Migration
277284
278- - Unit tests
279- - Alignment helper (` align_up ` ) and dynamic offset validation logic.
280- - Visibility enum mapping test (in-place).
281- - Integration
282- - Example ` uniform_buffer_triangle ` exercises the full path; a fuller
283- runnable test remains a future improvement.
285+ - No breaking changes. The feature is additive. Existing pipelines without bind
286+ groups continue to function. New pipelines MAY specify layouts via
287+ ` with_layouts ` without impacting prior behavior.
284288
285- ## Open Questions
286289
287- - Should we introduce a typed ` Uniform<T> ` handle now, or wait until there is a
288- second typed resource (e.g., storage) to avoid a one-off abstraction?
289- - Do we want a tiny cache for bind groups keyed by buffer+offset for frequent
290- reuse, or rely entirely on wgpu’s internal caches?
291290
292291## Changelog
293292
293+ - 2025-10-17 (v0.4.0) — Restructure to match spec template: add Summary, Scope,
294+ Terminology, Design (API/Behavior/Validation), Constraints and Rules,
295+ Requirements Checklist, Verification and Testing, and Compatibility. Remove
296+ Implementation Plan and Open Questions. No functional changes.
297+ - 2025-10-17 (v0.3.0) — Edit for professional tone; adopt clearer normative
298+ phrasing; convert Performance Notes to concise rationale; no functional
299+ changes to the specification; update metadata.
300+ - 2025-10-17 (v0.2.1) — Expand Performance Notes with rationale; update
301+ metadata (` last_updated ` , ` version ` , ` repo_commit ` ).
294302- 2025-10-13 (v0.1.1) — Synced spec to implementation: renamed visibility enum variant to ` VertexAndFragment ` ; clarified that builders defer validation to ` wgpu ` ; updated ` with_uniform ` size type to ` Option<NonZeroU64> ` ; added note on GPU column‑major matrices and CPU transpose guidance; adjusted dynamic offset example.
295303- 2025-10-11 (v0.1.0) — Initial draft aligned to roadmap; specifies platform and high-level APIs, commands, validation, examples, and phased delivery.
296304- 2025-10-13 (v0.2.0) — Add validation for dynamic offsets (count/alignment),
0 commit comments