Skip to content

Commit 93c85cc

Browse files
committed
[update] the specification.
1 parent 00aabab commit 93c85cc

File tree

1 file changed

+159
-151
lines changed

1 file changed

+159
-151
lines changed

docs/specs/uniform-buffers-and-bind-groups.md

Lines changed: 159 additions & 151 deletions
Original file line numberDiff line numberDiff line change
@@ -3,66 +3,73 @@ title: "Uniform Buffers and Bind Groups"
33
document_id: "ubo-spec-2025-10-11"
44
status: "living"
55
created: "2025-10-11T00:00:00Z"
6-
last_updated: "2025-10-13T00:00:00Z"
7-
version: "0.2.0"
6+
last_updated: "2025-10-17T00:00:00Z"
7+
version: "0.4.0"
88
engine_workspace_version: "2023.1.30"
99
wgpu_version: "26.0.1"
1010
shader_backend_default: "naga"
1111
winit_version: "0.29.10"
12-
repo_commit: "3e63f82b0a364bc52a40ae297a5300f998800518"
12+
repo_commit: "00aababeb76370ebdeb67fc12ab4393aac5e4193"
1313
owners: ["lambda-sh"]
1414
reviewers: ["engine", "rendering"]
1515
tags: ["spec", "rendering", "uniforms", "bind-groups", "wgpu"]
1616
---
1717

1818
# Uniform Buffers and Bind Groups
1919

20-
This spec defines uniform buffer objects (UBO) and bind groups for Lambda’s
21-
wgpu-backed renderer. It follows the existing builder/command patterns and
22-
splits responsibilities between the platform layer (`lambda-rs-platform`) and
23-
the high-level API (`lambda-rs`).
20+
Summary
21+
- Specifies uniform buffer objects (UBOs) and bind groups for the
22+
wgpu‑backed renderer, preserving builder/command patterns and the separation
23+
between platform and high‑level layers.
24+
- Rationale: Enables structured constants (for example, cameras, materials,
25+
per‑frame data) beyond push constants and supports dynamic offsets for
26+
batching many small records efficiently.
2427

25-
The design enables larger, structured GPU constants (cameras, materials,
26-
per-frame data) beyond push constants, with an ergonomic path to dynamic
27-
offsets for batching many small uniforms in a single buffer.
28+
## Scope
2829

29-
## Goals
30+
### Goals
3031

3132
- Add first-class uniform buffers and bind groups.
3233
- Maintain builder ergonomics consistent with buffers, pipelines, and passes.
3334
- Integrate with the existing render command stream (inside a pass).
3435
- Provide a portable, WGSL/GLSL-friendly layout model and validation.
3536
- Expose dynamic uniform offsets (opt-in) with correct alignment handling.
3637

37-
## Non-Goals
38+
### Non-Goals
3839

3940
- Storage buffers, textures/samplers, and compute are referenced but not
4041
implemented here; separate specs cover them.
4142
- Descriptor set caching beyond wgpu’s internal caches.
4243

43-
## Background
44+
## Terminology
4445

45-
Roadmap docs propose UBOs and bind groups to complement push constants and
46-
unlock cameras/materials. This spec refines those sketches into concrete API
47-
types, builders, commands, validation, and an implementation plan for both
48-
layers of the workspace.
46+
- Uniform buffer object (UBO): Read‑only constant buffer accessed by shaders as
47+
`var<uniform>`.
48+
- Bind group: A collection of bound resources used together by a pipeline.
49+
- Bind group layout: The declared interface (bindings, types, visibility) for a
50+
bind group.
51+
- Dynamic offset: A per‑draw offset applied to a uniform binding to select a
52+
different slice within a larger buffer.
53+
- Visibility: Shader stage visibility for a binding (vertex, fragment, compute).
4954

5055
## Architecture Overview
5156

5257
- Platform (`lambda-rs-platform`)
53-
- Thin wrappers around `wgpu::BindGroupLayout` and `wgpu::BindGroup` with
54-
builder structs that produce concrete `wgpu` descriptors and perform
55-
validation against device limits.
56-
- Expose the raw `wgpu` handles for use by higher layers.
58+
- Wrappers around `wgpu::BindGroupLayout` and `wgpu::BindGroup` with builder
59+
types that produce `wgpu` descriptors and perform validation against device
60+
limits.
61+
- The platform layer owns the raw `wgpu` handles and exposes them to the
62+
high-level layer as needed.
5763

5864
- High level (`lambda-rs`)
59-
- Public builders/types for bind group layouts and bind groups aligned with
60-
`RenderPipelineBuilder` and `BufferBuilder` patterns.
61-
- Extend `RenderPipelineBuilder` to accept bind group layouts, building a
62-
`wgpu::PipelineLayout` under the hood.
63-
- Extend `RenderCommand` with `SetBindGroup` to bind resources during a pass.
64-
- Avoid exposing `wgpu` types in the public API; surface numeric limits and
65-
high-level wrappers only, delegating raw handles to the platform layer.
65+
- Public builders and types for bind group layouts and bind groups, aligned
66+
with existing `RenderPipelineBuilder` and `BufferBuilder` patterns.
67+
- `RenderPipelineBuilder` accepts bind group layouts and constructs a
68+
`wgpu::PipelineLayout` during build.
69+
- `RenderCommand` includes `SetBindGroup` to bind resources during a pass.
70+
- The public application programming interface avoids exposing `wgpu` types.
71+
Numeric limits and high-level wrappers are surfaced; raw handles live in the
72+
platform layer.
6673

6774
Data flow (one-time setup → per-frame):
6875
```
@@ -73,86 +80,64 @@ BufferBuilder (Usage::UNIFORM) --------------+--> BindGroupBuilder (uniform bind
7380
Per-frame commands: BeginRenderPass -> SetPipeline -> SetBindGroup -> Draw -> End
7481
```
7582

76-
## Platform API Design (lambda-rs-platform)
77-
78-
- Module: `lambda_platform::wgpu::bind`
79-
- `struct BindGroupLayout { raw: wgpu::BindGroupLayout, label: Option<String> }`
80-
- `struct BindGroup { raw: wgpu::BindGroup, label: Option<String> }`
81-
- `enum Visibility { Vertex, Fragment, Compute, VertexAndFragment, All }`
82-
- Maps to `wgpu::ShaderStages`.
83-
- `struct BindGroupLayoutBuilder { entries: Vec<wgpu::BindGroupLayoutEntry>, label: Option<String> }`
84-
- `fn new() -> Self`
85-
- `fn with_uniform(mut self, binding: u32, visibility: Visibility) -> Self`
86-
- `fn with_uniform_dynamic(mut self, binding: u32, visibility: Visibility) -> Self`
87-
- `fn with_label(mut self, label: &str) -> Self`
88-
- `fn build(self, device: &wgpu::Device) -> BindGroupLayout`
89-
- `struct BindGroupBuilder { layout: wgpu::BindGroupLayout, entries: Vec<wgpu::BindGroupEntry>, label: Option<String> }`
90-
- `fn new() -> Self`
91-
- `fn with_layout(mut self, layout: &BindGroupLayout) -> Self`
92-
- `fn with_uniform(mut self, binding: u32, buffer: &wgpu::Buffer, offset: u64, size: Option<NonZeroU64>) -> Self`
93-
- `fn with_label(mut self, label: &str) -> Self`
94-
- `fn build(self, device: &wgpu::Device) -> BindGroup`
95-
96-
Validation and limits
97-
- High-level validation now checks common cases early:
98-
- Bind group uniform binding sizes are asserted to be ≤ `max_uniform_buffer_binding_size`.
99-
- Dynamic offset count and alignment are validated before encoding `SetBindGroup`.
100-
- Pipeline builder asserts the number of bind group layouts ≤ `max_bind_groups`.
101-
- Helpers are provided to compute aligned strides and to validate dynamic offsets.
102-
103-
Helpers
104-
- High-level exposes small helpers:
105-
- `align_up(value, align)` to compute aligned uniform strides (for offsets).
106-
- `validate_dynamic_offsets(required, offsets, alignment, set)` used internally and testable.
107-
108-
## High-Level API Design (lambda-rs)
109-
110-
New module: `lambda::render::bind`
111-
- `pub struct BindGroupLayout { /* holds Rc<wgpu::BindGroupLayout> */ }`
112-
- `pub struct BindGroup { /* holds Rc<wgpu::BindGroup> */ }`
113-
- `pub enum BindingVisibility { Vertex, Fragment, Compute, VertexAndFragment, All }`
114-
- `pub struct BindGroupLayoutBuilder { /* mirrors platform builder */ }`
115-
- `pub fn new() -> Self`
116-
- `pub fn with_uniform(self, binding: u32, visibility: BindingVisibility) -> Self`
117-
- `pub fn with_uniform_dynamic(self, binding: u32, visibility: BindingVisibility) -> Self`
118-
- `pub fn with_label(self, label: &str) -> Self`
119-
- `pub fn build(self, rc: &RenderContext) -> BindGroupLayout`
120-
- `pub struct BindGroupBuilder { /* mirrors platform builder */ }`
121-
- `pub fn new() -> Self`
122-
- `pub fn with_layout(self, layout: &BindGroupLayout) -> Self`
123-
- `pub fn with_uniform(self, binding: u32, buffer: &buffer::Buffer, offset: u64, size: Option<NonZeroU64>) -> Self`
124-
- `pub fn with_label(self, label: &str) -> Self`
125-
- `pub fn build(self, rc: &RenderContext) -> BindGroup`
126-
127-
Pipeline integration
128-
- `RenderPipelineBuilder::with_layouts(&[&BindGroupLayout])` stores layouts and
129-
constructs a `wgpu::PipelineLayout` during `build(...)`.
130-
131-
Render commands
132-
- Extend `RenderCommand` with:
133-
- `SetBindGroup { set: u32, group: super::ResourceId, dynamic_offsets: Vec<u32> }`
134-
- `RenderContext::encode_pass` maps to `wgpu::RenderPass::set_bind_group`.
135-
136-
Buffers
137-
- Continue using `buffer::BufferBuilder` with `Usage::UNIFORM` and CPU-visible
138-
properties for frequently updated UBOs.
139-
- A typed `UniformBuffer<T>` wrapper is available with `new(&mut rc, &T, label)`
140-
and `write(&rc, &T)`, and exposes `raw()` to bind.
141-
142-
## Layout and Alignment Rules
143-
144-
- WGSL/std140-like layout for uniform buffers (via naga/wgpu):
83+
## Design
84+
85+
### API Surface
86+
87+
- Platform layer (`lambda-rs-platform`, module `lambda_platform::wgpu::bind`)
88+
- Types: `BindGroupLayout`, `BindGroup`, and `Visibility` (maps to
89+
`wgpu::ShaderStages`).
90+
- Builders: `BindGroupLayoutBuilder` and `BindGroupBuilder` for declaring
91+
uniform bindings (static and dynamic), setting labels, and creating
92+
resources.
93+
- High-level layer (`lambda-rs`, module `lambda::render::bind`)
94+
- Types: high-level `BindGroupLayout` and `BindGroup` wrappers, and
95+
`BindingVisibility` enumeration.
96+
- Builders: mirror the platform builders; integrate with `RenderContext`.
97+
- Pipeline integration: `RenderPipelineBuilder::with_layouts(&[&BindGroupLayout])`
98+
stores layouts and constructs a `wgpu::PipelineLayout` during `build`.
99+
- Render commands: `RenderCommand::SetBindGroup { set, group, dynamic_offsets }`
100+
encodes `wgpu::RenderPass::set_bind_group` via `RenderContext`.
101+
- Buffers: Uniform buffers MUST be created with `Usage::UNIFORM`. For frequently
102+
updated data, pair with CPU-visible properties. A typed `UniformBuffer<T>`
103+
provides `new(&mut rc, &T, label)`, `write(&rc, &T)`, and exposes `raw()`.
104+
105+
### Behavior
106+
107+
- Bind group layouts declare uniform bindings and their stage visibility. Layout
108+
indices correspond to set numbers; binding indices map one-to-one to shader
109+
`@binding(N)` declarations.
110+
- Bind groups bind a buffer (with optional size slice) to a binding declared in
111+
the layout. When a binding is dynamic, the actual offset is supplied at draw
112+
time using `dynamic_offsets`.
113+
- Pipelines reference one or more bind group layouts; all render passes that use
114+
that pipeline MUST supply compatible bind groups at the expected sets.
115+
116+
### Validation and Errors
117+
118+
- Uniform binding ranges MUST NOT exceed
119+
`limits.max_uniform_buffer_binding_size`.
120+
- Dynamic uniform offsets MUST be aligned to
121+
`limits.min_uniform_buffer_offset_alignment` and the count MUST match the
122+
number of dynamic bindings set.
123+
- The number of bind group layouts in a pipeline MUST be ≤ `limits.max_bind_groups`.
124+
- Violations surface as wgpu validation errors during resource creation or when
125+
encoding `set_bind_group`. Helper functions validate alignment and counts.
126+
127+
## Constraints and Rules
128+
129+
- WGSL/std140-like layout for uniform buffers (as enforced by wgpu):
145130
- Scalars 4 B; `vec2` 8 B; `vec3/vec4` 16 B; matrices 16 B column alignment.
146131
- Struct members rounded up to their alignment; struct size rounded up to the
147132
max alignment of its fields.
148-
- Rust-side structs used as UBOs must be `#[repr(C)]` and plain-old-data.
149-
Recommend `bytemuck::{Pod, Zeroable}` in examples for safety.
133+
- Rust-side structs used as UBOs MUST be `#[repr(C)]` and plain old data. Using
134+
`bytemuck::{Pod, Zeroable}` in examples is recommended for safety.
150135
- Dynamic offsets must be multiples of
151136
`limits.min_uniform_buffer_offset_alignment`.
152137
- Respect `limits.max_uniform_buffer_binding_size` when slicing UBOs.
153-
- Matrices are column‑major in GLSL/WGSL. If your CPU math builds row‑major
154-
matrices, either transpose before uploading to the GPU or mark GLSL uniform
155-
blocks with `layout(row_major)` to avoid unexpected transforms.
138+
- Matrices are column‑major in GLSL/WGSL. If CPU math constructs row‑major
139+
matrices, transpose before uploading or mark GLSL uniform
140+
blocks with `layout(row_major)` to avoid unexpected transforms.
156141

157142
## Example Usage
158143

@@ -234,63 +219,86 @@ let stride = lambda::render::validation::align_up(size, align);
234219
let offsets = vec![0u32, stride as u32, (2*stride) as u32];
235220
RC::SetBindGroup { set: 0, group: dyn_group_id, dynamic_offsets: offsets };
236221
```
222+
## Performance Considerations
223+
224+
- Prefer `Properties::DEVICE_LOCAL` for long‑lived uniform buffers that are
225+
updated infrequently; otherwise use CPU‑visible memory with
226+
`Queue::write_buffer` for per‑frame updates.
227+
- Rationale: Device‑local memory provides higher bandwidth and lower latency
228+
for repeated reads. When updates are rare, the staging copy cost is
229+
amortized and the GPU benefits every frame. For small
230+
per‑frame updates, writing directly to CPU‑visible memory avoids additional
231+
copies and reduces driver synchronization. On integrated graphics, the hint
232+
still guides an efficient path and helps avoid stalls.
233+
- Use dynamic offsets to reduce bind group churn; align and pack many objects in
234+
a single uniform buffer.
235+
- Rationale: Reusing one bind group and changing only a 32‑bit offset turns
236+
descriptor updates into a cheap command. This lowers CPU
237+
overhead, reduces driver validation and allocation, improves cache locality
238+
by keeping per‑object blocks contiguous, and reduces the number of bind
239+
groups created and cached. Align slices to
240+
`min_uniform_buffer_offset_alignment` to satisfy hardware requirements and
241+
avoid implicit padding or copies.
242+
- Separate stable data (for example, camera) from frequently changing data (for
243+
example, per‑object).
244+
- Rationale: Bind stable data once per pass and vary only the hot set per
245+
draw. This reduces state changes, keeps descriptor caches warm, avoids
246+
rebinding large constant blocks when only small data changes, and lowers
247+
bandwidth while improving cache effectiveness.
248+
249+
## Requirements Checklist
250+
251+
- Functionality
252+
- [x] Core behavior implemented — crates/lambda-rs/src/render/bind.rs
253+
- [x] Dynamic offsets supported — crates/lambda-rs/src/render/command.rs
254+
- [x] Edge cases validated (alignment/size) — crates/lambda-rs/src/render/validation.rs
255+
- API Surface
256+
- [x] Platform types and builders — crates/lambda-rs-platform/src/wgpu/bind.rs
257+
- [x] High-level wrappers and builders — crates/lambda-rs/src/render/bind.rs
258+
- [x] Pipeline layout integration — crates/lambda-rs/src/render/pipeline.rs
259+
- Validation and Errors
260+
- [x] Uniform binding size checks — crates/lambda-rs/src/render/mod.rs
261+
- [x] Dynamic offset alignment/count checks — crates/lambda-rs/src/render/validation.rs
262+
- Performance
263+
- [x] Recommendations documented (this section)
264+
- [x] Dynamic offsets example provided — docs/specs/uniform-buffers-and-bind-groups.md
265+
- Documentation and Examples
266+
- [x] Spec updated (this document)
267+
- [x] Example added — crates/lambda-rs/examples/uniform_buffer_triangle.rs
268+
269+
## Verification and Testing
237270

238-
## Error Handling
239-
240-
- `BufferBuilder` already errors on zero length; keep behavior.
241-
- Bind group and layout builders currently do not pre‑validate against device limits.
242-
Invalid sizes/offsets typically surface as `wgpu` validation errors during creation
243-
or when calling `set_bind_group`. Ensure dynamic offsets are aligned to device limits
244-
and uniform ranges respect `max_uniform_buffer_binding_size`.
245-
246-
## Performance Notes
247-
248-
- Prefer `Properties::DEVICE_LOCAL` for long-lived UBOs updated infrequently;
249-
otherwise CPU-visible + `Queue::write_buffer` for per-frame updates.
250-
- Dynamic offsets reduce bind group churn; align and pack many objects per UBO.
251-
- Group stable data (camera) separate from frequently changing data (object).
252-
253-
## Implementation Plan
254-
255-
Phase 0 (minimal, static UBO)
256-
- Platform: add bind module, layout/bind builders, validation helpers.
257-
- High level: expose `bind` module; add pipeline `.with_layouts`; extend
258-
`RenderCommand` and encoder with `SetBindGroup`.
259-
- Update examples to use one UBO for a transform/camera.
260-
261-
Phase 1 (dynamic offsets)
262-
- Done: `.with_uniform_dynamic` in layout builder, support for dynamic offsets, and
263-
validation of count/alignment before binding. Alignment helper implemented.
264-
265-
Phase 2 (ergonomics/testing)
266-
- Done: `UniformBuffer<T>` wrapper with `.write(&T)` convenience.
267-
- Added unit tests for alignment and dynamic offset validation; example animates a
268-
triangle with a UBO (integration test remains minimal).
269-
270-
File layout
271-
- Platform: `crates/lambda-rs-platform/src/wgpu/bind.rs` (+ `mod.rs` re-export).
272-
- High level: `crates/lambda-rs/src/render/bind.rs`, plus edits to
273-
`render/pipeline.rs`, `render/command.rs`, and `render/mod.rs` to wire in
274-
pipeline layouts and `SetBindGroup` encoding.
271+
- Unit tests
272+
- Alignment helper and dynamic offset validation — crates/lambda-rs/src/render/validation.rs
273+
- Visibility mapping — crates/lambda-rs-platform/src/wgpu/bind.rs
274+
- Command encoding satisfies device limits — crates/lambda-rs/src/render/command.rs
275+
- Command: `cargo test --workspace`
276+
- Integration tests and examples
277+
- `uniform_buffer_triangle` exercises the full path — crates/lambda-rs/examples/uniform_buffer_triangle.rs
278+
- Command: `cargo run -p lambda-rs --example uniform_buffer_triangle`
279+
- Manual checks (optional)
280+
- Validate dynamic offsets across multiple objects render correctly (no
281+
misaligned reads) by varying object counts and strides.
275282

276-
## Testing Plan
283+
## Compatibility and Migration
277284

278-
- Unit tests
279-
- Alignment helper (`align_up`) and dynamic offset validation logic.
280-
- Visibility enum mapping test (in-place).
281-
- Integration
282-
- Example `uniform_buffer_triangle` exercises the full path; a fuller
283-
runnable test remains a future improvement.
285+
- No breaking changes. The feature is additive. Existing pipelines without bind
286+
groups continue to function. New pipelines MAY specify layouts via
287+
`with_layouts` without impacting prior behavior.
284288

285-
## Open Questions
286289

287-
- Should we introduce a typed `Uniform<T>` handle now, or wait until there is a
288-
second typed resource (e.g., storage) to avoid a one-off abstraction?
289-
- Do we want a tiny cache for bind groups keyed by buffer+offset for frequent
290-
reuse, or rely entirely on wgpu’s internal caches?
291290

292291
## Changelog
293292

293+
- 2025-10-17 (v0.4.0) — Restructure to match spec template: add Summary, Scope,
294+
Terminology, Design (API/Behavior/Validation), Constraints and Rules,
295+
Requirements Checklist, Verification and Testing, and Compatibility. Remove
296+
Implementation Plan and Open Questions. No functional changes.
297+
- 2025-10-17 (v0.3.0) — Edit for professional tone; adopt clearer normative
298+
phrasing; convert Performance Notes to concise rationale; no functional
299+
changes to the specification; update metadata.
300+
- 2025-10-17 (v0.2.1) — Expand Performance Notes with rationale; update
301+
metadata (`last_updated`, `version`, `repo_commit`).
294302
- 2025-10-13 (v0.1.1) — Synced spec to implementation: renamed visibility enum variant to `VertexAndFragment`; clarified that builders defer validation to `wgpu`; updated `with_uniform` size type to `Option<NonZeroU64>`; added note on GPU column‑major matrices and CPU transpose guidance; adjusted dynamic offset example.
295303
- 2025-10-11 (v0.1.0) — Initial draft aligned to roadmap; specifies platform and high-level APIs, commands, validation, examples, and phased delivery.
296304
- 2025-10-13 (v0.2.0) — Add validation for dynamic offsets (count/alignment),

0 commit comments

Comments
 (0)