Skip to content

Conversation

@Acly
Copy link
Owner

@Acly Acly commented Aug 1, 2025

No description provided.

Acly added 18 commits August 1, 2025 10:33
…o gguf

* introduce `model_file` to read key-value data from gguf files
* conditionally set `cwhn` flag based on gguf tensor data layout
migan: can now run in cwhn and whcn mode (but cwhn remains faster in all cases)
* convert weights on CPU after model load
* whcn is slower on both cpu or vulkan
* whcn is more correct on vulkan, likely there is a bug in cwhn version of conv2d/deform
…meter

* even though almost all tests use cwhn right now, whcn is the default in both ggml and pytorch.
* it's only available for WHCN layout for now, but faster than anything else
* WHCN for model weights
* WHCN for Vulkan
* CWHN for CPU (converted at model load)
* probably CWHN version of birefnet is still somewhat broken, but since WHCN doesn't have the issue and is way faster there's not much incentive atm to fix
* times old -> new / new with coopmat2
* birefnet: 268ms -> 315ms / 243ms
* birefnet-lite: 109ms -> 119 / 87ms
* deform conv2d is a bit slower without coopmat2 support, but also requires much less vram, so still worth it
@Acly Acly merged commit ddeea58 into main Aug 13, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants