Commit 58e50e0
authored
## Purpose ##
* Enable users to offload activations to another GPU
* Because GPU to GPU transfer is must faster than GPU to CPU, there
should theoretically be runtime improvements from this option
## Changes ##
* Rename `offload_sequential_activations` -> `sequential_offload_device`
## TODO ##
* Demonstrate in test that using `cuda:1` leads to runtime improvements
---------
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
1 parent ef45976 commit 58e50e0
File tree
2 files changed
+6
-7
lines changed- src/llmcompressor
- args
- pipelines/sequential
2 files changed
+6
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
230 | 230 | | |
231 | 231 | | |
232 | 232 | | |
233 | | - | |
234 | | - | |
| 233 | + | |
| 234 | + | |
235 | 235 | | |
236 | | - | |
237 | | - | |
238 | | - | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
239 | 239 | | |
240 | 240 | | |
241 | 241 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
92 | | - | |
93 | | - | |
| 92 | + | |
94 | 93 | | |
95 | 94 | | |
96 | 95 | | |
| |||
0 commit comments