Replies: 1 comment
-
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
We had a network that structed as
a convnet pre-encoder -> DiT blocks -> final block for last sampling, it worked well with torch format and onnx format, but when we tried to convert it into tensorrt fp16 format, the inference will get value overflow. we had seen the data differene [between onnx and trt fp16, with polygraphy.] get larger and larger following those DiT blocks. My question is, how to make the whole model design more friendly to mix-precision inference? to let the DiT blocks less sensitive to value precision. Should I make the convnet pre-encoder and final blocks more complex, or more simple? ThanksBeta Was this translation helpful? Give feedback.
All reactions