How to design network with DiT blocks that are friendly to Tensorrt fp16 conversion? #12949

JohnHerry · 2025-11-12T02:23:37Z

JohnHerry
Nov 12, 2025

We had a network that structed as a convnet pre-encoder -> DiT blocks -> final block for last sampling, it worked well with torch format and onnx format, but when we tried to convert it into tensorrt fp16 format, the inference will get value overflow. we had seen the data differene [between onnx and trt fp16, with polygraphy.] get larger and larger following those DiT blocks. My question is, how to make the whole model design more friendly to mix-precision inference? to let the DiT blocks less sensitive to value precision. Should I make the convnet pre-encoder and final blocks more complex, or more simple? Thanks

2026-01-09T15:06:03Z

github-actions[bot]
bot Jan 9, 2026

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to design network with DiT blocks that are friendly to Tensorrt fp16 conversion? #12949

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to design network with DiT blocks that are friendly to Tensorrt fp16 conversion? #12949

Uh oh!

JohnHerry Nov 12, 2025

Replies: 1 comment

Uh oh!

github-actions[bot] bot Jan 9, 2026

JohnHerry
Nov 12, 2025

github-actions[bot]
bot Jan 9, 2026