video Preprocesss #18048
Unanswered
jiangkunkun1993
asked this question in
Q&A
video Preprocesss
#18048
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Would it be supported to add a video preprocessing module to mtmd, which mimics the logic of HF Qwen2VLVideoProcessor: split multiple frames into spatiotemporal patches with temporal_patch_size=2 and merge_size=2, generate a video_grid_thw=[3,30,60], and produce 5400 visual tokens.
Beta Was this translation helpful? Give feedback.
All reactions