While reproducing results from the official Qwen-VL code, loading Qwen/Qwen2-VL-7B-Instruct now produces missing-weight warnings on transformers==4.52.1.
As a result, all reward outputs become 0, suggesting the model is not initialized with pretrained weights.
Environment
GPU: NVIDIA RTX A6000
CUDA: 12.4
PyTorch: 2.6.0 (+cu124)
Transformers: 4.52.1
Model: Qwen/Qwen2-VL-7B-Instruct
Other libs: accelerate==1.6.0, deepspeed==0.16.5
Python: 3.10
Observed Warning
Some weights of RewardModel were not initialized from the model checkpoint at Qwen/Qwen2-VL-7B-Instruct and are newly initialized: ['model.language_model.embed_tokens.weight', 'model.language_model.layers.0.input_layernorm.weight', ...]