Skip to content

[Inquiry] Questions regarding code benchmarks (Qwen baselines) and fine-tuning scripts #3

@Qianhao-Ren

Description

@Qianhao-Ren

Hello! Thank you for the impressive work on LLaDA 2.0. I really enjoyed reading the paper and I am excited to try out the model.

I have a couple of questions regarding the benchmarks and future usage:

  1. Regarding Code Benchmarks:
    I noticed that the paper compares the model against "Qwen3 8b (no think)". However, it seems that the performance of this baseline on code benchmarks (specifically BIRD and Spider) is significantly lower than that of current SOTA coding models, such as Qwen-2.5-Coder-7B-Instruct or Qwen-2.5-7B-Instruct.
    I was wondering if you have conducted any comparisons against these stronger coding baselines? It would be very helpful to see how LLaDA 2.0 stacks up against specifically optimized coding models.

  2. Fine-tuning Guidance:
    I am interested in performing further fine-tuning (either LoRA or full-parameter) on this model using my own dataset. Could you please provide some guidance or example scripts for training? Knowing the recommended hyperparameters or the training pipeline you used would be greatly appreciated.

Thank you again for your contribution to the open-source community!

Best regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions