PaddlePaddle
diff --git a/‎docs/zh/datasets_format_zh.md‎
Lines changed: 36 additions & 36 deletions b/‎docs/zh/datasets_format_zh.md‎
Lines changed: 36 additions & 36 deletions
diff --git a/‎examples/config/sft-vl/full.yaml‎
Lines changed: 5 additions & 2 deletions b/‎examples/config/sft-vl/full.yaml‎
Lines changed: 5 additions & 2 deletions
diff --git a/‎examples/config/sft-vl/full_tp_pp.yaml‎ ‎examples/config/sft-vl/full_tp.yaml‎examples/config/sft-vl/full_tp_pp.yaml renamed to examples/config/sft-vl/full_tp.yaml
Lines changed: 6 additions & 5 deletions b/‎examples/config/sft-vl/full_tp_pp.yaml‎ ‎examples/config/sft-vl/full_tp.yaml‎examples/config/sft-vl/full_tp_pp.yaml renamed to examples/config/sft-vl/full_tp.yaml
Lines changed: 6 additions & 5 deletions
diff --git a/‎examples/config/sft-vl/lora.yaml‎
Lines changed: 56 additions & 0 deletions b/‎examples/config/sft-vl/lora.yaml‎
Lines changed: 56 additions & 0 deletions
diff --git a/‎examples/config/sft-vl/lora_tp.yaml‎
Lines changed: 57 additions & 0 deletions b/‎examples/config/sft-vl/lora_tp.yaml‎
Lines changed: 57 additions & 0 deletions
@@ -8,11 +8,11 @@
 
 ### 1.1. 在线数据流
 
-#### 1.1.1. erniekit格式
+#### 1.1.1. erniekit 格式
 
 使用 `erniekit` 格式需要在 `train(/eval)_dataset_type` 处指定为 `erniekit`
 
-erniekit格式：每条数据都是一个字典，包含以下字段：
+erniekit 格式：每条数据都是一个字典，包含以下字段：
 
 - `text` : `str, List(str)`
 
@@ -30,11 +30,11 @@ wget https://paddleformers.bj.bcebos.com/datasets/pt_data.tar.gz
 mkdir -p data/pt && tar -xf pt_data.tar.gz -C data/pt/
 ```
 
-#### 1.1.2. messages格式
+#### 1.1.2. messages 格式
 
 使用 `messages` 格式需要在 `train(/eval)_dataset_type` 处指定为 `messages`
 
-messages格式：每条数据都是一个字典，包含以下字段：
+messages 格式：每条数据都是一个字典，包含以下字段：
 
 - `messages` : `List(Dict）`
 
@@ -49,7 +49,7 @@ messages格式：每条数据都是一个字典，包含以下字段：
 
 我们也可以选择使用离线的比特预训练数据流，更节省内存。
 
-为了方便测试，我们也提供了[离线预训练demo数据集](https://paddleformers.bj.bcebos.com/datasets/pretrain_offline_data.tar.gz)可以直接使用：
+为了方便测试，我们也提供了[离线预训练 demo 数据集](https://paddleformers.bj.bcebos.com/datasets/pretrain_offline_data.tar.gz)可以直接使用：
 
 ```shell
 wget https://paddleformers.bj.bcebos.com/datasets/pretrain_offline_data.tar.gz
@@ -60,7 +60,7 @@ tar -xf pretrain_offline_data.tar.gz -C data/pre-training/
 
 下载一个文本数据集，例如 https://modelscope.cn/datasets/BazingaLyn/mini_pretrain_dataset
 
-格式需为jsonl，每行格式例如BazingaLyn/mini_pretrain_dataset/pretrain_hq_v7.jsonl：
+格式需为 jsonl，每行格式例如 BazingaLyn/mini_pretrain_dataset/pretrain_hq_v7.jsonl：
 ```text
 {"text": "番茄炒蛋\n材料：\n鸡蛋3个、番茄1个、油、盐、糖、水淀粉\n做法：..."}
 {"text": "请描述一下如何正确规划个人理财。正确规划个人理财需要以下几个步骤..."}
@@ -82,35 +82,35 @@ python -u examples/tools/create_pretraining_data.py \
 ```
 
 - 参数说明
- 
+
 | 参数名              | 类型        | 说明                 |
 |--------------------|----------- |-----------------|
 | `--model_name_or_path`     | string     | 模型路径  |
 | `--data_format`    | string     | 支持的文件格式，当前只支持 JSON |
-| `--input_path`     | string     | 输入的json文件的路径  |
-| `--append_eos`     | store_true | 是否在document的结尾添加eos token  |
+| `--input_path`     | string     | 输入的 json 文件的路径  |
+| `--append_eos`     | store_true | 是否在 document 的结尾添加 eos token  |
 | `--output_prefix`  | str        | 输出文件的前缀    |
 | `--workers`        | int        | 运行的进程数     |
 | `--log_interval`   | int        | 打印日志间隔   |
-| `--data_impl`      | str        | 制作的数据集类型，默认为mmap，也可以选择lazy |
+| `--data_impl`      | str        | 制作的数据集类型，默认为 mmap，也可以选择 lazy |
 
-## 2. SFT数据流
+## 2. SFT 数据流
 
-### erniekit格式
+### erniekit 格式
 
 使用 `erniekit` 格式需要在 `train(/eval)_dataset_type` 处指定为 `erniekit`
 
-SFT数据流中，每条数据都是一个字典，包含以下字段：
+SFT 数据流中，每条数据都是一个字典，包含以下字段：
 
 - `src` : `str, List(str)`, 模型的输入指令（instruction）、提示（prompt），模型应该执行的任务。
 - `tgt` : `str, List(str)`, 模型的输出。
 - `system(optional)` : 系统配置
 - `label(optional)`: Training flag (1=参与训练, 0=不参与训练)
-- `is_system(optional)` : 标志src的第一条数据是否是system
+- `is_system(optional)` : 标志 src 的第一条数据是否是 system
 
 Notes:
 * `src` 和 `tgt` 为支持多轮对话的列表（List）对象
-* 每个训练样本均为JSON格式，多个样本以换行符分隔
+* 每个训练样本均为 JSON 格式，多个样本以换行符分隔
 
 样例数据：
 ```json
@@ -136,21 +136,21 @@ mkdir -p data/sft && tar -xf alpaca_demo.gz -C data/sft/ --strip-components=1
 ```
 
 
-### messages格式
+### messages 格式
 
 使用 `messages` 格式需要在 `train(/eval)_dataset_type` 处指定为 `messages`
 
-SFT数据流中，每条数据都是一个字典，包含以下字段：
+SFT 数据流中，每条数据都是一个字典，包含以下字段：
 
-- `messages` : `List(Dict)`, 每个字典包含 `role`、`content`、`tool_calls(optional)` 三种key。
+- `messages` : `List(Dict)`, 每个字典包含 `role`、`content`、`tool_calls(optional)` 三种 key。
     - `role` 的值可以选择 `system`, `user`, `assistant` 或 `tool(optional)`。
     - `content`为具体的对话内容。
     - `tool_calls(optional)` 为申请工具调用。
 - `tools(optional)` : `List(Dict)`, 表示工具信息。
 - `label(optional)`: Training flag (1=参与训练, 0=不参与训练)
 
 Notes:
-* 每个训练样本均为JSON格式，多个样本以换行符分隔
+* 每个训练样本均为 JSON 格式，多个样本以换行符分隔
 
 样例数据：
 
@@ -166,9 +166,9 @@ Notes:
 ]
 ```
 
-- 注意：在 `examples/data/sft_think-train.jsonl` 和 `examples/data/sft_think-eval.jsonl` 中提供的demo数据集来自由nvidia发布的 [OpenCodeReasoning数据集](https://huggingface.co/datasets/nvidia/OpenCodeReasoning)。该数据集需要遵循 Creative Commons Attribution 4.0 International License (CC BY 4.0) 协议。
+- 注意：在 `examples/data/sft_think-train.jsonl` 和 `examples/data/sft_think-eval.jsonl` 中提供的 demo 数据集来自由 nvidia 发布的 [OpenCodeReasoning 数据集](https://huggingface.co/datasets/nvidia/OpenCodeReasoning)。该数据集需要遵循 Creative Commons Attribution 4.0 International License (CC BY 4.0) 协议。
 
-用于function call训练的demo数据：
+用于 function call 训练的 demo 数据：
 
 ```json
 [
@@ -194,23 +194,23 @@ wget https://paddleformers.bj.bcebos.com/datasets/sft_function_call_demo.tar.gz
 mkdir -p data/sft && tar -zxf sft_function_call_demo.tar.gz -C data/sft/
 ```
 
-## 3. DPO数据流
+## 3. DPO 数据流
 
-### erniekit格式
+### erniekit 格式
 
 使用 `erniekit` 格式需要在 `train(/eval)_dataset_type` 处指定为 `erniekit`
 
-DPO数据流中，每条数据都是一个字典，包含以下字段：
+DPO 数据流中，每条数据都是一个字典，包含以下字段：
 
 - `system(optional)`: 系统配置
 - `src` : `str, List(str)`, 用户对话内容
-- `tgt` : `str, List(str)`, 系统回复内容（比src少一个）
+- `tgt` : `str, List(str)`, 系统回复内容（比 src 少一个）
 - `response` : `str, List(str)`, 包含 chosen 和 rejected 回复。
 - `sort` : `List(int)`, sort 值用于区分 response 中 chosen 和 rejected（sort 值小的是 rejected，sort 值大的是 chosen）。
-- `is_system(optional)` : 标志src的第一条数据是否是system
+- `is_system(optional)` : 标志 src 的第一条数据是否是 system
 
 Notes:
-* 每个训练样本均为JSON格式，多个样本以换行符分隔
+* 每个训练样本均为 JSON 格式，多个样本以换行符分隔
 
 样例数据：
 
@@ -249,7 +249,7 @@ mkdir -p data/dpo && tar -zxf ultrafeedback_binarized.tar.gz -C data/dpo/ --stri
 
 使用 `messages` 格式需要在 `train(/eval)_dataset_type` 处指定为 `messages`
 
-DPO数据流中，每条数据都是一个字典，包含以下字段：
+DPO 数据流中，每条数据都是一个字典，包含以下字段：
 - `messages` : `List(dict)`, 对话历史列表。
   - 普通轮次：包含 `role` (`"user"` 或 `"assistant"`) 和 `content` (`str`) 字段。
   - 偏好/非偏好轮次（用于偏好学习）：包含以下两个关键字段，用于表示对同一用户查询的不同系统回复的偏好排序。
@@ -258,7 +258,7 @@ DPO数据流中，每条数据都是一个字典，包含以下字段：
 - `tools` : `List(dict)`, 对话中可能用到的工具（函数）的定义列表。
 - `label` : `List(int)`, 用于区分 `preferred_output` 和 `non_preferred_output` 的排序标签。其中 0 对应 `non_preferred_output` (rejected)， 1 对应 `preferred_output` (chosen)。
 
-详细的数据格式可见[function call说明](https://github.com/PaddlePaddle/PaddleFormers/blob/develop/examples/best_practices/function_call.md)
+详细的数据格式可见[function call 说明](https://github.com/PaddlePaddle/PaddleFormers/blob/develop/examples/best_practices/function_call.md)
 
 样例数据
 ```json
@@ -329,13 +329,13 @@ wget https://paddleformers.bj.bcebos.com/datasets/dpo_function_call_1k.tar.gz
 mkdir -p data/dpo_fc && tar -zxf dpo_function_call_1k.tar.gz -C data/dpo_fc/
 ```
 
-## 4. 多模 SFT数据流
+## 4. 多模 SFT 数据流
 
-### erniekit格式
+### erniekit 格式
 
 使用 `erniekit` 格式需要在 `train(/eval)_dataset_type` 处指定为 `erniekit`
 
-SFT数据流中，每条数据都是一个字典，包含以下字段：
+SFT 数据流中，每条数据都是一个字典，包含以下字段：
 
 * `text_info`: 纯文本的列表，每个元素包含一个 `text` 和一个 `tag`
   * `text`: 来自使用者的问题或系统回复的文字内容
@@ -397,18 +397,18 @@ SFT数据流中，每条数据都是一个字典，包含以下字段：
 }
 ```
 
-为了方便测试，我们也提供了用于快速训练的demo数据，请根据您的需要下载[数据](https://paddleformers.bj.bcebos.com/datasets/DoclingMatix.tar.gz)，并将其解压缩到`tests/fixtures/dummy/sft-vl/`：
+为了方便测试，我们也提供了用于快速训练的 demo 数据，请根据您的需要下载[数据](https://paddleformers.bj.bcebos.com/datasets/DoclingMatix.tar.gz)，并将其解压缩到`tests/fixtures/dummy/sft-vl/`：
 
 ```shell
 wget https://paddleformers.bj.bcebos.com/datasets/DoclingMatix.tar.gz
-tar -xf DoclingMatix.tar.gz -C tests/fixtures/dummy/sft-vl/ --strip-components=1
+tar -xf DoclingMatix.tar.gz -C tests/fixtures/dummy/sft-vl/
 ```
 
-### messages格式
+### messages 格式
 
 使用 `messages` 格式需要在 `train(/eval)_dataset_type` 处指定为 `messages`
 
-多模messages格式需要在纯文messages格式的基础上加上`images`、`videos`、`audios`几个key，用于传入多模态资源的`url`或者`path`，同时在`messages`中插入`<image>`、`<video>`、`<audio>`标签来表述插入多模态数据的位置：
+多模 messages 格式需要在纯文 messages 格式的基础上加上`images`、`videos`、`audios`几个 key，用于传入多模态资源的`url`或者`path`，同时在`messages`中插入`<image>`、`<video>`、`<audio>`标签来表述插入多模态数据的位置：
 
 纯文：
 ```json
 
@@ -33,7 +33,7 @@ save_strategy: steps
 logging_steps: 1
 gradient_accumulation_steps: 4
 logging_dir: ./vdl_log
-output_dir: ./checkpoints/Qwen2.5-VL-sft-full
+output_dir: ./checkpoints/qwen2.5-vl-sft-full
 disable_tqdm: true
 eval_accumulation_steps: 16
 
@@ -48,4 +48,7 @@ sharding: stage2
 recompute: true
 bf16: true
 fp16_opt_level: O2
-unified_checkpoint: true
+unified_checkpoint: false
+save_checkpoint_format: "flex_checkpoint"
+load_checkpoint_format: "flex_checkpoint"
+freeze_config: freeze_vision freeze_aligner
@@ -5,14 +5,14 @@ train_dataset_path: ./tests/fixtures/dummy/sft-vl/train.jsonl
 train_dataset_prob: "1.0"
 eval_dataset_path: ./tests/fixtures/dummy/sft-vl/train.jsonl
 eval_dataset_prob: "1.0"
-max_seq_len: 8192
+max_seq_len: 32768
 packing: true
 mix_strategy: concat
 template_backend: custom
 template: qwen2_vl
 
 ### model
-model_name_or_path: Qwen2.5-VL-3B-Instruct
+model_name_or_path: Qwen/Qwen2.5-VL-3B-Instruct
 attn_impl: flashmask
 
 ### finetuning
@@ -33,7 +33,7 @@ save_strategy: steps
 logging_steps: 1
 gradient_accumulation_steps: 4
 logging_dir: ./vdl_log
-output_dir: ./checkpoints/Qwen2.5-VL-sft-full-tp-pp
+output_dir: ./checkpoints/qwen2.5-vl-sft-full-tp
 disable_tqdm: true
 eval_accumulation_steps: 16
 
@@ -43,12 +43,13 @@ learning_rate: 1.0e-5
 
 # performance
 tensor_parallel_degree: 2
-pipeline_parallel_degree: 2
+pipeline_parallel_degree: 1
 sequence_parallel: true
 sharding: stage1
 recompute: true
 bf16: true
 fp16_opt_level: O2
 unified_checkpoint: false
 save_checkpoint_format: "flex_checkpoint"
-load_checkpoint_format: "flex_checkpoint"
+load_checkpoint_format: "flex_checkpoint"
+freeze_config: freeze_vision freeze_aligner
@@ -0,0 +1,56 @@
+### data
+train_dataset_type: erniekit
+eval_dataset_type: erniekit
+train_dataset_path: ./tests/fixtures/dummy/sft-vl/train.jsonl
+train_dataset_prob: "1.0"
+eval_dataset_path: ./tests/fixtures/dummy/sft-vl/train.jsonl
+eval_dataset_prob: "1.0"
+max_seq_len: 8192
+packing: false
+mix_strategy: concat
+template_backend: custom
+template: qwen2_vl
+
+### model
+model_name_or_path: Qwen2.5-VL-3B-Instruct
+attn_impl: flashmask
+lora: true
+lora_rank: 8
+
+### finetuning
+# base
+stage: VL-SFT
+fine_tuning: lora
+seed: 23
+do_train: true
+do_eval: true
+per_device_eval_batch_size: 1
+per_device_train_batch_size: 1
+num_train_epochs: 1
+max_steps: -1
+eval_steps: 100
+evaluation_strategy: steps
+save_steps: 100
+save_strategy: steps
+logging_steps: 1
+gradient_accumulation_steps: 4
+logging_dir: ./vdl_log
+output_dir: ./checkpoints/qwen2.5-vl-sft-lora
+disable_tqdm: true
+eval_accumulation_steps: 16
+
+# train
+warmup_steps: 20
+learning_rate: 1.0e-4
+
+# performance
+tensor_parallel_degree: 1
+pipeline_parallel_degree: 1
+sharding: stage2
+recompute: true
+bf16: true
+fp16_opt_level: O2
+unified_checkpoint: false
+save_checkpoint_format: "flex_checkpoint"
+load_checkpoint_format: "flex_checkpoint"
+freeze_config: freeze_vision freeze_aligner
@@ -0,0 +1,57 @@
+### data
+train_dataset_type: erniekit
+eval_dataset_type: erniekit
+train_dataset_path: ./tests/fixtures/dummy/sft-vl/train.jsonl
+train_dataset_prob: "1.0"
+eval_dataset_path: ./tests/fixtures/dummy/sft-vl/train.jsonl
+eval_dataset_prob: "1.0"
+max_seq_len: 32768
+packing: true
+mix_strategy: concat
+template_backend: custom
+template: qwen2_vl
+
+### model
+model_name_or_path: Qwen2.5-VL-3B-Instruct
+attn_impl: flashmask
+lora: true
+lora_rank: 8
+
+### finetuning
+# base
+stage: VL-SFT
+fine_tuning: lora
+seed: 23
+do_train: true
+do_eval: true
+per_device_eval_batch_size: 1
+per_device_train_batch_size: 1
+num_train_epochs: 1
+max_steps: -1
+eval_steps: 100
+evaluation_strategy: steps
+save_steps: 100
+save_strategy: steps
+logging_steps: 1
+gradient_accumulation_steps: 4
+logging_dir: ./vdl_log
+output_dir: ./checkpoints/qwem2.5-vl-sft-lora-tp
+disable_tqdm: true
+eval_accumulation_steps: 16
+
+# train
+warmup_steps: 20
+learning_rate: 1.0e-4
+
+# performance
+tensor_parallel_degree: 2
+pipeline_parallel_degree: 1
+sequence_parallel: true
+sharding: stage2
+recompute: true
+bf16: true
+fp16_opt_level: O2
+unified_checkpoint: false
+save_checkpoint_format: "flex_checkpoint"
+load_checkpoint_format: "flex_checkpoint"
+freeze_config: freeze_vision freeze_aligner