Feature/no default lora (#137)

Glaceon-Hyy · web-flow · commit 4e09363284dc · 2025-08-06T14:56:58.000+08:00
* update qwen image doc

* fix tutorial_zh resolutions

* support no default lora key

* ruff format

* fix typo
diff --git a/diffsynth_engine/pipelines/qwen_image.py b/diffsynth_engine/pipelines/qwen_image.py
@@ -41,19 +41,32 @@ def _from_diffsynth(self, lora_state_dict: Dict[str, torch.Tensor]) -> Dict[str,
         dit_dict = {}
         for key, param in lora_state_dict.items():
             origin_key = key
-            if "lora_A.default.weight" not in key:
+            lora_a_suffix = None
+            if "lora_A.default.weight" in key:
+                lora_a_suffix = "lora_A.default.weight"
+            elif "lora_A.weight" in key:
+                lora_a_suffix = "lora_A.weight"
+
+            if lora_a_suffix is None:
                 continue
+
             lora_args = {}
             lora_args["down"] = param
-            lora_args["up"] = lora_state_dict[origin_key.replace("lora_A.default.weight", "lora_B.default.weight")]
+
+            lora_b_suffix = lora_a_suffix.replace("lora_A", "lora_B")
+            lora_args["up"] = lora_state_dict[origin_key.replace(lora_a_suffix, lora_b_suffix)]
+
             lora_args["rank"] = lora_args["up"].shape[1]
-            alpha_key = origin_key.replace("lora_A.default.weight", "alpha").replace("lora_up.default.weight", "alpha")
+            alpha_key = origin_key.replace("lora_up", "lora_A").replace(lora_a_suffix, "alpha")
+
             if alpha_key in lora_state_dict:
                 alpha = lora_state_dict[alpha_key]
             else:
                 alpha = lora_args["rank"]
             lora_args["alpha"] = alpha
-            key = key.replace(".lora_A.default.weight", "")
+
+            key = key.replace(f".{lora_a_suffix}", "")
+
             if key.startswith("transformer") and "attn.to_out.0" in key:
                 key = key.replace("attn.to_out.0", "attn.to_out")
             dit_dict[key] = lora_args
diff --git a/docs/tutorial.md b/docs/tutorial.md
@@ -88,7 +88,6 @@ We will continuously update DiffSynth-Engine to support more models. (Wan2.2 LoR
 
 After the model is downloaded, load the model with the corresponding pipeline and perform inference.
 
-
 ### Image Generation(Qwen-Image)
 
 The following code calls `QwenImagePipeline` to load the [Qwen-Image](https://www.modelscope.cn/models/Qwen/Qwen-Image) model and generate an image. Recommended resolutions are 928×1664, 1104×1472, 1328×1328, 1472×1104, and 1664×928, with a suggested cfg_scale of 4. If no negative_prompt is provided, it defaults to a single space character (not an empty string). For multi-GPU parallelism, currently only cfg parallelism is supported (parallelism=2), with other optimization efforts underway.
@@ -122,7 +121,7 @@ image.save("image.png")
 
 Please note that if some necessary modules, like text encoders, are missing from a model repository, the pipeline will automatically download the required files.
 
-#### Detailed Parameters(Qwen-Image)
+### Detailed Parameters(Qwen-Image)
 
 In the image generation pipeline `pipe`, we can use the following parameters for fine-grained control:
 
diff --git a/docs/tutorial_zh.md b/docs/tutorial_zh.md
@@ -2,13 +2,13 @@
 
 ## 安装
 
-在使用 DiffSynth-Engine 前，请先确保您的硬件设备满足以下要求：
+在使用 DiffSynth-Engine 前，请先确保您的硬件设备满足以下要求:
 
 * NVIDIA GPU CUDA 计算能力 8.6+（例如 RTX 50 Series、RTX 40 Series、RTX 30 Series 等，详见 [NVidia 文档](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities)）或 Apple Silicon M 系列芯片
 
-以及 Python 环境需求：Python 3.10+。
+以及 Python 环境需求: Python 3.10+。
 
-使用 `pip3` 工具从 PyPI 安装 DiffSynth-Engine：
+使用 `pip3` 工具从 PyPI 安装 DiffSynth-Engine:
 
 ```shell
 pip3 install diffsynth-engine
@@ -64,7 +64,7 @@ model_path = fetch_model("Wan-AI/Wan2.1-T2V-14B", path="diffusion_pytorch_model*
 
 ## 模型类型
 
-Diffusion 模型包含多种多样的模型结构，每种模型由对应的流水线进行加载和推理，目前我们支持的模型类型包括：
+Diffusion 模型包含多种多样的模型结构，每种模型由对应的流水线进行加载和推理，目前我们支持的模型类型包括:
 
 | 模型结构         | 样例                                                         | 流水线              |
 | --------------- | ------------------------------------------------------------ | ------------------- |
@@ -123,16 +123,17 @@ image.save("image.png")
 
 #### 详细参数(Qwen-Image)
 
-在图像生成流水线 `pipe` 中，我们可以通过以下参数进行精细的控制：
+在图像生成流水线 `pipe` 中，我们可以通过以下参数进行精细的控制:
 
 * `prompt`: 提示词，用于描述生成图像的内容，支持多种语言(中文/英文/日文等)，例如“一只猫”/"a cat"/"庭を走る猫"。
 * `negative_prompt`: 负面提示词，用于描述不希望图像中出现的内容，例如“ugly”，默认为一个空格而不是空字符串， " "。
-* `cfg_scale`: [Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数，通常更大的引导系数可以达到更强的文图相关性，但会降低生成内容的多样性，推荐值为4。
+* `cfg_scale`:[Classifier-free guidance](https://arxiv.org/abs/2207.12598) 的引导系数，通常更大的引导系数可以达到更强的文图相关性，但会降低生成内容的多样性，推荐值为4。
 * `height`: 图像高度。
 * `width`: 图像宽度。
 * `num_inference_steps`: 推理步数，通常推理步数越多，计算时间越长，图像质量越高。
 * `seed`: 随机种子，固定的随机种子可以使生成的内容固定。
 
+
 ### 图像生成
 
 以下代码可以调用 `FluxImagePipeline` 加载[麦橘超然](https://www.modelscope.cn/models/MAILAND/majicflus_v1/summary?version=v1.0)模型生成一张图。如果要加载其他结构的模型，请将代码中的 `FluxImagePipeline` 和 `FluxPipelineConfig` 替换成对应的流水线模块及配置。
@@ -152,7 +153,7 @@ image.save("image.png")
 
 #### 详细参数
 
-在图像生成流水线 `pipe` 中，我们可以通过以下参数进行精细的控制：
+在图像生成流水线 `pipe` 中，我们可以通过以下参数进行精细的控制:
 
 * `prompt`: 提示词，用于描述生成图像的内容，例如“a cat”。
 * `negative_prompt`: 负面提示词，用于描述不希望图像中出现的内容，例如“ugly”。
@@ -217,7 +218,7 @@ save_video(video, "video.mp4")
 
 #### 详细参数
 
-在视频生成流水线 `pipe` 中，我们可以通过以下参数进行精细的控制：
+在视频生成流水线 `pipe` 中，我们可以通过以下参数进行精细的控制:
 
 * `prompt`: 提示词，用于描述生成图像的内容，例如“a cat”。
 * `negative_prompt`: 负面提示词，用于描述不希望图像中出现的内容，例如“ugly”。