Not getting the desired result as Stable Diffusion Web UI #12938
Replies: 6 comments
-
|
Hi @aniket-professional2025, for me is not clear which result is which and why are you using inpainting with a controlnet that also does inpainting? which parts are you inpainting? are you also doing the same with auto1111, meaning using an inpainting controlnet with another inpainting? If I try to guess I would say that the big vertical images are done with auto1111 and the square ones with diffusers but if you're comparing them, why not use the same exact image with the same mask on both? Also are you using a long prompt? because diffusers by default doesn't use long prompts or the weights in the prompt that are commonly used with auto1111, meaning the parts that are like |
Beta Was this translation helpful? Give feedback.
-
|
I have used the same images, same text prompts, same controlnet inpainting.
I have used only stable diffusion without controlnet and the results are
not satisfactory. Since I have to generate the actual saree (required for
the client), I use ControlNet with that.
The problem is with the values in the SD web ui and HF diffusers pipeline
is not scaled the same. Plus the saree details is completely lost for the
HF Diffusers result
At this point, I need some solutions or suggestions. Since, I am working on
Google colab (with GroundedDino+SAM is also running), I can't use SD3 with
controlnet 3. I also tried to use a Runpod with 24 GB VRAM. But it also
shows the same problem: Cuda out of memory
…On Thu, May 22, 2025 at 5:24 PM Álvaro Somoza ***@***.***> wrote:
*asomoza* left a comment (huggingface/diffusers#11585)
<#11585 (comment)>
Hi @aniket-professional2025 <https://github.com/aniket-professional2025>,
for me is not clear which result is which and why are you using inpainting
with a controlnet that also does inpainting? which parts are you
inpainting? are you also doing the same with auto1111, meaning using an
inpainting controlnet with another inpainting?
If I try to guess I would say that the big vertical images are done with
auto1111 and the square ones with diffusers but if you're comparing them,
why not use the same exact image with the same mask on both?
Also are you using a long prompt? because diffusers by default doesn't use
long prompts or the weights in the prompt that are commonly used with
auto1111, meaning the parts that are like (word:0.5)
—
Reply to this email directly, view it on GitHub
<#11585 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BRWHC6JC4BYF53GS7TTHQML27W3GHAVCNFSM6AAAAAB5PO4HSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMBQHE2DEMJQGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
Maybe I expressed myself wrong, so I'll try to be more specific: I don't know which one (if there is one) that corresponds to automatic1111 and which one to diffusers since you presented 4 images, two are vertical and two are squared, are there ones that are good here? Since you're using SD 1.5 I can't really tell, all of them seem really bad to me, but the last is the worst. Maybe don't jump to SD 3 from SD 1.5, you can try SDXL. Even with SD3, SDXL and other models, you can use
It's really hard to help you with this if you don't provide the values and images you're using with auto1111. |
Beta Was this translation helpful? Give feedback.
-
|
All the pictures have their respective names. Just take it as reference.
They will tell which picture I get from which source
I am using the same source image and same control image in both Auto1111
and HF diffusers. However, the values are different. I am writing the
values I am using.
In HF diffusers model: DDIM scheduler
num_inference_steps = 50,strength = 0.1, guidance_scale = 9,
controlnet_conditioning_scale
= 0.05 eta = 1.0, control_guidance_start = 0.4, control_guidance_end = 0.5
In Auto1111 SD: strength: 0.75, Guidance scale: 7.5, controlnet is more
important , DDIM scheduler
…On Thu, May 22, 2025 at 8:23 PM Álvaro Somoza ***@***.***> wrote:
*asomoza* left a comment (huggingface/diffusers#11585)
<#11585 (comment)>
Maybe I expressed myself wrong, so I'll try to be more specific:
I don't know which one (if there is one) that corresponds to automatic1111
and which one to diffusers since you presented 4 images, two are vertical
and two are squared, are there ones that are good here? Since you're using
SD 1.5 I can't really tell, all of them seem really bad to me, but the last
is the worst.
Maybe don't jump to SD 3 from SD 1.5, you can try SDXL.
Even with SD3, SDXL and other models, you can use enable_model_cpu_offload
which will make them run in google colab, you can also try quantization.
The problem is with the values in the SD web ui and HF diffusers pipeline
is not scaled the same. Plus the saree details is completely lost for the
HF Diffusers result
It's really hard to help you with this if you don't provide the values and
images you're using with auto1111.
—
Reply to this email directly, view it on GitHub
<#11585 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BRWHC6IDDEBDIOYYEANJHWL27XQGNAVCNFSM6AAAAAB5PO4HSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMBRGUZTEMZVG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
We don't see the names in github, using the name of the filenames in github does not work also it makes it harder for us to see, you should just add the description of each one.
which one is the source and which one is the control image?
Why are you using a also a The same with the strength of the img2img, you're using 0.1 while in auto1111 you're using 0.75, with auto1111 you're make the image really noisy but with diffusers you're practically using the original image. I told you this before, the reason why this is being that hard to help you is because you're not providing all the information we need, I recommend to update your initial post with all the information and the correct description of the images, so I can see the source images, the input parameters and what you're expecting from diffusers. |
Beta Was this translation helpful? Give feedback.
-
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Describe the bug
While using the Stable Diffusion Web UI and Hugging face Diffuser's Stable Diffusion for inpainting task with ControlNet, I am not getting the desired result in HF diffusers. But I want to get the same result as Stable Diffusion Web UI.
Reproduction
controlnet = ControlNetModel.from_pretrained(
"lllyasviel/control_v11p_sd15_inpaint", torch_dtype=torch.float16)
pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", controlnet = controlnet, torch_dtype = torch.float16)
pipe.to("cuda")
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
generate image
output = pipe(prompt = pos_prompt, negative_prompt = neg_prompt, num_inference_steps = 45,
generator = generator, image = init_image, mask_image = mask_image,
control_image = ref_image, strength = 0.5, guidance_scale = 8.0,
controlnet_conditioning_scale = 1.0, padding_mask_crop = None)
Logs
System Info
Who can help?
No response
Beta Was this translation helpful? Give feedback.
All reactions