Telegram Web
Media is too big
VIEW IN TELEGRAM
Shooting Aliens - 100% Qwen Image Edit 2509 + NextScene LoRA + Wan 2.2 I2V

https://redd.it/1o6m23n
@rStableDiffusion
Please unknown developer IK you're there
https://redd.it/1o6t3so
@rStableDiffusion
ByteDance FaceCLIP Model Taken Down

HuggingFace Repo (Now Removed): https://huggingface.co/ByteDance/FaceCLIP

Did anyone make a copy of the files? Not sure why this was removed, it was a brilliant model.

From the release:

"ByteDance just released FaceCLIP on Hugging Face!

A new vision-language model specializing in understanding and generating diverse human faces.
Dive into the future of facial AI."

They released both SDXL and Flux fine-tunes that worked with the FaceCLIP weights.

https://redd.it/1o6xiry
@rStableDiffusion
Anyone else use their ai rig as a heater?

So, I recently moved my ai machine(RTX3090) into my bedroom and discovered the thing is literally a space heater. Woke up this morning sweating. My electric bill has been ridiculous but I just chalked it up to inflation and summer time running the air conditioner a lot.

https://redd.it/1o6nhly
@rStableDiffusion
Where to post music and other kinds of Lora’s?

Hey

Just wondering has anyone been trai ing any music models or other kinds of models and where do you guys post these.

I'm sitting on a lot of trained Loras for ace step and music gen and have no idea where to post.

Are people even training music Loras or other kinds of Loras? If so where are you posting them.

https://redd.it/1o72or8
@rStableDiffusion
I built a wheel of nunchaku for cuda130, reducing size by 57%.

whl (only windows): https://huggingface.co/X5R/nunchaku-cu130-wheel

it works with torch2.9.0+cu130. here to install pip install -U torch torchaudio torchvision --index-url https://download.pytorch.org/whl/test/cu130

besides, torch of cu130 is also smaller than cu12x, reducing size more than 50%. I don't know why.

https://redd.it/1o741un
@rStableDiffusion
Where do people train Qwen Image Edit 2509 LoRAs?

Hi, I trained a few small LoRAs with AI-Toolkit locally, and some bigger ones for Qwen Image Edit running AI-Toolkit on Runpod using Ostris guide. Is it possible to train 2509 LoRAs there already? Don't wanna rent a GPU just to check if it's available, and I cannot find the info with researches. Thanks!

https://redd.it/1o74lnd
@rStableDiffusion
Compile fp8 on RTX 30xx in triton-windows 3.5

I've merged the patch to let torch.compile work with fp8 on Ampere GPUs and let's see how it rolls out: https://github.com/woct0rdho/triton-windows/pull/140

I hoped this could be superseded by GGUF + better torch.compile or Nunchaku, but as of PyTorch 2.9 I realized that fp8 + the block swap in ComfyUI-WanVideoWrapper (or ComfyUI-wanBlockswap for native workflows) runs faster and causes fewer recompilations than GGUF + the block swap in ComfyUI-GGUF on my machine.

This is the first feature in the 'core' part (rather than the Windows support code) that's deliberately different from the official Triton. It should also work on Linux but I'm not sure what's the best way to publish Linux wheels.

I'm not an expert on PTX. Welcome help in optimizing those PTX code.

triton-windows 3.2.0.post21 is also released, which supports fp8 on RTX 20xx.

https://redd.it/1o75zgt
@rStableDiffusion
16 GB of VRAM: Is it worth leaving SDXL for Chroma, Flux, or WAN text-to-image?

Hello, I currently mainly use SDXL or its PONY variant. For 20 steps and a resolution of 896x1152, I can generate an image without LoRAs in 10 seconds using FORGE or its variants.

Like most people, I use the unscientific method of trial and error: I create an image, and 10 seconds is a comfortable waiting time to change parameters and try again.

However, I would like to be able to use the real text generation capabilities and the strong prompt adherence that other models like Chroma, Flux, or WAN have.

The problem is the waiting time for image generation with those models. In my case, it easily goes over 60 seconds, which obviously makes a trial-and-error-based creation method useless and impossible.

Basically, my question is: Is there any way to reduce the times to something close to SDXL's while maintaining image quality? I tried "Sagge Attention" in ComfyUI with WAN 2.2 and the times for generating one image were absolutely excessive.

https://redd.it/1o76sa4
@rStableDiffusion
2025/10/19 17:28:37
Back to Top
HTML Embed Code: