Telegram Web
Media is too big
VIEW IN TELEGRAM
Experimenting with Cinematic Style & Continuity | WAN 2.2 + Qwen Image + InfiniteTalk

https://redd.it/1o4lfru
@rStableDiffusion
Local Dream 2.0 with embedding and prompt weights

Prompt weights and embedding can now be used in the new Local Dreams. This requires re-encoding the CPU and NPU models, but the old ones will still work without the new features.

For more information, see the Releases page:

https://github.com/xororz/local-dream/releases/tag/v2.0.0

https://redd.it/1o4oe69
@rStableDiffusion
What’s everyone using these days for local image gen? Flux still king or something new?

Hey everyone,
I’ve been out of the loop for a bit and wanted to ask what local models people are currently using for image generation — especially for image-to-video or workflows that build on top of that.

Are people still running Flux models (like flux.1-dev, flux-krea, etc.), or has HiDream or something newer taken over lately?

I can comfortably run models in the 12–16 GB range, including Q8 versions, so I’m open to anything that fits within that. Just trying to figure out what’s giving the best balance between realism, speed, and compatibility right now.

Would appreciate any recommendations or insight into what’s trending locally — thanks!

https://redd.it/1o4tzd7
@rStableDiffusion
hunyuan image 3.0 localy on rtx pro 6000 96GB - first try.
https://redd.it/1o4xpxz
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
You’re seriously missing out if you haven’t tried Wan 2.2 FLF2V yet! (-Ellary- method)

https://redd.it/1o55qfy
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
OVI ComfyUI testing with 12gb vram. Non optimal settings, merely trying it out.

https://redd.it/1o59jxt
@rStableDiffusion
Local Dream 2.1.0 with upscalers for NPU models!

The newly released Local Dream version includes 4x upscaling for NPU models! It uses realesrgan_x4plus_anime_6b for anime images and 4x_UltraSharpV2_Lite for realistic photos. Resizing takes just a few moments, and you can save the image in 2048 resolution!

More info here:

https://github.com/xororz/local-dream/releases/tag/v2.1.0

https://redd.it/1o5aaxj
@rStableDiffusion
Kandinsky 5 - video output examples from a 24gb GPU

About two weeks ago , the news of the Kandinsky 5 lite models came up on here [https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced\_kandinsky\_50\_t2v\_lite\_a\_lite\_2b/](https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced_kandinsky_50_t2v_lite_a_lite_2b/) with a nice video from the repos page and with ComfyUI nodes included . However, what wasn't mentioned on their repo page (originally) was that it needed 48gb VRAM for the VAE Decoding....ahem.

**In the last few days, that has been taken care of and it now tootles along using \~19GB on the run and spiking up to \~24GB on the VAE decode**

https://preview.redd.it/5y3bin2aduuf1.png?width=817&format=png&auto=webp&s=7145052efce232663aad7e61166caa694db27636

* Speed : unable to implement Magcache in my workflow yet [https://github.com/Zehong-Ma/ComfyUI-MagCache](https://github.com/Zehong-Ma/ComfyUI-MagCache)
* Who Can Use It: 24gb+ VRAM gpu owners
* Models Unique Selling Point : making 10s videos out of the box
* Github Page : [https://github.com/ai-forever/Kandinsky-5](https://github.com/ai-forever/Kandinsky-5)
* **Very Important Caveat** : the requirements messed up my Comfy install (the Pytorch to be specific), so I'd suggest a fresh trial install to keep it initially separate from your working install - ie know what you're doing with a pytorch.
* Is it any good ? : eye of the beholder time and each model has particular strengths in particular scenarios - also 10s out of the box . It takes about 12min total for each gen and I want to go play the new BF6 (these are my first 2 gens).
* workflow ?: in the repo
* Particular model used for video below : Kandinsky5lite\_t2v\_sft\_10s.safetensors

[I'm making no comment on their #1 claims. ](https://preview.redd.it/1yxlsop8guuf1.png?width=816&format=png&auto=webp&s=8b1b307b273f9b63e85558f117919908253f781d)

Test videos below using a prompt I made with an LLM feeding their text encoders :

Not cherry picked either way,

* 768x512
* length: 10s
* 48fps (interpolated from 24fps)
* 50 steps
* 11.94s/it
* render time: 9min 09s for a 10s video (it took longer in total as I added post processing to the flow) . I also have not yet got MagCache working
* 4090 24gb vram with 64gb ram

https://reddit.com/link/1o5epv7/video/xk32u4wikuuf1/player

https://preview.redd.it/8t1gkm3kbuuf1.png?width=1949&format=png&auto=webp&s=ce36344737441a8514eac525c1ef7cc02372bac7

https://redd.it/1o5epv7
@rStableDiffusion
Some random examples from our new SwarmUI Wan 2.2 Image Generation preset - Random picks from Grid not cherry pick - People undermining SwarmUI power :D Remember it is also powered by ComfyUI at the backend

https://redd.it/1o5ivh6
@rStableDiffusion
2025/10/25 06:51:22
Back to Top
HTML Embed Code: