Media is too big
VIEW IN TELEGRAM
Experimenting with Cinematic Style & Continuity | WAN 2.2 + Qwen Image + InfiniteTalk
https://redd.it/1o4lfru
@rStableDiffusion
https://redd.it/1o4lfru
@rStableDiffusion
Local Dream 2.0 with embedding and prompt weights
Prompt weights and embedding can now be used in the new Local Dreams. This requires re-encoding the CPU and NPU models, but the old ones will still work without the new features.
For more information, see the Releases page:
https://github.com/xororz/local-dream/releases/tag/v2.0.0
https://redd.it/1o4oe69
@rStableDiffusion
Prompt weights and embedding can now be used in the new Local Dreams. This requires re-encoding the CPU and NPU models, but the old ones will still work without the new features.
For more information, see the Releases page:
https://github.com/xororz/local-dream/releases/tag/v2.0.0
https://redd.it/1o4oe69
@rStableDiffusion
GitHub
Release v2.0.0 · xororz/local-dream
v2.0.0 is now available.
Support for using parentheses to change prompt weights, with the same format as WebUI's standard, for example (best quality:1.5).
Support for importing embedding files...
Support for using parentheses to change prompt weights, with the same format as WebUI's standard, for example (best quality:1.5).
Support for importing embedding files...
Recomendations for Models, Worlflows and Loras for Architecture
https://redd.it/1o4pgl5
@rStableDiffusion
https://redd.it/1o4pgl5
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Recomendations for Models, Worlflows and Loras for Architecture
Explore this post and more from the StableDiffusion community
QwenEdit2509-ObjectRemovalAlpha
https://preview.redd.it/wui233jhqouf1.png?width=2898&format=png&auto=webp&s=f89d9292bdb722433d8e8e77e9a69da04bdf7833
https://preview.redd.it/oeq90ijhqouf1.png?width=1966&format=png&auto=webp&s=2bfae4f0df85aa154eb776ea04e9da5dbb900a5d
QwenEdit2509-ObjectRemovalAlpha
fix qwen edit pixels shift and color shift on object removal task.
The current version built upon small dataset which limited the model on sample diversity.
Welcome to provide more diversity dataset to improve the lora.
Civitai:
https://civitai.com/models/2037657?modelVersionId=2306222
HF:
https://huggingface.co/lrzjason/QwenEdit2509-ObjectRemovalAlpha
RH:
https://www.runninghub.cn/post/1977359768337698818/?inviteCode=rh-v1279
https://redd.it/1o4pi3k
@rStableDiffusion
https://preview.redd.it/wui233jhqouf1.png?width=2898&format=png&auto=webp&s=f89d9292bdb722433d8e8e77e9a69da04bdf7833
https://preview.redd.it/oeq90ijhqouf1.png?width=1966&format=png&auto=webp&s=2bfae4f0df85aa154eb776ea04e9da5dbb900a5d
QwenEdit2509-ObjectRemovalAlpha
fix qwen edit pixels shift and color shift on object removal task.
The current version built upon small dataset which limited the model on sample diversity.
Welcome to provide more diversity dataset to improve the lora.
Civitai:
https://civitai.com/models/2037657?modelVersionId=2306222
HF:
https://huggingface.co/lrzjason/QwenEdit2509-ObjectRemovalAlpha
RH:
https://www.runninghub.cn/post/1977359768337698818/?inviteCode=rh-v1279
https://redd.it/1o4pi3k
@rStableDiffusion
What’s everyone using these days for local image gen? Flux still king or something new?
Hey everyone,
I’ve been out of the loop for a bit and wanted to ask what local models people are currently using for image generation — especially for image-to-video or workflows that build on top of that.
Are people still running Flux models (like flux.1-dev, flux-krea, etc.), or has HiDream or something newer taken over lately?
I can comfortably run models in the 12–16 GB range, including Q8 versions, so I’m open to anything that fits within that. Just trying to figure out what’s giving the best balance between realism, speed, and compatibility right now.
Would appreciate any recommendations or insight into what’s trending locally — thanks!
https://redd.it/1o4tzd7
@rStableDiffusion
Hey everyone,
I’ve been out of the loop for a bit and wanted to ask what local models people are currently using for image generation — especially for image-to-video or workflows that build on top of that.
Are people still running Flux models (like flux.1-dev, flux-krea, etc.), or has HiDream or something newer taken over lately?
I can comfortably run models in the 12–16 GB range, including Q8 versions, so I’m open to anything that fits within that. Just trying to figure out what’s giving the best balance between realism, speed, and compatibility right now.
Would appreciate any recommendations or insight into what’s trending locally — thanks!
https://redd.it/1o4tzd7
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
This media is not supported in your browser
VIEW IN TELEGRAM
You’re seriously missing out if you haven’t tried Wan 2.2 FLF2V yet! (-Ellary- method)
https://redd.it/1o55qfy
@rStableDiffusion
https://redd.it/1o55qfy
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
OVI ComfyUI testing with 12gb vram. Non optimal settings, merely trying it out.
https://redd.it/1o59jxt
@rStableDiffusion
https://redd.it/1o59jxt
@rStableDiffusion
Local Dream 2.1.0 with upscalers for NPU models!
The newly released Local Dream version includes 4x upscaling for NPU models! It uses realesrgan_x4plus_anime_6b for anime images and 4x_UltraSharpV2_Lite for realistic photos. Resizing takes just a few moments, and you can save the image in 2048 resolution!
More info here:
https://github.com/xororz/local-dream/releases/tag/v2.1.0
https://redd.it/1o5aaxj
@rStableDiffusion
The newly released Local Dream version includes 4x upscaling for NPU models! It uses realesrgan_x4plus_anime_6b for anime images and 4x_UltraSharpV2_Lite for realistic photos. Resizing takes just a few moments, and you can save the image in 2048 resolution!
More info here:
https://github.com/xororz/local-dream/releases/tag/v2.1.0
https://redd.it/1o5aaxj
@rStableDiffusion
GitHub
Release v2.1.0 · xororz/local-dream
Add built-in upscaler, supporting NPU models only. It uses realesrgan_x4plus_anime_6b for anime images and 4x_UltraSharpV2_Lite for realistic photos.
Images with a width or height greater than 1024...
Images with a width or height greater than 1024...
Kandinsky 5 - video output examples from a 24gb GPU
About two weeks ago , the news of the Kandinsky 5 lite models came up on here [https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced\_kandinsky\_50\_t2v\_lite\_a\_lite\_2b/](https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced_kandinsky_50_t2v_lite_a_lite_2b/) with a nice video from the repos page and with ComfyUI nodes included . However, what wasn't mentioned on their repo page (originally) was that it needed 48gb VRAM for the VAE Decoding....ahem.
**In the last few days, that has been taken care of and it now tootles along using \~19GB on the run and spiking up to \~24GB on the VAE decode**
https://preview.redd.it/5y3bin2aduuf1.png?width=817&format=png&auto=webp&s=7145052efce232663aad7e61166caa694db27636
* Speed : unable to implement Magcache in my workflow yet [https://github.com/Zehong-Ma/ComfyUI-MagCache](https://github.com/Zehong-Ma/ComfyUI-MagCache)
* Who Can Use It: 24gb+ VRAM gpu owners
* Models Unique Selling Point : making 10s videos out of the box
* Github Page : [https://github.com/ai-forever/Kandinsky-5](https://github.com/ai-forever/Kandinsky-5)
* **Very Important Caveat** : the requirements messed up my Comfy install (the Pytorch to be specific), so I'd suggest a fresh trial install to keep it initially separate from your working install - ie know what you're doing with a pytorch.
* Is it any good ? : eye of the beholder time and each model has particular strengths in particular scenarios - also 10s out of the box . It takes about 12min total for each gen and I want to go play the new BF6 (these are my first 2 gens).
* workflow ?: in the repo
* Particular model used for video below : Kandinsky5lite\_t2v\_sft\_10s.safetensors
[I'm making no comment on their #1 claims. ](https://preview.redd.it/1yxlsop8guuf1.png?width=816&format=png&auto=webp&s=8b1b307b273f9b63e85558f117919908253f781d)
Test videos below using a prompt I made with an LLM feeding their text encoders :
Not cherry picked either way,
* 768x512
* length: 10s
* 48fps (interpolated from 24fps)
* 50 steps
* 11.94s/it
* render time: 9min 09s for a 10s video (it took longer in total as I added post processing to the flow) . I also have not yet got MagCache working
* 4090 24gb vram with 64gb ram
https://reddit.com/link/1o5epv7/video/xk32u4wikuuf1/player
https://preview.redd.it/8t1gkm3kbuuf1.png?width=1949&format=png&auto=webp&s=ce36344737441a8514eac525c1ef7cc02372bac7
https://redd.it/1o5epv7
@rStableDiffusion
About two weeks ago , the news of the Kandinsky 5 lite models came up on here [https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced\_kandinsky\_50\_t2v\_lite\_a\_lite\_2b/](https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced_kandinsky_50_t2v_lite_a_lite_2b/) with a nice video from the repos page and with ComfyUI nodes included . However, what wasn't mentioned on their repo page (originally) was that it needed 48gb VRAM for the VAE Decoding....ahem.
**In the last few days, that has been taken care of and it now tootles along using \~19GB on the run and spiking up to \~24GB on the VAE decode**
https://preview.redd.it/5y3bin2aduuf1.png?width=817&format=png&auto=webp&s=7145052efce232663aad7e61166caa694db27636
* Speed : unable to implement Magcache in my workflow yet [https://github.com/Zehong-Ma/ComfyUI-MagCache](https://github.com/Zehong-Ma/ComfyUI-MagCache)
* Who Can Use It: 24gb+ VRAM gpu owners
* Models Unique Selling Point : making 10s videos out of the box
* Github Page : [https://github.com/ai-forever/Kandinsky-5](https://github.com/ai-forever/Kandinsky-5)
* **Very Important Caveat** : the requirements messed up my Comfy install (the Pytorch to be specific), so I'd suggest a fresh trial install to keep it initially separate from your working install - ie know what you're doing with a pytorch.
* Is it any good ? : eye of the beholder time and each model has particular strengths in particular scenarios - also 10s out of the box . It takes about 12min total for each gen and I want to go play the new BF6 (these are my first 2 gens).
* workflow ?: in the repo
* Particular model used for video below : Kandinsky5lite\_t2v\_sft\_10s.safetensors
[I'm making no comment on their #1 claims. ](https://preview.redd.it/1yxlsop8guuf1.png?width=816&format=png&auto=webp&s=8b1b307b273f9b63e85558f117919908253f781d)
Test videos below using a prompt I made with an LLM feeding their text encoders :
Not cherry picked either way,
* 768x512
* length: 10s
* 48fps (interpolated from 24fps)
* 50 steps
* 11.94s/it
* render time: 9min 09s for a 10s video (it took longer in total as I added post processing to the flow) . I also have not yet got MagCache working
* 4090 24gb vram with 64gb ram
https://reddit.com/link/1o5epv7/video/xk32u4wikuuf1/player
https://preview.redd.it/8t1gkm3kbuuf1.png?width=1949&format=png&auto=webp&s=ce36344737441a8514eac525c1ef7cc02372bac7
https://redd.it/1o5epv7
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Open-sourced Kandinsky 5.0 T2V Lite a lite (2B parameters) version of Kandinsky 5.0…
Explore this post and more from the StableDiffusion community
Some random examples from our new SwarmUI Wan 2.2 Image Generation preset - Random picks from Grid not cherry pick - People undermining SwarmUI power :D Remember it is also powered by ComfyUI at the backend
https://redd.it/1o5ivh6
@rStableDiffusion
https://redd.it/1o5ivh6
@rStableDiffusion
Reddit
From the sdforall community on Reddit: Some random examples from our new SwarmUI Wan 2.2 Image Generation preset - Random picks…
Explore this post and more from the sdforall community