RSTABLEDIFFUSION Telegram 55772
Kandinsky 5 - video output examples from a 24gb GPU

About two weeks ago , the news of the Kandinsky 5 lite models came up on here [https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced\_kandinsky\_50\_t2v\_lite\_a\_lite\_2b/](https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced_kandinsky_50_t2v_lite_a_lite_2b/) with a nice video from the repos page and with ComfyUI nodes included . However, what wasn't mentioned on their repo page (originally) was that it needed 48gb VRAM for the VAE Decoding....ahem.

**In the last few days, that has been taken care of and it now tootles along using \~19GB on the run and spiking up to \~24GB on the VAE decode**

https://preview.redd.it/5y3bin2aduuf1.png?width=817&format=png&auto=webp&s=7145052efce232663aad7e61166caa694db27636

* Speed : unable to implement Magcache in my workflow yet [https://github.com/Zehong-Ma/ComfyUI-MagCache](https://github.com/Zehong-Ma/ComfyUI-MagCache)
* Who Can Use It: 24gb+ VRAM gpu owners
* Models Unique Selling Point : making 10s videos out of the box
* Github Page : [https://github.com/ai-forever/Kandinsky-5](https://github.com/ai-forever/Kandinsky-5)
* **Very Important Caveat** : the requirements messed up my Comfy install (the Pytorch to be specific), so I'd suggest a fresh trial install to keep it initially separate from your working install - ie know what you're doing with a pytorch.
* Is it any good ? : eye of the beholder time and each model has particular strengths in particular scenarios - also 10s out of the box . It takes about 12min total for each gen and I want to go play the new BF6 (these are my first 2 gens).
* workflow ?: in the repo
* Particular model used for video below : Kandinsky5lite\_t2v\_sft\_10s.safetensors

[I'm making no comment on their #1 claims. ](https://preview.redd.it/1yxlsop8guuf1.png?width=816&format=png&auto=webp&s=8b1b307b273f9b63e85558f117919908253f781d)

Test videos below using a prompt I made with an LLM feeding their text encoders :

Not cherry picked either way,

* 768x512
* length: 10s
* 48fps (interpolated from 24fps)
* 50 steps
* 11.94s/it
* render time: 9min 09s for a 10s video (it took longer in total as I added post processing to the flow) . I also have not yet got MagCache working
* 4090 24gb vram with 64gb ram

https://reddit.com/link/1o5epv7/video/xk32u4wikuuf1/player

https://preview.redd.it/8t1gkm3kbuuf1.png?width=1949&format=png&auto=webp&s=ce36344737441a8514eac525c1ef7cc02372bac7

https://redd.it/1o5epv7
@rStableDiffusion



tgoop.com/rStableDiffusion/55772
Create:
Last Update:

Kandinsky 5 - video output examples from a 24gb GPU

About two weeks ago , the news of the Kandinsky 5 lite models came up on here [https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced\_kandinsky\_50\_t2v\_lite\_a\_lite\_2b/](https://www.reddit.com/r/StableDiffusion/comments/1nuipsj/opensourced_kandinsky_50_t2v_lite_a_lite_2b/) with a nice video from the repos page and with ComfyUI nodes included . However, what wasn't mentioned on their repo page (originally) was that it needed 48gb VRAM for the VAE Decoding....ahem.

**In the last few days, that has been taken care of and it now tootles along using \~19GB on the run and spiking up to \~24GB on the VAE decode**

https://preview.redd.it/5y3bin2aduuf1.png?width=817&format=png&auto=webp&s=7145052efce232663aad7e61166caa694db27636

* Speed : unable to implement Magcache in my workflow yet [https://github.com/Zehong-Ma/ComfyUI-MagCache](https://github.com/Zehong-Ma/ComfyUI-MagCache)
* Who Can Use It: 24gb+ VRAM gpu owners
* Models Unique Selling Point : making 10s videos out of the box
* Github Page : [https://github.com/ai-forever/Kandinsky-5](https://github.com/ai-forever/Kandinsky-5)
* **Very Important Caveat** : the requirements messed up my Comfy install (the Pytorch to be specific), so I'd suggest a fresh trial install to keep it initially separate from your working install - ie know what you're doing with a pytorch.
* Is it any good ? : eye of the beholder time and each model has particular strengths in particular scenarios - also 10s out of the box . It takes about 12min total for each gen and I want to go play the new BF6 (these are my first 2 gens).
* workflow ?: in the repo
* Particular model used for video below : Kandinsky5lite\_t2v\_sft\_10s.safetensors

[I'm making no comment on their #1 claims. ](https://preview.redd.it/1yxlsop8guuf1.png?width=816&format=png&auto=webp&s=8b1b307b273f9b63e85558f117919908253f781d)

Test videos below using a prompt I made with an LLM feeding their text encoders :

Not cherry picked either way,

* 768x512
* length: 10s
* 48fps (interpolated from 24fps)
* 50 steps
* 11.94s/it
* render time: 9min 09s for a 10s video (it took longer in total as I added post processing to the flow) . I also have not yet got MagCache working
* 4090 24gb vram with 64gb ram

https://reddit.com/link/1o5epv7/video/xk32u4wikuuf1/player

https://preview.redd.it/8t1gkm3kbuuf1.png?width=1949&format=png&auto=webp&s=ce36344737441a8514eac525c1ef7cc02372bac7

https://redd.it/1o5epv7
@rStableDiffusion

BY r/StableDiffusion




Share with your friend now:
tgoop.com/rStableDiffusion/55772

View MORE
Open in Telegram


Telegram News

Date: |

As of Thursday, the SUCK Channel had 34,146 subscribers, with only one message dated August 28, 2020. It was an announcement stating that police had removed all posts on the channel because its content “contravenes the laws of Hong Kong.” On June 7, Perekopsky met with Brazilian President Jair Bolsonaro, an avid user of the platform. According to the firm's VP, the main subject of the meeting was "freedom of expression." Channel login must contain 5-32 characters Invite up to 200 users from your contacts to join your channel Informative
from us


Telegram r/StableDiffusion
FROM American