[Update] AI Image Tagger, added Visual Node Editor, R-4B support, smart templates and more
Hey everyone,
a while back I shared my [AI Image Tagger project](https://www.reddit.com/r/StableDiffusion/comments/1nwvhp1/made_a_free_tool_to_autotag_images_alpha_looking/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button), a simple batch captioning tool built around BLIP.
I’ve been working on it since then, and there’s now a pretty big update with a bunch of new stuff and general improvements.
**Main changes:**
* Added a visual node editor, so you can build your own processing pipelines (like Input → Model → Output).
* Added support for the R-4B model, which gives more detailed and reasoning-based captions. BLIP is still there if you want something faster.
* Introduced Smart Templates (called Conjunction nodes) to combine AI outputs and custom prompts into structured captions.
* Added real-time stats – shows processing speed and ETA while it’s running.
* Improved batch processing – handles larger sets of images more efficiently and uses less memory.
* Added flexible export – outputs as a ZIP with embedded metadata.
* Supports multiple precision modes: float32, float16, 8-bit, and 4-bit.
I designed this pipeline to leverage an LLM for producing detailed, multi perspective image descriptions, refining the results across several iterations.
Everything’s open-source (MIT) here:
[https://github.com/maxiarat1/ai-image-captioner](https://github.com/maxiarat1/ai-image-captioner)
If you tried the earlier version, this one should feel a lot smoother and more flexible. I’d appreciate any feedback or ideas for other node types to add next.
https://preview.redd.it/4cqaztbdj4wf1.png?width=3870&format=png&auto=webp&s=96dcc926d8a6746c9a2cc8504a93502868850adc
If you tried the previous version, this update adds much more flexibility and visual control.
Feedback and suggestions are welcome, especially regarding model performance and node editor usability.
https://redd.it/1oazq7n
@rStableDiffusion
Hey everyone,
a while back I shared my [AI Image Tagger project](https://www.reddit.com/r/StableDiffusion/comments/1nwvhp1/made_a_free_tool_to_autotag_images_alpha_looking/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button), a simple batch captioning tool built around BLIP.
I’ve been working on it since then, and there’s now a pretty big update with a bunch of new stuff and general improvements.
**Main changes:**
* Added a visual node editor, so you can build your own processing pipelines (like Input → Model → Output).
* Added support for the R-4B model, which gives more detailed and reasoning-based captions. BLIP is still there if you want something faster.
* Introduced Smart Templates (called Conjunction nodes) to combine AI outputs and custom prompts into structured captions.
* Added real-time stats – shows processing speed and ETA while it’s running.
* Improved batch processing – handles larger sets of images more efficiently and uses less memory.
* Added flexible export – outputs as a ZIP with embedded metadata.
* Supports multiple precision modes: float32, float16, 8-bit, and 4-bit.
I designed this pipeline to leverage an LLM for producing detailed, multi perspective image descriptions, refining the results across several iterations.
Everything’s open-source (MIT) here:
[https://github.com/maxiarat1/ai-image-captioner](https://github.com/maxiarat1/ai-image-captioner)
If you tried the earlier version, this one should feel a lot smoother and more flexible. I’d appreciate any feedback or ideas for other node types to add next.
https://preview.redd.it/4cqaztbdj4wf1.png?width=3870&format=png&auto=webp&s=96dcc926d8a6746c9a2cc8504a93502868850adc
If you tried the previous version, this update adds much more flexibility and visual control.
Feedback and suggestions are welcome, especially regarding model performance and node editor usability.
https://redd.it/1oazq7n
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Made a free tool to auto-tag images (alpha) – looking for ideas/feedback
Explore this post and more from the StableDiffusion community
PSA: Ditch the high noise lightx2v
This isn't some secret knowledge but I have only really tested this today and if you're like me, maybe I'm the one to get this idea into your head: ditch the lightx2v lora for the high noise. At least for I2V, that's what I'm testing now.
I have gotten frustrated by the slow movement and bad prompt adherence. So today I decided to try to use the high noise model naked. I always assumed it would need too many steps and take way too long, but that's not really the case. I have settled for a 6/4 split, 6 steps with the high noise model without lightx2v and then 4 steps with the low noise model with lightx2v. It just feels so much better. It does take a little longer (6 minutes for the whole generation) but the quality boost is worth it. Do it. It feels like a whole new model to me.
https://redd.it/1ob3uaa
@rStableDiffusion
This isn't some secret knowledge but I have only really tested this today and if you're like me, maybe I'm the one to get this idea into your head: ditch the lightx2v lora for the high noise. At least for I2V, that's what I'm testing now.
I have gotten frustrated by the slow movement and bad prompt adherence. So today I decided to try to use the high noise model naked. I always assumed it would need too many steps and take way too long, but that's not really the case. I have settled for a 6/4 split, 6 steps with the high noise model without lightx2v and then 4 steps with the low noise model with lightx2v. It just feels so much better. It does take a little longer (6 minutes for the whole generation) but the quality boost is worth it. Do it. It feels like a whole new model to me.
https://redd.it/1ob3uaa
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
LucidFlux image restoration — broken workflows or am I dumb? 😅
https://redd.it/1ob1iuo
@rStableDiffusion
https://redd.it/1ob1iuo
@rStableDiffusion