DeepSeek Enhanced Prompting For Flux in ComfyUI
Вставка
- Опубліковано 10 лют 2025
- DeepSeek. It's a Large language model. Very large. Thankfully, there are smaller, distilled versions available to run at home for FREE, and all you need to enhance any of your workflows is 2 nodes. Nice.
But how well does DeepSeek do when it comes to prompt generation for FLux.1 in ComfyUI? I guess it's time to find out!
Want to support the channel? It’s you who keeps things going!
/ deepseek-r1-flux-12121...
Links:
github.com/com...
ollama.com/
ollama.com/lib...
huggingface.co...
== Beginners Guides! ==
1. A Beginner's Guide to Installing Python - • Anaconda - Python Inst...
2. Properly Installing ComfyUI for Beginners - • Install Stable Diffusi...
3. ComfyUI Workflow Basics - • ComfyUI Workflow Creat...
== More Flux.1 ==
Flux.1 in ComfyUI - • Flux.1 vs AuraFlow 0.2...
AI Enhanced Prompting in Flux - • Flux.1 IMG2IMG + Using...
Train your own Flux LoRA - • How to Train a Flux.1 ...
Speed Boost Flux, SDXL, LTVX or HunyuanVideo with TeaCache & WaveSpeed! - • Speed Boost Flux, SDXL...
Using Detail Daemon in ComfyUI for Extra Details! - • Using Detail Daemon in...
That's a good idea to input deepseek to help the prompt
Thank you 🙏🏽 I downloaded the WF and it ran very well. Glad to add new tricks to my ComfyUI toolbox.
And thank you 😎
Hahaha ;)) love your style of investigations ! Bravo et merci 🤩 !
Deeeeeep!
The pattern is a Python pattern, right?
I learned a lot from your videos. Thank you!
Thank you for the tip :) I was looking for an integration of DS into Comfy. Best part is that despite nvidia's efforts to control vram on consumer cards, with Comfy you can offload up to 45GB or more of model data into system RAM and if you got DDR5 the speed isn't actually terrible. That's how I run HunYuan video at 720p on my poor 3080 10GB hehe, and now i want to try and load a DS model for the prompt.
4:50 Plush-for-ComfyUI has a 'Remove Text Block' Node that will remove the tags as well.
Awesome, that works well. thank you
i recommend using the Q8 GGUF model and loader instead. There are GGUF versions of T5 as well. It's a tiny bit slower in runtime but you can manage to run more models in parallel with the same quality
Here is the promt so people can copy and paste it:
You are playing the part of an expert art critic. From the user
input given, you must generate a text description of the required
fictional image. Provide at least 6 sentences, but no more than
20. Use only the present tense to describe in professional detail
all subjects, objects, colours, textures, designs, styles,
lighting, artistic techniques used, styles, positions, etc. along
with any other additional details typically used to most
accurately describe both the scene, and the emotion to be
conveyed, perfectly. Provide the image prompt text only. Use
British English. Do not interject, do not include this context in
your response, do not ask questions, and do not add any type of
metacommentary, do not add any pre- or post-amble, do not include
any headers or footers.
Could you do a video on all the different samplers and schedulers and what they are good for / at? I see I get different results with them but I don't really know when to use what and maybe miss out on better results. I also do a lot with images that contain text and am interested in most consistent text output.
Is there an alterantive to comfyui_if for the regex node? Currently use griptape for AI usage (both local and api) but cant find a regex node
Cool 👍
I don’t really understand using text generating AI for generating inputs for text2image AI. Sure, it can write a very good description. But 1) there’s a certain point that the image generator doesn’t really care about your paragraph long description and 2) you lose creative control which is already fairly low using these AI.
It's because most people do not yet understand how to write prompts for AI images, they think that using all those adjectives and methophors make the image better; which obviously doesn't..
I agree with you that simply relying on these tools to automate everything is antithesis to the creative process, but at the same time, it's ironic that you don't see the creative potential for it.
Sometimes you just need something to bounce ideas off of, and having an LLM that will give you it's own descriptions can help reveal a path you might not have thought of before, like subjects and terminology that you might have missed.
I think of it as a roving scout that I can send out and explore a wider area of subjects and therefore discover more things that are useful to me. That's why it's called creative exploration.
Where can I get a copy of the workflow to try this way please? :)
not free...its a patreon
@ronbere I can't see any indication that the workflows are on the patron, don't mind paying to save me the hour of faffing every frame to manually make it if so. There's some nice bits in there around iterative changes that would finish my weaning off Geminis god awful mess lol
@noiceartisan If you’re interested in more than the video, then the teaching materials I create are available via patreon
I see potential for automation: deepseek could create and then janus could look at the result and deepseek could then maybe adjust and improve things ^^ at least janus could be used as a better florence in workflows, where you want an automatic captioning.
Vajanus
chibi nodes text split , return second half is even simpler than regex.
Nothing is simpler than regex! 😉
I can only hear your videos in Latin American Spanish, What can I do to access your channel in English like before?
What about R1 + Deepseek Janus Pro 7B (feed anything - receive anything, except audio)
Trade-off, use a smaller image generation model because LLM takes up memory for formatting prompts.
Or use 0 extra VRAM like in the video 😉
togethervision dude...
Deepseek MoE = VLM
😊👍
does it know what the Taiwanese flag looks like?
Could you make a video about Kokoro.
Cerebras chat with LLama 3 beats deepseek for prompt enhancement/revision.
GPT release the reasoning model few hours ago. xd
Why don't I see deepseek r1 1.5B in SillyTavern like in the last video? You skipped that part. It runs on ollamma just fine.
Thanks for watching!
@@NerdyRodent Oh, you click connect. I'm dumb.
Have to understand one other thing. Chinese language has 3000 and even more strokes in it's character for their language. To think that they are designing their AI models with western ideologies may not necessarily be correct. I would lean to they will likely make their AI models more Chinese centric taking into consideration how they use the Chinese language. Janus will visually figure out how language is used in their context. Prompt understanding might be wholly unique with deepseek
@@juukaa648chill, there was nothing racist about, it's just how AI learning works lol. Stop claiming you're a programmer if you can't comprehend this simple notion. If your dataset is in Chinese its learning process is going to be different than using an English one.
Ex: if you'll ask the model to generate a boll of rice, it might rather add chopsticks next to it instead of a fork. It's not racist, it's simply the culture; assuming it was trained mostly on Chiness data. Got it?
Will it be able to run the larger models in low vram mode someday?