DeepSeek Drops Janus Pro - Vision AND Image Gen In ONE Model
Вставка
- Опубліковано 7 лют 2025
- Multimodel and Image Gen in one model? Let's check it out!
Vultr is empowering the next generation of generative AI startups with access to the latest AMD GPUs! Try it yourself when you visit getvultr.com/fo... and use promo code "BERMAN300" for $300 off your first 30 days!
Join My Newsletter for Regular AI Updates 👇🏼
forwardfuture.ai
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
👉🏻 Instagram: / matthewberman_ai
👉🏻 Threads: www.threads.ne...
👉🏻 LinkedIn: / forward-future-ai
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V
DeepSeek just snatched OpenAi's soul.....
OpenAI never had a soul; DeepSeek is giving back to the world what OpenAI secretly stole from the world while it falsely promised to help the world.
🤣
I wouldn't go that far. It turns out, Deepseek V3 was worse then Openai's 01, and yesterday they launched 03...leaving everyone else behind.
@@BlindedByLogic wouldn’t go that far. deepseek is what i’d classify here in new jersey as a mooch
@murc11 Deepseek v3 is not a “thinking” model, It’s better to compare Deepseek V3 with GPT4o. It’s better to compare Deepseek R1 with o1. And yea o3 mini was released, but currently it’s worse than Deepseek R1 and o1. When o3 full comes out we will will see how well it performs, and also how much it costs.
The name Janus is very appropriate! The god of duality has two faces and looks to the past and the future (hence the transition month is named January), and this fits an autoregressive model that can understand and generate images.
yeah they are really good at naming their models unlike ClosedAI/Grok
I thought it was the word "anus" with a "j" on it. Like the Miley Cyrus song, "J's on my anus," I mean "feet."
@@shazzadhasan4885 Grok is a catchy name. ClosedAI is alos apopros.
😂, yeah they know the naming.
Also, Magus at Chrono Trigger.
deepseek is really having a good week
worst week it’s ever had
天天被黑客攻击,在中国的节日工作人员不能放假来抵制美国黑客的攻击
DeepSeek should take the name OpenAI from ClosedAI
🤣
I think you're the first person do you ever say that for the millionth time.
And then when they go to closed models because they are a business with 1 billion worth of GPUs and need to pay for this, maybe a true non for profit company can come along and take the name back off them too. Would love to be proved wrong, maybe they have the money to just do this philanthropically
Thanks to them we still got chatGPT early, thanks to them you even have R1 to celebrate today. I'm glad they are openClosedAi, o3 mini is a boss we can't wait to see o3 itself! #callmefanboybelow
OpenAl = PotAI
DS researcher on twitter was saying the most exiting part of their new year festivities (just happened - not the same as our new year) was watching the R1-zero curves continuously increase. They are still cooking R1-zero and its constantly getting better.
As an IT professional with zero coding experience, I’m amazed by what this R1 ds running it localy is accomplishing-it's truly impressive. The ability to generate code for creating applications that simplify my work is incredible. I believe that soon, certain industries and professionals who assume their jobs are secure may no longer be needed.
if you run it locally the model will be worse bc only the 400b or so model is the real
deepseek r1
@@pro12235 This problem might be solved by NVDA's new little supercomputers for $3K each and need 2 or 3 of them if you got the money* (*or you make a youtube channel that NVDA likes, right Matt?)
It's the 4-minute mile phenom. Stand by for dozens more... ;)
More like the 3.5 minute mile at this point.....
Yup, Huawei is saying their's is better
Are you referring to hyphen-Americans?
Gotta love ❤️ those Chinese bright minds
Bright minds? They’re distilled other more established models. It’s like taking parts from different cars and claiming you made an innovative new car. 🙄
@attribute-4677 Where is the evidence? How do you distill from OpenAI or other models that are not open source?
@attribute-4677Stay mad bozo 🤣
@attribute-4677 Actually, that's what innovation is. Unless you think innovation is about creating stuff from thin air.
@attribute-4677 Get your pill kiddo!
Wow, looks like Deepseek will give OPEN ai deep trouble.
AI has been commoditized.
Ai has been communised
"Always has been👨🚀 🔫👨🚀🌌"
@@tanker7757you love overpaying for overpriced products that open ai made with stolen data like a good peasant.
A gift for the people of the world!
About time
You can also load this model locally with LM Studio - 7B isn’t heavy.
LM Studio can't output images?
Now that's an innovative model! Chat and vision and image gen in a single model?? That's so weird and so awesome
DeepSeek’s rollin’ up like the Dark Knight, takin no prisoners and throwin shade at OpenAI - big moves, bold plays, straight fire.
It's pretty awesome that everyone now has real access to AI now.
Now AI can talk to everyone, not just The Few.
CCP bot
Yeah because there wasn't already free access to ChatGPT and Gemini.
But then again, this is just a post from a CCP bot.
Janus Pro is an advanced multimodal AI model developed by DeepSeek.
It is designed to handle both multimodal understanding and text-to-image generation tasks effectively.
The model can generate high-quality images from text descriptions and also understand visuals, which is becoming increasingly important in modern AI applications.
Janus Pro-7B outperforms models like OpenAI's DALL-E 3 and Stable Diffusion on benchmarks such as GenEval and DPG-Bench, thanks to its improved training processes, data quality, and model size.
Janus Pro represents a significant advancement in the field of multimodal AI, offering a versatile and powerful tool for developers and researchers working with text and image data.
The Chinese models even have far better names: _"DeepSeek, Qwen, Janus Pro..."_
Meanwhile, the American models are: _"o1, o3, o3 mini, ChatGPT, Llama..."_ 😂
Fantastic! Competition drives progress. Dismiss the unfounded complaints about hoarding GPUs. They've developed open-source models and maintained transparency. It's time to mature and embrace innovation for the greater good instead of acting petulant.
Yeah, just sweep that under the rug 🙄 Who needs facts anyway?
Yes, it's very democratic to decide which country NVidia can sell their cards and how much. At least thi AI doesn't need NVidia, Huawei chips are enough.
CCP bot.
@attribute-4677 bullshit ain't facts. sourgrapes won't do you good.
@@illuminated2438 westard bot.
They just keep cooking harder and harder
can't wait to test it out - the deep think showing the process is innovative
crazy for 7b.
last year it was Ai will come takeover your job,
this year is Ai will takeover another Ai job
🤣🤣🤣
Nice presentation and the thumbnail was a good hook.
@@tengdayz2 thanks. Thumbnail was chosen by UA-cam lol
By comparasin this makes Sam Altman look like the employee of the month at Mc Donalds compared to a PHD in any field. 😂
my mans is on a role with all this deepseek hollabaloo
deepseek is killin it right now omg!! 🔥
CCP bot
Did you know AI literally cannot generate images of a wine glass filled to the rim?
try it with any model you like.
One of the best things about Janus Pro is that I don't have to replace my 8GB laptop with a more expensive 16GB laptop. And I don't need to buy video editing software that uses Closed AI, because that will also slow down my 8GB laptop
Janus Pro is something special. I think this marks a paradigm shift. The *native ability* for text/image I/O is cracked - you can “feel” the “telephone game” when e.g. 4o passes a prompt to Dall-E or Groq passes one to Flux.
The fact that it also *knows English* means its prompt adherence is often best-in-class (though, still with that 7b smell). I’m fairly confident that this is just the new direction; and this is the worst it’s ever going to be.
Curious to see how it compares to other VLMs for agentic capabilities (e.g. computer use).
Waiting patiently for LMStudio to add support for this model (I know, I could just do it myself - I don’t want to); and yet-more-patiently to see this paradigm get wider adoption.
Deepseek is a worthy adversary for OpenAi and partners, but the mistake Deepseek has made was telling the world including the competition that they have made it cheaper with better performances. Now think, a GIANT of an opponent company such as OpenAI and Nvidia that can "afford" the production of a expensive method for many years now have knowledge of a cheaper method of production. Deepseek, is heading towards checkmate as OpenAI claims the victory.
it doesn't matter. Chinese don't work the way the american capitalism gatekeep everything from anyone. they build something so that others can build new things upon that.
someone can try to train their own AI using their own curated dataset. then other one can try to Optimize the model, other can try implement them in robotics.
Instead of one giant monopoly, Deepseek will be giant that many companies can grow on and surprise surprise BIGGEST, CHEAPEST MANUFACTURINGS, DRONE AND ROBOTIC PRODUCERS, the industries that can benefit the most from AI, LOCATED IN CHINA!
It literally won't matter, DeepSeek seeks to innovate in AI, not compete for market share, all their stuff is open source, including the research and how they got there.
Well except the costs I suppose or the data they used but still.
Deepseek is part of the government like everything else in communist china, so they are more powerful actually.
Usually, when I get a wrong answer, I run the prompt again in another session, to check if the first result was a hallucination.
Would like to see this working with a webcam on intervals, acting as computer vision
❤️🙏I was lost financially until I prayed for a breakthrough. Inspired by Kathy I took a chance and soon earned $15,000 🙏a week! This transformed my finances and confirmed God's presence in my life. It taught me that with faith and perseverance, anything is possible.
I'm 37 and have been looking for ways to be successful, please how??
Sure, the investment-advisor that guides me is..
Mrs Kathy lien
Same, I met Kathy lien last year for the first time at a conference in Wilshire, after then my Life has changed for good.God bless Kathy lien
Her services is the best, I got a brand new Lambo last week and paid off my mortgage loan thanks to her wonderful services!
All The Chinese did was remind us how wasteful and full of shit we are
CCP bot.
All the Chinese did is remind us they can't invent anything and just steal technology and then pump out low cost or free stuff to engage in economic warfare.
Thanks Matt that was great and thanks for the $300.00 credit...
It would be awesome if you did a video step-by-step on how to get LLMs set up and running on Vultr. I searched your archive and you've got one on Runpod, but Vultr doesn't seem as one click
Janice Pro on personal PCs? Wild. Hope it really outdoes Stability AI and OpenAI. Fingers crossed!
I'm not sure about VULTR or Janus Pro, but the AMD MI300X looks really impressive. A similar price to rent as an H100, only way more RAM!
openAI image intelligence has been around for ages - I gave o1 a few medical images and it answered with possible diagnosis perfectly.
you forgot the words "local" & "7b".
Installing something like this on robots will be really cool, I'm looking forward to it
Hey, loving watching you and learning, I suggest you upgrade your “battery” of tests so they are relevant for today’s beasts.
It does not add to the perception of your professionalism and as a fan, I would not want that.
Enough with the non representative statistically insignificant tests, childish questions the LLM barely feels.
As time replaced snake with Tetris, Ask it to model and effectively 3d animate the solar system and have you move the POV interactively, to teach kids how each celestial object moves in relation to the other and itself.
Please don’t let mediocre testing make you less relevant. You are one of five I watch to stay ahead. Thanks in advance.
Palmberrie
when will it come to ollama?
It's so funny deepseek bringing karma to open-AI I feel they should change names at this point.
it didnt understand the meme like we do, it explained it
you think you did, but you don't understand the meme like i do
it would be Ironic if deep seek renamed themselves Closed AI and stayed open source to show the Irony of Open AI being closed source
Can you make a video showing us how to load these models on Vultr?
In China, which has one of the highest work ethics in the world (maybe too much, but still under Korea), the minds would interpret that Big Company picture differently. The bigger companies do work more systematically on more complex problems. In that picture, it can be complex systematic problem where everyone is in a state of dependency on the one person's operation. Thus, we have to realize that interpretations vary from culture to culture.
5:05 it is a proper definition. Doesn't explain the meme right but it did clearly define what the picture represent in a positive tune. The executives are actually working as hard as the lower division. Try being a leader for an organization for a week and you get what i meant.
"run locally"? Personally I think that running locally means you use your own resources without paying for anything.
just my opinion though. I know there are different interpretations of it.
U R right.
Can it intelligently change the image? All that was shown of was it’s vision but not whether or not the image gen + vision can work together.
You are not explaining what Janus is, how to get it and how to use it...What is the point of your video??
How do you run vision in ollama? I'd really like to test the model.
exciting times!!!
So much lunch getting eaten right now 😂😂😂...and open source!!
It should generate images continuously to explain its text output, not as a separate prompt.
Let's go deepseek
I agreed with your other video that the demand for chips will continue regardless of Deepseek. Nonetheless I can understand why Sam isn't too happy having just been at the limelight of Stargate.
Yea, there'll be demand for chips, but not just nvidia chips or pricy ones too. Mercedes made the first car.
I think Nvidia's fall is multifactor. Wall of worry = 1 - Deepseek made significant advances in training signalling massive amounts (1M+ GPU clusters) of gpu's may not be needed for training. 2 - Deepseek is doing inference on ascend not Nvidia (China does not need Nvidia for inference). 3 - investors know that deepseeks success will force the US gov to implement more Nvidia restrictions, further losing the Chinese market. 4 - When there is products like Cerebras dominating GPUs for inference you start to see Nvidia is begging to push uphill.
Some stats that should make American investors question things. Doubao AI (bytedance) is processing over 4T tokens per day. Open AI as per its recent report from the company is processing only 2T tokens per day. Where is all that inference coming from 😉
Are there any step-by-step instructions on how to run those models on Vultr?
ComfyUI instead? 7B isn't too large.
What is the UI application name you are using in the video for DeepSeek Janus Pro ?
I totally get the need to monetize, but it would be awesome if this worked locally. Maybe Forge or ComfyUI could help us out.
OpenAI will come out tomorrow and say Deepseek distill it's output ilegally. Whatever.
Do they have a text to speech? Even with WSL, I can't get torchvision and xcode2 to work with their different torch requirement versions. Sucks
Can this take in an image as part of an image generation prompt? Can you instruct small changes on an existing image? How close can the output mimic the input?
I'm waiting for Anthropic and for Groq to release their new ones. I think they will be 'next level'
Janus doesn’t seem better than anything that was released in the last year. minicpm-v and Flux 1 Schnell is a much better option
Coupon codes doesn't work?!
no luck here either, wanted to check this out @matthew_berman is the code expired or just broken for now?
can this be used for image enhancement ie upscaling resolution
5:40 is this really “locally” ? You’re running it on some hosted high end cloud provider
A model you can run locally to me is one you run on your own machine.
You can run it locally. It isn’t any larger than SDXL
You can run it locally if you have some good resources at your home lol.
@@GearForTheYear Really? I can run XL no issue and this swallows my 16GB VRAM
@@marc1190 hm. Probably because it’s fp16. Probably need to wait a bit for a q8 to be released on HF
@ I know you can. Just saying he’s not.
An offsite hosted machine isn’t “locally”. That’s all I’m saying.
The model response for the workers image was accurate. If you thinking about it from the models perspective, order and hierarchy, are a must to produce the latest amount of energy for the result, which is one worker and everybody else standing and thinking.
In my opinion, this is bad. Because if I will take over he will understand that he doesn't need so many of us because he will do all the thinking. We need to start working on AI teamwork so the AI will want to keep us all😂
I tried to have it generate images of various rooms with furniture etc. Not that great for this particular types of images 😅
How much it's going up costs per hour to run deepseek on vultr just to chat and ask questions ((I don't know anything about AI or cost) ✨️
Oh wow 🤯😳
How do we localise and train deepseek?
Will you be evaluating Tulu 3 AI? Seems to be performing well against DeepSeek.
Can you the vision model with API?
what would u exactly need to run a 7b parameter model?
What is Vultr and how was this used in your demo?
Where is the link for janus pro, or do you need to download it and run on your computer to be able to use it?
Every image gen model can do vision. It's litterally how they're tained. Not only that, they can interpret depthmaps, outlines, heatmaps, segmentation, etc...
Promo code not working: It says "Cannot add code: Gift code is no longer valid."
This video is great! It would be great to see anothe llm, one that is totally equivalent to Janus, generate a better interpretation of the meme.
DeepSeek might be bias against talking bad about large company structures.
I love all this back and forth between all the AI players 😅
Cursor setup video with MCP?
Can you run this locally through something like LM Studio?
What could you use it for?
I see why the model is wrong on the construction photo comparison but i could see somebody saying the big company photos only needs 1 guy digging because they have the expertise to know where the problem is below the ground. Or they know what product their bread is buttered with. The new company could have 10 different holes or directions going and they won't all work out?
What PC software runs something like this?
On the Startups vs Big Companies prompt, I think it thought it got it right based on Chinese culture. There is an emphasis on hierarchical control.
Exactly. Matt missed how the Chinese culture works (How they put the system above the individual, which will, despite being oppressive, be more efficient on a macro-scale). We, those of us who value personal freedom, should not praise Deepseek at all, as it basically is an indoctrinating model subverting cultures not aligning with the Chinese oppressive regime.
I’m seeing major concerns about DeepSeek’s TOS. Have you looked at them?
What's TOS?
Deepseek the model and Deepseek the app are not the same. The model is open source and he run it locally so he's not concerned by the terms of services
Bruh stop with the biased fearmongering bs
This is sheer nonsense. DeepSeek is excellent, and the United States lags behind and lacks the motivation to improve. Arrogantly and quickly sanctioning other countries' chips. In the name of global fair competition, shouldn't this behavior be condemned by the whole world? If you don't have the strength yourself, don't maliciously criticize other countries. Focus on improving yourself.
Nice video. I tried to open an account with your code and link and I didnt get anything. 😟
I received this feedback from them: Thank you for contacting Vultr!
Although the BERMAN300 promotion is no longer available, we have manually added another $300.00 promotion to your account. This promotion expires after 30 days, and any unused credit will be removed from the account at that time. Please note that any charges incurred over $300.00 or after 30 days beginning on March 4, 2025 will be your full responsibility.
youve been able to add images to chatgpt for eons now .. or what am i missing?
Your promo code is not working.
we can't run this inside LM Studio?
How long do the images take to run
Disappointed you have no option to attach files to o3-mini yet
The promo code did not work.
it's christmas!
Why aren't you showing us how fast it actually works? You can't edit out the processing time and then say "okay, that was very fast" (and unconvincingly, I might add).
This is crazy
How to use this and where to use or how to download?