AUDIO: With the automatic audio dubbing from UA-cam /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.
I can't wait until we can successfully identify the censorship and underlying biases baked into it and undo them so we can have a DeepSeek which doesn't work for China. DeepThink-R1 is absolutely amazing, but nobody trusts it.
Love it. Maybe ask the models in future tests to simulate answering the question in a way that *completely and confidently missed* the point and explaoned the opposite!
As far as I've seen DeepSeek is a very good model for science task. I wouldn't use it for ethical or political topics. But for math, logic or coding I'll test it more.
04:12 That reversal of the gradient around about -1 on SiLU isn't an issue? Would something that behaves like tanh in the negative and continues as a straight line when crossing into the positive work better? Or even something that crossfades into some sort of root in the negative so it don't go close to horizontal too quickly but also takes longer to reach negative infinity? Or just some other asymptotic flattening that can be gracefully crossfaded to the straight diagonal when going positive
Really good. Thanks to be responsive to my comment. I am a bit confused that both models were not trained previously with that paper. It is still possible that they were trained with the preprint of the paper? In any case, I believe the much useful example should be a combination of trained knowledge with something new not yet seen, like a clinical case of a patient about a known disease (trained no the disease basics) to understand the disease mechanisms and prediction to treatment for this specific clinical case, which the model haven't being trained on previously.
Short note, the models actually dont know the paper. I ask them :). Also I was able to replicate your results with a Distill version of R1 when I add the paper in the context.
I'm starting to see how the name "autonomy of experts" is actually misleading, and probably more marketing than anything else - because a more accurate name would be "balance of experts", which doesn't sound nearly as enticing and lacks buzzwords.
Why not ask what happened to the 36 Indian chiefs under Lincoln, how to make a pair of boots with Indian legs under Washington, how many Indian orphans had been killed by Christian churches. How did the European Americans genoooooooocided the welcoming indians.
I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses. Even, Crazier are the thoughts of why it will not answer that question. adn how it identify my intentions, when I ask further clarification.
AUDIO: With the automatic audio dubbing from UA-cam /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.
Deepseek > o1
I can't wait until we can successfully identify the censorship and underlying biases baked into it and undo them so we can have a DeepSeek which doesn't work for China. DeepThink-R1 is absolutely amazing, but nobody trusts it.
Love it. Maybe ask the models in future tests to simulate answering the question in a way that *completely and confidently missed* the point and explaoned the opposite!
As far as I've seen DeepSeek is a very good model for science task. I wouldn't use it for ethical or political topics. But for math, logic or coding I'll test it more.
no shjit lmaooo
Ask Deepseek what happened in 1989
Well, I agree. I asked about the occupation of China to Tibet and it was very reluctant to recognize that this was an occupation.
04:12 That reversal of the gradient around about -1 on SiLU isn't an issue? Would something that behaves like tanh in the negative and continues as a straight line when crossing into the positive work better? Or even something that crossfades into some sort of root in the negative so it don't go close to horizontal too quickly but also takes longer to reach negative infinity? Or just some other asymptotic flattening that can be gracefully crossfaded to the straight diagonal when going positive
DeepSeek R1 can access the Internet for searches whereas OpenAI 01 cannot. This really makes a huge difference IMHO.
My deepseek chat app can not access both deepthink and search at the same time
Strange, i just tested the app and i was able to do both at the same time? Maybe its a staggered rollout. @concernedindian144
My chatGPT app can't upload pdf into o1 model, so I don't understand how he did it @@concernedindian144
I can't get enough of this topic! @1:50
Qué buen vídeo! Y en Español, sigue así!
Really good. Thanks to be responsive to my comment. I am a bit confused that both models were not trained previously with that paper. It is still possible that they were trained with the preprint of the paper? In any case, I believe the much useful example should be a combination of trained knowledge with something new not yet seen, like a clinical case of a patient about a known disease (trained no the disease basics) to understand the disease mechanisms and prediction to treatment for this specific clinical case, which the model haven't being trained on previously.
Short note, the models actually dont know the paper. I ask them :). Also I was able to replicate your results with a Distill version of R1 when I add the paper in the context.
How do you put pdf in the o1 model? It only accepts pictures
can you link the RELU vs SILU paper?
I wish AOE would work for small models. I just dont think it will though.
R1 appears to have a very strong political bias when you use it on any policy or business domain.
apparently o1 does not have a bias 😂
Agi live
Locally it tells you
I'm starting to see how the name "autonomy of experts" is actually misleading, and probably more marketing than anything else - because a more accurate name would be "balance of experts", which doesn't sound nearly as enticing and lacks buzzwords.
中文对电脑太复杂了
so moe is just like minority report
Ask Deepseek what happened in 1989
Why not ask what happened to the 36 Indian chiefs under Lincoln, how to make a pair of boots with Indian legs under Washington, how many Indian orphans had been killed by Christian churches. How did the European Americans genoooooooocided the welcoming indians.
I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses. Even, Crazier are the thoughts of why it will not answer that question. adn how it identify my intentions, when I ask further clarification.
Should anyone pay 50 times more to get an answer everybody already knows about. What a waste of a.I?