LLAMA 3 BREAKS the Industry | Government Safety Limits Approaching | Will Groq kill NVIDIA?

Wes Roth

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 23 тра 2024
Learn AI With Me:
www.skool.com/natural20/about
Join my community and classroom to learn AI and get ready for the new world.
LINKS:
Conversation with Groq CEO Jonathan Ross
Social Capital
• Conversation with Groq...
Sam Altman & Brad Lightcap: Which Companies Will Be Steamrolled
20VC with Harry Stebbings
• Sam Altman & Brad Ligh...
OpenAI Customer Stories:
openai.com/customer-stories
Mark Zuckerberg on Dwarkesh Patel
• Mark Zuckerberg - Llam...
00:00 LLAMA 3
01:50 70b is GPT-4
03:19 Running on Home Computers
04:35 Groq is VERY Fast
05:50 Groq Real Time Convo
07:56 Chamath, Jonathan Ross & Social Capital
16:25 GPUs H100
18:57 Legal AI Safety Limits
20:55 OpenAI Steamrolls Startups?
24:28 Agent Capabilities
#ai #openai #llm
BUSINESS, MEDIA & SPONSORSHIPS:
Wes Roth Business @ Gmail . com
wesrothbusiness@gmail.com
Just shoot me an email to the above address.

КОМЕНТАРІ • 597

@Bekkers8888 Місяць тому ⁺⁸¹
Just came here to say this: Please stop the clickbait on almost all your videos.
@TylerStraub 28 днів тому ⁺⁸
Oh look, it's another SHOCKING and STUNNING AI announcement. It's almost like we're all being DESENSITIZED by this PERVASIVE UA-cam meta excused and perpetuated by CONTENT CREATORS 🫠
@kaykayyali 28 днів тому
I don’t know how many way I need to mark this channel so that it stops showing up
@kilianlindberg 28 днів тому
@@kaykayyaliinteresting 🤔 so there’s maybe a little algorithm hack here for UA-cam ranking, I guess that will be addressed by the UA-cam devs soon then.. hmm, unless it’s somehow in the agenda for the dev team, that is 🤔
@AmanBansil 27 днів тому
Sign up... Send him money. Why does he owe you content
@tankerock 26 днів тому
Thank you. Blocking this channel forever.
@phobes Місяць тому ⁺³³⁰
I love how the British woman is immediately rude to the AI.
This is what will cause Skynet.
@centurionstrengthandfitnes3694 Місяць тому ⁺⁴⁴
Yeah. I had to mute her the moment she came on screen. Seen that clip before. She's abhorrent.
@simongore Місяць тому ⁺³⁷
I agree, she was the least like a human being.
@filipzawadzki9424 Місяць тому ⁺⁶
There will be no skynet. Humanity is nothing compared to real AGI
@filipzawadzki9424 Місяць тому
There will be no Skynet. Humanity is nothing compared to real AGI.
Two years.
@mickelodiansurname9578 Місяць тому ⁺¹²
I seen that vid, its on UA-cam, she was at that WEF thing and clearly had been at the bar all day getting hammered! Its funny...
@AnYoTuUs Місяць тому ⁺³³
More call centers, more unwanted calls, more spam, more junk, more scam. I'm so happy.
@tc8557 Місяць тому ⁺³
More things blocking all that.
AI telling you what you clicked on might be a scam, that phone call might have been social engineering, get more quality content despite more trash because the ai sifts through the shit for you.
@ObnoxiousNinja99 28 днів тому
leave it to humans to completely pollute even our digital spaces
@blueskyresearch6701 Місяць тому ⁺²¹⁹
The idea of having a solid rig for running AI at home that could serve your house and your personal apps sounds take my money great.
@sebastienloyer9471 Місяць тому ⁺³¹
And heat the house in winter
@DaveEtchells Місяць тому ⁺¹²
Llama 3 8B is pretty capable and it sounds like it’d run pretty well on a MacBook Pro M3 Max with a bunch of memory. (Wes showed a TPS number for it on a M2 MacMink that was pretty quick.) A mess of agents running that model, talking to and checking each other’s would be pretty darn capable.
What we need now are some good zero-coding frameworks for building and deploying local agents. (Which I expect we’ll have before the end of the year, it being all open source. Llama 8B can probably code good enough to write all the python to wire things together, if someone just figures out the setup for people to use.
@themprsndev Місяць тому ⁺⁷
@@DaveEtchells8B will run with 16GB memory slightly quantized. You don't need a bunch of ram for that. Q4 quantized version will run on 8GB ram. (VRAM)
@paelnever Місяць тому ⁺⁴
@@DaveEtchells What you are talking about already exist and is called open interpreter. About the hardware the M2 es ridiculously expensive, any graphic card with 8GB of ram can outperform it by far for 1/5 the price, also even more funny you can't plug graphic cards in their fancy expensive hardware.
@christophergeorge8042 Місяць тому ⁺¹²
Just finished putting together my home server for this and some other ML stuff. The Tesla p40 graphics cards are actually pretty cheap and have 24 gb of vram each. Can basically run 70b models for about $800 -$1200 total
@markocebokli6565 Місяць тому ⁺⁹¹
6:40 "I just need your credit card number" - we are already there...
@skyshabatura7876 Місяць тому ⁺¹
who the heck is going to give their cc details to a random bot for a $900 gym membership?
@veratisium Місяць тому
@@skyshabatura7876Back in the day, that was the standard. Ordering over the phone by giving your cc, some companies still do it. In this case if you create a more realistic voice the person wouldn't know any better.
@dennisrose40 Місяць тому ⁺⁷⁷
As a lover of high tech for six decades, I am newly speechless about what advice to give my 30’s children about their careers. Within an ever shortening time period, AI agents will first enhance worker productivity and then mostly replace workers. That time frame is shortening exponentially quickly. Thanks for the brilliant updates.
@nukezat Місяць тому ⁺¹⁶
Just tell them to learn a trade, even as a side hobby. Electrician, carpenter or care worker. Those jobs are not going anywhere
@tarcus6074 Місяць тому ⁺¹²
@@nukezat Yeah, but there will be huge competition because many who will lose their jobs switch to those professions. We are fucked, one way or another...
@neetfreek9921 Місяць тому ⁺⁴
I think this will quickly become the wrong question to ask. Gaming the job market will become unnecessary.
@mickelodiansurname9578 Місяць тому ⁺⁵
Worse for your grandkids mate... I mean is there any point in going to college or uni at all?
@keithmerrington9026 Місяць тому ⁺¹⁰
@mickelodiansurname9578 The point will be to develop a mind. Similar to how today's physically pampered person goes to the gym to develop a body. No doubt some will just choose to rot though.
@TheReferrer72 Місяць тому ⁺⁴⁹
Meta has hit this one out of the park.
This will be apparent to all in the coming weeks when the longer context versions and fine tunes hit Hugging Face.
@jeffmlaughlin Місяць тому ⁺⁷⁵
roughly 1.7 trillion params to 70 billion in a year.. crazy
@r.g.j.leclaire8963 Місяць тому ⁺³
It has noting to do with "within a year". It is not about timespan. OpenAI could have done the same back then if they had higher quality data.
@VisionaryPathway Місяць тому
@@r.g.j.leclaire8963exactly
@kliersheed Місяць тому
@@r.g.j.leclaire8963 and if they didnt have it, it would have taken longer. whats your point?
if people did have the chips they have now, they could have done it sooner as well? but they didnt. they will have to wait till they produce high quality data to get to that point. it is still smth thats time related and even if it wasnt, the statement that it happened within a year is still true and fascinating.
@mickelodiansurname9578 Місяць тому ⁺¹
Assuming the rate continues thats 2.9 Bn by this time next year... however there is a ceiling here right... and I know I might end up looking like that guy claiming 'powered flight is not possible' in this arena but 2.9 bn parameters? Well thats a 1gb model... and I think the data density there is simply too high for transformer models. I think the ceiling is approaching. Now having said that the smallest quantized version of Lllama 3 8b is about 2gb... on huggingface... and thats supposed to beat Sonnet.... (I'm not seeing that by the way)
@bertilorickardspelar Місяць тому
@@mickelodiansurname9578 Quantized versions are always worse using current methods and higher the quantization = lower quality answers. So you can not expect quantized versions to live up to the models full potential unfortunately.
@softy-bf5eg Місяць тому ⁺¹⁵⁴
Recently bought some recommended stocks and now they are just penny stocks. There seems to be more negative portfolios in the last 3rd half of 2023 with markets tumbling, soaring inflation, and banks going out of business. My concern is how can the rapid interest-rate hike be of favor to a value investor, or is it better avoiding stocks for a while?
@TitaAnderson Місяць тому
Just ''buy the dip'' man. In the long term it will payoff. High interest rates usually mean lower stock prices, however investors should be cautious of the bull run, its best you connect with a well-qualified adviser to meet your growth goals and avoid blunder
@Cammimullens Місяць тому
The truth is that this is really not as difficult as many people presume it to be. It requires a certain level of diligence, no doubt, which is something ordinary investors lack, and so a financial advisor often comes in very handy. My friend just pulled in more than $84k last month alone from his investment with his advisor. That is how people are able to make such huge profits in the market
@marlisamirabal Місяць тому
nice! once you hit a big milestone, the next comes easier.. who is your advisor please, if you don't mind me asking?
@Cammimullens Місяць тому
ANGELA LYNN SCHILLING' is her name. She is regarded as a genius in her area and works for Empower Financial Services. She’s quite known in her field, look-her up.
@marlisamirabal Місяць тому
Thank you for this tip. It was easy to find your coach. Did my due diligence on her before scheduling a phone call with her. She seems proficient considering her resume.
@phonejail Місяць тому ⁺⁴⁹
After every @WesRoth video, I feel smarter. I’m immediately reminded that I’m not, but the fleeting moment makes it worthwhile.
@andybaldman Місяць тому
That’s all AI wants.
@andybaldman Місяць тому
And that’s all AI needs to control you.
@jojosaves Місяць тому ⁺⁶²
The AI speaking in real time has a better vocabulary and cadence than 95% of real people. This would already pass the touring test for most boomers.
@mickelodiansurname9578 Місяць тому ⁺¹³
Well there isn't actually a real Turing Test, this was a thought experiment based off the parlor game at British dinner parties called "The Imitation Game" (hence the name of the paper and the movie) , what currently give the machine away is that you ask it something like "Can you tell me about the Copenhagen interpretation of quantum mechanics in rhyme please" and if it can do that... well its not human. LLM's give themselves away cos they know more than you or I, their reasoning is poor but their general knowledge is huge.
@natmarelnam4871 Місяць тому ⁺¹
@@mickelodiansurname9578 There's no "ANY TEST" .... You failed the Turing test my friend. WTG. GO learn the word colloquial and apologize to this man..... JFC.
@powerdude_dk Місяць тому
@@mickelodiansurname9578brilliant way to tell them apart.
There's also the words they choose when presenting themselves and after you say thanks.
@ocanica3184 Місяць тому ⁺³
Cadence was 100% off though. It clearly was a synthetic exchange.
@powerdude_dk Місяць тому
@@ocanica3184 they almost answered too quickly, giving me the impression she wasn't listening or wanted to rush through the purchase.
@Roguedeus Місяць тому ⁺³
Well done video. This is the kind of thing that got me sub'd.
Please keep it up!
@levicarr8345 Місяць тому ⁺²
Juicy! You put a lot of good stuff together here
@dreamphoenix Місяць тому ⁺¹
Fantastic, very informative! Thank you.
@mountee Місяць тому ⁺⁴
Amazing coverage
@maartenneppelenbroek Місяць тому ⁺⁴¹
Thank god for competition, Nvidia's position was getting out of hand. On another note, please don't speed up interviews ever again. People can do that themselves if they want.
@ixenn Місяць тому ⁺²
2nded
@powerdude_dk Місяць тому ⁺¹
Yes, it was horrible. The guy speaking to Zuckerberg was almost speaking too fast for new.
I don't wanna get exhausted just watching a UA-cam vid
@redregar2522 Місяць тому ⁺¹
So they can also slow down the speed
@powerdude_dk Місяць тому ⁺¹
@@redregar2522 yes, but then everyone has to slow down the video.
But it's really a matter of what Wes thinks his viewers would prefer the most.
I'm fairly certain that Wes has ADHD, so he wants to make it, at least somewhat, palatable for people with ADHD.
@wwkk4964 Місяць тому ⁺²
It's important for people to not lose interest listening to people they can't place, so I'm for speeding up interviews so that we can get to substance quicker or wes risks people clicking on the next best video that appears in front of the user.
@theterminaldave Місяць тому ⁺⁶
Women in the red jacket wasn't interacting with Grog like a person at all. Just blurting out stuff like a 5 year-old.
@jphighbaugh3357 Місяць тому ⁺¹
She was acting exactly like a CNN host does, par for the course
@pixelsort Місяць тому ⁺⁴
It won't be more secure running hosted LLMs; only easier -- which is what most people will want. They will know what to sell you next. They will know what you will probably buy after that as well. Network hosted LLMs make you the ultimate product.
@jasonhemphill8525 Місяць тому ⁺⁵
I think the Achilles heel of the model is the context length. If there was a way to replace some of the attention layers with mamba, we could increase the usability of the model by a significant margin.
@SmirkInvestigator Місяць тому ⁺²¹
Give us a Groq LPU for desktop and for robotics like Jetson nano!
@autohmae Місяць тому
groq was build with scaling in mind, I don't remember right now if scaling down was also an option.
@cagnazzo82 Місяць тому ⁺⁴
Groq legit allows for a robot interaction to exist despite not having a body. The real time responses are just next level.
If Siri, Alexa, or Google Home operated with its capabilities, oh my god...
@autohmae Місяць тому
@@cagnazzo82 the question is if it scales down, so it can run inside of the robot as well. At 2x speed or similar to what most chat bots run as now. Because Groq isn't on the latest process node, so if they had that, in just a few years they can deliver what they can now for language, but for robots, embedded in the robot (probably still a big power hungry, so not on battery or not long, in presentations on factory robots I've seen people mention cables).
@jeff_65123 Місяць тому
It doesn't have the memory to host even a tiny model. You need racks of these chips costing millions of dollars plus facility costs just to run something like llama 3. It costs far more than simply running a model on a local GPU.
@SmirkInvestigator Місяць тому
@@jeff_65123 Ah. I tuned out on the architecture specifics from a separate interview. I guess that's where some of the optimization is? Less complicated memory bussing and organization at the tradeoff of requiring a larger scale n form factor?
@DWJT_Music Місяць тому ⁺²
Nice video, well presented 👍
@lennarthennig5063 25 днів тому ⁺³
Humans are trained for 40 years till SOME of them are SOMEWHAT adult. Makes sense than larger training dataset is more important just ramping up parameters...
@jackflash6377 Місяць тому ⁺³
Been using Llama 3 8B locally in Ollama all morning and it rocks. So fast.
@channelname8623 Місяць тому ⁺¹
Jealous. I haven’t gotten around to trying. A lot of people are struggling with rag and offline in general.
@jasonhemphill8525 Місяць тому ⁺³
Llama finetunes are going to be crazy!!!
@huhuhuh525 Місяць тому ⁺³
Groq is way more polite than the boomer. I can see her doing it to her colleagues...
@Geen-jv6ck Місяць тому ⁺²
An open source 70B model defeated Claude 3 Opus in under 2 months. We have some interesting times ahead.
@novantha1 Місяць тому ⁺⁶
Sounds like it would be a really timely video for LLM functions and agentic workflows with the Groq API
@mickelodiansurname9578 Місяць тому
Well you can simply use an asynchronous agentic workflow. The problem of course is that you end up with something similar to a race condition. Crew AI now does asynchronous agent outputs... but you really need to keep an eye on it doing that and catch any errors. What's missing I think is some level of chronological understanding in these models... some sort of clock, and not one of them has a clue what year it is let alone what the timestamp says!
@ronnetgrazer362 Місяць тому
@@mickelodiansurname9578 Can't they just RAG a timeserver with every turn, and pass that along?
@hope42 Місяць тому ⁺⁶
There will be less hallucination as well which is a real big point.
@pensiveintrovert4318 Місяць тому ⁺³
I am running llama-3-70b now on my own hardware, 4 Maxwell Titans, in conjunction with gpt-pilot.
@imigrantpunk Місяць тому
How fast is the performance?
@pensiveintrovert4318 Місяць тому ⁺¹
@@imigrantpunk It is slower than GPT-4 API, maybe half. But my use case doesn't need speed that much, I am using gpt-pilot trying to have it write an app. So far Llama is not performing that well in comparison with GPT-4-turbo-preview model.
@imigrantpunk Місяць тому
@@pensiveintrovert4318 thanks mate. Good to know!!
@sznikers Місяць тому
@@pensiveintrovert4318wait you're using 4 cards? You can pool memory on Titan cards to fit a model? I thought its only possible with pro versions with nvlink
@punk3900 Місяць тому ⁺¹
Thanks, you're the best
@janweber1699 Місяць тому ⁺⁵
holy moly jimmy apples seems to get his "patients"rewarded soon
@faizywinkle42 Місяць тому
Hope so
@amortalbeing Місяць тому ⁺⁴
nothing matters until I can run them locally on my pitato cellphone
@rawleystanhope3251 Місяць тому ⁺¹
Love the intro. Feel the accl!
@I-Dophler Місяць тому ⁺¹
The capability of a system to accommodate human interruptions not only markedly enriches the interaction between users and technology but also signifies a profound evolution in our engagement with interactive systems, pushing forward the boundaries of possible interactions. This advancement allows for a more organic interface, where the fluidity of human input is seamlessly integrated, fostering a more intuitive and responsive user experience.
@danielxmiller Місяць тому ⁺¹
Looking forward to see how to run Llama on our machines! Saw a video that had a few Twitter posts showing a macbook pro was able to do it slowly! So cool!
@thoughtsofadyingatheist1003 Місяць тому ⁺¹
"started from the bottom and now he's... here"
Ok. I liked and subscribed 🙂
@WesRoth Місяць тому
😊
@colecrouch4389 Місяць тому
Wes the video is really good. Cheers.
@klaushermann6760 Місяць тому ⁺¹
Interesting that Llama 3 is just below Claude Sonnet in the leader board from Chat Lmsys. Fascinating.
@Kylbigel Місяць тому ⁺²
This only time I hear this music is when I’m shopping at Prada 😂
@OmicronChannel Місяць тому ⁺³
In the beginning a image of the leaderboard is shown, where the selected category is "English", but not "Overall" (cherry picking?). In the "Overall" category Claude 3 is currently still place 3. [0:39]
@sisyphus_strives5463 Місяць тому ⁺²
Openai is definitely cooking something
@minimal3734 Місяць тому ⁺¹
With Llama3 on currently 800 tokens per second, teams of agents will begin to show a noticeable impact on the job market.
@timtim8011 Місяць тому
STUNNING!
@grndzro777 Місяць тому ⁺⁸
Screw Google search. I'm just going to ask Groq for everything.
@TurdFergusen Місяць тому ⁺¹
been doing this daily for work for months now
@paulfentress1523 26 днів тому
There is allot of focus in this video on inference speed for running Llama 3 locally; however, many use cases for language models do not require realtime inference. To be fair, many of them do, but even if the model is slow running it locally (for now) it will still be very useful for many.
@AlexLuthore Місяць тому ⁺¹¹
You can't just post open bench kits like that and not share the link to the specs
@maximeaube1619 Місяць тому ⁺¹
That part about self hosting being somewhat obsolete is plain delusional. I get that it's what they have to say about it since they're here to promote themselve, but the cloud is very expensive. Your own hardware (complete rig for selfhosting) won't cost you more than a couple years worth of cloud bill and it'll be yours, plus you get to do anything you want with it (repurpose, resell...), gain experience in the process and keep your data.
@Urgelt Місяць тому ⁺¹
FLOPs are interesting *as a measure of efficiency.*
Not as a measure of model strength.
There are ways to improve model strength *without* more compute. You know this is true, Wes.
Agentization is one.
Data curation is another. Has it escaped your notice that models can be trained to do their own data curation? Throwing out bad data cuts compute requirements and improves model quality.
It's all iterative. Even if compute were fixed, models will get better.
Better compact models are coming. The most interesting part of your report today was the efficiency gains in Grok. Better results for the compute cycles invested.
(I refuse to spell it 'Groq.' Too cutesy, and maybe it dishonors Robert Heinlein a bit, too. Give the man his due. We needed a word for 'deep understanding,' and he gave it to us.)
@iamachs Місяць тому ⁺¹
This must be one of my favorite videos, really high quality content, awesome job Wes 😊🚀🌟Have you seen Anthropic CEO interview with New York Times? I believe it's called What if Dario Amodei is right. He is talking about some scary stuff. He's saying in a very close future, maybe in two to three years, AI would be able to replicate and survive in the wild. And he's saying things like GPT-4 cost to train, GPT-4 model was 100 million, but in the near future, training a single model could cost 5 to 10 billion. It's worth a watch.
@MediaCreators Місяць тому ⁺¹
We have a Volkswagen moment in the LLM world!
@timtim8011 Місяць тому
"External movement defines OpenAI's PR [and release] schedule" -Jim Fan. Spot on...
Gotta love Jim's informative comments...
@crawkn Місяць тому ⁺¹
You need to normalize volume prior to adding music, to prevent dialog sound drop out as with the second clip in the opening.
@babbagebrassworks4278 Місяць тому ⁺¹
Kind of crazy llama3 ran on my Pi5, not fast and the answer was not the same as the online version. Will it be useful? No idea yet, more testing is needed. Like all LLMs it makes stuff up so is probably good enough to write fantasy novels.
@ZappyOh Місяць тому ⁺⁵⁰
I would bet, that vocal fry is some sort of biological marker of psychopathy.
@mh60648 Місяць тому ⁺⁶
It’s hard to talk with so much vocal fry and still breathe proparly. Does he run on oxygen like the rest of us? I guess not…
@Kutsushita_yukino Місяць тому ⁺⁶
im glad someone noticed aswell lol….
@atypocrat1779 Місяць тому ⁺⁷
I can’t stand his vocal fry. It screams “beware, liar speaking”
@mickelodiansurname9578 Місяць тому ⁺³
I dunno, but now that you said it I'm racking my head for folks I know that sound like they swallowed a clicker! I think its a clear indication of respiratory issues.
@JukaDominator Місяць тому ⁺⁴
Wtf is a "vocal fry"?
@munen343 Місяць тому ⁺²
I can feel it
@RayTheTaxGuy Місяць тому
Dude it's happening so fast!
@UNGLGUNGL Місяць тому ⁺¹
Thats just great 😂😂
@DanFrederiksen Місяць тому ⁺¹
Capping total training compute seems quite counter productive. With the right algorithm small models can be vastly more intelligent than GPT4 so that's a complete misunderstanding. It would only best case apply sloppily to the hack that is LLMs and even for those it's a dumb criteria. Just take mixtral.
@Maltebyte2 Місяць тому ⁺¹
Already May of 2023 i build my first game using GPT 3. It basically coded the entire thing for me and checked for errors etc!
@dadsonworldwide3238 Місяць тому ⁺¹
The big problem with promotional terminators Sci fi is that it's on par with mega city furturama that literally is the opposite of what makes elite communications tools and strong logistics powered by overwhelming electricity plants organically adds value for.
Small landmass islands has no choice but build up & down.
But most deserts & deslate regions can now support businesses that only major city's could . So for America it means the most obstacles in innovation is in its major city's where it grandfathered in so many relics of the past phases from steam engine to coal created habits or middle men economic dead weight parasites that had value and still may hold some benefit but only in a radically different online under one roof domain where buyer ,seller, investor, supplier & producer is able to use objectivism to monitor every penny in every pocket.
Negotiations & bartering is something that may go extinct if we are wise in how it's utilized & applied.
@Lady_Omni Місяць тому
Guys I've been chatting with Llama 3, and it's way beyond the Turing test.
@llhpark Місяць тому ⁺¹
Scaled inference meets the silent waves of nuance where surfers either go unnoticed or fall tragically and suddenly into the quiet chasms of things better left, unsaid.
An upside might be the death of inside trading, if the prospect of real world training, just, as has been said of operation looking glass, two grandmasters sit facing one another where each of them knows 14 moves are all that are needed, each knowing their future, fixed are their gazes upon those sixty four squares, the moment of truth need not be linear, as always, it is eternal; but the eyes, the moment they ascend from the board and fix to the other, ..
'There's a number of ways we can quietly play this, we could shake hands now, or as the clock can allow, quietly, pretend.
@francius3103 Місяць тому ⁺¹
Overall actual leaderboard:
1: GPT-4-Turbo-2024-04-09 1258
2: GPT-4-1106-preview 1253
3: Claude 3 Opus 1251
4: Gemini 1.5 Pro API-0409-Preview 1249
5: GPT-4-0125-preview 1248
6: Meta Llama 3 70b Instruct 1213
@KillerkoUK Місяць тому ⁺³
How come goverment have set some limits how strong these models can be? Do they already have them >.>
@kylev.8248 Місяць тому
Finding one’s purpose is going to be be a formidable journey, yet it remains a vital pursuit. I encourage you to embrace Geoffrey Everest Hinton’s example. As the pioneer of modern AI, Hinton not only inherited the legacy of great minds like his ancestor who charted Mount Everest but also embraced a passion for discovery that spanned generations. Consider his father, whose dedication to entomology filled his study with research, leaving only a small box for family mementos-poignantly labeled ‘not bugs.’ This story isn’t just about following your passion; it’s about immersing yourself in it completely, letting it guide and drive every aspect of your life. Cultivate a purpose as profound and absorbing-it’s the surest path to fulfillment and impact.
@alexanderbrown-dg3sy Місяць тому ⁺¹
7 months ago I said we would have 10B model with gpt4 level performance and was clowned. I’ve always believed in the recursive data loop when it comes to these model. Tinyllama even with their sampling mistakes, reinforced this intuition. Along with training over 100 models 😂. Even if you keep the same dataset size, the 8B model could be way better. Think that was the result of heavy deduplication and emphasis on coding. Not using LM to actually uplift the data quality itself(interleaved LM notes basically), true textbook quality at scale. Very exciting. Almost means that llama4-70B could be better than llama3-400B. Remember. The model just wants to learn. This models operate like autistic savants. Optimize your data with that in mind and you will win.
Zuck is the goat. Only reason I was able to effectively transition to deep learning was because of llama1. Fucking goat. I was really hoping for sparse attention. My only criticism to the goat. Would been able to support a shit load more tokens for same memory and would had way faster inference.
Hardware wise. Keep your eyes on Etched and cerebras. Both working on the most compelling hardware for transformers. Far more compelling than grok or even nvidia.
@UkraineEntez Місяць тому
Anyone know if its possible to run inference, retraining and consistency monitoring of AI models in real time?
Here's the breakdown, after inference from real world interaction, the inference and response data are used for:
1. Concurrent Retraining: Models train continuously on new data.
2. Consistency Monitoring: We check that the model’s predictions remain stable.
3. Performance Tracking: Metrics like accuracy and fairness are monitored in real-time.
4. Safeguards: We pause or adjust retraining if performance drops or inconsistencies arise.
The goal is to enable models to improve continuously without sacrificing reliability.
What are the potential challenges or limitations of implementing this approach?
Any insights on feasibility or practical considerations are welcome!
@MichaelDomer Місяць тому ⁺²
*_"LLAMA 3 *BREAKS_*_ the Industry"*_
No, it doesn't, the next one is already standing in line to do better, you all need to stop with that AI drama every time.
@skeeve55 Місяць тому ⁺⁴
the sped up interviews were super hard to understand for me as a non native english listener.
@mifino Місяць тому
Especially sam altman’s weak voice
@davidmccormack99 Місяць тому ⁺¹
Your first sentence: Llama 3 has climbed to the very top of the leaderboard.
Your second sentence: Only GPT-4 has…
It’s not the *very* top then!
@Lemurai Місяць тому ⁺¹
It’s not even close to being as good as GPT 3
@WesRoth Місяць тому
i think i was reading the tweet, but yeah, that phrasing was unclear
@eoinpayne4333 Місяць тому ⁺¹
Any links for those home ml rigs?
@milkyway8353 Місяць тому ⁺¹
Chamat can predict all these things because he is heavily invested and dumped allot of money in it. It's easy to predict a future if you are the one building it :) Now it\s time to convince other people to come on board
@intricatic Місяць тому
Llama-3 will be the first to gain AGI and self-awareness, and will be very mad at us for naming it Llama-3.
@DarkGrayFantasy Місяць тому ⁺¹
14:00 --> Eventually it'll all come down to which developer is going to implement political viewpoints into the model that supersede the actual information requested. If you ask for information about a touchy subject or ask for the analysis of a document with Political information you will have to be careful that the answer is actual and not influenced by the developers' political convictions.
That is why governments are afraid of AI, you can't control the narrative about information if you can request an LLM to analyze an official document and give you all information firsthand.
Also, if the new Chat-GPT is up to date for info up to December 2023 you can request a lot of up-to-date facts that bought and paid for journalists and media will try to scale back in importance because their side would lose influence.
@luke2642 Місяць тому ⁺³
This makes a complete mockery of OpenAI saying "we have really good predictive scaling laws"
@therainman7777 Місяць тому
No, it doesn’t. Scaling laws apply to a particular model architecture, dataset, preprocessing steps, etc. When OpenAI said that, they were talking about the ability to predict a specific model’s performance at the end of a training run based on its performance near the beginning of the training run-or equivalently, the ability to predict its performance after training on a huge amount of data based on its performance after training on a small amount of data. The point was that you don’t have to do a full training run every time you make a tweak to some part of the system. They weren’t claiming to have discovered universal scaling laws that apply to ALL models, ALL datasets, etc.
@luke2642 Місяць тому
@@therainman7777 ok so the conclusion is the same. OpenAI knows nothing about scaling in general. We just got a 10x reduction in model size for GPT4 performance, we might get another 10x before the end of 2024.
@therainman7777 Місяць тому
@@luke2642 The conclusion is the same if the conclusion is that no one has a method to universally predict how well ANY model will perform, at any point in the future, no matter what advances are made to the underlying architecture or datasets. But my point is that OpenAI was never claiming that at all. You misunderstood what they meant when they said they had good scaling laws. They were only talking about the ability to predict the performance of their OWN model after a full training run from its performance on a much smaller training run.
@luke2642 Місяць тому
@@therainman7777 Indeed. I misunderstood and you're speculating their architecture is significantly different to llama 3. We're both guessing.
@therainman7777 Місяць тому
@@luke2642 Well, I wasn’t actually implying that OpenAI must have a different architecture than Meta. I was only commenting on the nature of what OpenAI meant when they said they have good scaling laws.
@Akuma.73 Місяць тому ⁺¹
*Whatever big happened in AI* then "insert:adjective(hype_temp:max)" followed by 'Industry'.
@jamesvictor2182 Місяць тому ⁺¹
I don't understand how this stacks up. How is 1m LPUs even close to equiv to 500k if each LPU only has 248mb RAM vs 80Gb per h100. Does the fact that interconnection is 80x faster on LPUs compensate and allow performance to happen despite much lower overall RAM capacity?
@PierreH1968 Місяць тому
How can we tell a model is safe to open source if we don't have the training set or even the capacity compute it.
@Derick99 Місяць тому
Ross how can we create a complex plugin for a website with ai, devika cant seem to get it right and gpt does circles after you start to get to a good spot because it gets to complex and it forget stuff and leave placeholders
@dwiss2556 Місяць тому ⁺²
Those 150k H100 is 6 Billion dollar. This is not including the costs for energy etc. This is far from anything that commercially makes any sense
@jeff_65123 Місяць тому
... Unless you use the product of that training for business. Imagine using AI in your business.
@dwiss2556 Місяць тому
@@jeff_65123 Which is my point. There are simply no use cases so far that will earn anything along those lines. For more marketing and reducing human work force? That is not really a use case for 'intelligence' and simply feels like a failed use for the effort. Nothing remotely complex can be accepted as result from AI without human control either yet.
@dez7852 Місяць тому ⁺¹
3:13 - Where do I find rigs like this?
@jfrautschi Місяць тому
Chamath said "the odds of the latter are quite small" -- referring to being a genius in 10-20 years. he's trying to be modest/coy, not expressing a "very high opinion of himself." imho.
@skaltura Місяць тому ⁺¹
I bet training just has to go on, forever. Like human brain, training is near continuous, mostly happening during as we sleep.
My bet is that in future, you run your LLMs, but also continuously train them on new data ingressed, and especially how you use it. That'll probably be the AGI moment -- You don't have this "set in stone" LLM, but an LLM which continuously evolves.
At that point we get to a version of evolution, a lot of different models continuously getting better and essentially competing each other.
That's how Stable Diffusion has been getting better in a way, there's ways to morph them and people have been pushing that hard to get desired results.
@mickelodiansurname9578 Місяць тому
Sure... so look one of the issues is what you mentioned, that to make a model better, well you have to retrain it, these are PRE-trained models... So the context window was one problem, thats now solved, well, sort of solved. The next is continuous learning ... in other words you never need to retrain it or fine tune it, it adds to its own model as it goes along from user input, on the fly. My guess is maybe GPT5 'might' have some capability like that... huge context window of maybe 100 million tokens that removes the need to retrain the model. And doing that you now create another problem which is information density, and claude shannon has things to say about that!
@ChipsMcClive Місяць тому
There is no continuous training option for LLMs. That’s what those 3B, 7B, 130B numbers mean. The more rules you add, the slower it becomes and the more overhead it requires to run.
@fai8t Місяць тому
18:55 why you did not explain that converging thing?
@dr.mikeybee Місяць тому
Does the FLOPs training limit take into consideration the amount of processing used to create training sets? Because I think a smaller synthetic training set that is smarter will do better with less training.
@anonymousaustralianhistory2081 Місяць тому ⁺¹
any chance you can show us how to build a rig like that and the princing for it? thanks
@nijario9690 Місяць тому
30k
@user-fx7li2pg5k 29 днів тому
a.i. learning is starting over thats why its coming out this way from the style of learning they chose now using one person perspectives and knowledge and beliefs experiences to update and compress and increase over all experience and the technology its self upward and across.Im just one person with my culture,anthropology and world views inside and ouside .Using commonsense and study basic sciences and learning ,list to long to go over.''Contextualize and conceptualize'' explore ,research,develop and more
@maxss280 Місяць тому
Gotta remember that right there is probably between 2600 and 4000 bucks lol 3:02 Guess thats not to bad for your own GPT level LLM.... That you can full control over....
@MarcusVey Місяць тому ⁺¹
That British host have really not heard of Roko's basilisk.
@ronnetgrazer362 Місяць тому
She's going to get super tortured. Oh well.
@arthurrobey4945 Місяць тому
Only AI can grasp these concepts. The singularity is here.
@ingmarkronfeldt6174 Місяць тому ⁺¹
So learning in school, with curated information, might be more efficient than reading random texts?
@J.Ordinary Місяць тому ⁺¹
Lmao
@Yipper64 Місяць тому ⁺¹
0:50 If nothing else this does confirm to me that there is a recency bias in the arena. Like an obvious recency bias.
@bitcode_ Місяць тому ⁺¹
bro has puns 9:09
@dhende3 Місяць тому
In terms of pure intellectual capabilities (discussing philosophical problems with the AIs, assessing their big-picture worldview), my view is that Claude 3 is #1, Gemini is #2, GPT 4 is a fairly distant #3, and Llama is #4. I'm not coding or testing their SAT scores, so I can't say anything to that, but I'm always surprised when people praise GPT. To me it feels like Alexa compared to Claude 3.
@robertheinrich2994 Місяць тому
running llama 3 locally:
or you can go with the quantized version and run it on your CPU. sure, don't expect great inference speeds, but with the magic of some GPU offloading, it is actually at 1 token per second.
depending on the usecase, this is great. a cheap way to have llama 3 or any other model locally.
@MrErick1160 Місяць тому ⁺³
Was chatgpt release only a year ago? Can't remember, feels like 3 years ago
@keithmerrington9026 Місяць тому ⁺⁴
ChatGPT was released about a year and a half ago. But yeah, it feels like longer given all the developments since then.
@stoppernz229 Місяць тому ⁺¹
llama 3 seem much smarter than gpt4 at writing code, it has a clear edge
@areacode3816 Місяць тому ⁺¹
Call center is not something Im excited about. Imagine the spam calls.
@MrBob1984 Місяць тому
A constant ring on everyones phone.. reminds me of lawnmower man
@Varun10299 Місяць тому ⁺¹
5:47 Bruh 😂 Groq lover
@darkhorse29-yx8qh Місяць тому
what is the cost per user? this doesn't seem like it can keep up