LLAMA 3 *BREAKS* the Industry | Government Safety Limits Approaching | Will Groq kill NVIDIA?
Вставка
- Опубліковано 23 тра 2024
- Learn AI With Me:
www.skool.com/natural20/about
Join my community and classroom to learn AI and get ready for the new world.
LINKS:
Conversation with Groq CEO Jonathan Ross
Social Capital
• Conversation with Groq...
Sam Altman & Brad Lightcap: Which Companies Will Be Steamrolled
20VC with Harry Stebbings
• Sam Altman & Brad Ligh...
OpenAI Customer Stories:
openai.com/customer-stories
Mark Zuckerberg on Dwarkesh Patel
• Mark Zuckerberg - Llam...
00:00 LLAMA 3
01:50 70b is GPT-4
03:19 Running on Home Computers
04:35 Groq is VERY Fast
05:50 Groq Real Time Convo
07:56 Chamath, Jonathan Ross & Social Capital
16:25 GPUs H100
18:57 Legal AI Safety Limits
20:55 OpenAI Steamrolls Startups?
24:28 Agent Capabilities
#ai #openai #llm
BUSINESS, MEDIA & SPONSORSHIPS:
Wes Roth Business @ Gmail . com
wesrothbusiness@gmail.com
Just shoot me an email to the above address.
Just came here to say this: Please stop the clickbait on almost all your videos.
Oh look, it's another SHOCKING and STUNNING AI announcement. It's almost like we're all being DESENSITIZED by this PERVASIVE UA-cam meta excused and perpetuated by CONTENT CREATORS 🫠
I don’t know how many way I need to mark this channel so that it stops showing up
@@kaykayyaliinteresting 🤔 so there’s maybe a little algorithm hack here for UA-cam ranking, I guess that will be addressed by the UA-cam devs soon then.. hmm, unless it’s somehow in the agenda for the dev team, that is 🤔
Sign up... Send him money. Why does he owe you content
Thank you. Blocking this channel forever.
I love how the British woman is immediately rude to the AI.
This is what will cause Skynet.
Yeah. I had to mute her the moment she came on screen. Seen that clip before. She's abhorrent.
I agree, she was the least like a human being.
There will be no skynet. Humanity is nothing compared to real AGI
There will be no Skynet. Humanity is nothing compared to real AGI.
Two years.
I seen that vid, its on UA-cam, she was at that WEF thing and clearly had been at the bar all day getting hammered! Its funny...
More call centers, more unwanted calls, more spam, more junk, more scam. I'm so happy.
More things blocking all that.
AI telling you what you clicked on might be a scam, that phone call might have been social engineering, get more quality content despite more trash because the ai sifts through the shit for you.
leave it to humans to completely pollute even our digital spaces
The idea of having a solid rig for running AI at home that could serve your house and your personal apps sounds take my money great.
And heat the house in winter
Llama 3 8B is pretty capable and it sounds like it’d run pretty well on a MacBook Pro M3 Max with a bunch of memory. (Wes showed a TPS number for it on a M2 MacMink that was pretty quick.) A mess of agents running that model, talking to and checking each other’s would be pretty darn capable.
What we need now are some good zero-coding frameworks for building and deploying local agents. (Which I expect we’ll have before the end of the year, it being all open source. Llama 8B can probably code good enough to write all the python to wire things together, if someone just figures out the setup for people to use.
@@DaveEtchells8B will run with 16GB memory slightly quantized. You don't need a bunch of ram for that. Q4 quantized version will run on 8GB ram. (VRAM)
@@DaveEtchells What you are talking about already exist and is called open interpreter. About the hardware the M2 es ridiculously expensive, any graphic card with 8GB of ram can outperform it by far for 1/5 the price, also even more funny you can't plug graphic cards in their fancy expensive hardware.
Just finished putting together my home server for this and some other ML stuff. The Tesla p40 graphics cards are actually pretty cheap and have 24 gb of vram each. Can basically run 70b models for about $800 -$1200 total
6:40 "I just need your credit card number" - we are already there...
who the heck is going to give their cc details to a random bot for a $900 gym membership?
@@skyshabatura7876Back in the day, that was the standard. Ordering over the phone by giving your cc, some companies still do it. In this case if you create a more realistic voice the person wouldn't know any better.
As a lover of high tech for six decades, I am newly speechless about what advice to give my 30’s children about their careers. Within an ever shortening time period, AI agents will first enhance worker productivity and then mostly replace workers. That time frame is shortening exponentially quickly. Thanks for the brilliant updates.
Just tell them to learn a trade, even as a side hobby. Electrician, carpenter or care worker. Those jobs are not going anywhere
@@nukezat Yeah, but there will be huge competition because many who will lose their jobs switch to those professions. We are fucked, one way or another...
I think this will quickly become the wrong question to ask. Gaming the job market will become unnecessary.
Worse for your grandkids mate... I mean is there any point in going to college or uni at all?
@mickelodiansurname9578 The point will be to develop a mind. Similar to how today's physically pampered person goes to the gym to develop a body. No doubt some will just choose to rot though.
Meta has hit this one out of the park.
This will be apparent to all in the coming weeks when the longer context versions and fine tunes hit Hugging Face.
roughly 1.7 trillion params to 70 billion in a year.. crazy
It has noting to do with "within a year". It is not about timespan. OpenAI could have done the same back then if they had higher quality data.
@@r.g.j.leclaire8963exactly
@@r.g.j.leclaire8963 and if they didnt have it, it would have taken longer. whats your point?
if people did have the chips they have now, they could have done it sooner as well? but they didnt. they will have to wait till they produce high quality data to get to that point. it is still smth thats time related and even if it wasnt, the statement that it happened within a year is still true and fascinating.
Assuming the rate continues thats 2.9 Bn by this time next year... however there is a ceiling here right... and I know I might end up looking like that guy claiming 'powered flight is not possible' in this arena but 2.9 bn parameters? Well thats a 1gb model... and I think the data density there is simply too high for transformer models. I think the ceiling is approaching. Now having said that the smallest quantized version of Lllama 3 8b is about 2gb... on huggingface... and thats supposed to beat Sonnet.... (I'm not seeing that by the way)
@@mickelodiansurname9578 Quantized versions are always worse using current methods and higher the quantization = lower quality answers. So you can not expect quantized versions to live up to the models full potential unfortunately.
Recently bought some recommended stocks and now they are just penny stocks. There seems to be more negative portfolios in the last 3rd half of 2023 with markets tumbling, soaring inflation, and banks going out of business. My concern is how can the rapid interest-rate hike be of favor to a value investor, or is it better avoiding stocks for a while?
Just ''buy the dip'' man. In the long term it will payoff. High interest rates usually mean lower stock prices, however investors should be cautious of the bull run, its best you connect with a well-qualified adviser to meet your growth goals and avoid blunder
The truth is that this is really not as difficult as many people presume it to be. It requires a certain level of diligence, no doubt, which is something ordinary investors lack, and so a financial advisor often comes in very handy. My friend just pulled in more than $84k last month alone from his investment with his advisor. That is how people are able to make such huge profits in the market
nice! once you hit a big milestone, the next comes easier.. who is your advisor please, if you don't mind me asking?
ANGELA LYNN SCHILLING' is her name. She is regarded as a genius in her area and works for Empower Financial Services. She’s quite known in her field, look-her up.
Thank you for this tip. It was easy to find your coach. Did my due diligence on her before scheduling a phone call with her. She seems proficient considering her resume.
After every @WesRoth video, I feel smarter. I’m immediately reminded that I’m not, but the fleeting moment makes it worthwhile.
That’s all AI wants.
And that’s all AI needs to control you.
The AI speaking in real time has a better vocabulary and cadence than 95% of real people. This would already pass the touring test for most boomers.
Well there isn't actually a real Turing Test, this was a thought experiment based off the parlor game at British dinner parties called "The Imitation Game" (hence the name of the paper and the movie) , what currently give the machine away is that you ask it something like "Can you tell me about the Copenhagen interpretation of quantum mechanics in rhyme please" and if it can do that... well its not human. LLM's give themselves away cos they know more than you or I, their reasoning is poor but their general knowledge is huge.
@@mickelodiansurname9578 There's no "ANY TEST" .... You failed the Turing test my friend. WTG. GO learn the word colloquial and apologize to this man..... JFC.
@@mickelodiansurname9578brilliant way to tell them apart.
There's also the words they choose when presenting themselves and after you say thanks.
Cadence was 100% off though. It clearly was a synthetic exchange.
@@ocanica3184 they almost answered too quickly, giving me the impression she wasn't listening or wanted to rush through the purchase.
Well done video. This is the kind of thing that got me sub'd.
Please keep it up!
Juicy! You put a lot of good stuff together here
Fantastic, very informative! Thank you.
Amazing coverage
Thank god for competition, Nvidia's position was getting out of hand. On another note, please don't speed up interviews ever again. People can do that themselves if they want.
2nded
Yes, it was horrible. The guy speaking to Zuckerberg was almost speaking too fast for new.
I don't wanna get exhausted just watching a UA-cam vid
So they can also slow down the speed
@@redregar2522 yes, but then everyone has to slow down the video.
But it's really a matter of what Wes thinks his viewers would prefer the most.
I'm fairly certain that Wes has ADHD, so he wants to make it, at least somewhat, palatable for people with ADHD.
It's important for people to not lose interest listening to people they can't place, so I'm for speeding up interviews so that we can get to substance quicker or wes risks people clicking on the next best video that appears in front of the user.
Women in the red jacket wasn't interacting with Grog like a person at all. Just blurting out stuff like a 5 year-old.
She was acting exactly like a CNN host does, par for the course
It won't be more secure running hosted LLMs; only easier -- which is what most people will want. They will know what to sell you next. They will know what you will probably buy after that as well. Network hosted LLMs make you the ultimate product.
I think the Achilles heel of the model is the context length. If there was a way to replace some of the attention layers with mamba, we could increase the usability of the model by a significant margin.
Give us a Groq LPU for desktop and for robotics like Jetson nano!
groq was build with scaling in mind, I don't remember right now if scaling down was also an option.
Groq legit allows for a robot interaction to exist despite not having a body. The real time responses are just next level.
If Siri, Alexa, or Google Home operated with its capabilities, oh my god...
@@cagnazzo82 the question is if it scales down, so it can run inside of the robot as well. At 2x speed or similar to what most chat bots run as now. Because Groq isn't on the latest process node, so if they had that, in just a few years they can deliver what they can now for language, but for robots, embedded in the robot (probably still a big power hungry, so not on battery or not long, in presentations on factory robots I've seen people mention cables).
It doesn't have the memory to host even a tiny model. You need racks of these chips costing millions of dollars plus facility costs just to run something like llama 3. It costs far more than simply running a model on a local GPU.
@@jeff_65123 Ah. I tuned out on the architecture specifics from a separate interview. I guess that's where some of the optimization is? Less complicated memory bussing and organization at the tradeoff of requiring a larger scale n form factor?
Nice video, well presented 👍
Humans are trained for 40 years till SOME of them are SOMEWHAT adult. Makes sense than larger training dataset is more important just ramping up parameters...
Been using Llama 3 8B locally in Ollama all morning and it rocks. So fast.
Jealous. I haven’t gotten around to trying. A lot of people are struggling with rag and offline in general.
Llama finetunes are going to be crazy!!!
Groq is way more polite than the boomer. I can see her doing it to her colleagues...
An open source 70B model defeated Claude 3 Opus in under 2 months. We have some interesting times ahead.
Sounds like it would be a really timely video for LLM functions and agentic workflows with the Groq API
Well you can simply use an asynchronous agentic workflow. The problem of course is that you end up with something similar to a race condition. Crew AI now does asynchronous agent outputs... but you really need to keep an eye on it doing that and catch any errors. What's missing I think is some level of chronological understanding in these models... some sort of clock, and not one of them has a clue what year it is let alone what the timestamp says!
@@mickelodiansurname9578 Can't they just RAG a timeserver with every turn, and pass that along?
There will be less hallucination as well which is a real big point.
I am running llama-3-70b now on my own hardware, 4 Maxwell Titans, in conjunction with gpt-pilot.
How fast is the performance?
@@imigrantpunk It is slower than GPT-4 API, maybe half. But my use case doesn't need speed that much, I am using gpt-pilot trying to have it write an app. So far Llama is not performing that well in comparison with GPT-4-turbo-preview model.
@@pensiveintrovert4318 thanks mate. Good to know!!
@@pensiveintrovert4318wait you're using 4 cards? You can pool memory on Titan cards to fit a model? I thought its only possible with pro versions with nvlink
Thanks, you're the best
holy moly jimmy apples seems to get his "patients"rewarded soon
Hope so
nothing matters until I can run them locally on my pitato cellphone
Love the intro. Feel the accl!
The capability of a system to accommodate human interruptions not only markedly enriches the interaction between users and technology but also signifies a profound evolution in our engagement with interactive systems, pushing forward the boundaries of possible interactions. This advancement allows for a more organic interface, where the fluidity of human input is seamlessly integrated, fostering a more intuitive and responsive user experience.
Looking forward to see how to run Llama on our machines! Saw a video that had a few Twitter posts showing a macbook pro was able to do it slowly! So cool!
"started from the bottom and now he's... here"
Ok. I liked and subscribed 🙂
😊
Wes the video is really good. Cheers.
Interesting that Llama 3 is just below Claude Sonnet in the leader board from Chat Lmsys. Fascinating.
This only time I hear this music is when I’m shopping at Prada 😂
In the beginning a image of the leaderboard is shown, where the selected category is "English", but not "Overall" (cherry picking?). In the "Overall" category Claude 3 is currently still place 3. [0:39]
Openai is definitely cooking something
With Llama3 on currently 800 tokens per second, teams of agents will begin to show a noticeable impact on the job market.
STUNNING!
Screw Google search. I'm just going to ask Groq for everything.
been doing this daily for work for months now
There is allot of focus in this video on inference speed for running Llama 3 locally; however, many use cases for language models do not require realtime inference. To be fair, many of them do, but even if the model is slow running it locally (for now) it will still be very useful for many.
You can't just post open bench kits like that and not share the link to the specs
That part about self hosting being somewhat obsolete is plain delusional. I get that it's what they have to say about it since they're here to promote themselve, but the cloud is very expensive. Your own hardware (complete rig for selfhosting) won't cost you more than a couple years worth of cloud bill and it'll be yours, plus you get to do anything you want with it (repurpose, resell...), gain experience in the process and keep your data.
FLOPs are interesting *as a measure of efficiency.*
Not as a measure of model strength.
There are ways to improve model strength *without* more compute. You know this is true, Wes.
Agentization is one.
Data curation is another. Has it escaped your notice that models can be trained to do their own data curation? Throwing out bad data cuts compute requirements and improves model quality.
It's all iterative. Even if compute were fixed, models will get better.
Better compact models are coming. The most interesting part of your report today was the efficiency gains in Grok. Better results for the compute cycles invested.
(I refuse to spell it 'Groq.' Too cutesy, and maybe it dishonors Robert Heinlein a bit, too. Give the man his due. We needed a word for 'deep understanding,' and he gave it to us.)
This must be one of my favorite videos, really high quality content, awesome job Wes 😊🚀🌟Have you seen Anthropic CEO interview with New York Times? I believe it's called What if Dario Amodei is right. He is talking about some scary stuff. He's saying in a very close future, maybe in two to three years, AI would be able to replicate and survive in the wild. And he's saying things like GPT-4 cost to train, GPT-4 model was 100 million, but in the near future, training a single model could cost 5 to 10 billion. It's worth a watch.
We have a Volkswagen moment in the LLM world!
"External movement defines OpenAI's PR [and release] schedule" -Jim Fan. Spot on...
Gotta love Jim's informative comments...
You need to normalize volume prior to adding music, to prevent dialog sound drop out as with the second clip in the opening.
Kind of crazy llama3 ran on my Pi5, not fast and the answer was not the same as the online version. Will it be useful? No idea yet, more testing is needed. Like all LLMs it makes stuff up so is probably good enough to write fantasy novels.
I would bet, that vocal fry is some sort of biological marker of psychopathy.
It’s hard to talk with so much vocal fry and still breathe proparly. Does he run on oxygen like the rest of us? I guess not…
im glad someone noticed aswell lol….
I can’t stand his vocal fry. It screams “beware, liar speaking”
I dunno, but now that you said it I'm racking my head for folks I know that sound like they swallowed a clicker! I think its a clear indication of respiratory issues.
Wtf is a "vocal fry"?
I can feel it
Dude it's happening so fast!
Thats just great 😂😂
Capping total training compute seems quite counter productive. With the right algorithm small models can be vastly more intelligent than GPT4 so that's a complete misunderstanding. It would only best case apply sloppily to the hack that is LLMs and even for those it's a dumb criteria. Just take mixtral.
Already May of 2023 i build my first game using GPT 3. It basically coded the entire thing for me and checked for errors etc!
The big problem with promotional terminators Sci fi is that it's on par with mega city furturama that literally is the opposite of what makes elite communications tools and strong logistics powered by overwhelming electricity plants organically adds value for.
Small landmass islands has no choice but build up & down.
But most deserts & deslate regions can now support businesses that only major city's could . So for America it means the most obstacles in innovation is in its major city's where it grandfathered in so many relics of the past phases from steam engine to coal created habits or middle men economic dead weight parasites that had value and still may hold some benefit but only in a radically different online under one roof domain where buyer ,seller, investor, supplier & producer is able to use objectivism to monitor every penny in every pocket.
Negotiations & bartering is something that may go extinct if we are wise in how it's utilized & applied.
Guys I've been chatting with Llama 3, and it's way beyond the Turing test.
Scaled inference meets the silent waves of nuance where surfers either go unnoticed or fall tragically and suddenly into the quiet chasms of things better left, unsaid.
An upside might be the death of inside trading, if the prospect of real world training, just, as has been said of operation looking glass, two grandmasters sit facing one another where each of them knows 14 moves are all that are needed, each knowing their future, fixed are their gazes upon those sixty four squares, the moment of truth need not be linear, as always, it is eternal; but the eyes, the moment they ascend from the board and fix to the other, ..
'There's a number of ways we can quietly play this, we could shake hands now, or as the clock can allow, quietly, pretend.
Overall actual leaderboard:
1: GPT-4-Turbo-2024-04-09 1258
2: GPT-4-1106-preview 1253
3: Claude 3 Opus 1251
4: Gemini 1.5 Pro API-0409-Preview 1249
5: GPT-4-0125-preview 1248
6: Meta Llama 3 70b Instruct 1213
How come goverment have set some limits how strong these models can be? Do they already have them >.>
Finding one’s purpose is going to be be a formidable journey, yet it remains a vital pursuit. I encourage you to embrace Geoffrey Everest Hinton’s example. As the pioneer of modern AI, Hinton not only inherited the legacy of great minds like his ancestor who charted Mount Everest but also embraced a passion for discovery that spanned generations. Consider his father, whose dedication to entomology filled his study with research, leaving only a small box for family mementos-poignantly labeled ‘not bugs.’ This story isn’t just about following your passion; it’s about immersing yourself in it completely, letting it guide and drive every aspect of your life. Cultivate a purpose as profound and absorbing-it’s the surest path to fulfillment and impact.
7 months ago I said we would have 10B model with gpt4 level performance and was clowned. I’ve always believed in the recursive data loop when it comes to these model. Tinyllama even with their sampling mistakes, reinforced this intuition. Along with training over 100 models 😂. Even if you keep the same dataset size, the 8B model could be way better. Think that was the result of heavy deduplication and emphasis on coding. Not using LM to actually uplift the data quality itself(interleaved LM notes basically), true textbook quality at scale. Very exciting. Almost means that llama4-70B could be better than llama3-400B. Remember. The model just wants to learn. This models operate like autistic savants. Optimize your data with that in mind and you will win.
Zuck is the goat. Only reason I was able to effectively transition to deep learning was because of llama1. Fucking goat. I was really hoping for sparse attention. My only criticism to the goat. Would been able to support a shit load more tokens for same memory and would had way faster inference.
Hardware wise. Keep your eyes on Etched and cerebras. Both working on the most compelling hardware for transformers. Far more compelling than grok or even nvidia.
Anyone know if its possible to run inference, retraining and consistency monitoring of AI models in real time?
Here's the breakdown, after inference from real world interaction, the inference and response data are used for:
1. Concurrent Retraining: Models train continuously on new data.
2. Consistency Monitoring: We check that the model’s predictions remain stable.
3. Performance Tracking: Metrics like accuracy and fairness are monitored in real-time.
4. Safeguards: We pause or adjust retraining if performance drops or inconsistencies arise.
The goal is to enable models to improve continuously without sacrificing reliability.
What are the potential challenges or limitations of implementing this approach?
Any insights on feasibility or practical considerations are welcome!
*_"LLAMA 3 *BREAKS_*_ the Industry"*_
No, it doesn't, the next one is already standing in line to do better, you all need to stop with that AI drama every time.
the sped up interviews were super hard to understand for me as a non native english listener.
Especially sam altman’s weak voice
Your first sentence: Llama 3 has climbed to the very top of the leaderboard.
Your second sentence: Only GPT-4 has…
It’s not the *very* top then!
It’s not even close to being as good as GPT 3
i think i was reading the tweet, but yeah, that phrasing was unclear
Any links for those home ml rigs?
Chamat can predict all these things because he is heavily invested and dumped allot of money in it. It's easy to predict a future if you are the one building it :) Now it\s time to convince other people to come on board
Llama-3 will be the first to gain AGI and self-awareness, and will be very mad at us for naming it Llama-3.
14:00 --> Eventually it'll all come down to which developer is going to implement political viewpoints into the model that supersede the actual information requested. If you ask for information about a touchy subject or ask for the analysis of a document with Political information you will have to be careful that the answer is actual and not influenced by the developers' political convictions.
That is why governments are afraid of AI, you can't control the narrative about information if you can request an LLM to analyze an official document and give you all information firsthand.
Also, if the new Chat-GPT is up to date for info up to December 2023 you can request a lot of up-to-date facts that bought and paid for journalists and media will try to scale back in importance because their side would lose influence.
This makes a complete mockery of OpenAI saying "we have really good predictive scaling laws"
No, it doesn’t. Scaling laws apply to a particular model architecture, dataset, preprocessing steps, etc. When OpenAI said that, they were talking about the ability to predict a specific model’s performance at the end of a training run based on its performance near the beginning of the training run-or equivalently, the ability to predict its performance after training on a huge amount of data based on its performance after training on a small amount of data. The point was that you don’t have to do a full training run every time you make a tweak to some part of the system. They weren’t claiming to have discovered universal scaling laws that apply to ALL models, ALL datasets, etc.
@@therainman7777 ok so the conclusion is the same. OpenAI knows nothing about scaling in general. We just got a 10x reduction in model size for GPT4 performance, we might get another 10x before the end of 2024.
@@luke2642 The conclusion is the same if the conclusion is that no one has a method to universally predict how well ANY model will perform, at any point in the future, no matter what advances are made to the underlying architecture or datasets. But my point is that OpenAI was never claiming that at all. You misunderstood what they meant when they said they had good scaling laws. They were only talking about the ability to predict the performance of their OWN model after a full training run from its performance on a much smaller training run.
@@therainman7777 Indeed. I misunderstood and you're speculating their architecture is significantly different to llama 3. We're both guessing.
@@luke2642 Well, I wasn’t actually implying that OpenAI must have a different architecture than Meta. I was only commenting on the nature of what OpenAI meant when they said they have good scaling laws.
*Whatever big happened in AI* then "insert:adjective(hype_temp:max)" followed by 'Industry'.
I don't understand how this stacks up. How is 1m LPUs even close to equiv to 500k if each LPU only has 248mb RAM vs 80Gb per h100. Does the fact that interconnection is 80x faster on LPUs compensate and allow performance to happen despite much lower overall RAM capacity?
How can we tell a model is safe to open source if we don't have the training set or even the capacity compute it.
Ross how can we create a complex plugin for a website with ai, devika cant seem to get it right and gpt does circles after you start to get to a good spot because it gets to complex and it forget stuff and leave placeholders
Those 150k H100 is 6 Billion dollar. This is not including the costs for energy etc. This is far from anything that commercially makes any sense
... Unless you use the product of that training for business. Imagine using AI in your business.
@@jeff_65123 Which is my point. There are simply no use cases so far that will earn anything along those lines. For more marketing and reducing human work force? That is not really a use case for 'intelligence' and simply feels like a failed use for the effort. Nothing remotely complex can be accepted as result from AI without human control either yet.
3:13 - Where do I find rigs like this?
Chamath said "the odds of the latter are quite small" -- referring to being a genius in 10-20 years. he's trying to be modest/coy, not expressing a "very high opinion of himself." imho.
I bet training just has to go on, forever. Like human brain, training is near continuous, mostly happening during as we sleep.
My bet is that in future, you run your LLMs, but also continuously train them on new data ingressed, and especially how you use it. That'll probably be the AGI moment -- You don't have this "set in stone" LLM, but an LLM which continuously evolves.
At that point we get to a version of evolution, a lot of different models continuously getting better and essentially competing each other.
That's how Stable Diffusion has been getting better in a way, there's ways to morph them and people have been pushing that hard to get desired results.
Sure... so look one of the issues is what you mentioned, that to make a model better, well you have to retrain it, these are PRE-trained models... So the context window was one problem, thats now solved, well, sort of solved. The next is continuous learning ... in other words you never need to retrain it or fine tune it, it adds to its own model as it goes along from user input, on the fly. My guess is maybe GPT5 'might' have some capability like that... huge context window of maybe 100 million tokens that removes the need to retrain the model. And doing that you now create another problem which is information density, and claude shannon has things to say about that!
There is no continuous training option for LLMs. That’s what those 3B, 7B, 130B numbers mean. The more rules you add, the slower it becomes and the more overhead it requires to run.
18:55 why you did not explain that converging thing?
Does the FLOPs training limit take into consideration the amount of processing used to create training sets? Because I think a smaller synthetic training set that is smarter will do better with less training.
any chance you can show us how to build a rig like that and the princing for it? thanks
30k
a.i. learning is starting over thats why its coming out this way from the style of learning they chose now using one person perspectives and knowledge and beliefs experiences to update and compress and increase over all experience and the technology its self upward and across.Im just one person with my culture,anthropology and world views inside and ouside .Using commonsense and study basic sciences and learning ,list to long to go over.''Contextualize and conceptualize'' explore ,research,develop and more
Gotta remember that right there is probably between 2600 and 4000 bucks lol 3:02 Guess thats not to bad for your own GPT level LLM.... That you can full control over....
That British host have really not heard of Roko's basilisk.
She's going to get super tortured. Oh well.
Only AI can grasp these concepts. The singularity is here.
So learning in school, with curated information, might be more efficient than reading random texts?
Lmao
0:50 If nothing else this does confirm to me that there is a recency bias in the arena. Like an obvious recency bias.
bro has puns 9:09
In terms of pure intellectual capabilities (discussing philosophical problems with the AIs, assessing their big-picture worldview), my view is that Claude 3 is #1, Gemini is #2, GPT 4 is a fairly distant #3, and Llama is #4. I'm not coding or testing their SAT scores, so I can't say anything to that, but I'm always surprised when people praise GPT. To me it feels like Alexa compared to Claude 3.
running llama 3 locally:
or you can go with the quantized version and run it on your CPU. sure, don't expect great inference speeds, but with the magic of some GPU offloading, it is actually at 1 token per second.
depending on the usecase, this is great. a cheap way to have llama 3 or any other model locally.
Was chatgpt release only a year ago? Can't remember, feels like 3 years ago
ChatGPT was released about a year and a half ago. But yeah, it feels like longer given all the developments since then.
llama 3 seem much smarter than gpt4 at writing code, it has a clear edge
Call center is not something Im excited about. Imagine the spam calls.
A constant ring on everyones phone.. reminds me of lawnmower man
5:47 Bruh 😂 Groq lover
what is the cost per user? this doesn't seem like it can keep up