[00:00:00] Intro [00:01:57] Yi Tay Intro [00:03:02] Path into LLMs [00:09:41] Google Brain: PaLM, UL2, DSI, Emergent Abilities [00:11:54] PaLM 2 [00:15:27] Emergent Abilities [00:18:26] Quoc Le [00:24:16] Marketing Research: How to Start from Zero with No Reach [00:27:34] What's needed to be a successful AI Researcher? [00:30:31] Reka Origin [00:33:24] Starting Reka Infra [00:35:04] Why not to use TPUs outside Google [00:36:29] Chaotic vs Stable Infra [00:38:04] Risk Sharing of Bad Nodes [00:41:05] Checkpointing and Orchestration [00:43:39] Reka Flash/Core/Edge [00:46:59] Recruiting the team [00:47:22] Noam Architecture - Swiglu, GQA, RMSnorm, ROPE [00:52:26] Encoder-decoder vs Decoder-only [00:55:52] LLM Trends - Llama 3 and Phi 3 Glowup [00:57:46] LLM Trends - Benchmarks and Evals [01:03:25] LLM Trends - Early vs Late Fusion Multimodality [01:07:22] LLM Trends - Scaling Laws [01:09:41] LLM Trends - Long Context vs RAG [01:12:31] Long Context vs Finetuning [01:14:14] If emergence is real, when does Efficiency work? [01:17:41] MoEs and Upcycling [01:20:47] The Efficiency Misnomer - Efficiency != Speed [01:25:05] Open Source vs Closed Models [01:28:08] Personal Productivity [01:33:19] Singapore vs US Academic Scene [01:37:42] Building Silicon Valley outside Silicon Valley [01:40:29] TechInAsia Meetup
I am a beginner in the LLM world , but I have 5 years experience in ml and data science. Can you a get a and learn from this amazing guy in his start up ? I can work for free for him.
This channel is incredible❤
keep em coming 🔥
[00:00:00] Intro
[00:01:57] Yi Tay Intro
[00:03:02] Path into LLMs
[00:09:41] Google Brain: PaLM, UL2, DSI, Emergent Abilities
[00:11:54] PaLM 2
[00:15:27] Emergent Abilities
[00:18:26] Quoc Le
[00:24:16] Marketing Research: How to Start from Zero with No Reach
[00:27:34] What's needed to be a successful AI Researcher?
[00:30:31] Reka Origin
[00:33:24] Starting Reka Infra
[00:35:04] Why not to use TPUs outside Google
[00:36:29] Chaotic vs Stable Infra
[00:38:04] Risk Sharing of Bad Nodes
[00:41:05] Checkpointing and Orchestration
[00:43:39] Reka Flash/Core/Edge
[00:46:59] Recruiting the team
[00:47:22] Noam Architecture - Swiglu, GQA, RMSnorm, ROPE
[00:52:26] Encoder-decoder vs Decoder-only
[00:55:52] LLM Trends - Llama 3 and Phi 3 Glowup
[00:57:46] LLM Trends - Benchmarks and Evals
[01:03:25] LLM Trends - Early vs Late Fusion Multimodality
[01:07:22] LLM Trends - Scaling Laws
[01:09:41] LLM Trends - Long Context vs RAG
[01:12:31] Long Context vs Finetuning
[01:14:14] If emergence is real, when does Efficiency work?
[01:17:41] MoEs and Upcycling
[01:20:47] The Efficiency Misnomer - Efficiency != Speed
[01:25:05] Open Source vs Closed Models
[01:28:08] Personal Productivity
[01:33:19] Singapore vs US Academic Scene
[01:37:42] Building Silicon Valley outside Silicon Valley
[01:40:29] TechInAsia Meetup
that is the audio timestamps, but the video has different editing therefore i didnt put it there
Audio pod and show notes: www.latent.space/p/yitay
I am a beginner in the LLM world , but I have 5 years experience in ml and data science. Can you a get a and learn from this amazing guy in his start up ? I can work for free for him.
Don't forget to "like" it.