How Cohere will improve AI Reasoning this year

Evan Hubinger (Anthropic)-Deception, Sleeper Agents, Responsible Scaling

2 years in Dubai - my honest thoughts

Василиса пошла В ПЕРВЫЙ класс! А что у вас в рюкзаке)))?

Проверил Лайфхак ОГОНЬ-ТРЕНИЕМ Сахар+Марганцовка #фрост #shorts #frost #лайфхаки #лайфхак #выживание

👆🏻Если любишь маму, жми на «МЫ поехали в ПИТЕР…» и увидишь самый лучший влог 👀

Owain Evans - AI Situational Awareness, LLM Out-of-Context Reasoning

The Inside View

Переглядів 1 906

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 13 вер 2024
Наука та технологія

КОМЕНТАРІ • 10

@TheInsideView 21 день тому ⁺²
OUTLINE
01:12 Owain's Research Agenda
02:25 Defining Situational Awareness
03:30 Safety Motivation
04:58 Why Release A Dataset
06:17 Risks From Releasing It
10:03 Claude 3 on the Longform Task
14:57 Needle in a Haystack
19:23 Situating Prompt
23:08 Deceptive Alignment Precursor
30:12 Distribution Over Two Random Words
34:36 Discontinuing a 01 sequence
40:20 GPT-4 Base On the Longform Task
46:44 Human-AI Data in GPT-4's Pretraining
49:25 Are Longform Task Questions Unusual
51:48 When Will Situational Awareness Saturate
53:36 Safety And Governance Implications Of Saturation
56:17 Evaluation Implications Of Saturation
57:40 Follow-up Work On The Situational Awarenss Dataset
01:00:04 Would Removing Chain-Of-Thought Work?
01:02:18 Out-of-Context Reasoning: the "Connecting the Dots" paper
01:05:15 Experimental Setup
01:07:46 Concrete Function Example: 3x + 1
01:11:23 Isn't It Just A Simple Mapping?
01:17:20 Safety Motivation
01:22:40 Out-Of-Context Reasoning Results Were Surprising
01:24:51 The Biased Coin Task
01:27:00 Will Out-Of-Context Resaoning Scale
01:32:50 Checking If In-Context Learning Work
01:34:33 Mixture-Of-Functions
01:38:24 Infering New Architectures From ArXiv
01:43:52 Twitter Questions
01:44:27 How Does Owain Come Up With Ideas?
01:49:44 How Did Owain's Background Influence His Research Style And Taste?
01:52:06 Should AI Alignment Researchers Aim For Publication?
01:57:01 How Can We Apply LLM Understanding To Mitigate Deceptive Alignment?
01:58:52 Could Owain's Research Accelerate Capabilities?
02:08:44 How Was Owain's Work Received?
02:13:23 Last Message
@Max-bh1pl 21 день тому ⁺³
Finally, a new episode! I've been eagerly waiting for this!
@MrCheeze 21 день тому ⁺¹
We're so barack
@human_shaped 19 днів тому
Really very interesting. It's good to let AIs know how they're being tested so they can take that into consideration too. Thanks for the transcript ;)
@simonstrandgaard5503 21 день тому ⁺¹
great interview
@TheJokerReturns 4 дні тому
I'll like to see if we can coordinate on podcasts. How can we best reach you?
@bilalchughtai_ 21 день тому ⁺¹
banger

Наступне

Автоматичне відтворення

How Cohere will improve AI Reasoning this year

How Cohere will improve AI Reasoning this year

Evan Hubinger (Anthropic)-Deception, Sleeper Agents, Responsible Scaling

Evan Hubinger (Anthropic)—Deception, Sleeper Agents, Responsible Scaling

2 years in Dubai - my honest thoughts

2 years in Dubai - my honest thoughts

Василиса пошла В ПЕРВЫЙ класс! А что у вас в рюкзаке)))?

Василиса пошла В ПЕРВЫЙ класс! А что у вас в рюкзаке)))?

Проверил Лайфхак ОГОНЬ-ТРЕНИЕМ Сахар+Марганцовка #фрост #shorts #frost #лайфхаки #лайфхак #выживание

Проверил Лайфхак ОГОНЬ-ТРЕНИЕМ Сахар+Марганцовка #фрост #shorts #frost #лайфхаки #лайфхак #выживание

👆🏻Если любишь маму, жми на «МЫ поехали в ПИТЕР…» и увидишь самый лучший влог 👀

👆🏻Если любишь маму, жми на «МЫ поехали в ПИТЕР…» и увидишь самый лучший влог 👀

Сбежать от Granny : Nuggets Gegagedigedagedago пытается удрать от страшной бабульки !

Сбежать от Granny : Nuggets Gegagedigedagedago пытается удрать от страшной бабульки !

Connor Leahy Unveils the Darker Side of AI

Connor Leahy Unveils the Darker Side of AI

Звуковые иллюзии, которые работают на всех (почти) [Veritasium]

Звуковые иллюзии, которые работают на всех (почти) [Veritasium]

How RAG Turns AI Chatbots Into Something Practical

How RAG Turns AI Chatbots Into Something Practical

The "Modern Day Slaves" Of The AI Tech World

The "Modern Day Slaves" Of The AI Tech World

Concerning the Stranded Astronauts

Concerning the Stranded Astronauts

Why no other human species survived - David Reich

Why no other human species survived – David Reich

AGI, SingularityNET, Longevity Escape Velocity with Dr. Ben Goertzel

AGI, SingularityNET, Longevity Escape Velocity with Dr. Ben Goertzel

"We Are All Software" - Joscha Bach

"We Are All Software" - Joscha Bach

What are AI Agents?

What are AI Agents?

Самая дорогая видеокарта в мире #пк #игры #гейминг #сборкапк #игровойпк #pc #games #nvidia #amd

Самая дорогая видеокарта в мире #пк #игры #гейминг #сборкапк #игровойпк #pc #games #nvidia #amd

РУМТУР 2024! КАК И ГДЕ ЖИВЕТ ПИПАПУПА?! ОБЗОР НА КОМП МЕЧТЫ!

РУМТУР 2024! КАК И ГДЕ ЖИВЕТ ПИПАПУПА?! ОБЗОР НА КОМП МЕЧТЫ!

PlayStation 5 Pro Console - Reveal Trailer

PlayStation 5 Pro Console - Reveal Trailer

iPad 10 за 350$ - лучший в 2024?

iPad 10 за 350$ - лучший в 2024?

You can play MC and protect your iPad at the same time, do you still need to choose? #ipadkeyboard

You can play MC and protect your iPad at the same time, do you still need to choose? #ipadkeyboard

Компьютер за «5000 рублей» с редкой видеокартой за 600$!

Компьютер за «5000 рублей» с редкой видеокартой за 600$!

#major #airdrop #telegram #web3 #listing #crypto

#major #airdrop #telegram #web3 #listing #crypto

Презентація iPhone 16 - ВСЕ, ЩО ТРЕБА ЗНАТИ!

Презентація iPhone 16 - ВСЕ, ЩО ТРЕБА ЗНАТИ!