Building with continuously trainable AI, Alternative to Pre-trained Deep Learning - Ali Al Ebrahim

Leveraging Global Talent for Technical AI safety Research - Apart's mode - Jason Hoelscher Obermaier

Proving safety for narrow AI outputs - Evan Miyazono

Какой звук фальшивый?

Росіяни прийняли азовців за своїх. Як хитрістю захопити окупантів у лісах Луганщини?

What really determines the price of a token? Hamster Girl explains ⚡️ Hamster Academy

SatisfIA - Updates From Our Project On Aspiration Based Agent Designs - Jobst Heitzig

Horizon Events

Переглядів 21

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 27 вер 2024
Session presented during the Virtual AI Safety Unconference 2024
Speaker: Jobst Heitzig
Session Description: SatisfIA is an ongoing project (pik-gane.githu...) with AI Safety Camp, SPAR, and interns at my lab (forum.effectiv....
We develop non-maximizing, aspiration-based designs for AI agents to avoid risks related to maximizing misspecified reward functions. This can be seen as being related to decision theory, inner and outer alignment, agent foundations, and impact regularization.
We mostly operate in a theoretical framework that assumes the agent will be given temporary goals specified via constraints on world states (rather than via reward functions), will use a probabilistic world model for assessing consequences of possible plans, will consider various generic criteria to assess the safety of possible plans for achieving the goal (e.g., information-theoretic impact metrics), and will use a hard-coded, non-optimising decision algorithm to choose from these plans.
Our project focuses on the design of such algorithms, the curation of safety criteria, and the testing in simple environments (e.g., AI safety gridworlds).
This session reports on the project's goals, methods, outputs, and plans and invites questions, criticisms, and suggestions. We also still seek collaborators!

КОМЕНТАРІ •

Наступне

Автоматичне відтворення

Building with continuously trainable AI, Alternative to Pre-trained Deep Learning - Ali Al Ebrahim

Building with continuously trainable AI, Alternative to Pre-trained Deep Learning – Ali Al Ebrahim

Leveraging Global Talent for Technical AI safety Research - Apart's mode - Jason Hoelscher Obermaier

Leveraging Global Talent for Technical AI safety Research - Apart's mode – Jason Hoelscher Obermaier

Proving safety for narrow AI outputs - Evan Miyazono

Proving safety for narrow AI outputs – Evan Miyazono

Какой звук фальшивый?

Какой звук фальшивый?

Росіяни прийняли азовців за своїх. Як хитрістю захопити окупантів у лісах Луганщини?

Росіяни прийняли азовців за своїх. Як хитрістю захопити окупантів у лісах Луганщини?

What really determines the price of a token? Hamster Girl explains ⚡️ Hamster Academy

What really determines the price of a token? Hamster Girl explains ⚡️ Hamster Academy

ЛЕГЕНДАРНАЯ ГОНКА Passat VS BMW | Ильдар АВТО-ПОДБОР против Мастерской Синдиката

ЛЕГЕНДАРНАЯ ГОНКА Passat VS BMW | Ильдар АВТО-ПОДБОР против Мастерской Синдиката

Robert Greene: A Process for Finding & Achieving Your Unique Purpose

Robert Greene: A Process for Finding & Achieving Your Unique Purpose

Eric Weinstein - Are We On The Brink Of A Revolution? (4K)

Eric Weinstein - Are We On The Brink Of A Revolution? (4K)

Coffee Break Briefing: High-rise blocks your legal responsibilities

Coffee Break Briefing: High-rise blocks your legal responsibilities

Fundamentals of Quantum Physics. Basics of Quantum Mechanics 🌚 Lecture for Sleep & Study

Fundamentals of Quantum Physics. Basics of Quantum Mechanics 🌚 Lecture for Sleep & Study

The original growth hacker reveals his secrets | Sean Ellis (author of “Hacking Growth”)

The original growth hacker reveals his secrets | Sean Ellis (author of “Hacking Growth”)

Wayne Dyer: The Most Powerful Life Philosophy That Will Change Your Mindset Forever!

Wayne Dyer: The Most Powerful Life Philosophy That Will Change Your Mindset Forever!

Hands-On Power BI Tutorial 📊Beginner to Pro [Full Course] ⚡

Hands-On Power BI Tutorial 📊Beginner to Pro [Full Course] ⚡

UPAC Colloquium April - The Non-Fundamentality of Spacetime

UPAC Colloquium April - The Non-Fundamentality of Spacetime

Gaia: Distributed planetary-scale AI safety - Rafael Kaufmann

Gaia: Distributed planetary-scale AI safety – Rafael Kaufmann

Папич - миллионы на стримах, донаты от Меллстроя и альтушки

Папич — миллионы на стримах, донаты от Меллстроя и альтушки

Мой последний стендап концерт можно посмотреть целиком на платформе specialscomedy.com

Мой последний стендап концерт можно посмотреть целиком на платформе specialscomedy.com

Помоги Симбочке убежать от монстра! Подпишись на ютуб, скорее! 🙀 #симба #симбочка #симбочкапимпочка

Помоги Симбочке убежать от монстра! Подпишись на ютуб, скорее! 🙀 #симба #симбочка #симбочкапимпочка

Terraform Labs shuts down, SEC delays Coinbase case, Decentraland hacked on X ⚡️ Hamster News

Terraform Labs shuts down, SEC delays Coinbase case, Decentraland hacked on X ⚡️ Hamster News

Росіяни прийняли азовців за своїх. Як хитрістю захопити окупантів у лісах Луганщини?

Росіяни прийняли азовців за своїх. Як хитрістю захопити окупантів у лісах Луганщини?

"Завжди був патріотом! Я ніколи би не залишив країну через булінг!", Волошин⁠⁠ | @Raminaeshakzai

"Завжди був патріотом! Я ніколи би не залишив країну через булінг!", Волошин⁠⁠ | @Raminaeshakzai

ПОЛНОЕ видео на канале. Нажми СРАЖАЮСЬ с ЗЛЫМИ РОДИТЕЛЯМИ в schoolboy ranaway

ПОЛНОЕ видео на канале. Нажми СРАЖАЮСЬ с ЗЛЫМИ РОДИТЕЛЯМИ в schoolboy ranaway

Неловко вышло😅 приколы каждый день @stas.yornik.shorts

Неловко вышло😅 приколы каждый день @stas.yornik.shorts