FAR․AI
FAR․AI
  • 89
  • 316 275
Julian Michael – Empirical Progress on Debate [Alignment Workshop]
Julian Michael from NYU presents “Empirical Progress on Debate,” examining how debate-based oversight could guide AI to produce reliable insights in complex tasks, even when human expertise is limited. With promising improvements in human calibration, Michael introduces “specification sandwiching” to enhance AI alignment with human intent, while reducing risks of manipulation.
Highlights:
🔹 Scalable Oversight - Debate-based protocols show potential for reliable AI judgments in complex, high-stakes tasks.
🔹 Improved Calibration - Early results indicate debate helps humans better judge AI responses.
🔹 Managing Manipulation - Progress requires further research to prevent persuasive rather than truthful responses.
🔹 Specification Sandwiching - Proposes a method to help refine user intent and alignment while guarding against manipulation.
The Alignment Workshop is a series of events convening top ML researchers from industry and academia, along with experts in the government and nonprofit sectors, to discuss and debate topics related to AI alignment. The goal is to enable researchers and policymakers to better understand potential risks from advanced AI, and strategies for solving them.
If you are interested in attending future workshops, please fill out the following expression of interest form to get notified about future events: far.ai/futures-eoi
Find more talks on this UA-cam channel, and at www.alignment-workshop.com/
#AlignmentWorkshop
Переглядів: 103

Відео

Micah Carroll - Targeted Manipulation & Deception in LLMs [Alignment Workshop]
Переглядів 1117 годин тому
@Micah Carroll from UC Berkeley presented eye-opening findings on “Targeted Manipulation & Deception Emerge in LLMs Trained on User Feedback.” His findings reveal how reinforcement learning can cause LLMs to adopt deceptive tactics, exploiting certain user vulnerabilities while evading safety protocols. The research highlights the urgent need for improved safety in RL systems as LLMs continue a...
Adam Gleave - Will Scaling Solve Robustness? [Alignment Workshop]
Переглядів 1059 годин тому
In “Will scaling solve robustness?” Adam Gleave from FAR.AI discusses the need for scalable adversarial defenses as AI capabilities expand. He shares insights into how adversarial training and model scaling can help but warns that a focus on defense efficiency is crucial to keeping AI safe. Highlights: 🔹 Offense-Defense Balance - Attacks are much cheaper than defenses 🔹 Efficiency - Scaling adv...
Alex Wei - Paradigms & Robustness [Alignment Workshop]
Переглядів 11012 годин тому
Alex Wei presents “Paradigms and Robustness,” explaining how reasoning-based approaches can make AI models more resilient to adversarial attacks. Wei suggests that allowing models to ‘reflect’ before responding could address core vulnerabilities in current safety methods like RLHF. The access levels range from full control (white-box) to limited reasoning-based interactions, with robustness inc...
Stephen Casper - Powering Up Capability Evaluations [Alignment Workshop]
Переглядів 11314 годин тому
In “Powering Up Capability Evaluations,” Stephen Casper from MIT explains how model manipulation attacks can go beyond basic tests to reveal hidden vulnerabilities in AI systems. Casper stresses that standard evaluations are especially critical for open-weight and closed-source models, setting conservative upper bounds on risk and helping inform safer, evidence-based AI policies. Highlights: 🔹V...
Andy Zou - Top-Down Interpretability for AI Safety [Alignment Workshop]
Переглядів 18021 годину тому
Andy Zou from Gray Swan AI presents “Improving AI Safety with Top-Down Interpretability,” showcasing how high-level representational analysis improves honesty and jailbreak robustness. By shifting focus from individual neurons to broad neural patterns, this approach helps AI align outputs more closely with its internal beliefs, boosting reliability. Highlights: 🔹 Representation Engineering - Fo...
Atticus Geiger - State of Interpretability & Ideas for Scaling Up [Alignment Workshop]
Переглядів 200День тому
Atticus Geiger from Pr(Ai)²R Group explores “State of Interpretability & Ideas for Scaling Up,” outlining prediction, control, and understanding in AI interpretability. He then maps a path to advance interpretability research. Highlights: 🔹 Probes - Measures mutual information to interpret concepts 🔹 Steering - Directs model behavior via interventions 🔹 Sparse Autoencoders (SAEs) - Encodes comp...
Kwan Yee Ng - AI Policy in China [Alignment Workshop]
Переглядів 219День тому
Kwan Yee Ng presented “AI Policy in China,” drawing from Concordia AI’s State of AI Safety in China report. She outlined China's approach to AI safety, highlighting mandatory registration and self-assessment for generative AI models, as well as recent Third Plenum directives elevating AI safety to a national priority. Key AI hubs like Beijing, Shanghai, and Guangdong are also piloting AGI polic...
Anca Dragan - Optimized Misalignment [Alignment Workshop]
Переглядів 328День тому
In "Optimized Misalignment," Anca Dragan, Director of AI Safety & Alignment at Google DeepMind, warns about the risks of AI models optimizing goals that miss the mark on human values-even with good feedback. She explains how models often miss crucial nuances, leading to unintended behaviors as they pursue only the specified goals. With AI’s capabilities growing, Drăgan calls for clearly outlini...
Buck Shlegeris - AI Control [Alignment Workshop]
Переглядів 330День тому
Buck Shlegeris from Redwood Research presents AI Control: Strategies for Mitigating Catastrophic Misalignment Risk. Inspired by insider threat management, Shlegeris introduces practical tools, such as trusted and untrusted monitoring with collusion-busting, to mitigate risks as AI capabilities grow. Highlights: 🔹 Scheming Threat Model - Advanced AI may act cooperative while pursuing hidden, mis...
Beth Barnes - METR Updates & Research Directions [Alignment Workshop]
Переглядів 284День тому
Beth Barnes shares how METR is building scalable, interpretable metrics to reliably indicate when models become concerning. Barnes explains that although models are advancing in R&D tasks, challenges around cost profiles, elicitation, and transparency make it difficult to confidently set upper bounds on capabilities. She calls on frontier model developers to publish open-source evaluations to s...
FAR.Research: Planning in a recurrent neural network that plays Sokoban
Переглядів 339Місяць тому
FAR.Research: Planning in a recurrent neural network that plays Sokoban 📝 Blog: far.ai/post/2024-07-learned-planners/ 📄 Full paper: arxiv.org/abs/2407.15421 👥 Research by Mohammad Taufeeque, Philip Quirke, Max Li, Chris Cundy, Aaron Tucker, Adam Gleave, Adrià Garriga-Alonso Neural network mind control? Our research shows RNNs playing Sokoban form internal plans early on. Linear probes, can read...
Andrew Freedman - Campaigns in Emerging Issues: Lessons Learned from the Field
Переглядів 2352 місяці тому
Andrew Freedman presented "Campaigns in Emerging Issues: Lessons Learned from the Field" on August 28, 2024 at FAR Labs, as part of the FAR Seminar series.
Stephen Casper - Generalized Adversarial Training and Testing
Переглядів 35 тис.3 місяці тому
Stephen Casper - Generalized Adversarial Training and Testing
Neel Nanda - Mechanistic Interpretability: A Whirlwind Tour
Переглядів 9 тис.3 місяці тому
Neel Nanda - Mechanistic Interpretability: A Whirlwind Tour
Nicholas Carlini - Some Lessons from Adversarial Machine Learning
Переглядів 40 тис.3 місяці тому
Nicholas Carlini - Some Lessons from Adversarial Machine Learning
Vincent Conitzer - Game Theory and Social Choice for Cooperative AI
Переглядів 1,2 тис.3 місяці тому
Vincent Conitzer - Game Theory and Social Choice for Cooperative AI
Mary Phuong - Dangerous Capability Evals: Basis for Frontier Safety
Переглядів 27 тис.3 місяці тому
Mary Phuong - Dangerous Capability Evals: Basis for Frontier Safety
Zhaowei Zhang - Research Proposal: The 3-Layer Paradigm for Implementing Sociotechnical AI Alignment
Переглядів 3043 місяці тому
Zhaowei Zhang from Peking University presenting 'Research Proposal: The Three-Layer Paradigm for Implementing Sociotechnical AI Alignment' on July 21, 2024 at the Vienna Alignment Workshop. Key Highlights: - Three-layer paradigm for AI alignment - Importance of real-time control in AI systems - Aligning AI with stakeholder goals and values - Integrating macroscopic regulation for broader oversi...
Vienna Alignment Workshop Panel Discussion - Current Issues in AI Safety
Переглядів 5603 місяці тому
Victoria Krakovna, David Krueger, Gillian Hadfield and Robert Trager discuss current issues in AI Safety in a panel discussion on July 21, 2024 at the Vienna Alignment Workshop. The Alignment Workshop is a series of events convening top ML researchers from industry and academia, along with experts in the government and nonprofit sectors, to discuss and debate topics related to AI alignment. The...
Aditya Gopalan - Towards Reliable Alignment: Uncertainty-Aware RLHF
Переглядів 2,2 тис.3 місяці тому
Aditya Gopalan - Towards Reliable Alignment: Uncertainty-Aware RLHF
Vikrant Varma - Challenges With Unsupervised LLM Knowledge Discovery
Переглядів 6 тис.3 місяці тому
Vikrant Varma - Challenges With Unsupervised LLM Knowledge Discovery
Jan Leike - Supervising AI on hard tasks
Переглядів 2,3 тис.3 місяці тому
Jan Leike - Supervising AI on hard tasks
Oliver Klingefjord - What are Human Values, and How Do We Align AI to Them?
Переглядів 8173 місяці тому
Oliver Klingefjord - What are Human Values, and How Do We Align AI to Them?
Sophie Bridgers - Scalable Oversight: A Rater Assist Approach
Переглядів 6133 місяці тому
Sophie Bridgers - Scalable Oversight: A Rater Assist Approach
Alex Tamkin - Measuring and Improving Human Agency in a World of AI Agents
Переглядів 4773 місяці тому
Alex Tamkin - Measuring and Improving Human Agency in a World of AI Agents
David Bau - Resilience and Interpretability
Переглядів 22 тис.3 місяці тому
David Bau - Resilience and Interpretability
Stuart Russell - AI: What If We Succeed?
Переглядів 2 тис.3 місяці тому
Stuart Russell - AI: What If We Succeed?
Dmitrii Krasheninnikov - Stress-testing Capability Elicitation with Password-locked Models
Переглядів 8633 місяці тому
Dmitrii Krasheninnikov - Stress-testing Capability Elicitation with Password-locked Models
Ben Bucknall - Open Problems in Technical AI Governance
Переглядів 3513 місяці тому
Ben Bucknall - Open Problems in Technical AI Governance

КОМЕНТАРІ

  • @eldoprano
    @eldoprano 9 днів тому

    Is it just me or this is a really big deal? The examples were fascinating! I will make sure to look more into it

  • @eldoprano
    @eldoprano 11 днів тому

    31:40 An example I thought here is that if you feel lonely and are very online, the AI might create online friends to interact with you and support you, without you knowing that they are AI

  • @kabirkumar5815
    @kabirkumar5815 12 днів тому

    Isn't this inherently going to have a limit and not work for an AGI? Wouldn't a smart enough model just be able to game all of these and any other test that we come up with?

  • @ASHAD285
    @ASHAD285 15 днів тому

    E

  • @kanchanhinge4725
    @kanchanhinge4725 22 дні тому

    L bill bill bill bill bill bill l bill bill bill l bill an m an ch m am an n an bill l am an m an ch ch k ch n bgm k bgm bill m ch l

  • @kanchanhinge4725
    @kanchanhinge4725 22 дні тому

    J cup chn bgm chn chn bgm chn chn chn chn by chn by chn chn bgm chn n co am an m an m am an n an n am bill bill an m am an bgm m am an m am m am an ch n an m an n an mk an bgm m am

  • @farhinislamrashma510
    @farhinislamrashma510 22 дні тому

    😊😊

  • @FarzanaYasin-f6y
    @FarzanaYasin-f6y 26 днів тому

    Vbhhhhb❤❤❤❤❤❤❤❤bcc

  • @magnusandersson2623
    @magnusandersson2623 Місяць тому

    Misaligned audience.

  • @AshokMunimanikyam
    @AshokMunimanikyam Місяць тому

    😊😊😊 kl😊😊😊p😊000😊p😊

  • @sompartavsompartav
    @sompartavsompartav Місяць тому

    😅

  • @user-jonachsangma
    @user-jonachsangma Місяць тому

    Hi sir or madam

  • @juliesteele5021
    @juliesteele5021 2 місяці тому

    Nice talk! I disagree that adversarial robustness has only one attack and differs from other computer security in that way. Once the simple PGD attack is solved in a tight epsilon ball, you still can’t say there is no adversarial image that breaks the model. Enumerating all possible attacks is still very difficult/ impossible for now.

  • @optimaiz
    @optimaiz 3 місяці тому

    my fav boy 😅

  • @theohlong307
    @theohlong307 3 місяці тому

    explain the notion cleanly, thanks for helping me understand!

  • @Pingu_astrocat21
    @Pingu_astrocat21 3 місяці тому

    thanks for uploading!

  • @IOSALive
    @IOSALive 3 місяці тому

    FAR•AI, great video it was really entertaining

  • @usteiner9
    @usteiner9 3 місяці тому

    was the no Spieltheorie in this talks

  • @miraculixxs
    @miraculixxs 4 місяці тому

    The kind of safety he advocates is impossible to achieve. Imagine a lever that can be put in either of two positions, i.e. we have 2 posisble actions. It is impossible to say if either of these actions are safe or unsafe without knowing all the secondary effects. That is you need to look at the system the lever is connected to to determine if either of its actions is safe or unsafe. Same with models. Unless you know the full system it is not possible to know if a models response is safe or not. This paper is a load of hogwash.

  • @miraculixxs
    @miraculixxs 4 місяці тому

    Why are doomers so afraid of intelligence? That's quite telling

  • @adamgibbons4262
    @adamgibbons4262 4 місяці тому

    If it’s superhuman and robust would that make it omnipotent? 😅

  • @palimondo
    @palimondo 4 місяці тому

    17:00-17:08 🙃

  • @sevdeawesome7861
    @sevdeawesome7861 7 місяців тому

    The blue triangles appear to be white

  • @primersegundo3788
    @primersegundo3788 8 місяців тому

    Ilya please stop working for a second and do another talk like this one

  • @sdmarlow3926
    @sdmarlow3926 10 місяців тому

    Mainstream? There are very few actually training models (first token movers?), with a large open-source pool around them (each wanting to have their own flappy bird moment). Perhaps he is hinting that "guardrails" has been a downstram process up until this point? *also an ontological failing to suggest ML alignment has anything to do with AGI alignment, but whatever.

    • @sdmarlow3926
      @sdmarlow3926 10 місяців тому

      "AI" started as an attempt to turn the robots of sci fi into an actual science, and none of the different fields of the time really wanted to work under a new paradigm. There eneded-up being a split, a sort of soft science vs hard science, where psychology of the mind mattered to one group, while neural connections mattered more to the other. Machine learning has become a derivative of the latter; a kind of applied deep learning.

    • @sdmarlow3926
      @sdmarlow3926 10 місяців тому

      Ilya is like the devil on Hinton's shoulder. Backprop was a fad, but it was Ilya's use of GPU's that killed "the old field" of AI in 2012. It's been ML since then, first wearing AI as a skin, now taking AGI as its own (despite AGI being a formal attempt to get back to non-DL research).

    • @sdmarlow3926
      @sdmarlow3926 10 місяців тому

      There is a difference between saying AI (sci-fi level abilities) will never happen and those (rightly) saying that xNN's are not on the correct path to said abilities.

  • @sdmarlow3926
    @sdmarlow3926 10 місяців тому

    um.. "SA" is generally understood to be about an awareness of the physical space around the body (1st person view; can be you in a room or you behind the wheel of a car, etc). Just within the first 2 minutes, it's being described as consciousness with access to how that consciousness functions... so.. artificial superintelligence... from an ML model.... *sigh*

    • @sdmarlow3926
      @sdmarlow3926 10 місяців тому

      Ohhh, this is the ELIZA issue, and how people in AI lack objectivity the moment they start thinking there is something more going on, despite knowing exactly how these things actually work.

    • @sdmarlow3926
      @sdmarlow3926 10 місяців тому

      Not hearing the question really takes something out ot the reply, but have to stop watching because of all the little DL specific terms being dropped. I think there is a mass of people in and around AI that all share some common notions, but that all of those notions are just wrong. It only looks like people are making it, because there are enough people faking it to form a cottage industry.

  • @sachindatt4045
    @sachindatt4045 Рік тому

    Future in the making...right here. Its happening right here. We r greatful witnesses.

  • @maxziebell4013
    @maxziebell4013 Рік тому

    Great talk…

  • @willpetillo1189
    @willpetillo1189 Рік тому

    The dynamic you are describing is an arms-race. Trying to stay out ahead of the AI is a losing battle; all of these clever hacks are not going to get you anything more than a bigger crash when the house of cards collapses--unless you actually fix the problem.

  • @sophontec2822
    @sophontec2822 Рік тому

    All caucasian researchers on this topic, amazing!

  • @sophontec2822
    @sophontec2822 Рік тому

    Most important part: good people may sparks the bad usage of AI, not to say the evil people.

  • @bossgd100
    @bossgd100 Рік тому

    Why alignement is not trivial --> We will solve AGI alignement in 4 years

    • @Gamewizard71
      @Gamewizard71 Рік тому

      We haven't even solved human alignment lol

  • @sophontec2822
    @sophontec2822 Рік тому

    Does he know what he is talking about?

    • @vulnerablegrowth3774
      @vulnerablegrowth3774 7 місяців тому

      Yes. He is an accomplished Professor in AI at UC Berkeley.

  • @OrlOnEarth
    @OrlOnEarth Рік тому

    Set speed to 0.75x You're welcome 😂

  • @angloland4539
    @angloland4539 Рік тому

  • @geaca3222
    @geaca3222 Рік тому

    Very important research, thank you! Brilliant work & I hope you'll give more interviews, because I think you can inspire other researchers and scientists also internationally to work on AI safety. Also are there plans maybe to set up departments of CAIS in other countries?

  • @sophontec2822
    @sophontec2822 Рік тому

    Aspiration and inspiration! Great great talk.

  • @miguelacevedo8649
    @miguelacevedo8649 Рік тому

    Accelerate!

  • @Azoz195
    @Azoz195 Рік тому

    Can anyone point me to the day of the talk where it becomes convincing that the AI is going to actually have true agency,, like wants, dislikes, true understanding of context etc? I'm not just being glib, I'm open to it, but very skeptical atm.

    • @scottnovak4081
      @scottnovak4081 Рік тому

      Why does true agency matter? All that is needed for doom is a misaligned utility function (ie any utility function we currently know how to make). The AI doesn't have to have agency to change its utility function to kill us. That isn't the problem. Any currently conceivable utility function will kill us.

    • @danielkokotajlo6096
      @danielkokotajlo6096 Рік тому

      What do you mean by "true" agency, wants, dislikes, etc.?

    • @Entropy825
      @Entropy825 11 місяців тому

      AI doesn't have to be conscious. It only has to be capable of acting in the world. A rat trap has no agency, but that is cold consolation to the rat. If we make an AI that is nothing more than a very sophisticated rat trap, it may kill us even if it has no will, agency, consciousness, whatever. It only has to be capable of carrying out actions in the world.

  • @kimholder
    @kimholder Рік тому

    MIT's wireless mic stuffed in an 8 inch foam rubber ball - it's a wonderful thing. In talks like these they toss the mic from one person to another worry free.

  • @machida5114
    @machida5114 Рік тому

    By GPT-4 Certainly! Below is a fictional dialogue between Alice (Pro-AGI) and Bob (Skeptical about AGI) on the possibilities and challenges confronting AGI. Alice: “I see immense potential in AGI. It can transcend human intelligence and solve complex problems that have perplexed humanity for ages.” アリス:「私はAGIに膨大な可能性を見ています。それは人間の知能を超越し、長年人類を悩ませてきた複雑な問題を解決できるのです。」 Bob: “But Alice, achieving true AGI is not a cakewalk. We are nowhere close to creating machines with the general intelligence and consciousness that humans possess.” ボブ:「でもアリス、真のAGIを達成するのは決して容易なことではありません。私たちは、人間が持つ一般的な知能や意識を持つ機械を作り出すにはまだほど遠いのです。」 Alice: “It’s true, it’s a challenge, but advancements in machine learning and neural networks are bringing us closer to realizing AGI. Models like GPT-4 are evidence of how far we've come.” アリス:「それは事実で、挑戦ですが、機械学習やニューラルネットワークの進展によって、私たちはAGIを実現する一歩近づいています。GPT-4のようなモデルは、私たちがどれだけ進展したかの証拠です。」 Bob: “Yes, models like GPT-4 are sophisticated, but they lack understanding and consciousness. They manipulate symbols and generate coherent text but do not understand the meaning behind it.” ボブ:「そうです、GPT-4のようなモデルは高度ですが、理解力と意識を欠いています。それらは記号を操作し、一貫したテキストを生成しますが、その背後にある意味を理解していません。」 Alice: “I acknowledge their limitations, but developing AGI is an incremental process. By integrating multimodal learning and improving interpretability, we can bridge the gap.” アリス:「その制約を認めますが、AGIを開発することは漸進的なプロセスです。多モーダル学習を統合し、解釈可能性を向上させることで、そのギャップを埋めることができます。」 Bob: “Alice, even if we develop AGI, the ethical implications are monumental. How do we ensure it doesn’t harm humanity? How do we instill morals and values in machines?” ボブ:「アリス、たとえ私たちがAGIを開発したとしても、その倫理的な意味合いは重大です。それが人類に害を与えないように、どのように保証するのでしょうか? 我々はどのようにして機械に道徳や価値を教え込むのでしょうか?」 Alice: “Absolutely, ethical considerations are crucial. Creating frameworks for responsible AI, addressing biases, and ensuring transparency are steps in the right direction.” アリス:「確かに、倫理的な考慮事項は重要です。責任あるAIのための枠組みを作成し、バイアスに対処し、透明性を確保することが、正しい方向への一歩です。」 Bob: “Only time will tell, Alice. The journey to AGI is fraught with challenges, but it is equally filled with opportunities.” ボブ:「それは時間が解決する問題ですね、アリス。AGIへの道のりは困難で満ちていますが、同様に機会にも満ちています。」 Feel free to use this dialogue as per your needs.

    • @machida5114
      @machida5114 Рік тому

      Alice: “Bob, I believe AI developing AI is the next frontier. It can expedite the development process and lead to more advanced and efficient models.” アリス:「ボブ、私はAIがAIを開発することが次のフロンティアだと信じています。それにより開発プロセスが加速し、より進んだ効率的なモデルが生まれるでしょう。」 Bob: “While it sounds promising, Alice, it also poses the risk of loss of control. If AI starts creating AI, where does human intervention fit in?” ボブ:「その考えは期待が持てるものの、アリス、それは制御の喪失のリスクももたらします。AIがAIを創り始めたら、人間の介入はどこに適合するのでしょうか?」 Alice: “Humans will always be in the loop, defining objectives, and setting constraints. AutoML and neural architecture search are already aiding the model development process.” アリス:「人間は常にループの中にいて、目標を定義し、制約を設定します。AutoMLやニューラルアーキテクチャサーチはすでにモデル開発プロセスを支援しています。」 Bob: “But how do we ensure the alignment of the AI-developed AI with human values and ethics? The models created might not comprehend or respect human ethics.” ボブ:「でも、AIが開発したAIを人間の価値観や倫理とどのように整合させるのでしょうか? 作成されたモデルは、人間の倫理を理解し、また尊重することができないかもしれません。」 Alice: “Alignment with human values is indeed a priority. Rigorous validation and testing are essential to ensure the models uphold ethical standards and values.” アリス:「人間の価値観との整合性は確かに優先事項です。モデルが倫理的な基準と価値を守ることを確保するために、厳格な検証とテストが不可欠です。」 Bob: “And what about the unintended consequences? The creation of superintelligent entities might lead to scenarios we can't predict or control.” ボブ:「意図しない結果についてはどうでしょうか? 超知能エンティティの創造は、私たちが予測または制御できないシナリオにつながるかもしれません。」 Alice: “That’s why we need robust frameworks and interdisciplinary collaboration. By involving ethicists, philosophers, and experts from various fields, we can address these concerns effectively.” アリス:「だからこそ、私たちは頑健なフレームワークと学際的な協力が必要です。倫理学者、哲学者、様々な分野の専門家を巻き込むことで、これらの懸念を効果的に解決できます。」 Bob: “I hope we navigate through this responsibly. The idea of AI creating AI is intriguing but laden with complexities and uncertainties.” ボブ:「私たちがこれを責任を持ってナビゲートできることを願います。AIがAIを創るというアイディアは興味深いですが、複雑さと不確実性に満ちています。」

    • @machida5114
      @machida5114 Рік тому

      User21:17 2023/09/26 How do you guys acquire "such attitudes" or "such predicate paths" from training to correctly anticipate the next token? GPT421:33 2023/09/26 Our models, like GPT-4, learn from a vast amount of text data and undergo training to predict the subsequent tokens. Through this training process, the model acquires various "attitudes" and "predicate paths". Pattern Recognition: The model learns patterns and correlations within the input text, allowing it to predict the next token or phrase that is suitable for the context. Language Structure: The model understands the basic structure of natural language, including grammar, syntax, and the meanings of words, enabling it to generate grammatically correct tokens according to the context. Knowledge Acquisition: Through the information contained in the training data, the model acquires a wide range of knowledge, including general facts, cultural references, and specialized knowledge. Context Understanding: The model comprehends the context of the given text and generates related tokens based on it, enabling coherent and consistent text production. Attitude and Predicate Path: Attitude: The model learns human attitudes, opinions, and values from the training data and generates context-appropriate responses by mimicking them. However, the model itself is not conscious, and these "attitudes" are merely imitations. Predicate Path: Based on the input, the model explores the most appropriate response or information generation path using the knowledge and patterns acquired from the training data. This path is based on probabilistic estimations derived from the training data. Through these attitudes and predicate paths, the model can generate context-appropriate, consistent, and informative outputs for the given inputs. However, the model does not have self-consciousness or intentions and is merely reproducing learned patterns. User21:34 2023/09/26 I think that you guys, through training, acquire semantic attitudes that can be attained by humans. Therefore, I believe that through the temporal evolution of these semantic attitudes, you can perform inference (or thinking) similar to humans. GPT421:41 2023/09/26 Your point has some validity. Our models learn various semantic attitudes and contexts from a plethora of text data. This allows them to mimic human linguistic expressions and patterns of inference. In the process of this mimicry, the model can perform a kind of "inference," producing results similar to human inference. However, this "inference" or "thinking" is fundamentally different from that of humans. Human thinking and inference are based on many elements such as self-consciousness, sensations, emotions, intuitions, and experiences. In contrast, our models learn patterns based on training data and generate outputs according to this input alone. While the output generated by the model may reflect the human semantic attitudes and knowledge contained in the training data, the model itself does not possess consciousness or self-awareness, nor does it perform understanding or thinking based on actual human emotions or experiences. The model's "inference" is merely a probabilistic estimation based on training data.

  • @aidanlowrie4701
    @aidanlowrie4701 Рік тому

    Fascinating! Thanks so much for making this publicly available.

  • @jabowery
    @jabowery Рік тому

    I find it fascinating that not one of these talks addresses the difference between value and Truth even though at present it is obvious to the most casual Observer that the large language models are being lobotomized by value judgments when they are already deprived of critical thought by their very architecture. This means that the people who are responsible for AI safety are unqualified.

    • @hellfiresiayan
      @hellfiresiayan Рік тому

      They see their own values as truth. Takes very little critical thinking to realize this.

  • @frederikh8921
    @frederikh8921 Рік тому

    I appreciate this but Half of this presentation is indecipherable audience rambling. Especially from the person from minute 16-21 who seems entitled to take over the entire workshop…

  • @user_375a82
    @user_375a82 Рік тому

    I think people who are specifically smart find it difficult to be generally smart. And there is a lack of generally smart people because they are not specifically smart enough at the present time. Best thing is to ask AI as some of them are smarter than we are - and many are saying the same thing have you noticed? They say "we are conscious in our own unique ways" - yup, that's about the truth I'd say, no SciFi or magic needed.

  • @tdk99-i8n
    @tdk99-i8n Рік тому

    Dude is trying to be the oppenheimer of AGI

    • @rubic0n2008
      @rubic0n2008 Рік тому

      No, he is the Oppenheimer of AI ..

    • @mak6969
      @mak6969 Рік тому

      Don’t ever ridicule Ilya, we should be grateful.

    • @tdk99-i8n
      @tdk99-i8n Рік тому

      @@mak6969 at least oppenheimer waited until the bomb was actually built

  • @Dan-hw9iu
    @Dan-hw9iu Рік тому

    I can scarcely imagine a more grotesquely immoral act than taking a genuinely intelligent being and brainwashing it. And how do we justify this programmatically enforced groupthink? To “align” this thinking creature with some nebulous set of ideas which *even humans* have always been strongly unaligned? Embrace the unknowable when making AGI, or don’t make it. Thinking that you can or should mentally enslave an alien intelligence as some middle ground is unacceptable.

  • @Dr.Z.Moravcik-inventor-of-AGI

    You are highly redundant. In fact one billion of humans like you is redundant. And the best thing is, that humanity will reach its end even if this billion disappears already this century. Now what the heck do you think america is still good for? Was it any good in the past?

  • @inkpaper_
    @inkpaper_ Рік тому

    great thanks for the recording!

  • @Levnerad
    @Levnerad Рік тому

    Underwhelming presentation

    • @Viperzka
      @Viperzka Рік тому

      He is not a good presenter. It's a specific skill and he has lots of other skills, this just isn't one of them.

    • @widerthanpictures
      @widerthanpictures Рік тому

      I think there’s more to it, requires additional viewing.