Mandi Zhao: New Perspectives on Harnessing Foundation Models for Robot Learning

Поділитися
Вставка
  • Опубліковано 12 жов 2024
  • Abstract: Foundation models are both appealing and perplexing to roboticists: they possess clear superior capabilities in reasoning and understanding our world that should be useful for robots; however, the high-level pre-trained capabilities are often lacking at low-level embodied tasks, hence it’s often unclear how to effectively incorporate these models into robotic systems. In this talk, I will present two recent projects that explore different approaches to this problem, and hope the lessons and discussions here will spark new ideas and direction in the future. First is RoCo: Dialectic Multi-Robot Collaboration with Large Language Models (arxiv.org/abs/..., which uses zero-shot Large Language Models (LLMs) as a communication tool to facilitate multi-robot collaboration. Second is Real2Code: Reconstruct Articulated Objects via Code Generation (arxiv.org/abs/..., where we adapt both LLM and large pre-trained vision models to propose a new approach to the Real2Sim2Real problem for articulated objects.
    Bio: Mandi Zhao is a third year PhD student at Stanford University advised by prof. Shuran Song. She earned her B.S. and M.S. from UC Berkeley, advised by Prof. Pieter Abbeel at Berkeley AI Research (BAIR). She has interned at Meta AI working on using generative models for augmenting multi-task robot learning, and Nvidia Seattle Robotics Lab working on dexterous robot manipulation. She is broadly interested in AI for robotics: data-driven approaches that enable embodied systems to perceive, reason, a make sequential decisions in the real world.

КОМЕНТАРІ •