How to build a real-time AI assistant (with voice and vision)

Underfitted

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 1 гру 2024

КОМЕНТАРІ • 159

@toddroloff93 5 місяців тому ⁺¹⁵
Incredible video. You're taking your content to the next level. Keep up the good work and thankyou for all you do.
@Sachin-ww1ej Місяць тому ⁺⁶
Hello! I just wanted to take a moment to express how much I love your videos. They’ve inspired us to work on a project, but we’ve hit a wall with an error that we can't seem to overcome. After struggling for nearly 7 hours, we’re starting to lose hope. If you could lend us a hand, we would be incredibly thankful. It’s just a small request from two 14-year-olds who are eager to learn from you. Thank you for considering it! ❤
@ZahedAshkara-q6u 4 місяці тому ⁺⁵
Sir, your AI voice assistant demo's are one of the most valuable and appreciated youtube videos I have come across. Please keep them coming and also would be great to do a demo with Groq for solving latency issue. You are doing great work man and your students are really appriciating it! Thanks a lot brother!
@bimalnair 3 місяці тому
Absolutely fabulous! Thanks for making this one! I loved it!!
@moacirosa 5 місяців тому ⁺²
Amazing content with solid explanation. Thanks very much 👏
@underfitted 5 місяців тому
Glad you liked it!
@Sachin-ww1ej Місяць тому
Hi! I just wanted to say how much I appreciate your videos. They’re so inspiring! We’re currently working on a project and tried to follow your amazing example, but we’ve encountered an error that we just can’t figure out. After almost 7 hours of struggling, we’re feeling really disheartened. If you could help us out, it would truly mean everything to us. We’re just two 14-year-olds trying to learn. Thank you for even considering this! ❤
@ronsinolast 4 місяці тому ⁺¹¹
Hi, this is great. I tried and its working. I tried to introduce my self, then I ask, "Do you know my name ?" the response is "I'm not able to remember past conversations." So, Can we make it remember the conversations, and also "remember" my face ?
@scott701230 4 місяці тому ⁺⁵
That actually a good question. I saw a paper published about persistent memory: short term memory, medium long term memory and building an automated RAG system to automatically RAG information to long term memory, so we can create an assistant that’s goal orientated and proactively manages you and towards meeting your goals.
@zaves1 2 місяці тому
Thats where profiles come in, similar to how desktops have user account's, I'm sure you could build a user profile with context that surrounds you personally that the ai could reference when being asked personal questions or when it needs that type of context
@davieslacker 5 місяців тому ⁺³
Really cool stuff... I def plan to recreate some of these things along with you when I have a bit more time at my computer. Just a thought, adding screen capture in with this would be pretty cool too to get help with whatever applications you're in... I would imagine you could include both camera and screenshot images in the same context and it should be able to distinguish which you're asking about.. or build a different tool that it can function call for that. Can't wait until we get some slightly more expressive voices as an option like OpenAI teased us with.
@Sachin-ww1ej Місяць тому
Hey! I really love your content! We have an upcoming project and were blown away by your amazing work. We’ve done quite a bit ourselves, but we’re stuck on an error we can't seem to fix, even after 7 hours of trying. We're feeling pretty hopeless and would be so grateful for any guidance you could offer. It’s just a heartfelt request from a couple of inspired 14-year-olds. Thank you for your time! ❤
@codemonkey2k5 3 місяці тому ⁺⁷
Is there any way you could do a version of this that can use a locally run Ollama server? Even if it means that I lose the image feature.
@johanransbygranberg2219 2 місяці тому
My first thought too!!
@Sachin-ww1ej Місяць тому
Hello! I wanted to say how much I enjoy your videos. We’re currently working on a project and your work has been a huge inspiration for us. However, we’ve run into an error we can’t fix, and despite working on it for almost 7 hours, we’re ready to throw in the towel. If you could help us out, we’d really appreciate it! Just a simple request from two 14-year-olds eager to learn from you. Thanks so much! ❤
@Sachin-ww1ej Місяць тому
Hi there! I absolutely love your videos. We’re working on a project and found your work truly inspiring. While we've managed to get quite a bit done, we've hit a wall with an error that we just can't seem to resolve. After nearly 7 hours of trying, we’re feeling pretty defeated. If you could lend us a hand, we would be so grateful. It's just a small request from a couple of 14-year-olds who admire your amazing talent. Thank you for considering it! ❤
@riemannderakhshan1037 5 місяців тому ⁺¹
You turned your videos to the next level which is pretty amazing. I would like you to ask if is possible, show us how to use open source models in those apps. Thank you in advance.
@sumitdevraye9725 5 місяців тому ⁺¹
Great video. Keep these coming.
@minusface1827 Місяць тому ⁺¹
i am stuck at the part that we create envioremental varbiabals for the keys and stuff how we do that and with bile or where do we have to put the keys???
@SaddamBinSyed 3 місяці тому ⁺⁴
Hi ..thanks for the nice video.. Can we use local LLM (like ollama) instead of a paid one..?
@iamaresellerinkerala Місяць тому
yes
@iitjeephysics2789 3 місяці тому ⁺²
from livekit import agents, rtc
ImportError: cannot import name 'agents' from 'livekit' (unknown location). I am getting this error.
@dheerajmadaan866 4 місяці тому ⁺¹
This was a really cool stuff. Thanks for sharing such a quality stuff. I ran it on vscode and it worked. The main problem is the latency. It took like 10s for the conversation. Not sure if it is because of the free account or their websocket API has the issue.
@Sachin-ww1ej Місяць тому
Hello! I just wanted to say how much I enjoy your videos. We’re currently working on a project that’s inspired by your amazing work, but we’ve hit a snag with an error we can’t figure out. After trying for almost 7 hours, we’re feeling pretty stuck. If you could help us out, we’d be incredibly grateful! It’s a simple request from two 14-year-olds who look up to you. Thank you so much! ❤
@Sachin-ww1ej Місяць тому
Hi! I just wanted to express how much I enjoy your videos. They’ve been a huge inspiration for us. We’re currently working on a project, but we’ve hit a wall with an error we can’t resolve. After nearly 7 hours of trying, we’re feeling really lost. Any help you could provide would mean the world to us. Just a simple request from two 14-year-olds trying to learn. Thank you so much! ❤
@kalash9114 25 днів тому
hi there ! looking good ! however there is an issue with the agent receiving the camera feed. I get the Trigger Vision Capability but nothing happens after. Any idea ?? many thanks !
@pienik.delrieu 26 днів тому
hi, is there a repo of the project same as the one you showed in the video ?
@Sachin-ww1ej Місяць тому
Hey! I just wanted to say how much I appreciate your content. Your videos inspire us to take on new projects. We’ve been trying to recreate some of your amazing work, but we’ve hit a frustrating error that we can’t figure out. After nearly 7 hours of effort, we’re feeling pretty lost. Any help you could offer would be incredibly appreciated. It’s a small request from two 14-year-olds who admire you. Thank you! ❤
@gundamfreedom006 16 днів тому
Wow, a million likes for this video! Can you break down the steps to create it?
@ashishtandi4440 3 місяці тому ⁺¹
Incredible!. I tried the same thing there is a noticeable delay. I am not sure if it is the TTS or STT or the LLM API itself. While yours and the default demo at Livekit is damn fast.
@GameCasters Місяць тому
I'm waiting for this to come to mobile devices because i always have my phone with me, it'll be like always having my little expert with me to answer my questions.
@Migueldicostanzo 24 дні тому
Great Video , so what do i need to do to run this as an app or on a Mobile device ? Can you guide me ? Do you charge to teach ? Thank you
@Sachin-ww1ej Місяць тому
Hey! I hope you don’t mind me reaching out. Your videos are a huge inspiration to us. We’ve been working on a project and were excited to try our hand at something similar, but we’ve run into an error that we can’t solve. After nearly 7 hours of effort, we’re feeling pretty defeated. If you could offer any guidance, we would be forever grateful. Just a humble request from two 14-year-olds who look up to you. Thank you for your kindness! ❤
@7BlackJack8 5 місяців тому ⁺³
Can be used with google flash? Thanks for super content!❤
@Max-n1p5h Місяць тому
ufortunately after changes on livekit, your code does not work any longer, they do not allow any longer this visualization, can you please update it?
@Sachin-ww1ej Місяць тому
Hi! I hope you’re doing well. I can’t express how much I admire your videos. They’ve inspired us to start a project, but we’ve hit a major roadblock with an error we can’t solve. After nearly 7 hours of trying, we’re feeling quite stuck. If you could offer any assistance, we would be so grateful. Just a small request from two 14-year-olds who look up to you. Thank you! ❤
@rakeshkumarrout2629 18 днів тому
hey how can we integrate our own custom frontend into it?
@Sachin-ww1ej Місяць тому
Hi there! I hope you’re doing well. I just wanted to reach out to say how much I admire your videos. We’re working on a project inspired by your amazing work, but we’ve run into a problem we can't fix. We’ve spent almost 7 hours trying to resolve this error, and we’re feeling really overwhelmed. If you could spare a moment to help us, it would mean the world to us. We’re just a couple of 14-year-olds trying to learn and grow. Thank you for considering our request! ❤
@edgarl.mardal8256 5 місяців тому
Hi, I am working on creating a closed lan-network, using per to per, and will input a live AI agent, locally stored, getting knowledge from LLM, and wonder if it is possible to have this kind of system then running without using internet?
@vocapal2024 3 місяці тому ⁺¹
Sir，what is the latency？ same as gpt4o s demo or much longer？
@delapeakierven8491 2 місяці тому
Sir can you also include to capture the device screen. Thank you
@Ryguy12543 Місяць тому
this is huge. love these two videos. thanks for introducing me to this. I have made so many function calls with livekit assistants and am wondering: how do you think we can make the function calls more consistent? can we use keywords or keyphrases? Thanks again.
@Ryguy12543 Місяць тому
I've tried adding different descriptions to the AssistantFunctions() and tried including references to the function calls in the system message, but it seems a bit inconsistent with processing the latest_image :)
@AndreyMakavelli 5 місяців тому
Great info, thx! Is there a way to use local LLM (like ollama, local AI etc) on this platform instead of openai?
@mystealthlife6991 Місяць тому
How can i use this with locally hosted Ollama?
@InAMinute-ws3yv 2 місяці тому
Hi can you please create vision to see content of laptop screen and same voice conversation. Then it will be actually more useful
@huangphoenix 4 місяці тому
Great video, keep going. Just wonder if you can add barge-in function?
@Sachin-ww1ej Місяць тому
Hello! Your videos are absolutely amazing, and they’ve motivated us to work on a project of our own. However, we’ve run into a challenging error that we can’t seem to fix. After almost 7 hours of struggling, we’re feeling quite overwhelmed. If you could help us out, we would be incredibly thankful. It’s just a humble request from two inspired 14-year-olds. Thank you for considering it! ❤
@Sachin-ww1ej Місяць тому
Hi! I hope you don’t mind me reaching out. Your videos have had such a positive impact on us. We’re working on a project inspired by your work, but we’ve hit a snag with an error we can’t resolve. We’ve been at it for nearly 7 hours and are feeling pretty defeated. If you could help us in any way, it
@Sachin-ww1ej Місяць тому
Hi! Your videos are fantastic! We’re tackling a project and were inspired by what you’ve created. However, we’ve encountered a frustrating error that we can't solve, despite nearly 7 hours of effort. We’re on the verge of giving up and would greatly appreciate any help you could provide. This is just a humble request from two 14-year-olds who admire your work. Thank you for considering it! ❤
Version 5
@sharplcdtv198 5 місяців тому ⁺¹
your code generally doesn't run in VScode in windows... some things seem platform dependent unfortunately
@underfitted 5 місяців тому
I don’t think it’s a problem with my code… it’s a problem with Windows. Try WSL.
@AmitMarx-ei8tt 4 місяці тому ⁺³
Got stuck with the API Keys, i'm not sure how to set them
@dmitrypehovski 5 місяців тому
Hi , i start test with all your steps and got stuck on the fact that text and audio from the openai api are not transferred to livekit, all requests pass in the terminal , tried many solutions...doesnt work
@densonsmith2 5 місяців тому
I think I may have a similar issue on Windows there is some problem with the ffmpeg library.
@gabeclements 2 місяці тому
at minute 5:16 you talk about asyncio.create_task(_answer(msg.message, use_image=False)). I want the assistant to alway reply with a variation of 1 specific response. How do I make it say something specific?
@Sachin-ww1ej Місяць тому
Hello! I just wanted to let you know how much I admire your videos. They’ve really motivated us to start a project of our own. However, we’ve come across an error that we simply can’t fix, and after almost 7 hours of effort, we’re feeling quite disheartened. If you could offer any guidance, we would be eternally grateful. This is just a small request from two 14-year-olds trying to learn and grow. Thank you! ❤
@solanobordim 3 місяці тому
This could be very useful for blind people. Thank you
@sengosy 5 місяців тому
sir can you help me why my assistant isnt talking back or nothings happening but its recognising in chat what im saying
@jimmywang6177 5 місяців тому ⁺¹
very interesting! thank you!
@WTF-Zone Місяць тому
i cloned your repo, and did it, but can't access video, still shown message 'waiting for video track'
@home-s1s 24 дні тому
i am having the same problem no matter what i do. Everything else works just waiting on video track even though my video feed is showing up on the right. Did you ever figure it out?
@chrismcnabb797 17 днів тому
Same Problem
@aaronwenniger7966 5 місяців тому ⁺²
now i keep running into troubles when using this code,
I would love to be able to discuss this so i can get it fixed i want to implement some features to see if it can work for something else to.
@AI_by_AI_007 5 місяців тому ⁺¹
Yes the API keys do not pass -- what are you experiencing?
@aaronwenniger7966 5 місяців тому
@@AI_by_AI_007 Hi Yes,
So i had to rework the code a little bit to get everything working again.
And now its working great except that the voice of the AI is not working and i cannot give voice commands anymore.
@Noahperaudon 5 місяців тому
@@aaronwenniger7966How do you have do for the livekit api key ?
@Noahperaudon 5 місяців тому
How for the livekit api key ?
@aaronwenniger7966 5 місяців тому ⁺¹
@@Noahperaudon ?
@jameszhang2832 5 місяців тому
Fantastic, thank you very much. How would you adapt your code if you have multiple participants?
@Sachin-ww1ej Місяць тому
Hello! I hope you’re having a great day. Your videos have been such a source of inspiration for us. We’re working on a project based on your work, but we’ve encountered an error that has us stumped. We’ve spent nearly 7 hours trying to fix it, and we’re feeling quite discouraged. If there’s any chance you could assist us, we would truly appreciate it. This is just a humble request from two 14-year-olds who look up to you. Thank you! ❤
@Sachin-ww1ej Місяць тому
Hi! I hope you don’t mind me reaching out. Your videos have had such a positive impact on us. We’re working on a project inspired by your work, but we’ve hit a snag with an error we can’t resolve. We’ve been at it for nearly 7 hours and are feeling pretty defeated. If you could help us in any way, it would mean the world to us. Just a humble request from two 14-year-olds. Thank you so much! ❤
@amazingvideos4824 3 місяці тому
Man this is amazing
Can we deploy it to cloud so it works from anywhere?
I deployed it to heroku but its not accessing the webcam
@rithikkumar7683 5 місяців тому
I hope we can we use gemini 1.5 pro? I will try to make this changes in old code
@ridhwanbakare3406 4 місяці тому
This is really cool. As someone with python knowledge how would you suggest I get started?
Any roadmaps or videos you published?
@SUWARNASHUKLA Місяць тому
video is very helpful
@mehmetbakideniz 4 місяці тому
great video as always. Does this system keep chat history?
@Sachin-ww1ej Місяць тому
Hi! I just wanted to reach out and say how much your videos inspire me. We’re currently working on a project and have been trying to replicate your amazing work. Unfortunately, we’ve run into a tricky error that we can’t seem to fix, and after almost 7 hours of trying, we’re feeling a bit lost. If you could help us in any way, we would be incredibly grateful. It’s just a simple request from two 14-year-olds hoping to learn from you. Thank you so much! ❤
@andriusem 5 місяців тому
Hi, great video! How to change the source code that it captures my screen, desktop. Thanks.
@裕也藤原-r7o 4 місяці тому
I wish the code would process the timeline.
@Sachin-ww1ej Місяць тому
Hi there! I’m a big fan of your work and wanted to say how inspiring your videos are. We’ve been working on a project, but we’ve hit a wall with an error we can’t solve. After about 7 hours of trying, we’re on the brink of giving up. If you could spare a moment to help us out, it would mean so much to us. Just a humble request from two eager 14-year-olds. Thank you for considering it! ❤
@abdiasj3692 4 місяці тому
would love to see how to to implement Deepgram TTS instead of OPenAI !
@underfitted 4 місяці тому ⁺¹
It’s actually very simple: simpler than what I had to do to get OpenAI working
@abdiasj3692 4 місяці тому
@@underfitted Hey thank for replying! This would be an awesome! Also using maybe openrouter as well! Wild ideas come to mind!
@iamaresellerinkerala Місяць тому
i got an error on connector
@mehershahzad-n5s 2 місяці тому
Amazing 🙂
@ind1ff3rent15 2 місяці тому
can you build with ollama
@Sachin-ww1ej Місяць тому
Hello! I’m a huge fan of your videos and the incredible work you do. We’ve been trying to create something similar for a project, but we’ve hit a major roadblock. Despite working for nearly 7 hours, we can’t seem to fix this error. We’re feeling pretty lost and would be so grateful for any help you could provide. This is just a humble request from two inspired 14-year-olds. Thank you so much for your time! ❤
@billmakatowicz8603 2 місяці тому
What does this give you that ChatGPT on your phone does not already give you?
@ainewsera 3 місяці тому
I need this but with a face to talk to me in real time. Can you do this?
@RemakeStationGames Місяць тому
Somebody can help me? I'm getting this always :(
Uncaught (in promise) NotReadableError: Could not start video source
@juanmanuelzwiener4447 4 місяці тому
Santiago, the voices of assistant are only in english? or also in spanish? abrazo crack!
@underfitted 4 місяці тому
They speak Spanish too
@insitegd7483 5 місяців тому
Thank you, It is very interesting.
@jeff_holmes 5 місяців тому
Curious about the latency. I noticed that you cut the video after each question (after 19:55), so I am assuming it was a few seconds?
@underfitted 5 місяців тому
It wasn’t bad, but GPT-4o is not as fast as it could be, so you definitely have to wait a second or so for an answer
@vesalaasanen2158 5 місяців тому ⁺¹
@@underfitted , would be nice to add at least one answer in real time so we would get more realistic picture of it.
@twetemomedical9500 3 місяці тому
Any one else not getting a response on the interface, it’s registering my commands by no response
@Sachin-ww1ej Місяць тому
Hello , love your videos , we have a project coming up and thought ur work was amazing and we wanted to learn how to make it , but unfortunately we are inexperienced and although we finished alot of it we faced an error that we jus couldn't fix no matter how long we tried , we have been trying for almost 7 hours and at the verge of giving up , if u could please please help us it would be very much appreciated, its just a humble request from a few 14 year old who are inspired by ur amazing work , please help us on our journey, and also achieve such a feat ❤
@boooosh2007 5 місяців тому
Is this functionally any different than your previous video?
@underfitted 5 місяців тому
While they work the same for the demo, my previous code is very brittle. This one is much better because I’m using an entire existing infrastructure to support it.
@Brou15O 4 місяці тому
could i get this on my smartphone?
@underfitted 4 місяці тому
As is, no. You’ll need to rewrite it in a phone-friendly language
@sr.modanez 5 місяців тому
obrigado, fantástico o vídeo 👏👏👏👏👏👏👏👏👏
@Sachin-ww1ej Місяць тому
Hello! I just wanted to reach out and share how much your videos mean to us. We’ve been inspired to work on a project, but we’ve encountered an error that’s been really tough to crack. We’ve spent almost 7 hours trying to sort it out and are feeling quite hopeless. If you could lend us a hand, it would truly make a difference. Just a humble request from two eager 14-year-olds. Thank you for your kindness! ❤
@reynoldoramas3138 5 місяців тому
Hola Santiago saludos desde Cuba, acabo de ver en su perfil de Github que es un coterráneo. Su contenido es muy valioso, por aquí un ingeniero de IA tratando de salir adelante en este mundo. Me encantaría poder contactar con usted y ayudarle en algún proyecto.
@apdurden 4 місяці тому
Yeah, this is cool but not helping. I can open the LiveKit interface but can't find a way to get the agent to connect. API keys all correct
@apdurden 4 місяці тому
I think the track management piece has changed since you made this. Running into no local_participant attribute for the Room object
@Noahperaudon 5 місяців тому
Hey I have a issue with key api livekit its telling me error like its invalid
@AI_by_AI_007 5 місяців тому ⁺¹
Me as well -- YOU on windows or MAC as you try this?
@Noahperaudon 5 місяців тому
@@AI_by_AI_007 windows
@Noahperaudon 5 місяців тому
@@AI_by_AI_007Windows
@rahahoseini1523 4 місяці тому
@@AI_by_AI_007 How can I access to the API Keys? could you please tell me step by step.
@nmstoker 4 місяці тому ⁺¹
Shame that the hook with videos is start open source and then get people draw into handling supporting functions via a commercial platform (for $$$)
@danieladama8105 5 місяців тому
This is great!
@rxWar 5 місяців тому
Nice men thanks
@avi7278 2 місяці тому
why did you cut the latency? A little dishonest don't you think? When you say working with them, are you being paid to represent their product? And if so, don't you think it's important to accurately represent it?
@densonsmith2 5 місяців тому
Has anyone gotten this to work on Windows?
@LesBrickodeurs 4 місяці тому
No I've got an error at line 12
@LesBrickodeurs 4 місяці тому
from livekit.agents.voice_assistant import AssistantContext, VoiceAssistant
ImportError: cannot import name 'AssistantContext' from 'livekit.agents.voice_assistant' (D:\Github\livekit-assistant\.venv\Lib\site-packages\livekit\agents\voice_assistant\__init__.py)
@davidkeane1820 3 місяці тому ⁺¹
@@LesBrickodeurs yes the SDKs have changed and a lot of this no longer works off the bat ..im still playing around but it probably needs redoing
@rylandboswell2288 Місяць тому
Is there anyone who could potentially build this for me?
@jsanti1000 4 місяці тому
Dangggggggg!!!
@aidanthompson5053 5 місяців тому
2:38
@process-ai 5 місяців тому
so i cannaot code can you make toturial for using ph3 which is free and have vision and also use visper ai to convert text to speech and other free tools so minimizing the cost to completely zero I am a student trying out these stuff and don't wanna pay or don't have money to pay for the API or other things so please make a toturial using all the free and open source tools
@MuditGupta07 Місяць тому
damn
@Sachin-ww1ej Місяць тому
Please reply
@sfsadfsadfasdf 2 місяці тому
i dont like this man, he just cuts the delay of the ai agent lmao... fking joke.
@ccouto2869 2 місяці тому
oh my god, amazing, tanks
@bhaskerbobby 3 місяці тому
Hi Im getting following error --> {"message": "draining worker", "level": "INFO", "id": "unregistered", "timeout": 60, "timestamp": "2024-08-16T02:26:14.669243+00:00"}
{"message": "shutting down worker", "level": "INFO", "id": "unregistered", "timestamp": "2024-08-16T02:26:14.670548+00:00"}
@YounessArjoune 2 місяці тому
How did you get passed this error if you ever did?

Наступне

Автоматичне відтворення

How to fine-tune a model using LoRA (step by step)