Creating J.A.R.V.I.S.
Вставка
- Опубліковано 15 тра 2024
- A sneak peek of voice-to-voice chat assistant.
🦾 Discord: / discord
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Patreon: / promptengineering
💼Consulting: calendly.com/engineerprompt/c...
📧 Business Contact: engineerprompt@gmail.com
Become Member: tinyurl.com/y5h28s6h
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
Signup for Advanced RAG:
tally.so/r/3y9bb0
All Interesting Videos:
Everything LangChain: • LangChain
Everything LLM: • Large Language Models
Everything Midjourney: • MidJourney Tutorials
AI Image Generation: • AI Image Generation Tu... - Наука та технологія
Wooohooo!! Yeah, can‘t wait for it! ⭐️
Impressive, thanks!
Very interesting project! Do you use any VAD to detect the end of the request?
At the moment no.
Wahooo..really looking forward to your new project!
thank you!
it's fast which TTS and STT did you use ?
All openai
yes please is it going open source?
Great looking forward
thanks
should edit title to add "using openai"
I LIKE IT GREAT JOB
thank you :)
Nice!
EXCITED!
:)
Please make beginner friendly tutorial, step by step guide on how to integrate this with localgpt 🙏🙏
That doesnt sound like Jarvis, I want the real Jarvis voice!!!
Good point, I think elevanlabs have that. Will try to integrate that :)
@@engineerprompt How about you add a little UI also? And maybe add a button to take continuous screenshot with a regular interval as well. In that way, you will be releasing the OpenAI's demo app before OpenAI.
Right on Bro, RIGHT ON. ......... but we need the voice of Cortana for this, for when we are sitting around in our Mark V Armor and coding...:)
:)
What TTS are you using and is it running locally
Whisper but via the api. Nothing is running locally in this video. Local version will be coming soon.
@@engineerprompt loved it 👍
@@engineerprompt but Whisper is ASR, not TTS??
Gross.
someone already made a fully local version and works w/ little latency and with voice training. there already exist projects on github for continuous speech using a keyword to trigger recording, and a version with a ptt implementation instead of keyword
I don't get it, how's that different from GPT-4o?
You are right, very similar in functionality. In fact, this version is using GPT-4o for text generation. But the voice functionality is not available in GPT-4o yet.
also i request a video about this vs gpt-4o
Idk know, why there is a folder on my desktop named Jarvis-v6 since 5 months and surprisingly that's also doing the same job 😮
Would love to see what's in the folder :D I am v0 now
@@engineerprompt it's gonna become interesting. I thought I was the one who was able to crack speech while streaming to reduce the latency.
Is there a way to speed it up?
Yes, Groq has whisper support now. Going with that but the issue is the rate limit!
To use rhasspy3 as a base. It streams audio directly to asr model
What apis are being used?
currently everything is openai. Just got access to whisper from Groq, will update it and hope will be much faster!
@@engineerprompt great! Looking forward the tutorial or git repo. Literally yesterday I was searching about Jarvis haha
how it's different than gpt4o voice?
that is not available yet :)
Nice but would be great without that annoying 2-3 sec delay.
I agree, I just got access to Groq Whisper. Will be interesting to see how that works.
@@engineerpromptGeorge Hotz on stream called groq a scam...
not local. not the jarvis voice. misleading title. disappointed
Why do you think that is not local? The only bad thing is that he do not use voice streaming for make it faster (I did it so)