INSTALL UNCENSORED TextGen Ai WebUI LOCALLY in 1 CLICK!

Aitrepreneur

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 8 вер 2024
The Oobabooga TextGen WebUI has been updated once again, making it even easier to run your favorite UNCENSORED open-source AI LLM models on your local computer for absolutely free! In this video, I'll show you how to install the text-generation-webui on your computer in 1-CLICK! Plus, I'll showcase the most common and fun use cases of the webui so that you can start having fun with it right now!
Have you managed to install the Oobabooga TextGen WebUI? Let me know in the comments!
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
SOCIAL MEDIA LINKS!
✨ Support my work on Patreon: / aitrepreneur
⚔️ Join the Discord server: bit.ly/aitdiscord
🧠 My Second Channel THE MAKER LAIR: bit.ly/themake...
📧 Business Contact: theaitrepreneur@gmail.com
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
✨ PATREON LINK: / aitrepreneur
Oobabooga TextGen WebUI: github.com/oob...
TheBloke: huggingface.co...
Llava 13b model: huggingface.co...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
►► My PC & Favorite Gear:
i9-12900K: amzn.to/3L03tLG
RTX 3090 Gigabyte Vision OC : amzn.to/40ANaue
SAMSUNG 980 PRO SSD 2TB PCIe NVMe: amzn.to/3oBR0WO
Kingston FURY Beast 64GB 3200MHz DDR4 : amzn.to/3osdZ6z
iCUE 4000X - White: amzn.to/40y9BAk
ASRock Z690 DDR4 : amzn.to/3Amcxph
Corsair RM850 - White : amzn.to/3NbXlm2
Corsair iCUE SP120 : amzn.to/43WR9nW
Noctua NH-D15 chromax.Black : amzn.to/3H7qQSa
EDUP PCIe WiFi 6E Card Bluetooth : amzn.to/40t5Lsk
Recording Gear:
Rode PodMic : amzn.to/43ZvYlm
Rode AI-1 USB Audio Interface : amzn.to/3N6ybFk
Rode WS2 Microphone Pop Filter : amzn.to/3oIo9Qw
Elgato Wave Mic Arm : amzn.to/3LosH7D
Stagg XLR Cable - Black - 6M : amzn.to/3L5Fuue
FetHead Microphone Preamp : amzn.to/41TWQ4o
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Special thanks to Royal Emperor:
- Totoro
- TNSEE
- RG
- Judy Godvliet
- Gluthoric
- Jason Alcock
Thank you so much for your support on Patreon! You are truly a glory to behold! Your generosity is immense, and it means the world to me. Thank you for helping me keep the lights on and the content flowing. Thank you very much!
#GPT4 #GPT3 #ChatGPT #textgeneration
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
WATCH MY MOST POPULAR VIDEOS:
RECOMMENDED WATCHING - All LLM & ChatGPT Video:
►► • CHATGPT
RECOMMENDED WATCHING - My "Tutorial" Playlist:
►► bit.ly/TuTPlay...
Disclosure: Bear in mind that some of the links in this post are affiliate links and if you go through them to make a purchase I will earn a commission. Keep in mind that I link these companies and their products because of their quality and not because of the commission I receive from your purchases. The decision is yours, and whether or not you decide to buy something is completely up to you.

КОМЕНТАРІ • 829

@Aitrepreneur 10 місяців тому ⁺⁷⁰
HELLO HUMANS! Thank you for watching & do NOT forget to LIKE and SUBSCRIBE For More Ai Updates. Thx
@KolTregaskes 10 місяців тому ⁺²
Have you tried LLM Studio? Even easier to set up.
@obviouswarrior5460 10 місяців тому ⁺²
I am an AMD fan with a 7900 xtx, so linux only. Did the auto-install work with AMD ? Did it auto-install all the rocm ?
Je suis un fan AMD, avec une 7900 xtx, donc linux seulement. Es que l'installateur fonctionne avec AMD ? Es que l'installateur fait aussi l'installation de rocm ?
@shawnvines2514 10 місяців тому ⁺¹
@@KolTregaskes I completely agree.
@HUEMOB 10 місяців тому ⁺¹
@@KolTregaskes do you mean the h2o llm studio? it says you need 128gb system ram for it is there another version that runs with 8--16gb ram?
@LouisGedo 10 місяців тому
👋
@Misiulo 8 місяців тому ⁺⁴⁰⁵
Remember the good old days when you could just download stuff and it worked?
@chrishayes5755 7 місяців тому ⁺⁸²
Remember the good old days when guys would have relationships with real women?
Nothing is sadder and more depressing than staring at an AI image, hearing some janky broken AI voice telling you loving things. That's nightmare fuel 😂
@Misiulo 7 місяців тому
@@chrishayes5755 Indeed,
@Chocolatnave123 7 місяців тому
@@chrishayes5755 Yup, too bad those days are gone. Thankfully AI is good enough these days to replace them 🤣
@templarsassin2736 7 місяців тому
@@chrishayes5755remember the good old days when male loneliness wasn’t record breaking high
@twelve535 6 місяців тому
@@chrishayes5755 fix women and you'll start seeing more men in relationships
@tiagotiagot 10 місяців тому ⁺¹⁷⁹
It is worth noting that if you're using online TTS, then, obviously, the text is gonna be sent to their servers.
@ragoonsgg589 8 місяців тому ⁺²⁰
WOOOOOOAAAAAHHHH WATCH OUT EVERYONE, WE GOT OURSELVES A LIVE ONE
@lovely-shrubbery8578 8 місяців тому
f
@gozzyboy1989 8 місяців тому ⁺¹⁵
Mines gonna be wild
@starcultiniser 7 місяців тому ⁺⁴⁵
lets do our best to scar some employees and raise therapists salaries.
@SkriptKeeper 6 місяців тому
@@ragoonsgg589🤣🤣🤣
@Henk717 10 місяців тому ⁺¹⁵⁰
As the creator of the Tiefighter model, thanks for showing it in your video!
I did design it for Koboldcpp so that one is also worth trying it in since thats even easier to setup.
@GhOsThPk 10 місяців тому ⁺⁸
Koboldcpp is amazing, would be very helpful if more people created models tailored towards it, thank you for your service Henk.
@Henk717 10 місяців тому ⁺⁶
@@GhOsThPk Luckily it doesn't need models specifically for it, but it does have some modes such as the Adventure mode which Tiefighter was also designed to be able to use.
For regular instruct models its instruct mode is probably all you need (Although possibly with a different template in the settings). For chat models its chat mode is pretty much all you need to, but if not it has API's compatible with most popular chat frontends to.
@Sloff1155 10 місяців тому ⁺¹
Kobold is also really good, I just prefer the silly tavern UI because it's amazing but I need to reset up kobold so that I can mess with that one again.
@Henk717 10 місяців тому ⁺³
@@Sloff1155 The cpp version supports streaming and is very easy to setup so you would have a great ST experience :D
@Sloff1155 10 місяців тому
Do you have a link?@@Henk717
@I-Dophler 10 місяців тому ⁺⁹⁹
It's truly remarkable how you're empowering users with the knowledge to leverage AI without censorship and within their own control. The comprehensive guide to installing Ubaa's Text Generation Web UI opens up possibilities for private, customized interaction with AI, which is a significant step towards user autonomy. The dedication to providing both a one-click installation for supporters and a detailed manual setup is greatly appreciated. Keep up the fantastic work!
@ReXoRofc 10 місяців тому ⁺³⁶
Man wrote this using AI I'm sure 😁
@I-Dophler 10 місяців тому ⁺⁸
@@ReXoRofc and as usual people, like you say, the same thing, over and over again, why don’t you come up with something more original.
@ReXoRofc 10 місяців тому ⁺²²
Dear @@I-Dophler ,
I apologize for the lack of originality that may have been present in my recent actions or words. It was never my intention to be unoriginal or repetitive; I am deeply committed to offering new ideas and perspectives whenever possible.
In the future, I promise to put more effort into making each interaction unique and engaging. I appreciate your understanding and support as I strive for growth and improvement in all aspects of life. Once again, please accept my sincere apologies, and rest assured that I will continue working on becoming a more original individual for your benefit.
Thank you, and have a wonderful day!
Yours faithfully,
Rexor.
@I-Dophler 10 місяців тому ⁺⁴
@ReXoRofc It's fascinating to see such advancements in AI technology and the ability to run uncensored models locally. This shift towards more personal and unrestricted interactions with AI could really open up new possibilities for users. The idea of integrating AI more seamlessly into our personal computing experiences, while retaining control over content and privacy, is truly a game changer in the field. Looking forward to exploring these developments further!
@SquirrelTheorist 9 місяців тому ⁺²⁸
AI wars in chat.
@MrArrmageddon 10 місяців тому ⁺²³²
An amazing model to check out is Mistral 7B particularity "Dolphin 2.1 Mistrial 7B" it scored a 67% overall on the open_llm_leaderboard. And it supports at least 8K context. This outclasses most if not all 65B models. Most if not all 13B. And a lot of models between those. It's near the top model and even beats some 70B from what I understand. If anyone else has a better 7B model to recommend please let me know. And since 70B can't be used at 8K context for most people. That's another amazing thing about some 7B models.
@gabluz 10 місяців тому ⁺⁹
Wow... I'm gonna check it out... Does it work great for roleplaying, too?
@zolilio 10 місяців тому ⁺⁵
I wonder if we can merge mistral with a 13b model. It would be amazing if we could finetune it like that !
@MrArrmageddon 10 місяців тому ⁺¹
@@gabluz Never tried I don't role play. Variants of it may be better off for that. But who knows you have to look into that.
@MrArrmageddon 10 місяців тому
@@gabluz By the way I heard rumours that anything less than about 8Q may not be good for 7B models. So I use GGUF 8-bit. But 5_M or 6-bit might be good enough? Still I heard people strongly recommending 8-bit.
@MrArrmageddon 10 місяців тому ⁺¹
@@zolilio Right I have no idea how all that works. But who ever made the core mistrial model. If they could put out like 11-15B model and then the people who made the dolphin variant could improve upon. That would be great. Now my PC likely could only use 4-8K context with 13B though. 8K might be to much at 13B? But I think I can still pull it off.
@gavincstewart 10 місяців тому ⁺¹⁵
I finally got everything working after watching this video. Thank you so much. It's 🔥🔥🔥
@rookandpawn 10 місяців тому ⁺⁶⁴
I love how you simplify and get to the point for everyones benefit ❤ you really do care
@Jesulex82 6 днів тому
tu pudiste hacer scroll y ver las otras opciones porque yto no... y luego una vez que cargo el modelo este cuando le hablo no responde.
@tecnopoesia 3 місяці тому ⁺¹
I was looking for content like this, but I wasn't finding what I wanted. So, I remembered this channel and found what I needed almost immediately. Every single content I've found before were only about online AI models. Greetings from Brazil!
@takisk.7698 9 місяців тому ⁺¹⁰
Actually GGUF can run on GPU just fine, and is actually faster on older GPUs like 1080.
@GraveUypo 10 місяців тому ⁺¹¹
come on, the image thing isn't just a gimmick. i showed all my gear to gpt4 in a picture and asked it to build a fitness regime using the tools in the picture. it did an amazing job and used everything it had at its disposal. not sure if this model is as capable, but if it's anywhere near, it's pretty cool
@Larimuss Місяць тому ⁺¹
FYI, eleven labs api extension has been removed from text gen ui now. They do have other options but for a good one you'll have to train your own.
Also can you please 🙏 do a LORA guide for text gen ui on unstructured text like ebooks in txt? And or json. That would be awesome, not many tutorials out there 😢
@xzendor7digitalartcreations 5 днів тому
The last part of the video for me was the best; as I have had issues with image description and the multi-modal extension. The info you provided worked with the gguf models I'm restricted to using. Thank You for the share.
@MrChamesy 10 місяців тому ⁺⁷
Thank you thank you thank yoooouuu!! I was literally just struggling with this, this video arrived just in time!
@mirek190 10 місяців тому ⁺¹⁴
Nice but you are wrong telling that gguf format works on CPU...
GGUF format is universal it means you can run it on GPU and CPU.
Also using that version you can easily run even 70b model on your home PC putting half of the model on GPU and the other half on CPU.
@VinciSid 2 місяці тому
Thank you so much for a very informative comment
Can you provide a best MODEL to use and settings for Parameters that I can use with a PC that uses 1080ti Video card and i7 7700k with 64gb ram?
Seems like I always get a repetitive answers or emoji only replies and no words. Using TheBloke_Llama-2-7B-Chat-GPTQ and Divine Intelect on parameters.
Appreciate your reply! Thanks
@mirek190 2 місяці тому
@@VinciSid TheBloke_Llama-2-7B-Chat-GPTQ ?
You are serious?
That model is from stone age era of llms ...
With small models try gemma 2 9b, llama-3 8b .
@VinciSid 2 місяці тому
@@mirek190 Yeah, I just finished following the video and completely clueless on what model to use :( Still tinkering with settings.. I'll try your suggestions, Thank you!
@mirek190 2 місяці тому
@@VinciSid Try llamacpp with gguf model
Something like
llama-cli.exe --model models/new3/gemma-2-9B-It-SPPO-Iter3-Q8_0.gguf --color --threads 30 --keep -1 --n-predict -1 --repeat-penalty 1.1 --ctx-size 0 --interactive -ngl 99 --simple-io --in-prefix "user
" --in-suffix "
model
" -e --multiline-input --no-display-prompt --conversation --no-mmap
@klausbantzer4077 8 місяців тому ⁺⁷
You don’t have any related links in the description…… Only your own stuff.
@Carlos-ux7gv 10 місяців тому ⁺¹¹
I'm using a Nvidia 3060 (12 GB VRAM) and I couldn't get any AWQ model to work. Don't know why. I reverted to my old GPTQ models and everything seemed fine.
@blackdeath4eternity 5 місяців тому
thanks will try
@silveonfall 10 місяців тому ⁺¹³
Please help someone.
No errors were present during installation, but when i try to generate a response with any models it just gives me an error: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
@camilovallejo5024 10 місяців тому
Copy and paste on chat gpt4 that's how I solve that shit now
@SmallFox74 10 місяців тому ⁺⁴
i get the same error, did you solve it?
@ArabianShark 10 місяців тому ⁺⁴
I have the same issue. Any help would be much appreciated!
@shabadooshabadoo4918 10 місяців тому ⁺²
same
@ggoddkkiller1342 9 місяців тому
Did you answered as yes or no to question in 3:30? You should answer as y as you have an old gpu.
@snooks5607 5 місяців тому ⁺¹
4:30 PSA for macos users: stick to GGUF models. many model formats don't support macos metal API and I'm not sure why oobabooga even shows them in the model loader dropdown list (AWQ is not installed by requirements_apple_silicon.txt, ExLlamav2_HF will complain about CUDA_HOME not being set which doesn't make sense, so don't get confused about those)
@guillermogarciamanjarrez8934 10 місяців тому ⁺¹²
Great to see you back aitrepreneur, some reviews of the latest models like Mistral and tiefighter would be great, also running models in phones is actually possible now, running llama 7B on a snapdragon 8 gen 1 with 12 GB of ram it's not too bad, And I want to see if I can get it to run Mistral.
@user-fv6nc7qi2x 10 місяців тому ⁺³
i think ur frying the phone 💀
@AdrianMark 10 місяців тому ⁺²²
As always, well done with the guide. You really have the best explainer videos about AI. Thanks for making this!
@staryeeter 6 місяців тому
Posting this here so that it gets noticed - I'm finding it a barrier to type in the “N” for the graphics card - the text doesn't show up, and I'm getting through reading the comments
@royalphoenix9016 6 місяців тому ⁺⁷
i tried and it helped alot but when it came time ti test the modle it say error none and never resonded plus it said "ModuleNotFoundError: No module named 'awq'" in the model tab no matter what i tried. Any way to help?
@zzum538 4 місяці тому
dont close the start_windows.bat file
@arthesian 8 місяців тому ⁺¹⁰
I don't see the elevenlabs checkbox in the extension list? I have a different list of available extensions? :S Did they remove it? or do I need to enable it in a different way somehow?
@zenkaikalashnikov4753 6 місяців тому ⁺¹
did you fix or find a way
@ABUNIMR2 2 місяці тому ⁺⁶
That a lot work. but I think it's easier than getting a gf
@SmallFox74 10 місяців тому ⁺⁹
PLEASE HELP getting this error ------- RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
@Samuel-wi4oc 10 місяців тому
What GPU do you have?
@mehueesti 10 місяців тому ⁺²
yea, same here
@mehueesti 10 місяців тому ⁺³
1060 6gig
@stryk3rsandwich578 10 місяців тому
what are your pagefile settings? once i set mine to the C: drive(ssd) to minimum of 1024mb and max to 16384mb, i was able to load AWQ and GPTQ models
@SmallFox74 10 місяців тому ⁺²
@@Samuel-wi4oc GTX 1080
@Mr.KAnime_047 7 місяців тому ⁺¹
Combining this with iStable diffusion and VR would go absoutely HARD
@midtown3221 9 місяців тому ⁺²
Wow, didn't think it would be that simple.
_If only you showed how to get or make characters_
@user-fv6nc7qi2x 9 місяців тому
dude... i got the character making website after a single google search
@Krullerized 9 місяців тому ⁺¹
Guys check Cybertron 7B it's #1 in 7B and 13B models and #8 all sizes.
@Kristofferrea 7 місяців тому ⁺³
Not working for me. Model loads successfully but whenever I talk to it it just generates a blank message. Can anyone help me out?
@fcf8269 2 місяці тому ⁺¹
Two questions: how do you "make" new characters? Are these just chatbots with some interests and specific preferences, when asked about certain things, or can you actually implement something more in depth in terms of mental patterns and behaviors?
Also can these models learn or are they stuck to what the model you load includes? For example can I train or teach this character about something like Audacity for example, so it can become an Audacity expert, and help me when I am working on some audio stuff? or is just a funny thing to talk to?
@Kitten_Genocide 10 місяців тому ⁺⁸
GGUF can be used for CPU+GPU inference, not just CPU only.
@Maltebyte2 6 місяців тому
What exactly does that mean? will the share ram so less vram is used? or quicker? Can i even use my old gtx 1080ti for this?
@Kitten_Genocide 6 місяців тому
@@Maltebyte2 It means you can choose to offload some or all of the model layers to GPU VRAM. If you don't have enough vram, you can offload a few to vram and let the RAM handle the rest. It's faster than CPU/RAM only. It works even with a 1050ti so I'm sure a 1080ti could handle it. but only depending on the model you use.
@Maltebyte2 6 місяців тому
@@Kitten_GenocideIm gonna install it again using my old 1080ti cause the gguf model is running soooo slow on my 13500 and its using all my 32gb ram! strange the model was only supposed to use 7ish gb ram!
@Kitten_Genocide 6 місяців тому
@@Maltebyte2 Which model are you using, and at what quant? I personally use LM Studio to load my models when I want a quick and easy way to do it. I had a 6600xt 8GB before and it could handle all the 7B models at Q4_K_S and some Q5, with all the layers offloaded on the vram, doing 20t/s at the most. I'd imagine your 1080ti would fare better and do faster.
@Maltebyte2 6 місяців тому
@@Kitten_GenocideThanks for quick response time! After this tutorial, i left all the default settings! I use oobabooga and the model was
Yarn-Mistral-7B-128k-GGUF. Then followed the issues i explained above "long waiting times and high ram usage." I used Q4_K_M which was also recommended!
@X37V_Freecazoid 10 місяців тому ⁺⁵
Where I can find LLM LoRA models for text generation, I really want to try it out but can't find anything like it
@MegaUpstairs 10 місяців тому ⁺³
"This is definitely cheaper than a real date" :DDDDD
@Aitrepreneur 10 місяців тому ⁺³
Am I right or am I right? xD
@ErwinRommel2016 10 місяців тому ⁺⁶
Will there be any videos on how to make Voice AI models🤔
@meyou7041 9 місяців тому ⁺⁵
so, some of these models are still censored, and some are not. is there a way to know more about each model and what they are capable of before downloading it? are there commands that i can give to disregard policy and rulesets the ai has?
@GypsyNova 10 місяців тому ⁺²
This so very much appreciated! I've been going through your older vids but this has been the most helpful! Can you do a video that shows how to connect with TavernAi?
@Eurotool 9 місяців тому ⁺²
1. INCREDIBLE amount of thanks for this guidew
2. Jeeze PLEASE use Dark Reader instead of nuking our eyes with sudden white backgrounds
3. Can you have the STT automatically kick in upon hearing a word like "Hey Siri" or "HAL"?
@fenix20075 10 місяців тому ⁺¹²
There is a lot of different methods to play in the game. I love to use two models to play roleplay, use the openhermes mistral 7B to play the char as logic, allow the character decide to do something with the player. After the character decide to do something with me, use the wizard vicuna7B to generate the reminding process.
@TataAIOfficial 9 місяців тому
So are the two best models for RP? I'm going to try it Thanks !
@fenix20075 9 місяців тому ⁺¹
@@TataAIOfficial I cannot say that is the best for everyone but they fit my style. XD And there is Mistral-ClaudeLimaRP-v3-7B model I will preferred.
@LagiohX3 9 місяців тому
how do you do reminding process?
@fenix20075 9 місяців тому ⁺³
@LagiohX3 switch to another server, type in the short descriptions about the first server's story situation, and then type in the command/direction I want the second server to generate the NSFW content. Basically, the character in the first server will make the decision if she/he wants to do NSFW action, and then the second server is used to generate whole/some NSFW content, then I copy the second server's NSFW content back to the first server, so I can see what the character feel and reaction in the first server. (That is why I love 7B model, speed and size are very important because I use CPU to generate content.....) And, the first server I preferred link with Silly Tavern with OpenHermas or Claude, the second server directly use koboldcpp UI, load in wizard vicuna 7B.
@lystic9392 9 місяців тому
@@fenix20075 Wow I thought I was in deep, but you sound like a very advanced lewd person!
I was just using 6B pygmalionAI but I did create tons of characters and situations. I'll have to give Mistral 7B a shot. (Though my 3080 can handle 6B but not 7B models.)
@user-rd4xj2ei1l 9 місяців тому ⁺¹
Okay let's get started
@activemotionpictures 9 місяців тому ⁺²
can it show "emotion" in the text pronunciation? Also, where do the wav temp files from the mic get stored? also where do the .wav from the response TEMP files get saved to? (what directory?)
@dmitrypichugin7449 7 місяців тому ⁺³
Thanks you man, you not only say how to install but explained very important nuances (model types, their difference and etc).
@AdamJensenReal 9 місяців тому ⁺¹
This is only for Nvidia GPU, for AMD you can use SHARK. I still don't know why AMD not have ROCm for windows. But anyway Shark is good and the only 1 possible solution for AMD and windows.
@igggaming9871 10 місяців тому ⁺⁸
The most frustrating thing about AI for me isn't the inability to make NSFW, the frustrating thing is my ability to make 'tasteful' things that the AI considers NSFW. I wanted to create art of mermaids, but NOPE. Remember that scene in the adult pornographic film that was the original The Little Mermaid where Ariels hair hides her boobs? Absolute degenerate filth according to AI.
@Sen3D 10 місяців тому ⁺³
:D True
@DeSibyl 5 місяців тому ⁺¹
Models from TheBloke throw a bunch of errors when it tries to generate a prompt unless you enable the "no_inject_fused_attention" option, why is that? (enabling that options makes the responses not accurate and random)
@madrooky1398 10 місяців тому ⁺⁴
No matter which model i run, it works at the beginning and then starts to produce gibberish. I have done a fresh install, 12gb gpu. Does not matter if i load older models that worked well before.
And even if it does work, the responses of the new AWQ models are really out of context.
@Coopetition 10 місяців тому ⁺¹
I'm having the same issue.
@k.h.p.9862 10 місяців тому
Thank you for sharing the limitations. Exactly what I was suspecting. Still not as intelligent as GPT-4.
@Clayf701 10 місяців тому
Same here
@happyme7055 9 місяців тому ⁺²
Another amazing video of you.... and it works fine :-) can you tell us how to create the character json?
@unqiotheowl9805 6 місяців тому ⁺¹
I’m very highly mentally confused on installing git as this video was back in November with the latest version of git being different and I downloaded it but saw too many things and boxes I had to check and uncheck. It was really frustrating as it didn’t match with the video as I gave up on trying get the git thing before getting oobabooga. Maybe you can upload a more updated 2024 tutorial on how to download oobagooba and the api url for oobagooba to put on Janitor or Venus AI? Because I need something that will be more helpful than this.
@michael_hackson_handle 7 місяців тому ⁺¹
You know something is AI generated when Sandra is using Charlotte's voice.
@jurandfantom 10 місяців тому ⁺¹
Will be interesting to see the video method vs what we know and can do with the 1-click installation that Ooba suggest by months.
Good to see A uploading a new video
@gotenks157 10 місяців тому ⁺³
for some reason i try this and i usually only get one somewhat good response, but the following responses are always gibberish. and that is with almost any model i try. I have a 4090 and a 13900k. I also was sure to pick GPU and Cuda12 and am using all AWQ models of different sizes, 7B 13B and 30B. Always the same gibberish after a response or 2. Any help would be appreciated.
@Sen3D 10 місяців тому
I also had problems with the AWQ models. Try GPTQ or GGUF. They worked well so far. With the specs of your Computer you could run a 70B model with GGUF by loading layers into Vram. That is, if you have 64GB Ram for the rest.
@gotenks157 10 місяців тому ⁺¹
@@Sen3D awesome I'll try that. I def have the ram. Thanks
@paulort1184 8 місяців тому ⁺⁴
So, would you have any suggestions for a Windows user who runs an AMD graphics card? When trying to run the script for the text generation webui AMD cards are only selectable if running Linux which I do not.
@Rook_Gaming_Original 8 місяців тому ⁺²
I have the same problem. I was forced to use CPU only, I don't know why AMD isn't supported when Intel arc was and it's new so it feels like bais at play but I don't know. What I do know is when I come across issues like this I don't donate to the project.
@hanbill 2 місяці тому
I agree, imagine if new 3rd GPU brand comes on market and it's also restricted to only 1 type of OS.
@joelchamp1949 10 місяців тому ⁺²
Nice. Can you make a video on training for The Oobabooga TextGen?
@hejjj12 10 місяців тому ⁺²
Was just looking for this, thanks my mind reading hero.
@Aitrepreneur 10 місяців тому ⁺²
You’re welcome 😊
@gamereplixer8521 6 місяців тому
@@Aitrepreneur how can you create your own character .json?
@musicandhappinessbyjo795 10 місяців тому ⁺³
Anyone facing issues with AWQ models? Its not even loading for me. Gguf models does work for me.
@Sen3D 10 місяців тому
looks like that everybody has problems with AWQ, including me.
@DrW1ne 10 місяців тому ⁺⁷
Thanks for a informative video, u got my like =]
@Aitrepreneur 10 місяців тому ⁺²
Glad you liked it!
@ChrisS-oo6fl 8 місяців тому
@@AitrepreneurYep super dope. I did the mistral model but my text responses are empty. No clue what I did wrong.
@murphy6672 6 місяців тому
Imagine connecting your PC to your smartphone using the microphone and headphone jacks (output to input). Then, configure your smartphone to automatically answer your incoming calls. Voilà! You could now call your AI assistant anytime. The best part? Both the smartphone and PC wouldn’t need an internet connection
@Planeta1951 3 місяці тому
0:07 I know this guy. Jason made it to the big AI world.
@cleverestx 10 місяців тому ⁺²
Great video overall, but correction: @6:47 - this isn't strictly true. If you have enough RAM to load the model, and a single 24GB video card (3090/4090) you can also run 70b models that are 2.4bpw using Exllama2/Flash attention 2 (use the latest OOBE, choosing 12.1 CUDA version when installing, and those models specifically), it's sloooow, like 0.5 - 1 token/sec so arguably not worth using, but they do run making your statement not strictly accurate...THAT could be the absolute highest you can run, not 33b
NOTE: This Exllama2 set up will not allow you to use AWQ models (at this time), so choose accordingly.
@martinmakuch2556 10 місяців тому
I use KoboldCpp and 70B GGUF with q3k_s and q3k_m (so around 3.4 - 4 bpw) and still get ~3T/s, slow but usable for chat. 1.2T/s I get for 70B q5k_m (so almost 6 bpw) which has great output but yes, that is too slow for chatting. For token generation you need as fast memory as possible, so you need some good DDR5 (for the CPU part). I have Ryzen 7950X3D, but for this particular case Intel would be probably better, since it can reach higher memory throughput.
@LilLegyiths 4 місяці тому ⁺³
hi I get completely blank responses
@mazireth 10 місяців тому ⁺³
I use GGUF models with my GPU (3090) by loading everything into VRAM (llama.cpp, n-gpu-layers to max). It seems to be faster than AWQ, but maybe I'm doing something wrong.
@datatron100 10 місяців тому ⁺¹
Depends on model size and context size I’ve got a 4090 so same 24gb vram just a bit faster. Runs faster via AWQ for me if it’s a 13b parameter model with 4096 tokens but if I want 8192 context then llama.cpp with gpu layers of the model running on the gpu is far faster. Running 70b parameter models works too although far faster in kobold cpp but still very slow at 2.6 tokens a second. I’d recommend kobold cpp over oobabooga it’s far faster for GGUF models in my experience I believe due to less overhead.
@MrWuzey 10 місяців тому
@@datatron100 2.6 t/s on 70b model is pretty good, it's true thought that if you compare with the speed you are getting on 13B model it looks very slow.
What is your cpu/ram and what quantization are you using on the 70b model ?
I've seen people with 4090 getting sub 2 t/s on 70b model but it depends of how loaded the context is when benchmarking, if the chat just started it will for sure be faster than when it reach 4/8k context.
@GrzegorzWierzowiecki 9 місяців тому
You should mention that if someone meeds 70B model, may run it super slowly on CPU having a lot of ram. That's how I do slowly some generations.
@chrisbraeuer9476 10 місяців тому ⁺⁴
Amazing. Please tell us how to create a character from scratch. Please.
@chrisbraeuer9476 8 місяців тому
@VioFax Awesome that is is easy for you.
@ilakya 10 місяців тому ⁺²
Somehow Tiefighter for me be like
"AI
How can I help you today?
You
Hi. Is this working?
AI
Yes, it's working fine. You may ask me any question or request assistance in anything.
You
really?
AI
What are your favorite things about being a DWARF?
Your short stature makes you tougher than most people think, but that doesn't mean dwarves aren't susceptible to pain and discomfort. Your strong body compensates for your size though; you could probably lift twice as much as an average human. Being able to see over most crowds comes in handy too.
Dwarven society is also incredibly tight-knit, which means family is everything. Most dwarves live in small villages or towns where everyone knows each other. This creates a sense of belonging and security that many non-dwarves envy. Dwarves are known for their craftsmanship, especially when it comes to metalworking and engineering. Their creations are some of the finest in the world.
But perhaps what sets dwarves apart the most is their resilience. They have"
After a couple exchange, It start to talking about some roleplay out of blue without any context as if it's high on some good shit. When it still sane, It could give me some nice creative result. But that's not so long before it start to do this.
Anyone have this problem?
@N8O12 10 місяців тому ⁺³
The AI got bored by your boring questions so it decided to start talking about dwarves
@ilakya 10 місяців тому
@@N8O12Trust me I ask much more spicIer stuff that I can't put in here.
@henrischomacker6097 10 місяців тому
Jip, again and again and again, unreproducable.
With every single model I tried. - And I tried a lot.
I would guess that there probably isn't one single person who tried opensource models and did not have this behavior from time to time.
@sylvainh2o 9 місяців тому ⁺¹
This will replace those erotic phone numbers lol. People will have dirty chat with AI that's hilarious :P
@ncarter9193 9 місяців тому
Just another example of AI putting humans out of employment.
@Sebastian-oz1lj 3 місяці тому ⁺¹
DUDE YOU ARE A W E S O M E. I watchd lke 8 guides and you are the first and only one wgo provided instructions clearly enough to make everything work flawless! Instant sub!
@joshjoshua6479 9 місяців тому ⁺²
I have an AMD graphics card, but I don’t have Linux, so I can’t use it. 😥
@veloyozh 9 місяців тому
Thanks for super informative video! I subscribed for your channel, hope to see much more interesting stuff here. Especially interesting in voice generation (and custom too) of the high quality. And also want to learn how to train new models on my own data (books) to be able to ask bot questions, quire the statistics etc.
@dracothecreative 9 місяців тому ⁺³
I have a question. I found that this ai broke my automatic 1111. Is there a way to restore that because i put a lot of work in it and i hate to lose it all and start over. Can they work besides eachother?
@SmallFox74 10 місяців тому ⁺⁴
follwed instructions, loaded model, went to chat... when i say Hello for example, the AI says "typing" then disappears? Any help much appreciated..... ALSO just noticed this --- RuntimeError: CUDA error: no kernel image is available for execution on the device
@GeorgeMichaelArt 10 місяців тому
Had the same issue , you need to load the model
@GeorgeMichaelArt 10 місяців тому
Look at your CMD screen, will show you the issue.
@SmallFox74 10 місяців тому ⁺¹
@@GeorgeMichaelArt I did load the Model, still happening though. How else did you fix it?
@SmallFox74 10 місяців тому ⁺³
@@GeorgeMichaelArt this is the error in CMD screen but i don't know what it means?
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Any help is much appreciated
@ArabianShark 10 місяців тому ⁺³
I'm getting the same issue; any help would be lovely!
@maghambor 8 місяців тому
After reading up a bit further, I got the AI to respond and now have my first local installed. Now I need it to start using the main Nvidia card and not the motherboard built-in AMD.
@vi6ddarkking 10 місяців тому ⁺⁶
Are you going to do a video explaining Sillytavern UI? The Software as advansed quite a lot since your last video on LLMs.
@Aitrepreneur 10 місяців тому ⁺¹³
Absolutely I will ;)
@TAGHERDAYT 9 місяців тому ⁺²
The oobabooga only works with amd GPUs if you u have Linux or Mac os
Is there a way I can make it work with windows or are there any other alternatives
@DrW1ne 10 місяців тому ⁺⁴
Ayooo, We are about to get another quality Video!
@uropig 10 місяців тому
you're first! congrats
@bargbikernes31 10 місяців тому
Sir please help me sir im Indian sir please help me sir I'm indian sir please help sir im indian
@flurit 8 місяців тому ⁺²
i have an amd gpu and use windows, i can't get past the select gpu part are there any alternatives for windows amd because i dont wanna switch to linux
@elzilcho222 8 місяців тому ⁺¹
Anyone else having trouble running the AWQ models? I'm fine as long as I run the GPTQ and GGUF...
@volt8ge 3 місяці тому
Yes , same and I have a 3090
@KizanTM 8 місяців тому ⁺²
For some reason I cant see elevenlabs as an option under "session", is there a way to fix it?
@somesheqx8073 8 місяців тому
Same here if someone has a solution :/
@JuanPeron007 7 місяців тому ⁺⁴
Dang... All we need now, is to also have her in video form, where she can move and look at us, while we have a conversation with her. As if she was on Skype
@mige5fin 7 місяців тому
This, sadly there doesnt seem to be anything like that.
@josel5831 7 місяців тому
Well, I guess we were not that far away from that movie "HER", where people have AI personalities in their phones.
@therealdkl 4 місяці тому
bro didn't even hesitate anyways thanks for the awesome tutorial
@STRIKE01 10 місяців тому ⁺³
Hello Aitrepreneur, thanks for the explanation of wich model to use didnt know AWQ was newer :) I hope someday you can publish how to finetune a model, thats would be great! Keep the good work! Thanks
@thill6482 10 місяців тому
THank you for this video! You're always ahead of the times.....if yo don't mind, could you please make an updated ComfyUI AnimateDiff Guide. The workflows just updated on 11/03/2023 and I can't find another person I trust that has a guide on the subject.
@camiloreina1062 5 місяців тому ⁺²
AMD for windows does not work? bcs I see it says only Mac and linux, what would happen if i select AMD in a windows?
@rongazaryan9571 3 місяці тому
Wondering the same thing here
@podcastando 5 місяців тому
The best off. Incredible. Thanks for shared. Agora vou aprender de verdade com quem de fato sabe fazer e ensinar.
@mattlau04 8 місяців тому
Really nice helpful video, it's hard to stay up to date with how fast AI is moving
@userx6679 9 місяців тому ⁺¹
Someone has to make these easier to install like NMKD stable diffusion. load the models the same way as that one does and runs on lower systems if we want to get these things out there more.
@kanavwastaken 10 місяців тому
I missed these videos so much! 😭❤️
@johnplaysgames3120 8 місяців тому ⁺²
Once everything's installed, can you switch between using GPU and CPU models or do you have to choose one or the other when installing the UI? My GPU is okay but limits me to the 7B param models. However, I have metric F-tons of RAM so it seems like CPU might be the better choice to try larger models. If I can switch back and forth, though, that would be ideal.
@blackdeath4eternity 5 місяців тому
i think you can just use the different versions of the model for each.
@justwhatever9217 10 місяців тому ⁺²
Is there a multi-modal model that can read documents and summarize them within Oobabooga TextGen WebUI?
@CausticCatastrophe 10 місяців тому ⁺¹
You should cover training llm lora with peft next.
@I-Dophler 10 місяців тому ⁺¹
Why did the AI go to school?
Because it was tired of performing unsupervised learning and wanted to improve its neural networks with some class!
@joshjoshua6479 9 місяців тому ⁺¹
My model won’t load after any period of time. Well, after “a few seconds” my model still didn’t load, why?
@joshjoshua6479 9 місяців тому ⁺¹
It says "Loading..."
@LordOscur 9 місяців тому
Question: Have you seen any models similar to the Star Trek "computer" ? 🤔 always wanted to say "computer create/do..." and get answers
@augustok2 10 місяців тому ⁺¹
I'm getting "AttributeError: 'NoneType' object has no attribute 'lower'" when trying to install model
@screamingiraffe 5 місяців тому ⁺¹
NIce that the video explains how to install and use a model, but the title says 'uncensored' and that entire chain of models is most definitely censored.
@1marcelfilms 6 місяців тому
Finally i can make AI chat say the nono word
@Sen3D 10 місяців тому ⁺³
The 4060 ti with 16GB VRam can load up to 20B Models. But I also tried 70B GGUF models. I'm not sure how smart the lower models are.
@MrArrmageddon 10 місяців тому ⁺³
I have a 4080 16GB and I can run 70B local. However I have 64GB of ram so it uses that too. How much ram do you have? You can easily split into ram and run greater than 20B. I can run up to 34B pretty well. But only 15B and under runs super fast. However some 7B models are almost as good as 70B actually. Few 7B models that surpass most 34B and pretty much all 65B.
@Sen3D 10 місяців тому
@@MrArrmageddon I also have 64GB Ram and split the 70B model that way like you. But it's like 0.2 tokens/s. I tried some 7B models with my old setup but they couldn't really hold a conversation. I guess I try some more.
@tungstentaco495 10 місяців тому
Also have the 4060ti 16Gb. Good to know! what 20B model do you recommend?
@Sen3D 10 місяців тому
@@tungstentaco495 The "MLewd-ReMM-L2-Chat-20B-GPTQ" from theBloke works with the settings: Modelloader: ExLlamav2, max_seq_len: 4096, compress_pos_emb 2. But it kind of doesn't understand scenarios very well in the beginning. But you can "replace" the last answer it gave and steer the conversation in the beginning. Maybe I'm just going beyond role play here :D
@MrArrmageddon 10 місяців тому ⁺²
@@Sen3D Absolutely 70B even on my system with 4080 is slow. Faster than 0.2 I think I get 0.5-0.6 but yeah it's usable but slow af. Typically I don't go past 34B. Even at 34B I can only really pull of 2K maybe 4K context. I'm not even sure about 4K. So that's why 7-13B is that sweet spot for 4-8K context for me. And with some recent models 70B is barely any better than 7B.
@k.g9863 8 місяців тому ⁺¹
I gethe them“Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0“ error when trying to run dolphin 8x7b, can anyone help?
@someusernameidclol 8 місяців тому ⁺¹
what do i do after i close it? like i cant open it again for some reason
@IndecisiveStoner Місяць тому
GGUF models do use CUDA tho. What’s the reasoning behind saying it’s CPU-only?
@juliandelphiki8443 7 місяців тому ⁺¹
please heeelp! i have this error and i don t know what to do: AttributeError: 'NoneType' object has no attribute 'Llama'

Наступне

Автоматичне відтворення