ChatGPT - but Open Sourced | Running HuggingChat locally (VM) | Chat-UI + Inference Server + LLM

Blue Antoinette

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 14 жов 2024

КОМЕНТАРІ • 44

@BlueAntoinette Рік тому ⁺⁴
In the meanwhile we created a HuggingChat Plugin for aitom8 which is our professional AI Automation software. It allows you to install HuggingChat with just one command. While everything explained in this video is still valid and fully functional, please consider further improving your efficiency with this video: ua-cam.com/video/HO1V7kLQu6s/v-deo.html
@kaitglynn2472 Рік тому ⁺¹
Thank you so much for this wealth of knowledge!! Spectacular job!
@BlueAntoinette Рік тому
Kait, I thank you so much!
@ShakeelaK-b5b Рік тому ⁺⁴
Thanks for this detailed tutorial. Would you mind sharing the scripts that you created?
@BlueAntoinette Рік тому ⁺⁶
Hi, I now have added a link to my instructions and scripts in the video description. You can access it directly on our site at this link www.blueantoinette.com/2023/05/09/chatgpt-but-open-sourced-running-huggingchat-locally-vm/
@itsmith32 Рік тому ⁺¹
Thank you so much! Great job
@BlueAntoinette Рік тому
Thx! 😀
@BlueAntoinette Рік тому ⁺²
Update: New video about running Code Llama locally available:
ua-cam.com/video/mhq6BQX0_P0/v-deo.html
@thevadimb Рік тому ⁺¹
First, thank you for your video and for sharing your experience! A question - why did you allocate two GPUs? Why do you need more than one for simple inference purposes?
@BlueAntoinette Рік тому ⁺¹
Well, this was a little bit of trial and error. I first increased the number of GPUs and then, because it still did not work, the CPUs and RAM, which eventually turned out to be the deciding factor. So potentially you can get away with just one GPU, but I did not test that.
@thevadimb Рік тому ⁺¹
@@BlueAntoinette Thank you!
@BlueAntoinette Рік тому ⁺²
@@thevadimb FYI, I now tried it with just one GPU but this results in an error "AssertionError: Each process is one gpu". Then I tried to reduce the number of shards to 1 but this results in waiting endlessly with the message "Waiting for shard 0 to be ready...". Therefore the only reliable configuration so far is the one that I show in the video (with 2 GPUs).
@thevadimb Рік тому ⁺¹
@@BlueAntoinette Thank you for devoting your time to checking this point. It is a bit weird that it requires at least two GPUs. HF did tremendous work building this server, so it is a bit strange that after all this profound design they ended up with such a strange restriction. I would bet that there is some hidden configuration setting... Probably 🙂
@BlueAntoinette Рік тому ⁺²
@@thevadimb Well, apparently they optimized it for their needs. Maybe there are settings for this or it requries changes to the code and a rebuild of their docker image. However that's beyond the time I can spend on it for free.
@MultiTheflyer Рік тому ⁺¹
Thank you!!! this has been super useful. I'm trying to use this front end, however I'd like to use OPENAI APIs as a backend, because it currently supports function calling (I don't know of any other model that does). I'm quite new to programming in general and don't have any experience with docker, in my understanding though, the huggingface chatui front end cannot be "edited" and can only be deployed as is because it's already in a container, is that correct?
I'd like to change it slightly so that it shows when a function is being called etc but it seems that's not possible right?
thanks again for the useful tutorial, it really did open up a new world of possibilities to me
@BlueAntoinette Рік тому ⁺²
Not quite right. I do not run the Chat-UI in a container, instead I run its source code directly with npm run, please check this out again in the video. If you want to make changes to the source code simply clone or fork it from repo and adapt it to your needs. The Chat-UI is written in Typescript and it utilizes Svelte and Tailwind, so you want to make yourself familiar with these technologies.
@MultiTheflyer Рік тому
@@BlueAntoinette thank you!
@chuizitech Рік тому
兄弟，感谢！正好准备进行私有化部署。
@BlueAntoinette Рік тому
不客气
@ShravaniSreeRavinuthala 4 місяці тому
Thank you for this video, I am trying to use the UI with my custom backend server which has RAG setup in it, but all it needs as parameters are the queries, as per what I explored, it looks like I have to make changes in the source code, is there any easier way to achieve this
@BlueAntoinette 4 місяці тому
I did the same once with my RAG backend and I had to make changes to the source code as well. Learn more about my solution here: aitomChat - Talk with documents | Retrieval Augmented Generation (RAG) | Huggingchat extension
ua-cam.com/video/n63SDeQzwHc/v-deo.html
@eduardmart1237 Рік тому ⁺²
Is it possible to train it on a custom data?
What are the ways to it?
Does it support any languages except for English?
@BlueAntoinette Рік тому ⁺²
Theoretically you can run it with any model, however I just tested it with the Open Assistent Model so far.
@oryxchannel Рік тому
I wonder if you can build a privacy filter built of diverse prompt clusters. Tell it that its all PII or something so that the VM isn't able to read your data. Tunneling and all that. This may be an added privacy solution, and maybe it won't work at all. But the fact that it's on a Google virtual machine does not mean that it is "local" or "private". Also, if you had the video support, an MMLU AI benchmark would be helpful.
@BlueAntoinette Рік тому
Interesting…
@frankwilder6860 Рік тому
Is there an easy way to run the HuggingChat UI on port 80 with SSL encryption?
@BlueAntoinette Рік тому ⁺¹
Yes, you can setup NGINX Reverse Proxy with SSL encryption. I fully automated this process in this video: ua-cam.com/video/v0D2rNHmSD4/v-deo.htmlsi=xjU2QGt_vQHXaBgj
With this approach it requires just one command!
@jackfalcun 10 місяців тому
damn, almost 600 usd monthly for the inference server alone.
@BlueAntoinette 10 місяців тому ⁺¹
Well, just if you choose variant 2. Actually, the high costs for variant 2 are caused by the required hardware, the inference server (software) comes at no cost (apart from integration, automation, maintenance, etc).
LLMs are very resource intensive and the cloud providers charge a lot for the required GPUs.
Alternatively you can stick with the remote endpoints (variant 1).
@deathybrs Рік тому ⁺¹
I am a little curious - why a VM?
@BlueAntoinette Рік тому ⁺²
You mean in contrast to running on your local machine? Well, there are several reasons. For example if you do not have sufficient hardware resources on your local machine, which is especially likely when you choose variant 2. Or if want to to make it publicly available with SSL encryption and reverse proxy.
@fredguth1315 Рік тому
Also, if you are developing as s team, a vem is handy for keeping environment in sync
@deathybrs Рік тому
At the end of the day, I think maybe I should have been more clear in my question.
Why a VM *before* explaining how to set it up *without* a VM? I understand the value of a VM, but there aren't many videos explaining how to do this, so why *start* with the VM explanation rather than explaining how to get it set up in our native environment first?
@BlueAntoinette Рік тому ⁺¹
@@deathybrs When it comes down to the Chat-UI, there is no difference in the installation on a VM or on a local linux compatible machine. If you don‘t want to use a VM then you don‘t have to for the UI part. If you run Windows locally you could utilize WSL as well. If you want to discuss your situation more specific feel free to share your native environment.
@deathybrs Рік тому ⁺¹
@@BlueAntoinette I really appreciate that, thanks!
At this point, I am actually not ready right now to set it up as my use-case for AI is 90% Diffusion rather than LLM, and I suspect that unless I need it soon, the tech will have changed shape so much that by the time I do get there, a video made now would not be applicable enough for it to be worth your time.
But as I said, your kindness is certainly appreciated!
@GAMINGDEADCRACKER Рік тому ⁺¹
May I get you mail address? I want to know more about it.
@BlueAntoinette Рік тому ⁺¹
Yes, please find my contact details here: www.blueantoinette.com/contact-us/
Thx!

Наступне

Автоматичне відтворення