Use AutoGen with ANY Open-Source Model! (RunPod + TextGen WebUI)
Вставка
- Опубліковано 16 вер 2024
- I might be obsessed with AutoGen...
In this video, I show you how to use AutoGen powered by TextGen WebUI and RunPod, which means you can use literally any open-source large language model with it, even Falcon 180b or Code LLaMA.
Enjoy :)
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewber...
Need AI Consulting? ✅
forwardfuture.ai/
Rent a GPU (MassedCompute) 🚀
bit.ly/matthew...
USE CODE "MatthewBerman" for 50% discount
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
Media/Sponsorship Inquiries 📈
bit.ly/44TC45V
Links:
Use RunPod - bit.ly/3OtbnQx
AutoGen Beginner Tutorial - • AutoGen Tutorial 🚀 Cre...
AutoGen Intermediate Tutorial - • AutoGen FULL Tutorial ...
AutoGen Fully Local - • How To Use AutoGen Wit...
AutoGen - microsoft.gith...
LMStudio - lmstudio.ai/
RunPod TextGen WebUI Template - bit.ly/3EqiQdl
Install TextGen Locally - • How To Install TextGen...
RunPod Full Tutorial - • Run ANY LLM Using Clou...
Should I make a video testing different open-source models to see which one powers AutoGen best?
I think that would be of great utility !
Yes!
That would be a tremendous help Matt - I was actually thinking of the ability to have specific agents using different models due to type and complexity of their roles - (one role could just have access to a business data model to keep things tight in certain areas) would really be a massive bespoke powerhouse
Definitilily! 😉
i'd like a real useful usecase of autogen
I work at a law firm, and I have set up AutoGen group chat to simulate a legal team to solve tasks. The team gather legal information and argue legal matters between agents to come up with a multiple scenarios that a virtual judge finally rate each of the suggested solutions. I tried it on previous exams from law school and compared AutoGen's output to the exam evaluation, and it is staggering how well it performs.
is this a public repo? have you posted it to the examples chat in autogen discord? would be awesome to see.
I’d also like to see this. I want more example of people using autogen
This is a great idea, would love to see it!
you should productize this and sell it to other law firms. Don't give out the code for free to freeloaders.
@@neoblackcyptron Im sure someone will regardless of this guy does, but to regard the opensource community as freeloaders is wierd. its the backbone of the entire Internet age imo. he's would need to use use like 1500 different foss projects (if you've ever seen a dependency tree) to productize his work.
Your ability to parse these install instructions and organize them into a video that we can actually follow is amazing.
Thank you for making these videos!
Thank you so much for the shout-out Matthew 😊! Amazing video and well-explained tutorial as usually! As I told you in private, even as a software engineer, you were the first one I watched and you helped me learn so much during my first steps into the AI & LLM world. Hopefully we'll have more amazing discoveries to share 😉.
Thank you, Ivan!
things that I think are a must for Autogen to take off are :
1) how well if at all it can push to github
2) iterate on the github
3) Embeddings and Vector DB like Supabase
to store all prompts so it does not deviate too much from the development of the coding project :/ ( but maybe I missed that part )
This will be INSANE! Can't wait to see what all people make from this.
Is anyone else continuously getting 502 gateway errors when they finish configuring the pod in the web UI? I've tried it on two different machines while using both Mistral 7b and Dolphin Mistral 7b
I never quite figured out how to get multiple agents set up in VS Code, running mistral 7b locally with autogen. I configured assistant as name "Coder", and then assistant2 as name "Checker" and tried to get Coder to pass all his work to Checker to verify his work, but instead it all came back to me as User Proxy. Would be great to see a 5 agent example, like a little dev team with a CEO, concept designer, user interface guy, coder and code checker or something similar 👍
I believe the name of the assistant object and the assigned name have to be the same
2:23 It is not completely uncensored, however an effort was made and with proper instructions you can mostly avoid censorship that it still tries to do to it's output. This was a cencored model which fine tuning efforts were made to reverse the censorship, it was not 100% successful but it was a good effort and it is substantially more useful.
Thx Matthew for this incredible work, I tried it before many times with many models and Mistral was the best and light option, i faced one issue with context length limit and i hope they have a good technique to solve it
Worked sweet on my older Mac M1; was able to create a POC for a healthcare project… immediate industry value
keep up the good work! loving the autogen series! a wizard an an assistant and a completer walk into a bar
Really confused about actual pricing for running on RunPod. The posted prices ($/hr) don't mean anything to me because I'm clueless about how much cpu time would be used in the real world. Is it likely to be multiples of Chat-GPT4's $20/mo? If you spend a day coding with mistral, what does that set you back?
Following
Hi! The 20 Bucks per month for ChatGPT and the OPEN AI API are two different things. If you want to use the OPEN-AI API for your autogen you have to pay for every token regardless of whether you are paying for ChatGPT or not. peace
Waiting for the advanced code generation tutorial by autogen
yeah, me too!!
I've discovered autogen + langchain can work with Excel sheets. Autogen can read the columns and calculate financial ratios (I use it for finance). Really looking forward to the advanced autogen video.
Hi there! Is there any chance you can share this? I'm trying to do the same thing!
Please share that (bis).
Thx in advance!
@@ludoviclebleu search "Using Langchain with Autogen", video by DLExplorers is the video I followed. Change the Excel file + autogen prompts for your use case.
@@candogruyol search "Using Langchain with Autogen", video by DLExplorers is the video I followed. Change the Excel file + autogen prompts for your use case.
Yes please!
A few questions:
1 - My scripts always fail because they generate more than the 8K token limit. is there a way to avoid this from happening? Can c-tags or other method be implemented?
2 - Will Autogen work with existing (large) codebases (I have a Django project i'm working on), if so how?
Just got it to work locally on my Windows, box. Thank you for the video. Um, suggestion to folks make sure you tell the bot in your system message the OS you are using. It likes to default to Linux. :) TextGen WebUI is a beast. The lm studio is too new.
Your videos are amazing!
Thanks!
Runpod also offer like a "LLM as a service", where you pay as you go. You think you could cover that in a video sometime?
Yes, that's a must!
Can't wait for the Autogen advanced tutorial!
Is there a reason to use text gen webui instead of LM Studio for a local execution scenario?
Hi Matthew great stuff can you maybe make a video on GDPR and data governance if you are using auto gen? Is it safe to use if you have sensitive data
We would love to see your projects, they must be interesting!
maybe i should train dolphin on falcon-180b too
Damn, why i just found your channel now ?
however, you do a good job with this channel. Very detail, practical, and easy explanation to follow. 👍👍
A helpful usecase I’ve found was with finding ongoing clinical trials that a particular patient could be a good candidate for.
This implementation was technically with autogpt (not done it with autogen yet)
Do a real world example deploying llms in k8s to simulated production enterprise when developers can connect to the LLM in cluster
In your previous video you mentioned LLM Studio where GPU could be used for GGUFs. How can one use GPUs for GGUFs? Thanks!
I'm reading now you should be using GPTQs for running on VRAM (GPU)
i used autogen with openai's api key and it ran my usage to $8 in less than 10 minutes
Can you show us how to run agents that are a mix of OpenAI as well as open source?
Hi Matthew! thanks for shaing! may I know where can check the AutoGen advance Tutorial? is it in the substack? please let me know :) have a nice day!!!🤗
Build an app. Your personal project would be great. Need to see how configure the agents. Code llama please!
I just installed code llama on textgen ui but whenever I try running it with Transformers model loader I get lots of traceback errors, show stopped for me and Code Llama :(
you're on fire! I've learned so much about autogen from you and really appreciate your clear and focussed tutorials. Thanks Matthew!
Thank you for great work and video again! I was wondering about possibility of combine aider with autogen. For example if developer agent could use aider when the prompt was given from proxy agent?
looks great, thanks.
but, wow if i have to do it from scratch for local use, it's kind of complicated, cause you will bump with many issues...
I followed the tutorial (several times, actually) and still cannot get port 5001 ready. This is needed to emulate OpenAI API. I added 5001 to the "Expose HTTP Ports (Max 10)" field in the RunPod configuration (also tried editing the pod later too), followed the instructions in the video carefully and always get "HTTP Service [Port 5001] Not Ready" in the Connection Options tab of the Connect dialog. HELP!
same here! Was seeing if anybody had this issue recently
its no longer port 5001. 5000 is the new api port
is it possible to have agents from different LLMs talking to each other through Autogen. For example Mistral with Openai?
looking forward to the real usecase video thanks again Matthew!
textgen webui is janky as hell in it's presentation, but it is still my favorite interface for trying new models because it is so bleeding edge. Great video.
Anyone know the price differential on requests of this approach vs OpenAi?
I can only imagine it’s much cheaper.
Kind of sad that local LLM with autogen are not really ready for primetime. I hope they get better. At the moment, we can barely even create toy projects as demonstrated with GPT-4.
Thanks for your tutorials, are amazing. Please, I would really like to see more Autogen tutorials, the best use cases!!!
Thanks, will do!
Super keen for that Advanced AG Tutorial!
Great content, concise and gets to the point
Dude yes.. thanks Matt 🙏🏽💎
Would you consider doing a video about Darkweb trained LLM's? I know of DarkBERT so far, but none others... I'd love to see that level on uncensored available
What are the differences between RunPod and LMStudio? Why did you go away from LMStudio?
it has missing functions though ... Fastchat does better in terms of api, but has issues with cpu offloading
We would love to see your personal project on AutoGen! Mine is about : Leveraging AutoGen to craft an AI-driven intelligent solution for optimized pharmaceutical inventory management. I think its certainly too ambitious, I would love your advice.
How do I get my GPU to work instead of CPU. It is taking very long to run my code as I notice from task manager only my CPU is working while the GPU is idle.
I do not know if there is a change in the template or something but I followed this video and another video exactly but could not get a the port 5001 to be working. Then I talked to Runpod discord and they have asked me to add the following environment variable to the pod
environment variable called UI_ARGS to your pod with a value of --extensions openai --api-port 5001.
Then it is working. Hopefully it will help those who might face with the same issue.
where do you add that environment variable?
not all heroes wear capes, thank you!
under expose http ports, there is env var setting.
Only thing missing is adding "langchain" of llamaindex to talk to database (csv, pdfs etc)
You should make a video about visual Copilot
What’s that?
Why use RunPod whereas you demonstrated LMStudio locally can do this free of charge? Also Mistral 7B can easily run locally which makes that you can run AutoGen endlessly and free? Am I missing something here? What's different besides Run Pod with your previous video? I was hoping to see LMStudio with Mistral 7B and AutoGen locally working on code but I miss the point of this video where LMStudio is replaced by Run Pod (which is paid by the hour). What's the message, why should we pay for Run Pod/do that?
LMstudio has requirements that not all of us have locally :)
There are still a lot of local computers that are not well equipped
I picked up a second hand MBP Mid 2010 Corr i7.
Will this run Auto Gen locally?
or real world use case, i personally need to categorize my product categories to google product category structure. would love to be able to setup that on my PC.
Been following your instruction for so long. Do we know what is wrong if the port 5001 is never ready?
Hey Matt, unfortunately Dolphin model quickly hits max context window of 2048 tokens. I've tried few different ones (Mistral-7b) but with various success. Let us know which model works the best and thanks for great content!
MemGPT might be a good solution to the content window issues. I’m making a video about it today.
Looks like Autogen is already adding support for memgpt-enabled agents. I guess the limitation now is finding open source models that support function calling correctly. There are a couple up on Huggingface that claim to do so.
These are great! Thanks for showing us the way mate
Very cool, thanks for sharing! I am going to be trying this asap 🎉
Thanks for the info,subscribed too.
that opening shot transition was sick, where is that from?
You are absolutely outstanding, Mat!
Can Autogen be used with LLMs from Huggingface? Like in Langchain...
Man you seem younger every day it pass, keep your great job. LOVE
Three cheers for AutoGen!!
this series is great keep it up! thank you!
I also see the same problem of non-exposed port 5001 at RUNPOD pods. Maybe they block it. Tried a lot of time the instructions and always get port not ready.
can you start mentioning associated costs with the videos you do please? i.e. runpod cost me $xx.xx to run through this demo
Getting this error. While testing mistral model.
Error occurred while processing message: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
Hey great video. I actually am starting to work with autogen, and local LLM's. But there is a big issue for using tools while doing so, do you have any solution for this? My first thought rn is to pass the tools as arguments and instead of running a simple generate on the models, I run an agent chain with langchain but it's not a clean solution.
So we dont need an api anymore? I mean like we can run without an api? Or just use the api generate for this platform?
You don’t need to pay for ChatGPT anymore, either way you need an API.
It seems like the template is failing to run openai on port 5001
Could you do more videos on autogen studio?
Autogen advanced video please!
Hi, I just started to venture into the AI world about almost 2 weeks now. And I am very fascinated by the llm, ChatGPT etc. Still struggle to learn Python and to understand what Docker is. Do you have any suggestion where should I start?
Ask chatgpt.
I got to know runpod because of AutoGen and used him.
How come you did not have to put the `llm_config` in the `UserProxyAgent`? Which llm is it using by default?
Can you show how to do this with RunPod Serverless endpoints?
Also, is there a way to secure the endpoint and set your own API token?
Great Video again. I am having a small issue with current setup, I tried all mentioned on video and text web UI is working fine but whenever I'm trying to connect it via API it's giving no response while autogen is working fine between agents but response is 'None'.
I got limit context window size error (~2k) is there any setting of the TextGen UI can overcome this?
Hello, would anyone understand why I can't start the service? I am new to this and I can't find a way to solve it, I need to use port 5001
"it your service runnig? check your togs or read the README"
Yehahaha you heard me lol but using runpod isn't free lol 😔
rip
but it does cost some money to use the biggest gpus
@@ryzikx how about using petals with autogen
So, could we do this with petals too?
Ayy, just in time! AutoGen Hyype!
Is it possible to use Autogen with open source models that are hosted on AWS EC2 instances?
Do we need cuda drivers?
@matthew_berman - Can we use AWS SageMaker foundation Models wit AutoGen?
AutoGen is a really interesting project especially when you don't have to pay the OpenAI fees. But the relatively small context windows for LLM's (all of them really) is frankly a show stopper for using AutoGpt. I work on a project which consists of more than 2000 java source files and I don't see any way to use AutoGPT to develop or iterate on projects of this size.
MemGPT solves that issue.
Can the TextGen WebUi run with the multiple autogen agents?
This is awesome 😍
This is really good
Thankyou so much.
The a100 and 4090 are twice as fast as the a6000 for inference
So do I understand correctly, with this method I don't have to pay for the chatGPT API like at all?
more please
Where’s the “Advanced Tutorial” for autogen? You’ve mentioned that it’s coming for a while now, but I’m not sure it exists? One of the main reasons i follow your channel. Thanks!
that's what I been doing for almost 3 weeks now :D
I cant install autogen or import in Linux or Windows, im so sad..
Didn't you just expose the runpod publicly with no API key protection?
How do you get internet access?