I am not sure how many people will see this video,but trust me. Everything I learned over the last month ,this video explained it. Thank you so much. Subscribed and best of luck
🎯 Key points for quick navigation: 00:26 *📊 A new tool for comparing large language model providers is discussed, focusing on dimensions crucial for choosing an LLM for applications.* 01:06 *🌐 Open Web UI provides an accessible interface for team use with just one OpenAI API key, integrating well with AMA.* 01:58 *💸 GPT-4o Mini is highlighted for its cost efficiency, offering lower prices and suitable for applications requiring low latency.* 03:54 *📈 GPT-4o Mini currently supports text and vision with plans for future multimodal support.* 06:49 *🛠️ Gro's collaboration introduces models specialized in function calling, crucial for building agent-based applications.* 10:05 *🏆 Artificial Analysis provides a comprehensive tool for independently comparing LLMs across quality, speed, and price.* 25:55 *📚 A new leaderboard evaluates LLMs on their ability to correctly call functions, crucial for programming applications.* 28:56 *🧩 Evaluation of AI models in function calling involves assessing Pythonic and non-Pythonic capabilities, from simple to complex function interactions.* 30:23 *📊 Non-Python evaluation includes testing AI models on tasks like correct usage of REST API and SQL query syntax.* 32:34 *🥇 Cloud 3.5 Sonet leads in function calling performance on the Berkeley Function Calling Leaderboard, emphasizing accuracy and efficiency metrics.* 34:17 *📉 Open-source models like MetaLama 3 lag in function calling reliability compared to proprietary models like GPT-4.* 41:23 *🛠️ Open Web UI allows local deployment for managing AI models, facilitating tasks like retrieval-augmented generation and function calling.* 01:00:13 *🛡️ Users can define a whitelist of accessible models in Open WebUI, restricting access to specific models like GPT-3.5 and GPT-4.* 01:01:33 *🌐 Open WebUI integrates with both OpenAI and Ollama APIs, allowing management of Ollama models directly within the platform.* 01:02:43 *📄 Document management in Open WebUI supports RAG applications, enabling users to configure document embeddings for retrieval and processing.* 01:03:50 *🎨 Open WebUI facilitates image generation using models like OpenAI's DALL-E 3, offering options to adjust resolution and other parameters.* 01:05:36 *🔍 Web search capabilities in Open WebUI allow users to perform internet searches directly within the interface, using services like D.O. without additional API costs.* 01:29:11 *🌐 Setting up Open WebUI allows for web searches directly integrated into AI models, enhancing information retrieval capabilities.* 01:31:45 *🖼️ Creating prompt snippets in Open WebUI simplifies repetitive tasks like image generation, using predefined templates with placeholders.* 01:42:09 *⚙️ Defining tools and functions in Open WebUI enables custom API integrations and complex calculations within AI models.* Made with HARPA AI
I have been playing with RAG for 3-4 months, using several different models on my local gaming desktop. This really helped remind me of several features and brought things back into a clear perspective. Great helps!
I also share the sentiment that this video consolidates a ton of knowledge after three years of experimenting with AI this unifies quite a bit of utility. This gentleman certainly has a sub for me! if he can really dive into tuning the functions and tools, he’s right there’s a not much clarity on the documentation. I want to process in the exports of conversations from language models that can be associated into groups of categories
I love your dedication to explain things up, it is really hard to do this kind of tutorials especially for an advancing tech because we don't know if the next month or even the next day their update abandon those features you invest to explain.... this is really a dedication. kudos for you mate! by the way becareful of Google products they are all mostly euthanized someday (Google notes, Google plus, Google sites, etc) however open source will be there forever even if isn't being used any more, and even if it's completely new UI it might get new branch instead so usually open source got less "betrayal" when community aware of content creator efforts too
This is an amazing and an informative video that I’ve been searching for. Thank you for your time. In the future, could you go into some of the pipeline features? I was able to get it setup but was unable to get the test scripts working properly. I wanted to create a mixture of agents using the pipeline
Thanks for this it was great. I really want to know more about the audio side if you can do a video on it? I'm trying to get a less robotic voice response and don't have a API to use OpenAI.
I think you can use pipe's inlet function to limit/truncate the input tokens. For the output token, you might need to use pipe with custom code that call OpenAI's chat completion and you specify `max_token`.
great overview of openwebui thank you for sharing this walkthrough. Can you give a few more sentences on how the data storage works? Can I use a vector database in conjunction with this so that the systems learns and has memory or just basic Rag. I'm new to all this so apologies if you talked about this and I missed it. Thanks
Hello David. I think it's the following. (1) conversation itself has one memory (what user says, and what AI responses), (2) local RAG memory if you upload the document from local space. For (2), if you wish to use external vector embedding database, you could try by configuring relevant environment variables (see here docs.openwebui.com/getting-started/env-configuration/#rag). I haven't tried it myself though.
The video was made in May 2024. It’s earlier version of Open WebUI. However, most features are the same, except Knowledge (previously as Documents), and some functions like event emitter. See my video on these updates: Open WebUI-Features Updates-Oct 24-Knowledge, RAG, Tools, Functions, Actions, Event Emitter, Filters ua-cam.com/users/live8FChfBUilno
Good question. I haven't tried that myself, but I would like to think the following steps might work. (1) Setup an endpoint in GCP that uses Gemini but emulates OpenAI (cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-gemini-using-openai-library) and (2) use the endpoint in Open WebUI.
Hey, I don't get it how to use these tools? I add the tool, include it for the model, include it in the conversation itself. For example, web search or web scrape, but the model still doesn't go on websites and search online and basically doesn't work with tools at all. The documentation for openwebui says that not all models support tools. I'm using gemma 2.27b.
I don't know if this is a stupid question ... but is this possible with Llama3.1 which is open source I think. That would be 0.00 cents per million input tokens, correct?
I'm certainly positive it's doable. You could just download Llama3.1 into Ollama (using this command: ollama run llama3.1:8b). Then, you should see the model in Open WebUI's dropdown list.
Can Open WebUI be deployed for an enterprise? multi user authentication for 10-15 employees? what would be the hardware requirements if we have to have it hosted on cloud?
Definitely possible for enterprise, especially a small group like 10-15 employees. The hurdle would be more on the cyber security side. So, you might need to consult with your cyber security team if you have. This is a crucial step for enterprises, however. Regarding hosting, a server you use to host does not need to have GPUs if you will use OpenAI API calls. In our in-house training, we use cloud virtual machine with 4 cores and 16 GB of memory to handle incoming request of 50 users.
Open WebUI has API endpoints that you can access too, allowing you to get chat completion, get files from the chat sessions, etc. You can learn more from its documentation. However, if you wish to just access AI models as endpoints, I won't use Open WebUI; it's overkill.
I installed openwebui today and I have both claude and groq api keys (they are both openai api format endpoints) however I don’t understand where to find the endpoints that openwebui requires in settings. The docs say the latest version they allow both local, openai compatible and now non-compatible api endpoints
I am not sure how many people will see this video,but trust me. Everything I learned over the last month ,this video explained it. Thank you so much. Subscribed and best of luck
🎯 Key points for quick navigation:
00:26 *📊 A new tool for comparing large language model providers is discussed, focusing on dimensions crucial for choosing an LLM for applications.*
01:06 *🌐 Open Web UI provides an accessible interface for team use with just one OpenAI API key, integrating well with AMA.*
01:58 *💸 GPT-4o Mini is highlighted for its cost efficiency, offering lower prices and suitable for applications requiring low latency.*
03:54 *📈 GPT-4o Mini currently supports text and vision with plans for future multimodal support.*
06:49 *🛠️ Gro's collaboration introduces models specialized in function calling, crucial for building agent-based applications.*
10:05 *🏆 Artificial Analysis provides a comprehensive tool for independently comparing LLMs across quality, speed, and price.*
25:55 *📚 A new leaderboard evaluates LLMs on their ability to correctly call functions, crucial for programming applications.*
28:56 *🧩 Evaluation of AI models in function calling involves assessing Pythonic and non-Pythonic capabilities, from simple to complex function interactions.*
30:23 *📊 Non-Python evaluation includes testing AI models on tasks like correct usage of REST API and SQL query syntax.*
32:34 *🥇 Cloud 3.5 Sonet leads in function calling performance on the Berkeley Function Calling Leaderboard, emphasizing accuracy and efficiency metrics.*
34:17 *📉 Open-source models like MetaLama 3 lag in function calling reliability compared to proprietary models like GPT-4.*
41:23 *🛠️ Open Web UI allows local deployment for managing AI models, facilitating tasks like retrieval-augmented generation and function calling.*
01:00:13 *🛡️ Users can define a whitelist of accessible models in Open WebUI, restricting access to specific models like GPT-3.5 and GPT-4.*
01:01:33 *🌐 Open WebUI integrates with both OpenAI and Ollama APIs, allowing management of Ollama models directly within the platform.*
01:02:43 *📄 Document management in Open WebUI supports RAG applications, enabling users to configure document embeddings for retrieval and processing.*
01:03:50 *🎨 Open WebUI facilitates image generation using models like OpenAI's DALL-E 3, offering options to adjust resolution and other parameters.*
01:05:36 *🔍 Web search capabilities in Open WebUI allow users to perform internet searches directly within the interface, using services like D.O. without additional API costs.*
01:29:11 *🌐 Setting up Open WebUI allows for web searches directly integrated into AI models, enhancing information retrieval capabilities.*
01:31:45 *🖼️ Creating prompt snippets in Open WebUI simplifies repetitive tasks like image generation, using predefined templates with placeholders.*
01:42:09 *⚙️ Defining tools and functions in Open WebUI enables custom API integrations and complex calculations within AI models.*
Made with HARPA AI
Thank you for your effort. Its a great video
You are welcome!
I have been playing with RAG for 3-4 months, using several different models on my local gaming desktop. This really helped remind me of several features and brought things back into a clear perspective. Great helps!
I also share the sentiment that this video consolidates a ton of knowledge after three years of experimenting with AI this unifies quite a bit of utility. This gentleman certainly has a sub for me!
if he can really dive into tuning the functions and tools, he’s right there’s a not much clarity on the documentation.
I want to process in the exports of conversations from language models that can be associated into groups of categories
Thanks for sharing. would be interesting to see usage of pipelines module of open web ui 🙌
Great suggestion!
I love your dedication to explain things up, it is really hard to do this kind of tutorials especially for an advancing tech because we don't know if the next month or even the next day their update abandon those features you invest to explain.... this is really a dedication. kudos for you mate!
by the way becareful of Google products they are all mostly euthanized someday (Google notes, Google plus, Google sites, etc)
however open source will be there forever even if isn't being used any more, and even if it's completely new UI it might get new branch instead so usually open source got less "betrayal" when community aware of content creator efforts too
Thank you very much, Ricky.
I was looking for this, Thank you so much!
Thank you very much for your beneficial contents.
Glad we could help :D
This is an amazing and an informative video that I’ve been searching for. Thank you for your time. In the future, could you go into some of the pipeline features? I was able to get it setup but was unable to get the test scripts working properly. I wanted to create a mixture of agents using the pipeline
Thanks for this it was great. I really want to know more about the audio side if you can do a video on it? I'm trying to get a less robotic voice response and don't have a API to use OpenAI.
I'm working on that!
@@CaseDonebyAI Great thanks, looking forward to it
thank you so much!... this is super useful
Glad it was helpful!
Is there a way to limit the maximum token limit(both input and output) per entire chat?
I think you can use pipe's inlet function to limit/truncate the input tokens. For the output token, you might need to use pipe with custom code that call OpenAI's chat completion and you specify `max_token`.
@@CaseDonebyAI Will check that out. Thank you for the clarification. Much appreciated!
great overview of openwebui thank you for sharing this walkthrough. Can you give a few more sentences on how the data storage works? Can I use a vector database in conjunction with this so that the systems learns and has memory or just basic Rag. I'm new to all this so apologies if you talked about this and I missed it. Thanks
Hello David. I think it's the following. (1) conversation itself has one memory (what user says, and what AI responses), (2) local RAG memory if you upload the document from local space. For (2), if you wish to use external vector embedding database, you could try by configuring relevant environment variables (see here docs.openwebui.com/getting-started/env-configuration/#rag). I haven't tried it myself though.
@@CaseDonebyAI Thank you for the response and thanks again for posting these videos!
Wich version do hou have (OpenWebIU
The video was made in May 2024. It’s earlier version of Open WebUI. However, most features are the same, except Knowledge (previously as Documents), and some functions like event emitter. See my video on these updates:
Open WebUI-Features Updates-Oct 24-Knowledge, RAG, Tools, Functions, Actions, Event Emitter, Filters
ua-cam.com/users/live8FChfBUilno
มีวิธีต่อ gemini api กับ open webui ไหมครับ เมื่อก่อนรองรับ ตอนนี้เหมือนจะถูกถอดออกไปแล้ว
Good question. I haven't tried that myself, but I would like to think the following steps might work. (1) Setup an endpoint in GCP that uses Gemini but emulates OpenAI (cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-gemini-using-openai-library) and (2) use the endpoint in Open WebUI.
Hey, I don't get it how to use these tools? I add the tool, include it for the model, include it in the conversation itself. For example, web search or web scrape, but the model still doesn't go on websites and search online and basically doesn't work with tools at all. The documentation for openwebui says that not all models support tools. I'm using gemma 2.27b.
Have you tried using a larger model, like GPT-4o. The success of to calling depends on the knowledge of the LLM as well.
I don't know if this is a stupid question ... but is this possible with Llama3.1 which is open source I think. That would be 0.00 cents per million input tokens, correct?
I'm certainly positive it's doable. You could just download Llama3.1 into Ollama (using this command: ollama run llama3.1:8b). Then, you should see the model in Open WebUI's dropdown list.
Can Open WebUI be deployed for an enterprise? multi user authentication for 10-15 employees? what would be the hardware requirements if we have to have it hosted on cloud?
Definitely possible for enterprise, especially a small group like 10-15 employees. The hurdle would be more on the cyber security side. So, you might need to consult with your cyber security team if you have. This is a crucial step for enterprises, however. Regarding hosting, a server you use to host does not need to have GPUs if you will use OpenAI API calls. In our in-house training, we use cloud virtual machine with 4 cores and 16 GB of memory to handle incoming request of 50 users.
Does Open WebUI support creating an API endpoint for AI models or is it just a chat UI?
does expose the models as a RESTful API ?
Open WebUI has API endpoints that you can access too, allowing you to get chat completion, get files from the chat sessions, etc. You can learn more from its documentation. However, if you wish to just access AI models as endpoints, I won't use Open WebUI; it's overkill.
I installed openwebui today and I have both claude and groq api keys (they are both openai api format endpoints) however I don’t understand where to find the endpoints that openwebui requires in settings. The docs say the latest version they allow both local, openai compatible and now non-compatible api endpoints
Let us check on that.
when i add tool. It error. Window 32 error. How to fix this?
So many things can go wrong :(. Syntax errors, missing libraries, etc. It's hard for us to really pinpoint and help you our here.
Thank you for another good overview, it is very useful. _/|\_
Where is chinese qwen