Good review, I have been using open-webui for a while and learned a bunch of new stuff, thanks. It appears to get better all the time which should continue especially after you've uncovered areas for improvement. BTW, I like the new chat archive feature.
I’ve enjoyed watching your videos about Ollama. I found them informative & easy to follow. I actually now have a working ollama open-web-ui set up on my old laptop. Yes it runs slowly but perfect for my needs. After a bereavement in my family I decided to take the plunge & work for myself. You have provided me with an additional tool set that I can call upon. Keep your great videos coming. Thank you!
The new required login doesn't go to any remote site, it stays on the local computer. This way multiple users can store chat history and settings. I agree that it should be optional, but at least it's local.
I really appreciated this video. I've only been using this tool for about a week and was really excited to get answers to all of the confounding and non-working features I kept running into...only to find out that they're actually confounding or non-working. 😂
When you set additional hosts in the 'Connections" settings, they will act as redundancy assuming you have the same models installed on each host. So if I serve to multiple users, all using the same model at the same time, it will queue up requests to the current unoccupied host, in sequence. I've tested it locally with 3 separate hosts, and it works quite well. BTW thank you for the great video!
I was wondering whether a model like Llama3:70b or Llava could run on one PC with a lot of hardware resources but on a seperate PC you could run a light model like Phi 3. Then... I could turn off the powerful pc at night /weekends to save power and the chats model could default to Phi3? Maybe this is what it could be used for. After all when the powerful PC is off Open web UI wouldn't know things like Llava even exist... Maybe open web UI would need a restart to notice things had changed? What are your thoughts?
@@liamburgo23 It sounds like you're proposing an interesting setup where you use a powerful PC to run larger models like Llama3:70b or Llava, and then switch to a lighter model like Phi 3 on a separate PC when the powerful one is off (for energy-saving purposes). You're right that when the powerful PC is off, the Open Web UI might not be aware of models like Llava unless it is restarted to recognize the change. This setup could work well if you manage the switching process effectively. As long as the Open Web UI is configured to handle the model change, it should be able to adapt to whichever model is active. I think it’s a practical way to balance performance with energy efficiency!
New viewer and Ollama user. I had been saying I wanted to go back to school now that I am retired from a university. Then I discovered this whole world of AI. WOW !! Been busy learning ever since. And I find your videos the easiest to follow. So thanks, you make learning this stuff fun. Anyway, the logon prompt thing is fixed. There is an option available when setting up Docker. I web to Ollama and get right to the business. I just wish Docker wasn't so memory hungry.
Thanks for teaching me how to get started. The only downside of Ollama is that it's unable to integrate with HuggingFace, but it is able to import the raw GUF files or whatever they are called, by manually filling out a Modelfile. It's amazing. I basically fill out FROM, TEMPLATE, PARAMETER context size and PARAMETER stop words. Then import it. The result is perfect. I even imported inside a Docker environment. Just place the image folder inside the mounted colume path. Then use "bash" inside the container and then you can do the import.
Great video, here's my use case: i drafted very many quiz questions and answers, but am having no luck either using ollama web ui to query the questions or generate new ones so i guess i must learn about RAG flask etc.
What would make a great addition to this , would be a RAG backed to load bulk documents . The way to do this would simply mount an external volume to the docker image. Then have a file watcher to load up any new document which are added to the external directory . All documents are avaiable to all users of Web-UI for RAG use.
You can upload all of the files that you want to load into the DOCS_DIR directory and hit scan, it will load any new files it finds. Not 100% automated, but more reliable than using the (+) button.
I use open webui some but also use the command line. I'm not familiar enough with advanced usage of either, though. I appreciated this video and am looking forward to learning more. At this point, I'm just a sponge. Thanks!
The user login default worked well for me - at a company that can’t use cloud based LLMs for security reasons, the default workflow allows you to immediately install this tool and share it with regular users (who don’t know what a command line is). But I agree maybe there ought to be a “dev” switch that turns it off. Really great video, looking forward to more.
I can appreciate your sense of humor. i just wish i was smart enough to get your goat. this is a case that it doesn't matter how early i get up in the morning. Your dedication and and willingness to teach is above board. the fact that you have a intuitive ability to do it as well as you is beyond irritating. you've made me require a different excuse for why i'm so damn slow at grasping this stuff. it's not like i wasn't there when my buddy up graded to the apple II then the mac. there were no apps or software. if you wanted your machine to do somthing you had to tell it. wow i'm not quitting . ah i got it. even though im genx im on the cusp of being a boomer.
I can't even wrap my head around why dark mode would hurt your eyes. I have to stare at a screen all day for my job and programmed a dark mode on our project that is solely for the devs because I hate getting blinded while working on it 😅 Good content none the less
@@technovangelist Im 59 so it's the dark mode that's making me go blind? Huh. i just figured what they told me was going to make me go blind when i was a kid was true.
I am new to OpenWebUi, but the system prompt was a useful features for me, there I could enter things like "I preferer answers in metric and 24-hour time, my location is Stockholm Sweden, but I want you to answer in English". This make it easier for me.
I’ve been looking at this and other tools and the one thing I find elusive is the ability to fine tune a model with desired prompt/inference examples to help fast track the usefulness of a newly downloaded model. Including this in your reviews would be amazing if possible.
I deployed open webui on my kubernetes cluster and I am pretty happy with it. It makes it easy to test some LLMs and compare their output. I wish one could add langchain code and select that as a model in the dropdown. Then it would be easy to integrate your own RAG/agent pipeline. Thank you for your videos! Your content is awesome!
Awesome to hear that you've successfully deployed Open WebUI on your Kubernetes cluster, are enjoying using it to test and compare LLM outputs! We appreciate your feedback and enthusiasm for the project. We love to hear from our users about their ideas and suggestions. In fact, we've had similar requests to yours for a while and we absolutely plan to address them. While we haven't had the bandwidth to implement these features yet, we're excited to know that there's continued interest in this direction. If you or anyone else in the community is interested in contributing to Open WebUI, we'd be happy to see pull requests for these features! Even if it's not directly related to Langchain integration, any PRs or answers to questions in our community can help free up time for our developers to focus on bigger features. Thanks again for your kind words and for being part of the Open WebUI community!
Hey! Kubernetes for this is a great idea specially because the time it takes for Ollama to switch models before give back the response. Would you share your kubectl command?
Thanks for the detailed description. Opens up a lot of possibilities, what I am missing on command line is a history which this provides... I also discovered Enchanted desktop client for Mac which does this as well, so this is easier to install.
Hi : ) Thanks for nice review. Like your voice : ) Btw. there is no need to create account. During container creation it is possible to add environment variable -e WEBUI_AUTH=False (it is in doc)
I installed it and it seems useful. There's a call feature now, but I haven't gotten it to work yet. MIght be a killer feature when I'm using my own llm from a mobile device. I use wireguard to connect to my home instance when I'm away.
interesting. I have been meaning to create an update. and just started updating my table of integrations....still a lot to do : www.technovangelist.com/notes/annotated%20list%20of%20ollama%20web%20and%20desktop%20integrations/
The CLI is less handy because it doesn't record the chat and editing a prompt in CLI id tough, but in general I agree with you. Especially the custom model would benefit of additional functionalities like using the unlsoth framework for fine-tuning, then saving, benchmarking and loading custom models, it is very compatible with ollama's local philosophy,
User login is a pain if you’re on your own but if you have a family using it then it’s good to have your conversations stored per user. Many users will run ollama and a UI like this on a powerful machine in a cupboard for example then access it via phones or laptops from the couch. Then if you have 3 or 4 of you then user management becomes almost a necessity. I agree though it should be optional.
BTW awesome tutorial - I was using the olv version and I didnt know the prompts, web scraper, document and the voice was available. Thank you for sharing. I how they will fix the whisper TTS soon as the generic windows TTS is so annoying .. sounds like we are year 2000 :)
i use it all the time. i use it as apposed to faster services for privacy reasons. i serve it from home and have it sitting behind a reverse proxy server so im able to reach it from a FQDN. it suits me well :D
Hey, this is great, Matt! :D Having you try and review all the Ollama frontends will be super useful! I'm really looking forward to the rest of the series! :D I currently use Open WebUI as well as the Ollama CLI, and I completely agree with the pros and cons you outlined. By the way, could you tell me where to find the comparative chart you mentioned in the video? I couldn’t find it on your website, but I'm really interested in having a look at it :)
I wouldn't go as far as to say that this is even close to a 1-1 command line tool. I use both regularly, it's nice to be able to easily setup a DB for file storage and to have a smooth and extremely easy way to integrate your files into your chats. It's also pretty nice to have an easy interface for model building rather than needing to build out your models in text files and then create them from that. It's also pretty nice to be able to provide feedback to your models in a concise way using the thumbs up and down feature, it's especially noticeable if you test with local files and repeatedly give the model poor praises when it answers correctly and visa versa, the model reflects the mixed judgment and starts to act foolish. Additionally, it makes it very easy to serve models to friends, family, and in a work environment.
I've tried setting up several different RAGs. In most cases, the rosey docs don't capture the snagging issues I have ran into. I can't help but feel we're in an early days state, and that in a few months RAGs will evolve. Right now, I'm kinda backing away from investing further time as they only work partially, and in document, image, and sound handling - there is ... work to do :/
I love Open WebUI. I can download a GGUF model in hugging face and convert directly into Ollama format in minutes using the GUI. And TTS is fantastic, hands free, I can talk and listen. I even installed new voices. And I can web search, RAG, many features indeed ! ❤❤❤
"convert directly into Ollama format in minutes using the GUI" Why is Ollama so dumb it needs to convert anything? So many other apps just work with normal GGUF files, without any messing around.
If the model weights are gguf there is no conversion. Ollama just uses straight gguf files. I assume he means converting from the source safetensors files. Oh wait. He did say convert gguf. He doesn’t know what he is talking about. In order to use a model anywhere you typically want a system prompt and template. With most other tools you have to figure that out yourself and with ollama you get it all. But it just uses gguf as is.
@@technovangelist Nope, I edited the environment variable to point Ollama at my 95GB folder full of GGUF that Backyard, ChatGPT, LMStudio and others can use, Ollama just stares blankly and declares "No models found"
I use Open WebUI every day and I love it! I love how it formats results nicely and stores the conversations for easy reference. The login page works with my password manager so it's not that inconvenient and I feel better that my conversations are kept private this way because privacy is such a huge motivation for running a private AI after all.
The login for me has use. I host this on a server. I have the admin account with all the trial models and so on and the user account which only has access to the one or two models that just work. As a result, when my wife wants to use it or when I want to just get stuff done, the user accounts are great. When I am fiddling and don't care there are loads of available models and duplicates with slightly different names, then admin it is. I do host it online but I do not treat the login page as any degree of security, just as a way of segregating functionality.
I envision a chat app with a tree-like structure, including a main trunk and collapsible branches for topics created by users. You could invite and select bots, similar to users, that are defined with Flowise with various capabilities defined by their flows. A respond from a chatbot would be triggered by selecting a bot or activating a checkbox when sending a message. Additionally, I’d like to have a Telegram client interface in a specific topic / branch and include SST/TTS functionalities. 🙂 (All I need is a seed investor and a engineer to do all the work. 😋)
Fully agree. Its an all-new technology for many of us and some terms used aren't that obvious. So, good help text and tool-tips are as important as the feature itself. Having a good UI is great. But, as far as understood this video, it is not *just* a UI. Built-in RAG, vector database etc. means there's more "stuff" than just the UI itself. It is needed of course, but it is rather a full blown frontend application than just a UI. Things grow over time. ;-)
I appreciate your review of this. I learned a lot. However, it has gone through a ton of changes, so a lot of your instruction is outdated. I'd love to see an updated review (if it doesn't exist already, of course). Specifically, the way it handles documents now. But I appreciate you highlighting the / and # commands. I missed that while playing with it. Very handy. I'm also really curious to get your take on the chromaDB RAG implementation? How does it stack up to say, pgvector? I ask because I'm having to explain why this (chromaDB) is a stronger choice since it's included in an out-of-the-box solution, and we're trying to get this going on several systems. Thanks again.
I see the docs have been updated to include this. that’s great. Its not everything I was mentioning, but it’s a good part of it. The modelfile is the key part of ollama that makes it amazing, but I didn't see any improvement on the basics in open webui
Thanks for the Video. I use this tool regularly but I did not cover all the features you mentioned- time to explore them 😃 The document chat for me feels a little bit less configurable than I would like to use it, e.g specific Text Splitters, working with external vector stores but maybe some of the features will be added sometime. I am very impressed by the release frequency of these guys!
I like this tool. For us people that maybe arent as comfortable with code it makes things easier. If the whole point of these open source models is to open LLMs to as many people as possible then these tools are needed. If the developers see this I'll throw one idea out there. Start people off by having them pick a model if if its a small model and have a help system that can run off of that. I actually dont get why more people arent doing that already. You just need structured documentation to make it that even the really small models can work with them. If someone doesnt understand something it would be really simple to just have a question mark button that they can click which the person can chat with. They could even go as far as having features requests or bug reports that use a similar system. On the developers side they can take in that data and use a larger model to do more processing on it to find common themes which would make it easy to prioritize everything.
I Liked and Subscribed. Hello Mat, I was one of the first group of STS Space Shuttle programmers 40 years ago while still in my early days of college. It's great to see how programmers of far ago and today's brains use the same synapse pathways. I have been with my hidratespark pro 32oz for three weeks now - love it. I plan to buy a small one (16oz?) to fit into my vehicle. 2. Which do you recommend, anaconda or Docker? 3. And what are we to do with the Modelfiles section? 4. What controls compare to Openai’s Custom Instructions in open web UI? 5. The ‘/’ features appear pretty helpful - I have to rewatch your explanation. 6. where to find user manual instructions for all of the open web UI features and how-tos. Thank you for the video. Happy Hidration.
Anaconda or docker??? Those have two very different roles and purposes. But I tend to avoid anaconda or conda or any of those package environments for Python. Just bloated. I don’t understand the question for 3 and 4
@@technovangelist Matt, #3, I am asking what the purpose/function of the Modelfiles area is. #4, OpenAI’s ChatGPT has a [Custom Instructions] feature in Settings (I think that's where it’s located; it allows for the user to predefine things the user wants ChatGPT to use in the responses without the user having to put it in every prompt.
The only thing that is missing for me when it comes to the web ui is A) do a sequential websearch (i.e. google stuff, if condition unsatisfied, google more, integrate into chromadb) B) digest my pdf data folder (e.g. a list of pdf publications) and store in DB. this could also be done in cli
I can't use light mode anymore. I got a terrible illness on my eyes, on the retinas, which I can't read almost anything on a white bg. It simply hurts my eyes. I still can see white over dark, but I need lots of contrast. 😮
My use case involves querying email archives, so it is crucial that the documents are not sent to external servers. I used the sentence-transformers/all-MiniLM-L6-v2 as the embedding model, and I believe that the documents I added are not sent to outside servers. I found GPT-4o much better than the models in Chat. My question is: will my email archives be exposed to external servers, or is only the question in my chat sent to OpenAI?
You embed to add them to a vector database so that you can find the most appropriate email to ask a question against. Then the email gets sent in plaintext with the question to whichever model you are using. If that’s going to be OpenAI, you are am sending the emails there. There is no way around this
I am wondering if there is a way to apply a white label style updating to the UI. Can you recommend customization for those who want to demo LLM centered ideas using Ollama UI? Perhaps an alternative from end with similar features?
PrivateGPT as UI? It's thanks to PrivateGPT that I learned about Ollama, it works pretty well on my Jetson Xavier AGX 32GB, not a simple task due to ARM64+CUDA.
Thing is, any gui would be better than commandline. Because you can use arrows to go back or forth, edit, select, etc. You cannot do that using Ollama in the terminal.
@@technovangelist well i can't. my own cli app has readline which allows me to do that but the official ollama cli does not allow me. when i hit left arrow for example i get ^[[D
Matt, I just installed Ollama and started using llama3.1 via the Windows cmd prompt. Now I need to install Webui and I followed you until you mentioned docker. You lost me after that. I need the procedure for installing Webui, and I assume I need docker. However you go through lots of details that seem important, but you don’t follow through for me. But what I’m saying is I need just more direct instructions for getting Winui to work without using the cmd prompt window. Thanks for your informative series. Howard from Detroit.
Yup. There is a lot of stuff with ai tools that assumes knowledge of docker. It’s a pretty low level requirement for so much of tech these days. I will be doing an update of this in the next few weeks and will try to include more of the steps required. And I know Detroit pretty well. My wife is from around there and we got married in Pinckney closer to Ann Arbor.
@@technovangelist Thanks for replying. I bet you are very busy with this Ai stuff. I am having a great time learning, and I am very excited about it. I graduated from Wayne State University in Detroit then worked for Chrysler Defense. I ended publishing some books on Basic programming with Howard W. Sam’s, and went on my own. It didn’t work out as well as I’d thought but I became a stay-at-home dad for our 2 kids, My older daughter is a programmer at GM and the younger one works for Ohio State. My wife went to U of Michigan. We go to Ann Arbor a few times a year for their theater. I think Ai is the next tech leap in computers and there’s nothing to fear. I am retired and doing some silly videos on UA-cam and working on a UFO blog with a friend from Colorado. My first computer was the Motorola M6800D1 kit using a teletype.
Last time I tried RAG from a big PDF in Open WebUI it did a terrible job, while Cheshire Cat did a good job with the same PDF. I tried also BionicGPT, bigAGI and chatd and Cheshire Cat was the clear winner in RAG. Besides that its plugin system offers many functionalities and its ability to delete specific memories is something I haven't seen anywhere else. I think it deserves a try, it seems a joke but it's not
Just had a thought - does the Ollama server already include a login management system? It’d be great if it could handle user credentials similar to Git, allowing specific access based on rules. For example, certain users could access specific models, or there might be usage restrictions. This would make it so much easier to deploy Ollama as an offline LLM service for small businesses. Not sure if this feature exists already, but if not, it could be a cool addition. By the way, awesome project! Really helpful for deploying LLMs locally. 🚀👊
No it doesn’t. It’s designed to be the best way to run models locally on your own hardware. Some folks are hosting solutions using ollama but they need to come up with that authentication and authorization system on their own. There are lots of tools for that depending on the specific needs of the project. In fact there are many large companies only focused on that part and none of them can provide all the options some folks want.
Open WebUI is the way it is because of how Ollama is designed and the intention the make copy of the ChatGPT UI, which was at least so successful that I confused it once with one of ChatGPT. If you worked with Ollama more on the command line, it I think it's pretty obvious what the modelfile section is about. Ollama annoyingly mixes-up modeles and weights. So it took me a while that an "Ollama" model is just a parameter file for a the files that contain the weights, that thing that we usually call a model. Except that the very first model file is kinda tied to the weights but you can create as many more model files as you want. I don't know if there is a set of settings already included with a GGUF file or if Ollama just assumes certain defaults as many modelfiles you can download from the Ollama site are mostly empty. In some cases the chat template is not even known, let alone additional parameters. Some models (that is modefile + weights) you can download from Ollama are even broken, so you can't get it working. Sometimes it is because the model was incorrectly trained like Westlake. Sometimes parameters are missing like for example the context size for the llama3 models. I don't know if they already fixed this but a llama3 model has a context size of 8k, if you don't set this in the model file, you will get trash responses after reaching this limit. The Ollama CLI also seems to store whole chat sessions either within a linked to specific model files. So can end up quickly in having tons of modelfiles which are listed as model but actually reference the same weights. If you need a different set of parameters then in the CLI you'd export the current modelfile, change it as needed and the "create" a new model from this. If you "create" it undere the same name, then the current one will be overriden. That's most likely the process that is included in Open WebUI, I say "most likely" because I actually do all of this on the command line and only had a quick look at this section of Open WebUI. ;-) Now since you can also change all parameters on the fly for the current session, just like you'd do with "/set parameter temperature 1" in the CLI, you can set these in the global settings of Open WebUI so the defaults from the modelfile or whatever defined them before will be overriden. Of course that's complete nonsense and very annoying, since you obviously want to have different settings for different models. So the option they give you is to create new modelfiles instead. So all this is quite clunky and unintuitive but that's more or less because Ollama is like it is. Of course you could work arround with that by for example not creating additional modelfiles by keeping profiles in the UI, we'll see of they do this.
I had a longer reply first but removed it. It seems most of this comment stems from a misunderstanding of how ollama works. Perhaps join the discord and ask there.
Good day, Matt, hopefully, this is my last question about the Private GPT installation. My laptop has arrived. I have installed an M.2 2T primary drive and a secondary 2T SSD. Q: After installing Ollama, Docker, and WebUI, can the models be stored (directed) to the secondary SSD to preserve space on the primary M.2 system SSD? If so, when do I pick where to store the models during their installation?
And the goal isn’t to look at web uis but rather all clients in general. It doesn’t offer that much over the cli but I am sure there is one that blows everything away.
I got addicted to ollama last year and got to play around with openwebui when it was still called ollama webui The name change messed up my docker installs, not gonna lie But then, we decided to try it as a corporate AI companion, but as it was a testing phase, we didn't scale our cloud very high, so it was pretty slow On my machine though, I wanted to try and use every bit of feature, which led me to install and learn ComfyUI, and while the image generations options from openwebui is limited whatever the backend you use, it's still useable
I appreciate that you introduced me to ollama and are sharing your experiences and frustrations with the deployment of it's various features. Your videos are a combination of joy and frustration that are part of every software development cycle, and it feels great that I am not alone feeling this. Thank you 😊
You leave my precious dark mode alone, you.. you meanie! 🙃 I use openwebui and I agree, why do we have to sign in and what is that modelfiles area for? I have not tried other addons yet, tho, but I am about to, which is part of why I watched this video. So. keep going through that addons list! Excellent video!
I think it would be better to have the 1st ollama web ui account be admin and have better user management and generation of local api keys also would be awesome so security is there from step 1 in case its ever in production in future!
Yes the user management is a bit lackluster not really providing much security and really only offering a little speed bump. So make it optional and then for folks that want the security offer it in a real way.
Great overview, thanks! I hope you will consider reviewing several of these. I use textgenwebui right now, but the options and processes can be overwhelming. They do have the most options for hardware tuned models, GPTQ, etc. A review and explainer for this tool would be appreciated.
Thx for video. A question about the combo ollama/openwebui/docker. I’ve this configuration and all is ok. I’ve a goal to raise. I want to specialize a LLM pre-trained. Training it with a large base of data about coding in a proprietary language not popular. Only 2 hundreds programmers use it. My question is: - which generic and light LLM can i use? - i use some Python script to train a LLM (in my case test with Phi3:Mini). I found a problem to solve. When i try to load the model something wrong. Infact Python says that not find the path of the model, usually ~./ollama/models/… - I note LLM are encrypted SHA256! Peraphs is a problem! Can you help me to do this training? Can you give some documentation or tutorials? Thx in advance. Have a good day. Sorry for my bad english. I’m an italian developer.
Wow... this is the kind of detailed, helpful and to the point app review we should see more of from people. Thanks!
Have my subscription Matt. I like your highly clear and structured way of speaking.
Awesome! Thank you!
Good review, I have been using open-webui for a while and learned a bunch of new stuff, thanks. It appears to get better all the time which should continue especially after you've uncovered areas for improvement. BTW, I like the new chat archive feature.
I’ve enjoyed watching your videos about Ollama. I found them informative & easy to follow. I actually now have a working ollama open-web-ui set up on my old laptop. Yes it runs slowly but perfect for my needs. After a bereavement in my family I decided to take the plunge & work for myself. You have provided me with an additional tool set that I can call upon. Keep your great videos coming. Thank you!
The new required login doesn't go to any remote site, it stays on the local computer. This way multiple users can store chat history and settings. I agree that it should be optional, but at least it's local.
Correct. It’s for access to openwebui. But it’s intended as a feature for hosting it on another system online.
Great video enjoy your content helpful. I have a question about agent's how can i contact you private
I am on the ollama discord. Or you can find me on twitter. Same name as this channel
It is optional I believe. Set: WEBUI_AUTH to False
Note: you can't change it back after doing this
I really appreciated this video. I've only been using this tool for about a week and was really excited to get answers to all of the confounding and non-working features I kept running into...only to find out that they're actually confounding or non-working. 😂
When you set additional hosts in the 'Connections" settings, they will act as redundancy assuming you have the same models installed on each host. So if I serve to multiple users, all using the same model at the same time, it will queue up requests to the current unoccupied host, in sequence. I've tested it locally with 3 separate hosts, and it works quite well. BTW thank you for the great video!
I was wondering whether a model like Llama3:70b or Llava could run on one PC with a lot of hardware resources but on a seperate PC you could run a light model like Phi 3. Then... I could turn off the powerful pc at night /weekends to save power and the chats model could default to Phi3? Maybe this is what it could be used for.
After all when the powerful PC is off Open web UI wouldn't know things like Llava even exist... Maybe open web UI would need a restart to notice things had changed? What are your thoughts?
@@liamburgo23 It sounds like you're proposing an interesting setup where you use a powerful PC to run larger models like Llama3:70b or Llava, and then switch to a lighter model like Phi 3 on a separate PC when the powerful one is off (for energy-saving purposes). You're right that when the powerful PC is off, the Open Web UI might not be aware of models like Llava unless it is restarted to recognize the change. This setup could work well if you manage the switching process effectively. As long as the Open Web UI is configured to handle the model change, it should be able to adapt to whichever model is active. I think it’s a practical way to balance performance with energy efficiency!
New viewer and Ollama user. I had been saying I wanted to go back to school now that I am retired from a university. Then I discovered this whole world of AI. WOW !! Been busy learning ever since. And I find your videos the easiest to follow. So thanks, you make learning this stuff fun.
Anyway, the logon prompt thing is fixed. There is an option available when setting up Docker. I web to Ollama and get right to the business. I just wish Docker wasn't so memory hungry.
Thanks for teaching me how to get started. The only downside of Ollama is that it's unable to integrate with HuggingFace, but it is able to import the raw GUF files or whatever they are called, by manually filling out a Modelfile. It's amazing.
I basically fill out FROM, TEMPLATE, PARAMETER context size and PARAMETER stop words. Then import it. The result is perfect.
I even imported inside a Docker environment. Just place the image folder inside the mounted colume path. Then use "bash" inside the container and then you can do the import.
Thanks for this Matt, very easy to work with this tool!
Excellent review.
Your voice and mannerisms were made for this.
Wow, very easy to understand it when explained by you! Thank you so much
Great video, here's my use case: i drafted very many quiz questions and answers, but am having no luck either using ollama web ui to query the questions or generate new ones so i guess i must learn about RAG flask etc.
What would make a great addition to this , would be a RAG backed to load bulk documents . The way to do this would simply mount an external volume to the docker image. Then have a file watcher to load up any new document which are added to the external directory . All documents are avaiable to all users of Web-UI for RAG use.
Or having a RAG from a webcrawl where the user just puts in the starting URL and domain
You can upload all of the files that you want to load into the DOCS_DIR directory and hit scan, it will load any new files it finds. Not 100% automated, but more reliable than using the (+) button.
There’s a tiny button after the response that gives you data on tokens per second etc, I love that about this particular UI, easy to compare speeds
Yes that is nice. It’s pretty interesting to see how much they have been able to replicate from the cli
You are amaxing love your teaching style.
I use open webui some but also use the command line. I'm not familiar enough with advanced usage of either, though. I appreciated this video and am looking forward to learning more. At this point, I'm just a sponge. Thanks!
hi Matt, amazing content. Thank you for sharing your thoughts with us and chatting with me during your stream.
The user login default worked well for me - at a company that can’t use cloud based LLMs for security reasons, the default workflow allows you to immediately install this tool and share it with regular users (who don’t know what a command line is). But I agree maybe there ought to be a “dev” switch that turns it off.
Really great video, looking forward to more.
would love to see a follow up detailed update review. love the level of detail in this one
I can appreciate your sense of humor. i just wish i was smart enough to get your goat. this is a case that it doesn't matter how early i get up in the morning. Your dedication and and willingness to teach is above board. the fact that you have a intuitive ability to do it as well as you is beyond irritating. you've made me require a different excuse for why i'm so damn slow at grasping this stuff. it's not like i wasn't there when my buddy up graded to the apple II then the mac. there were no apps or software. if you wanted your machine to do somthing you had to tell it. wow i'm not quitting . ah i got it. even though im genx im on the cusp of being a boomer.
Love your videos, mate. Even if we are on opposite sides of the fence re. Dark mode! Cheers.
I can't even wrap my head around why dark mode would hurt your eyes. I have to stare at a screen all day for my job and programmed a dark mode on our project that is solely for the devs because I hate getting blinded while working on it 😅 Good content none the less
Wait till you get older. It’s a pretty well known thing.
@@technovangelist Im 59 so it's the dark mode that's making me go blind? Huh. i just figured what they told me was going to make me go blind when i was a kid was true.
I am new to OpenWebUi, but the system prompt was a useful features for me, there I could enter things like "I preferer answers in metric and 24-hour time, my location is Stockholm Sweden, but I want you to answer in English". This make it easier for me.
your content is very hight quality... thanks Matt
Very clear presentation; thank you!
excellent overview
It would be great if you can make a video about deploying the model into the could and use its endpoints see how api friendly it is
I’ve been looking at this and other tools and the one thing I find elusive is the ability to fine tune a model with desired prompt/inference examples to help fast track the usefulness of a newly downloaded model. Including this in your reviews would be amazing if possible.
I deployed open webui on my kubernetes cluster and I am pretty happy with it. It makes it easy to test some LLMs and compare their output. I wish one could add langchain code and select that as a model in the dropdown. Then it would be easy to integrate your own RAG/agent pipeline.
Thank you for your videos! Your content is awesome!
Awesome to hear that you've successfully deployed Open WebUI on your Kubernetes cluster, are enjoying using it to test and compare LLM outputs! We appreciate your feedback and enthusiasm for the project.
We love to hear from our users about their ideas and suggestions. In fact, we've had similar requests to yours for a while and we absolutely plan to address them. While we haven't had the bandwidth to implement these features yet, we're excited to know that there's continued interest in this direction.
If you or anyone else in the community is interested in contributing to Open WebUI, we'd be happy to see pull requests for these features! Even if it's not directly related to Langchain integration, any PRs or answers to questions in our community can help free up time for our developers to focus on bigger features.
Thanks again for your kind words and for being part of the Open WebUI community!
Hey! Kubernetes for this is a great idea specially because the time it takes for Ollama to switch models before give back the response. Would you share your kubectl command?
thanks @W1ldTangent for that reply. I look forward to seeing open webui progress over each of the releases. Its amazing to see how far it has come.
Thanks for the detailed description. Opens up a lot of possibilities, what I am missing on command line is a history which this provides... I also discovered Enchanted desktop client for Mac which does this as well, so this is easier to install.
i usually use anythingllm but after you explain open web Ui i will try it
Hi : ) Thanks for nice review. Like your voice : ) Btw. there is no need to create account. During container creation it is possible to add environment variable -e WEBUI_AUTH=False (it is in doc)
I think they made a number of changes after this video
Loving your videos and surely will give Open WebUI a try. Keep it going on such amazing work with these content.
i use the system prompt to force output to be paragraphs instead of lists. it basically works.
Good stuff and one of the more useful things i have watched in a while debating uis
Love your review, sir. I came up with exactly same notes and frustrations as you did 😂
It would be nice if this were an assistant with a wake word. If there were a page to add actions, that would be terrific.
I installed it and it seems useful. There's a call feature now, but I haven't gotten it to work yet. MIght be a killer feature when I'm using my own llm from a mobile device. I use wireguard to connect to my home instance when I'm away.
interesting. I have been meaning to create an update. and just started updating my table of integrations....still a lot to do : www.technovangelist.com/notes/annotated%20list%20of%20ollama%20web%20and%20desktop%20integrations/
The CLI is less handy because it doesn't record the chat and editing a prompt in CLI id tough, but in general I agree with you. Especially the custom model would benefit of additional functionalities like using the unlsoth framework for fine-tuning, then saving, benchmarking and loading custom models, it is very compatible with ollama's local philosophy,
User login is a pain if you’re on your own but if you have a family using it then it’s good to have your conversations stored per user. Many users will run ollama and a UI like this on a powerful machine in a cupboard for example then access it via phones or laptops from the couch. Then if you have 3 or 4 of you then user management becomes almost a necessity. I agree though it should be optional.
Optional is the key word
Plus if with a password your kids can see all your conversations by looking at the unsecured information that’s stored as plaintext.
User management is actually a good thing if you want to share your LLM among other ppl without giving them an ability to mess with your stuff.
BTW awesome tutorial - I was using the olv version and I didnt know the prompts, web scraper, document and the voice was available. Thank you for sharing.
I how they will fix the whisper TTS soon as the generic windows TTS is so annoying .. sounds like we are year 2000 :)
Thanks for the info. Love the videos
Great content. I wish updating the Open webUI was easy for update directly on the web interface
For local hosting ollama UI, AnythingLLM is better for RAG use case, but Open WebUI offers a closer UX to ChatGPT interface.
Does AnythingLLM have an API endpoint for prompting with RAG functionality?
i use it all the time. i use it as apposed to faster services for privacy reasons. i serve it from home and have it sitting behind a reverse proxy server so im able to reach it from a FQDN. it suits me well :D
Hey, this is great, Matt! :D Having you try and review all the Ollama frontends will be super useful! I'm really looking forward to the rest of the series! :D
I currently use Open WebUI as well as the Ollama CLI, and I completely agree with the pros and cons you outlined.
By the way, could you tell me where to find the comparative chart you mentioned in the video? I couldn’t find it on your website, but I'm really interested in having a look at it :)
Thanks Matt, what about using the spew hint capabilities. Could you go through that?
Wow, that's a lot of models. Which ones are your favorites and what do you use them for?
Great ideas (as usual 😂), cannot wait for the whole series…
I wouldn't go as far as to say that this is even close to a 1-1 command line tool.
I use both regularly, it's nice to be able to easily setup a DB for file storage and to have a smooth and extremely easy way to integrate your files into your chats. It's also pretty nice to have an easy interface for model building rather than needing to build out your models in text files and then create them from that.
It's also pretty nice to be able to provide feedback to your models in a concise way using the thumbs up and down feature, it's especially noticeable if you test with local files and repeatedly give the model poor praises when it answers correctly and visa versa, the model reflects the mixed judgment and starts to act foolish.
Additionally, it makes it very easy to serve models to friends, family, and in a work environment.
-d is actually deatacched
‘docker run -help’
-d, --detach Run container in background and print container ID
Very captivating explanation, there is others like I would like you to review is LLM studios and anythingLLM as a suggestion. Thanks 🙏
I was using Lm Studio with Anything LLM... After seeing this video I think is time to change...
I've tried setting up several different RAGs. In most cases, the rosey docs don't capture the snagging issues I have ran into. I can't help but feel we're in an early days state, and that in a few months RAGs will evolve. Right now, I'm kinda backing away from investing further time as they only work partially, and in document, image, and sound handling - there is ... work to do :/
Hi Matt, as always, your demos just make things so clear : would you plan a demo about "custom tools" integrations ?... and add a cutom one ?
I love Open WebUI. I can download a GGUF model in hugging face and convert directly into Ollama format in minutes using the GUI. And TTS is fantastic, hands free, I can talk and listen. I even installed new voices. And I can web search, RAG, many features indeed ! ❤❤❤
"convert directly into Ollama format in minutes using the GUI" Why is Ollama so dumb it needs to convert anything? So many other apps just work with normal GGUF files, without any messing around.
If the model weights are gguf there is no conversion. Ollama just uses straight gguf files. I assume he means converting from the source safetensors files. Oh wait. He did say convert gguf. He doesn’t know what he is talking about. In order to use a model anywhere you typically want a system prompt and template. With most other tools you have to figure that out yourself and with ollama you get it all. But it just uses gguf as is.
@@technovangelist Nope, I edited the environment variable to point Ollama at my 95GB folder full of GGUF that Backyard, ChatGPT, LMStudio and others can use, Ollama just stares blankly and declares "No models found"
You need to add them to ollama. There is no conversion. But you do need to tell Ollama about them.
I use Open WebUI every day and I love it! I love how it formats results nicely and stores the conversations for easy reference. The login page works with my password manager so it's not that inconvenient and I feel better that my conversations are kept private this way because privacy is such a huge motivation for running a private AI after all.
But the login provides no privacy on your local machine. Maybe if it were hosted on an external server.
The login for me has use. I host this on a server. I have the admin account with all the trial models and so on and the user account which only has access to the one or two models that just work. As a result, when my wife wants to use it or when I want to just get stuff done, the user accounts are great. When I am fiddling and don't care there are loads of available models and duplicates with slightly different names, then admin it is.
I do host it online but I do not treat the login page as any degree of security, just as a way of segregating functionality.
I envision a chat app with a tree-like structure, including a main trunk and collapsible branches for topics created by users. You could invite and select bots, similar to users, that are defined with Flowise with various capabilities defined by their flows. A respond from a chatbot would be triggered by selecting a bot or activating a checkbox when sending a message. Additionally, I’d like to have a Telegram client interface in a specific topic / branch and include SST/TTS functionalities. 🙂 (All I need is a seed investor and a engineer to do all the work. 😋)
Fully agree. Its an all-new technology for many of us and some terms used aren't that obvious. So, good help text and tool-tips are as important as the feature itself. Having a good UI is great. But, as far as understood this video, it is not *just* a UI. Built-in RAG, vector database etc. means there's more "stuff" than just the UI itself. It is needed of course, but it is rather a full blown frontend application than just a UI. Things grow over time. ;-)
I appreciate your review of this. I learned a lot. However, it has gone through a ton of changes, so a lot of your instruction is outdated. I'd love to see an updated review (if it doesn't exist already, of course). Specifically, the way it handles documents now. But I appreciate you highlighting the / and # commands. I missed that while playing with it. Very handy. I'm also really curious to get your take on the chromaDB RAG implementation? How does it stack up to say, pgvector? I ask because I'm having to explain why this (chromaDB) is a stronger choice since it's included in an out-of-the-box solution, and we're trying to get this going on several systems. Thanks again.
always your videos so informative thank you 🥰
my main language is Arabic and i am weak in English but i understand you without using any subtitles
You can actually use @ to interact with a different model.
And I also find modelfile a bit interesting way to override model's default configuration.
If it’s not documented it doesn’t exist
I see the docs have been updated to include this. that’s great. Its not everything I was mentioning, but it’s a good part of it. The modelfile is the key part of ollama that makes it amazing, but I didn't see any improvement on the basics in open webui
Thanks for the Video. I use this tool regularly but I did not cover all the features you mentioned- time to explore them 😃
The document chat for me feels a little bit less configurable than I would like to use it, e.g specific Text Splitters, working with external vector stores but maybe some of the features will be added sometime. I am very impressed by the release frequency of these guys!
I like this tool. For us people that maybe arent as comfortable with code it makes things easier. If the whole point of these open source models is to open LLMs to as many people as possible then these tools are needed. If the developers see this I'll throw one idea out there. Start people off by having them pick a model if if its a small model and have a help system that can run off of that. I actually dont get why more people arent doing that already. You just need structured documentation to make it that even the really small models can work with them. If someone doesnt understand something it would be really simple to just have a question mark button that they can click which the person can chat with. They could even go as far as having features requests or bug reports that use a similar system. On the developers side they can take in that data and use a larger model to do more processing on it to find common themes which would make it easy to prioritize everything.
I Liked and Subscribed.
Hello Mat, I was one of the first group of STS Space Shuttle programmers 40 years ago while still in my early days of college. It's great to see how programmers of far ago and today's brains use the same synapse pathways.
I have been with my hidratespark pro 32oz for three weeks now - love it. I plan to buy a small one (16oz?) to fit into my vehicle.
2. Which do you recommend, anaconda or Docker?
3. And what are we to do with the Modelfiles section?
4. What controls compare to Openai’s Custom Instructions in open web UI?
5. The ‘/’ features appear pretty helpful - I have to rewatch your explanation.
6. where to find user manual instructions for all of the open web UI features and how-tos.
Thank you for the video.
Happy Hidration.
Anaconda or docker??? Those have two very different roles and purposes. But I tend to avoid anaconda or conda or any of those package environments for Python. Just bloated. I don’t understand the question for 3 and 4
I don’t think there is any docs
@@technovangelist Matt, #3, I am asking what the purpose/function of the Modelfiles area is.
#4, OpenAI’s ChatGPT has a [Custom Instructions] feature in Settings (I think that's where it’s located; it allows for the user to predefine things the user wants ChatGPT to use in the responses without the user having to put it in every prompt.
@@technovangelist 🥲🥲🥲
Thank you for replying.
ok, number 3. Wish i knew. its useless. they should remove since it doesn't add anything. For 4, I am not sure
The only thing that is missing for me when it comes to the web ui is
A) do a sequential websearch (i.e. google stuff, if condition unsatisfied, google more, integrate into chromadb)
B) digest my pdf data folder (e.g. a list of pdf publications) and store in DB. this could also be done in cli
Those things can’t be done in the cli as is but this webui doesn’t really do those things all that well either.
I can't use light mode anymore. I got a terrible illness on my eyes, on the retinas, which I can't read almost anything on a white bg. It simply hurts my eyes. I still can see white over dark, but I need lots of contrast. 😮
Thank you very much, this is super helpful
Hi Matt, thanks for your detail video.
Do you recommend another WebUI tool?
@technovangelist thank you for your video. You mentioned a chart, do you mind sharing it?
I did? What chart? I can review later but easier if you can give any info. Thanks
@@technovangelist 😅yes you did... 00:26 on... thank you
Thanks so much for pointing it out. www.technovangelist.com/notes/annotated%20list%20of%20ollama%20web%20and%20desktop%20integrations/
thank u...cool presentation style my man..#thumbs👍🏾
I like that color scheme though
Great video thanks!
My use case involves querying email archives, so it is crucial that the documents are not sent to external servers. I used the sentence-transformers/all-MiniLM-L6-v2 as the embedding model, and I believe that the documents I added are not sent to outside servers. I found GPT-4o much better than the models in Chat. My question is: will my email archives be exposed to external servers, or is only the question in my chat sent to OpenAI?
You embed to add them to a vector database so that you can find the most appropriate email to ask a question against. Then the email gets sent in plaintext with the question to whichever model you are using. If that’s going to be OpenAI, you are am sending the emails there. There is no way around this
@@technovangelist Thanks, Matt, for the clarification.
I am wondering if there is a way to apply a white label style updating to the UI. Can you recommend customization for those who want to demo LLM centered ideas using Ollama UI? Perhaps an alternative from end with similar features?
I would have like more info about system prompt at 7:35
Any idea how well this would work with the "Jetson Orin Nano Super Developer Kit" [running as a sole node in Proxmox]
PrivateGPT as UI?
It's thanks to PrivateGPT that I learned about Ollama, it works pretty well on my Jetson Xavier AGX 32GB, not a simple task due to ARM64+CUDA.
Thing is, any gui would be better than commandline. Because you can use arrows to go back or forth, edit, select, etc. You cannot do that using Ollama in the terminal.
Those are things you can do in the ollama cli.
@@technovangelist well i can't. my own cli app has readline which allows me to do that but the official ollama cli does not allow me. when i hit left arrow for example i get ^[[D
Great content. Thank you
Glad you liked it!
I love open webui. It is for sure the best. It even has some support for images if you don’t want to go to say the native automatic1111 webui
Thanks for the detailed video. I am trying to create a chat with voice. Something like amazon alexa. Can you please create a video around it ?
Matt, I just installed Ollama and started using llama3.1 via the Windows cmd prompt. Now I need to install Webui and I followed you until you mentioned docker. You lost me after that. I need the procedure for installing Webui, and I assume I need docker. However you go through lots of details that seem important, but you don’t follow through for me. But what I’m saying is I need just more direct instructions for getting Winui to work without using the cmd prompt window. Thanks for your informative series. Howard from Detroit.
Yup. There is a lot of stuff with ai tools that assumes knowledge of docker. It’s a pretty low level requirement for so much of tech these days. I will be doing an update of this in the next few weeks and will try to include more of the steps required. And I know Detroit pretty well. My wife is from around there and we got married in Pinckney closer to Ann Arbor.
@@technovangelist Thanks for replying. I bet you are very busy with this Ai stuff. I am having a great time learning, and I am very excited about it. I graduated from Wayne State University in Detroit then worked for Chrysler Defense. I ended publishing some books on Basic programming with Howard W. Sam’s, and went on my own. It didn’t work out as well as I’d thought but I became a stay-at-home dad for our 2 kids, My older daughter is a programmer at GM and the younger one works for Ohio State. My wife went to U of Michigan. We go to Ann Arbor a few times a year for their theater. I think Ai is the next tech leap in computers and there’s nothing to fear. I am retired and doing some silly videos on UA-cam and working on a UFO blog with a friend from Colorado. My first computer was the Motorola M6800D1 kit using a teletype.
Last time I tried RAG from a big PDF in Open WebUI it did a terrible job, while Cheshire Cat did a good job with the same PDF. I tried also BionicGPT, bigAGI and chatd and Cheshire Cat was the clear winner in RAG. Besides that its plugin system offers many functionalities and its ability to delete specific memories is something I haven't seen anywhere else. I think it deserves a try, it seems a joke but it's not
Just had a thought - does the Ollama server already include a login management system? It’d be great if it could handle user credentials similar to Git, allowing specific access based on rules. For example, certain users could access specific models, or there might be usage restrictions. This would make it so much easier to deploy Ollama as an offline LLM service for small businesses. Not sure if this feature exists already, but if not, it could be a cool addition. By the way, awesome project! Really helpful for deploying LLMs locally. 🚀👊
No it doesn’t. It’s designed to be the best way to run models locally on your own hardware. Some folks are hosting solutions using ollama but they need to come up with that authentication and authorization system on their own. There are lots of tools for that depending on the specific needs of the project. In fact there are many large companies only focused on that part and none of them can provide all the options some folks want.
Open WebUI is the way it is because of how Ollama is designed and the intention the make copy of the ChatGPT UI, which was at least so successful that I confused it once with one of ChatGPT. If you worked with Ollama more on the command line, it I think it's pretty obvious what the modelfile section is about. Ollama annoyingly mixes-up modeles and weights. So it took me a while that an "Ollama" model is just a parameter file for a the files that contain the weights, that thing that we usually call a model. Except that the very first model file is kinda tied to the weights but you can create as many more model files as you want. I don't know if there is a set of settings already included with a GGUF file or if Ollama just assumes certain defaults as many modelfiles you can download from the Ollama site are mostly empty. In some cases the chat template is not even known, let alone additional parameters. Some models (that is modefile + weights) you can download from Ollama are even broken, so you can't get it working. Sometimes it is because the model was incorrectly trained like Westlake. Sometimes parameters are missing like for example the context size for the llama3 models. I don't know if they already fixed this but a llama3 model has a context size of 8k, if you don't set this in the model file, you will get trash responses after reaching this limit. The Ollama CLI also seems to store whole chat sessions either within a linked to specific model files. So can end up quickly in having tons of modelfiles which are listed as model but actually reference the same weights. If you need a different set of parameters then in the CLI you'd export the current modelfile, change it as needed and the "create" a new model from this. If you "create" it undere the same name, then the current one will be overriden. That's most likely the process that is included in Open WebUI, I say "most likely" because I actually do all of this on the command line and only had a quick look at this section of Open WebUI. ;-) Now since you can also change all parameters on the fly for the current session, just like you'd do with "/set parameter temperature 1" in the CLI, you can set these in the global settings of Open WebUI so the defaults from the modelfile or whatever defined them before will be overriden. Of course that's complete nonsense and very annoying, since you obviously want to have different settings for different models. So the option they give you is to create new modelfiles instead. So all this is quite clunky and unintuitive but that's more or less because Ollama is like it is. Of course you could work arround with that by for example not creating additional modelfiles by keeping profiles in the UI, we'll see of they do this.
I had a longer reply first but removed it. It seems most of this comment stems from a misunderstanding of how ollama works. Perhaps join the discord and ask there.
Good day,
Matt, hopefully, this is my last question about the Private GPT installation. My laptop has arrived.
I have installed an M.2 2T primary drive and a secondary 2T SSD.
Q: After installing Ollama, Docker, and WebUI, can the models be stored (directed) to the secondary SSD to preserve space on the primary M.2 system SSD?
If so, when do I pick where to store the models during their installation?
Open WebUI is fantastic, but I agree some features need refinement
You don't have to review any other webui - this is the best one :) Only the login is annoying.
I’m hoping there is one that does a good job with everything. This is nice but far from perfect
And the goal isn’t to look at web uis but rather all clients in general. It doesn’t offer that much over the cli but I am sure there is one that blows everything away.
I got addicted to ollama last year and got to play around with openwebui when it was still called ollama webui
The name change messed up my docker installs, not gonna lie
But then, we decided to try it as a corporate AI companion, but as it was a testing phase, we didn't scale our cloud very high, so it was pretty slow
On my machine though, I wanted to try and use every bit of feature, which led me to install and learn ComfyUI, and while the image generations options from openwebui is limited whatever the backend you use, it's still useable
Interesting. I haven't really played with ComfyUI.
I appreciate that you introduced me to ollama and are sharing your experiences and frustrations with the deployment of it's various features. Your videos are a combination of joy and frustration that are part of every software development cycle, and it feels great that I am not alone feeling this. Thank you 😊
FWIW, I think the ModelFiles section is the most powerful part of Open WebUI.
Uh oh, not a fan then?
You leave my precious dark mode alone, you.. you meanie! 🙃 I use openwebui and I agree, why do we have to sign in and what is that modelfiles area for? I have not tried other addons yet, tho, but I am about to, which is part of why I watched this video. So. keep going through that addons list! Excellent video!
I’ll move on to the other user interfaces. My goal is to see if there is one that improves on the built in cli
I think it would be better to have the 1st ollama web ui account be admin and have better user management and generation of local api keys also would be awesome so security is there from step 1 in case its ever in production in future!
Yes the user management is a bit lackluster not really providing much security and really only offering a little speed bump. So make it optional and then for folks that want the security offer it in a real way.
Great overview, thanks! I hope you will consider reviewing several of these. I use textgenwebui right now, but the options and processes can be overwhelming. They do have the most options for hardware tuned models, GPTQ, etc. A review and explainer for this tool would be appreciated.
Oobabooga is an alternative to ollama rather than open web ui.
A very good explanation 🙂 but how will you integrate this tool with your code by integrating ollama api to the web api? if so then how?
I don’t understand the question. This is a front end tool for ollama
@@technovangelist I want to connect my backend code which is using llama3 to the open webui. How should I do?,
What are the other best alternatives with agent and tools options ?
So it’s not just me who struggles with some of these options. The OpenAI API key is not properly saved between restarts, something that drove me nuts.
I watched your video about Msty. I installed it and never looked back.
The webui video predates the msty one I think.
Thx for video. A question about the combo ollama/openwebui/docker. I’ve this configuration and all is ok. I’ve a goal to raise. I want to specialize a LLM pre-trained. Training it with a large base of data about coding in a proprietary language not popular. Only 2 hundreds programmers use it.
My question is:
- which generic and light LLM can i use?
- i use some Python script to train a LLM (in my case test with Phi3:Mini). I found a problem to solve. When i try to load the model something wrong. Infact Python says that not find the path of the model, usually ~./ollama/models/…
- I note LLM are encrypted SHA256! Peraphs is a problem!
Can you help me to do this training?
Can you give some documentation or tutorials?
Thx in advance. Have a good day.
Sorry for my bad english. I’m an italian developer.