I wanted to say that I loved this video and subscribed after watching. I believe my search was "Ollama Model File template" or something similar, scrolled past some thumbnails or channels I didn't like but liked the preview when I hovered over yours. I had watched some others first, from large name channels, which weren't as helpful as they first appeared. However, I immediately found that your content was right on target for both your title and my search. It was everything I needed to provide some insight to someone starting with Ollama. I had no idea who you are, didn't look at other things on your channel, and didn't read the details or comments before hand. Just your video and your presentation at face value. Not to harp on that but, as this platform can be fickle, I wanted to pass along details that may be helpful. Again, it was excellent and I truly appreciate the time you took to script and produce content to educate the audience in a natural and respectful manner. Thank you.
Hi! Thanks for the video. I have a question about using the Ollama model with LangChain: When I run the .invoke method with a simple prompt, does the Ollama library automatically insert the prompt into the pre-configured template in the model file, or do I need to manually include it in the LangChain prompt template?
very good !!! qell understood.... (quick advice)... Tempreture is related to the training also (as things which were not trained deeply will need higher tempreture ... and things deeply embedded will be ok with lowest tempreture: how do people train thier odels and what are thier acceptable levels? as some are .=0.5 and under whilst other dont care and let the model complete an epoch on large dataset and assume the data took .. as long as thier final output was preferable : when in fact all the data which did not go in at the loss below .0.5 did not take and is not retrivable perhaps its there ephemeallly ... as it is like a pretraining ... its just used for next word prediction... but we are doing tasks ! which is whole sequece prediction/recall so when we train for a task we expect the whole of the data set to be fit in range .... so low temptrture 1 should be acceptable losses ... Some say tha this effect the soft max of possiblisty chosen byu the topk sample as well as the topP percentage of cutt of... but this is when there are many sample chosen... but this also depicts the values that were trained at thatr rate of loss .... so it will be collecting sample from the level under the temptrture rate of 1 ( a lot ) so this will need constraining with topP (selecting the highest of probablitys ... but the softmax will also spread them alowing for more random also, when the model has been over trained.) .. so an over trained model can be loosened by raising the temptretue and a wild model tamed ! lol...
Please keep doing what you're doing, at this point I would guess job offers from all over the world pour in. Thanks for your continuous videos! I went through this with the meditron model, that I suspect is still not fully correct in prompt format, but couldn't fix, maybe with this video I will be more successful. Ps: Let us know in case you sell merch :)
i see that you can use your own docker registry with ollama as a way of hosting model files. would love to see a video on this for users running ollama on closed networks.
Thanks for all the helpful videos on Ollama! I've since located the answers, but these are a few questions I was always left with whenever I saw mention of making a model file. Asking so it may help other new Ollama users: What kind of file is it? What program should be used to create it? Is it saved in a specific file format or location?
I created it in vscode. It’s just a text file like everything else in a code editor. And put it anywhere you like. Once you run ollama create, blobs and manifests are generated in a specific place.
So helpful, so interesting, thanks 👍After generating my model, I notice that I have to specify the number of layers to use even though my GPU has enough memory in Ollama (--n-gpu-layers). If I use fewer layers, what does this mean in practice?
I fine-tuned a HuggingFace embedding model locally and got a set of files with safetensors on disk. Tried to convert to GGUF with llama.cpp but it fails. Ollama requires the model as GGUF. Any suggestions for how to integrate the fine-tuned model in Ollama?😢
Thank you for the short and sweet video. How do you get so much good audio quality on your videos? step 0: have a great voice. What is step1 (gear, setup in OBS/plugins) :?
My coworker and I set up a windows machine to run ollama. It works great but occasionally seems to crash. Could it be the keep_alive setting? If I want others to be able to hit it via the api, should I set the keep_alive to “forever”? (I don’t remember the flag for that off the top of my head). Thanks for your work on Ollama!
I wanted to say that I loved this video and subscribed after watching. I believe my search was "Ollama Model File template" or something similar, scrolled past some thumbnails or channels I didn't like but liked the preview when I hovered over yours. I had watched some others first, from large name channels, which weren't as helpful as they first appeared. However, I immediately found that your content was right on target for both your title and my search. It was everything I needed to provide some insight to someone starting with Ollama.
I had no idea who you are, didn't look at other things on your channel, and didn't read the details or comments before hand. Just your video and your presentation at face value. Not to harp on that but, as this platform can be fickle, I wanted to pass along details that may be helpful. Again, it was excellent and I truly appreciate the time you took to script and produce content to educate the audience in a natural and respectful manner.
Thank you.
I really appreciate you videos, you have a simple, understandable , friendly approach to teaching which keeps me coming back for more.
Thanks.
may be a dummy guide would be also helpful. it's a bit advance, the content, though very useful.
this would be a dummy guide
You're right.
Ignore comment from @karthickdurai2157,- just trying to give an illusion of competence.
@@karthickdurai2157 oh-......uhm...EVEN MORE DUMMY VIDEO THEN!! 😭😭😭
You saved my day. Thank you Matt.
I love your endings
Hi! Thanks for the video. I have a question about using the Ollama model with LangChain: When I run the .invoke method with a simple prompt, does the Ollama library automatically insert the prompt into the pre-configured template in the model file, or do I need to manually include it in the LangChain prompt template?
very good !!! qell understood....
(quick advice)...
Tempreture is related to the training also (as things which were not trained deeply will need higher tempreture ... and things deeply embedded will be ok with lowest tempreture: how do people train thier odels and what are thier acceptable levels? as some are .=0.5 and under whilst other dont care and let the model complete an epoch on large dataset and assume the data took .. as long as thier final output was preferable : when in fact all the data which did not go in at the loss below .0.5 did not take and is not retrivable perhaps its there ephemeallly ... as it is like a pretraining ... its just used for next word prediction... but we are doing tasks ! which is whole sequece prediction/recall so when we train for a task we expect the whole of the data set to be fit in range .... so low temptrture 1 should be acceptable losses ...
Some say tha this effect the soft max of possiblisty chosen byu the topk sample as well as the topP percentage of cutt of... but this is when there are many sample chosen... but this also depicts the values that were trained at thatr rate of loss .... so it will be collecting sample from the level under the temptrture rate of 1 ( a lot ) so this will need constraining with topP (selecting the highest of probablitys ... but the softmax will also spread them alowing for more random also, when the model has been over trained.) ..
so an over trained model can be loosened by raising the temptretue and a wild model tamed !
lol...
i would like to see a vdieo on publishing a modl really !
Can I train model via chatting with the model? how to do it
Please keep doing what you're doing, at this point I would guess job offers from all over the world pour in. Thanks for your continuous videos! I went through this with the meditron model, that I suspect is still not fully correct in prompt format, but couldn't fix, maybe with this video I will be more successful.
Ps: Let us know in case you sell merch :)
No merch but the is a patreon at patreon.com/technovangelist and a newsletter at technovangelist.com/newsletter
Great video, very usefull! I have a request for you: please make a video about making an ollama model of dbrx and/or grok 1.5 vision models
i see that you can use your own docker registry with ollama as a way of hosting model files. would love to see a video on this for users running ollama on closed networks.
It’s not actually the same as the docker registry. It was written by the same person that created the docker registry though.
It had to be modified because layers in a docker image are tiny whereas models are huge.
Any tutorial on model creating from custom data,like pdf s? Like for companies?
yes please ive been looking for this. if you find anything please share, any help is appreciated
Thanks for all the helpful videos on Ollama! I've since located the answers, but these are a few questions I was always left with whenever I saw mention of making a model file. Asking so it may help other new Ollama users: What kind of file is it? What program should be used to create it? Is it saved in a specific file format or location?
I created it in vscode. It’s just a text file like everything else in a code editor. And put it anywhere you like. Once you run ollama create, blobs and manifests are generated in a specific place.
Thanks ❤
Very helpful 😊
So helpful, so interesting, thanks 👍After generating my model, I notice that I have to specify the number of layers to use even though my GPU has enough memory in Ollama (--n-gpu-layers). If I use fewer layers, what does this mean in practice?
Excellent thank you!
Extremely useful but what if there is no template in the readme?
Then look in that file I showed. And if not there then look how the model was trained or fine tuned
I fine-tuned a HuggingFace embedding model locally and got a set of files with safetensors on disk. Tried to convert to GGUF with llama.cpp but it fails. Ollama requires the model as GGUF. Any suggestions for how to integrate the fine-tuned model in Ollama?😢
I’m not sure. Your best bet is to ask on the ollama discord. Discord.gg/ollama
Thank you for the short and sweet video. How do you get so much good audio quality on your videos? step 0: have a great voice. What is step1 (gear, setup in OBS/plugins) :?
I think I need a video on it. I don’t use obs though.
@@technovangelist yes, video please. You could add affiliate links to the gear as well. Really loved the base in audio
ua-cam.com/video/LQe3DFjMYrE/v-deo.htmlsi=R4u3h6yPtbUaHeDh
there is folder based on the date of this video . do you have a gist containing content of the template per model ?
No. It was just a few lines that you can grab from the same sources I did so didn’t bother with it
thanks!!!
My coworker and I set up a windows machine to run ollama. It works great but occasionally seems to crash. Could it be the keep_alive setting? If I want others to be able to hit it via the api, should I set the keep_alive to “forever”? (I don’t remember the flag for that off the top of my head). Thanks for your work on Ollama!
In most cases you shouldn’t need to worry about keep alive.