I think template is just taking in parameters, generating the input to be fed to the model. I want to know if ollama can use inference time reasoning like o1 and be able to use the template to reason. Maybe like provide a template where given the query, the model generates a reasoning, using something like chain of thought or tree of thought to reason and output the result? Easily achieved with langchain or python code on top. Just wanted to know if running this way is possible or if it can be faster.
@@technovangelist Not the function of it. I agree. But when employing techniques like 'chain of thought' or 'tree of thought' the model generates intermediate 'thinking tokens'. Necessary for computation but might not be needed for the end user. Rather given template can define the structure, I was just wondering if template can start generation of thinking steps but only output the response or the final answer.
Thank you for these great videos! I would like to make a request, N8N now has an AI Agent that supports tool calls. Ive been working with it and I can set it up with Ollama and set up a tool that it calls and uses the returned information to formulate the answer. The problem is that no one seems to know how to get it to pass information to the tool. Im asked on the N8N message board and even had others say they are having the same issue. With your knowledge of Ollama and having used N8N do you think you could make a working example and explain how to pass information from the model to the tool? For example the tool looks up a stock price but needs to know which stock symbol to look up. The model is asked what the price of google is and needs to pass it to the tool. Thank you
@@technovangelist How do these templates differ from the template I feed my LLM using something like the LangChain ChatOllama API? Does that template get put inside the Ollama template? In other words, when I'm telling the llama3.2 to perform sentiment analsysis, I show it a few example prompts and then leave a space for the tweet, that is my template - how does it interact with the ollama template?
I don’t know. For a long time langchain broke this. They used both even though there should be one. Thankfully there are very few reasons to ever use langchain. In most cases you can simplify by not using it.
The modelfile is still the modelfile. A template is one of the things that goes into a modelfile to build a model. You only need to define the template if importing a new model weights file that doesn't have a template define, which would be most of them.
Thanks for the very good content. I was waiting this video for sooooooo long. One thing I noticed (I do not know if it is true): If you downloaded a model like llama3.2 and you created a new model from it using a simple template, then you can NOT use tools as mentioned in Ollama api i.e. you can not pass tools to the client even the model originally support tool calling.. this means that Ollama checks for something in the template to decide whether the model support tool or not. If downloaded llama3.2 from ollama hub, it uses the default template the uploader used, and if you read that default llama3.2 template from the hub you will discover that it forces the model to always call a tool unless it received the tool response i.e. if you called llama3.2 (with tools inserted to client) with the message Hello... It will use one of the tools returning something not useful at all. I believe It is very bad idea to relate ability to pass tools to client with something in the template.. Also I believe that this what makes you and me preferring to use the old way for building tooled agent and considering it more reliable .. Thanks again for the good content 🌹
@@technovangelist So if I passed a tool, the model CAN NOT decide when to use the tool and when Not to use it and it will use the tool always even if I invoked it with a message like "Hello".
Maybe what was not spelled in many of these videos is that a template is the formatting used/ the way one decides what data to send to the model, for mat of the data used for inference.
@ no offense but i watched the first 5 minutes but it went straight into process rather than a high level of what it is, so I was lost at the outset and didn’t expect prospect to change.
Let's go! This is the video I've been waiting for. Thank you again for this wonderful course
Thank you so much for this video and content. I've been looking for exactly this information.
I think template is just taking in parameters, generating the input to be fed to the model. I want to know if ollama can use inference time reasoning like o1 and be able to use the template to reason. Maybe like provide a template where given the query, the model generates a reasoning, using something like chain of thought or tree of thought to reason and output the result? Easily achieved with langchain or python code on top. Just wanted to know if running this way is possible or if it can be faster.
That’s not a function of the Template but rather the model.
@@technovangelist Not the function of it. I agree. But when employing techniques like 'chain of thought' or 'tree of thought' the model generates intermediate 'thinking tokens'. Necessary for computation but might not be needed for the end user. Rather given template can define the structure, I was just wondering if template can start generation of thinking steps but only output the response or the final answer.
It would be more appropriate in the system prompt
Thank you for these great videos!
I would like to make a request, N8N now has an AI Agent that supports tool calls. Ive been working with it and I can set it up with Ollama and set up a tool that it calls and uses the returned information to formulate the answer. The problem is that no one seems to know how to get it to pass information to the tool. Im asked on the N8N message board and even had others say they are having the same issue. With your knowledge of Ollama and having used N8N do you think you could make a working example and explain how to pass information from the model to the tool? For example the tool looks up a stock price but needs to know which stock symbol to look up. The model is asked what the price of google is and needs to pass it to the tool.
Thank you
Do we need to use these templates if we're using the OpenAI compatible REST API? I'm trying to understand how they relate to each other?
All models use a template. But if using a model from ollama it’s already there
@@technovangelist How do these templates differ from the template I feed my LLM using something like the LangChain ChatOllama API? Does that template get put inside the Ollama template? In other words, when I'm telling the llama3.2 to perform sentiment analsysis, I show it a few example prompts and then leave a space for the tweet, that is my template - how does it interact with the ollama template?
I don’t know. For a long time langchain broke this. They used both even though there should be one. Thankfully there are very few reasons to ever use langchain. In most cases you can simplify by not using it.
@@technovangelist I've found that too, went down a rabbit hole of trying to find the 'right' framework to work with. Silly me.
all models can have a model file for example i have a template maker script i made for crewai to make any local model work with crew ai
Was it called Modelfile before?
The modelfile is still the modelfile. A template is one of the things that goes into a modelfile to build a model. You only need to define the template if importing a new model weights file that doesn't have a template define, which would be most of them.
Thanks for the very good content. I was waiting this video for sooooooo long. One thing I noticed (I do not know if it is true): If you downloaded a model like llama3.2 and you created a new model from it using a simple template, then you can NOT use tools as mentioned in Ollama api i.e. you can not pass tools to the client even the model originally support tool calling.. this means that Ollama checks for something in the template to decide whether the model support tool or not. If downloaded llama3.2 from ollama hub, it uses the default template the uploader used, and if you read that default llama3.2 template from the hub you will discover that it forces the model to always call a tool unless it received the tool response i.e. if you called llama3.2 (with tools inserted to client) with the message Hello... It will use one of the tools returning something not useful at all. I believe It is very bad idea to relate ability to pass tools to client with something in the template.. Also I believe that this what makes you and me preferring to use the old way for building tooled agent and considering it more reliable .. Thanks again for the good content 🌹
The models from ollama in the official library already have the template defined correctly as per the model developers.
If you send a request with tools then it will respond with the tool to use. If you don’t want it to use a tool don’t send it tools to use.
@@technovangelist So if I passed a tool, the model CAN NOT decide when to use the tool and when Not to use it and it will use the tool always even if I invoked it with a message like "Hello".
Thanks ❤
Maybe what was not spelled in many of these videos is that a template is the formatting used/ the way one decides what data to send to the model, for mat of the data used for inference.
It’s look like you wearing Malaysian Batik or something like that..nice, love it!❤Love from Malaysia🫡
I used to spend a lot of time in KL. But this one is from Amazon.
🎉
🤩🤩🤩
Your explanations are always like drinking a glass of ice water in a hot weather.
What is a template in Ollama?
perhaps you should watch the video
@ no offense but i watched the first 5 minutes but it went straight into process rather than a high level of what it is, so I was lost at the outset and didn’t expect prospect to change.
thing
If you are having to read them and use them you know
This is one of the advanced topics and assumes you have a basic knowledge of how ollama works.