I keep having problems with the model.merge_and_unload()... It seems to be a bit different from the documentation on Hugging Face...is there something I am missing here? The error says that that 'LlamaForCausalLM' object has no attribute 'merge_and_unload'.... Any ideas?
Incredible video!! Thank you very much, I have a question: isn't it mandatory to put characters like EOS at the end of the summary? for the LLM to finish the instruction?
Do you have or plan to make a tutorial for something like bellow? Tutorial for the plane text fine-tuning and then tuning that model to make it an instruct tuned one?
I totally forgot about the `target_modules`. I retrained and updated the notebook/tutorial with those. The results are better! Here's the list: lora_target_modules = [ "q_proj", "up_proj", "o_proj", "k_proj", "down_proj", "gate_proj", "v_proj", ] I composed it from here: github.com/huggingface/transformers/blob/f6301b9a13b8467d1f88a6f419d76aefa15bd9b8/src/transformers/models/llama/convert_llama_weights_to_hf.py#L144 Thank you!
Do you have an idea how GPT4 is so good with its responses from its base model when I upload documents to it? Could it be the parameter. size only or do you think other technologies are what determine the quality difference?
parameter size and training data i guess? Also I dont think we know their exact network architecture since they didnt release their network publicly, can only access it via product
Hi there, I am just reading through the repo and Im pretty sure this is the answer...i just wanted to make sure... The actual input to the model is only from the [text] field, is that correct? As the [text] field contains the prompt, the conversation and the summary...
can i download the finetuned model after finetuning? is it in format .bin or .safetensor or else? cuz im current trying to do finetuning on textgen, but having troubles. with dataset (format) i guess.
I need help please. I just want to be pointed in the right direction since I'm new to this and since I couldn't really find any proper guide to summarize the steps for what I want to accomplish. I want to integrate a LLama 2 70B chatbot into my website. I have no idea where to start. I looked into setting up the environment on one of my cloud servers(Has to be private). Now I'm looking into training/fine-tuneing the chat model using our data from our DBs(It's not clear for me here but I assume it involves two steps, first I have to have the data in a CSV format since it's easier for me, second I will need to format it in Alpaca or Openassistant formats). After that, the result should be a deployment-ready model ? Just bullet points I'd highly appreciate that.
@nty3929 Yeah, bots are ruthless here and youtube is having none of it, even at that cost. Guess they expect to see more technical conversations elsewhere
Hello Sir !! I have finetuned llama model and now want to use the model for deployment using flask how to do it because when I try to run the flask the problem occur is that it start downloading the base model which it require to load the finetuned model. Is there a possible way to store tha model like we do in ML and then can use this. Without taking much time. Please tell me
This looks like a great notebook, however, I always get a "CUDA out of memory" error when it executes the SFTTrainer function. It's fine up until then according to nvidia-smi but then memory just instantly maxes out. Does anyone know a way around this?
ALL of these tutorials require more dependencies. Can't somebody post how to do this in pycharm with your own GPU? I can't make any of the tutorials I've found work and it's just an endless troubleshooting process as to why everything is different in all of them
Full text tutorial (requires MLExpert Pro): www.mlexpert.io/prompt-engineering/fine-tuning-llama-2-on-custom-dataset
Can you send me your email pls I have a question can’t ask in public
I keep having problems with the model.merge_and_unload()...
It seems to be a bit different from the documentation on Hugging Face...is there something I am missing here?
The error says that that 'LlamaForCausalLM' object has no attribute 'merge_and_unload'....
Any ideas?
@@williamfussell1956 Did you fix that?
@@williamfussell1956 merged_model = trained_model.merge_and_unload()
@@pablomeza5932 did you fix it please ?
Can you provide the Google Collab notebook?
This is great. A version for question answering would be helpful too.
Excellent work! You are the hero!
Fantastic video! It will be nice to see a full tutorial on how to do it with pdf locally...
Incredible video!! Thank you very much, I have a question: isn't it mandatory to put characters like EOS at the end of the summary? for the LLM to finish the instruction?
Good stuff coming, thank you in advance ❤
Thank you for this! Is finetuning a good approach for a private/proprietary documentation Q&A?
Excellent video! What changes in the input we need to make to use 8 bit quantization instead of 4 bit. Thanks.
Awesome tutorial!
Super excited
very helpful. Thanks for the videos.
Do you have or plan to make a tutorial for something like bellow?
Tutorial for the plane text fine-tuning and then tuning that model to make it an instruct tuned one?
Awesome work! Thanks a ton!
Great video!
Is there anyway to build my instruction dataset for instruct fine-tuning from classical text books?
@@user-xt6tu3xt3t but then how to convert in question & answer format?
@@ikurious the best way is manualyl by a human
Thanks for sharing, really helpful. Waiting for my Llama model access to follow it step by step. Can I use any other model in place of this?
Did you get the access? And how long did it take?
Great!! Do some videos regarding RLHF.
will you be able to add a tutorial for llama2-chat model
very good video
default_factory=lambda: ["q_proj", "v_proj"] Why did you not add this? Is it because HF does under the hood?
I totally forgot about the `target_modules`. I retrained and updated the notebook/tutorial with those. The results are better!
Here's the list:
lora_target_modules = [
"q_proj",
"up_proj",
"o_proj",
"k_proj",
"down_proj",
"gate_proj",
"v_proj",
]
I composed it from here: github.com/huggingface/transformers/blob/f6301b9a13b8467d1f88a6f419d76aefa15bd9b8/src/transformers/models/llama/convert_llama_weights_to_hf.py#L144
Thank you!
Is there a good resource for understanding 'target modules' for different models? @@venelin_valkov
I still don't get it i have my data locally , how should start finetuning it please tell
Do you have an idea how GPT4 is so good with its responses from its base model when I upload documents to it?
Could it be the parameter. size only or do you think other technologies are what determine the quality difference?
parameter size and training data i guess? Also I dont think we know their exact network architecture since they didnt release their network publicly, can only access it via product
Hi there, I am just reading through the repo and Im pretty sure this is the answer...i just wanted to make sure...
The actual input to the model is only from the [text] field, is that correct? As the [text] field contains the prompt, the conversation and the summary...
Thanks for the insight, is it possible to perform training locally, with 8 GB VRAM?
No
should it be merged_model = trained_model.merge_and_unload()?
cannot run, it is killed
I have this problem as well😢
Were you able to solve this?
merged_model = trained_model.merge_and_unload()
@@rone3243 did you resolve it please ?
@@fl028 did you resolve it please ?
Hola, For me the validation log show No log with mistral instruct model. Please help anyone.
can i download the finetuned model after finetuning?
is it in format .bin or .safetensor or else?
cuz im current trying to do finetuning on textgen, but having troubles. with dataset (format) i guess.
do you already know how you can download the finetuned model?
Any idea how can we deploy llama-2 on huggingface api? just like the falcon one, has some issue with the handler.
I need help please. I just want to be pointed in the right direction since I'm new to this and since I couldn't really find any proper guide to summarize the steps for what I want to accomplish.
I want to integrate a LLama 2 70B chatbot into my website. I have no idea where to start. I looked into setting up the environment on one of my cloud servers(Has to be private). Now I'm looking into training/fine-tuneing the chat model using our data from our DBs(It's not clear for me here but I assume it involves two steps, first I have to have the data in a CSV format since it's easier for me, second I will need to format it in Alpaca or Openassistant formats). After that, the result should be a deployment-ready model ?
Just bullet points I'd highly appreciate that.
@nty3929 Oh :/ I’m still lost about this but thank you for your effort nevertheless!
@nty3929 Yeah, bots are ruthless here and youtube is having none of it, even at that cost. Guess they expect to see more technical conversations elsewhere
Hello Sir !!
I have finetuned llama model and now want to use the model for deployment using flask how to do it because when I try to run the flask the problem occur is that it start downloading the base model which it require to load the finetuned model. Is there a possible way to store tha model like we do in ML and then can use this. Without taking much time.
Please tell me
can you train the model on german data?
When you say you are tracking loss, what loss is that and how is that loss calculated for the task (summarization) at hand?
I have the same question. @karimbaig8573 were you able to figure out the answer?
Nope.
This looks like a great notebook, however, I always get a "CUDA out of memory" error when it executes the SFTTrainer function. It's fine up until then according to nvidia-smi but then memory just instantly maxes out. Does anyone know a way around this?
try reducing the sequence length
I reduced per_device_train_batch_size=1,
hello it's works for you ? i have the same error !!
Why do you use that kind of prompt for the training like `### Instruction`? When in fact Llama 2 prompts are like `[INST] `...
I think it's a LLaMA2-CHAT prompt. The base model was not finetuned.
Super🎉
🔥
anyoneelse have the issue with loading dataset????
omg @ 15:06 😂😂😂
ALL of these tutorials require more dependencies. Can't somebody post how to do this in pycharm with your own GPU? I can't make any of the tutorials I've found work and it's just an endless troubleshooting process as to why everything is different in all of them