311 - Fine tuning GPT2 using custom documents
Вставка
- Опубліковано 9 тра 2023
- 311 - Fine tuning GPT2 using custom documents
Code generated in the video can be downloaded from here:
github.com/bnsreenu/python_fo...
All other code:
github.com/bnsreenu/python_fo...
This tutorial explains the simple process of fine-tuning GPT2 using your own documents. It also demonstrates the advantages of structuring your training data as Q & A rather than long text. - Наука та технологія
Fantastic, just what i was looking for !
You are perfect to teach And I like your English. Thank you
Very helpful, thanks to you sir
Hello, one doubt, the question "What is the Babel fish?" in defined explicity in the file custom_q_and_a ? the users only could ask questions that was defined in the file custom_q_and_a? Thanks for your content are very interesting and clear
First of all your tutorial is very useful, there is very little materials on net related to this.
I had some questions,
like can we use this same approach for different models like flan-t5 or bloom, by just changing the model name,
Is there any other way to train model on only text data and not question/answer?
In this method is the model adding new parameters to the pre-trained model or is it doing something else.
Thanks in advance
Hii Sreeni! I loved your video, which gave me a lot of insights to learn the LLM model and fine-tune it over my own dataset. When I fine-tuned my own q_and_a dataset, I didn't get a good result. Can I have access to the article where you trained your model?
Could you also share the dataset please?
Hey, the tutorial is quite interesting but I'm having a challenge to finetune the model on 50 epochs and the dataset is around 22MB how much it will take the time to train using GPT2 model.
Thank you for the very useful tutorial, I'd like to do something similar but which looks a lot more like additional unsupervised pretraining rather than fine tuning. Essentially what I am trying to do is to give to a pre trained transformer a lot of documentation about marketing (e.g. 50/60 papers on marketing and business) to ingest. Unfortunately the documentation is in a completely unstructured format, given that it is composed essentially of pdf files... but I'd like it to be able to reply in chat like manner... Do you have any idea how that could be achieved?
I have already tried the "document" retrieval option by generating embeddings etc and it really doesn't work all that well unfortunately...
sorry for the question but I've been scouring the internet for this and haven't really been able to find an answer...
even I am in same situation the pdf is in unstructed manner, what do I do?
Thanks!
Thank you
I have one question...how do I add memory to the model,...in other words, how do I allow the model to remember older conversations?
Hey brother,
Did you get the answer?
I want to know as well that how will the bot remember and relate to the historical context provided to it earlier.?
Good Video pls keep doing , pls share the git labs files as well
Very good and useful tutorial. I tried this as in the tutorial, and trained the model to create an inquiry service chatbot on my campus. but the answer given is still not relevant, is there a problem in the dataset? Maybe you can tell me what the minimum dataset is needed or what optimizations affect the ability of this model?
your tutorials have been a greet help for me during my college!
i have a small question
if i want to train it for text summarization do i follow the same steps ?
my data set is text and summary .. should i join them in the same file as [text] then [summary] like we did in the questions ?
This approach makes sense if you have limited training data. Normally, you'd train a large language model using large amount data to make it versatile for chat or text summarization. You may find this resource helpful for your task: huggingface.co/docs/transformers/tasks/summarization
@@DigitalSreeni will check it, thanks alot! 🫡
@@DigitalSreeni another question pleases , can gpt 2 be fine tune for summarization with 75k article/summary pairs ? or models like mt5 will be better ?
Thank you sir! Can I finetune GPT for NER Detection
the question is, was that same question in the training set with a similar response?
Yah, if the question exists already, that kind of cheating. Users won't always ask questions that are in the q and a document..Or is it just better to train it on the original text for more epochs so it is more robust and can handle any question?
It would be very helpful if we had an answer to the above question
Also shouldn't the samples have ?
How did you make q and a dataset?
very helpful Need the dataset please!!!!!!
Is there any example of dataset Sir, Your video is very helpful but I have struggle to make or find the dataset
Thank for the video!
I have a question:
How can I use the GPU for this training?
Not sure if you still need it, but in Google Colab you can go to the "Runtime" tab and click "Change runtime" to something like T4
Thank you for this!! I ask you a question if I wanted to do this but with more than 10 thousand documents, would it be better to use a vector database and do similarity queries?
When dealing with a large number of documents, using a vector database for similarity queries can be a more efficient approach. Faiss or Annoy or such vector databases can scale better than simple linear search methods.
Thanks!! I think
the same@@DigitalSreeni
Can I use this method for other language?
How do I make this code work for gpt3 or gpt4? The part I am confused about the regular openai api for fine tuning is that when fine tuning it completes my prompts perfectly, but when I try to have a conversation it doesnt remember the context of previous messages. So i got the idea to feed it back in as an input the entire conversation up to that point, however it gets all crazy after 3-4 messages because the training dataset does not contain any prompts that have conversation history in it, so how is this done in order to have conversations with it like chatgpt?? Please someone help me
you have to fine tune it using chat history as well, e.g. your input would be the last 7/8 messages (which should be tagged user/assisant) and your output would be the message you want to generate depending on chat history (last 7/8 messages). I'm pretty sure Openai documentation has a pretty thorough example on this
Amazing how easy gpt2 can be fine tuned. My question is , do the final result can handle different questions in same context?
Who is the babel fish?
Tell me about babel fish?
Do you know babel fish?
Etc..
Great
am getting this error while running the code on :: RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`
same, did you solve it?
Hi Sreeni can you please make one video on ViT Swin transformers implementation?
i followed your tutorial to finetune a gpt2model, but when the train function is called 1st i got a " 1785 if not is_sagemaker_mp_enabled():
1786 if not is_accelerate_available(min_version="0.20.1"):
-> 1787 raise ImportError(
1788 "Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`"
1789 )
ImportError: Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`" error, after pip install accelerate now I get NameError: name 'AcceleratorState' is not defined error. can somebody help me with this
did anyone encounted the same erro..plss this is urgent
bit of a late reply but you just do !pip install accelerate -U and then restart the runtime and run the script again
What if I have a large raw text of documentation. Would I need to manually prepare questions and answers? That would be too tedious and require lots of human work.
Are there no other option? Like maybe pre-train the model on a huge raw text of documents, and then find some public question answering dataset to prime to model?
Same for summarization, how can I train a model on my custom text if I dont have examples of summarization? Can I somehow leverage pre-trained models that do summarization?
All I have is unfiltered raw text of documents.
same, did you find anything?
also curious, any updates?@@sathvikreddy4807
I used gpt to generate question and answers 😂
can we get the datasets?
did you find the dataset?
@@srujalreddy__3132 same maybe somebody get the dataset?
nice guy
Dear Sreeni, are you able to contact by email? Can you indicate me it?
Pretty arrogant!
What part, if I may ask?
@@DigitalSreeni Hey, I'm so sorry, I didn't mean arrogant, I meant 'approachable', in the sense that it's not as difficult as it sounds. Please accept my apologies. I really like your videos, they have helped me a lot.