311 - Fine tuning GPT2 using custom documents

DigitalSreeni

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 9 тра 2023
311 - Fine tuning GPT2 using custom documents
Code generated in the video can be downloaded from here:
github.com/bnsreenu/python_fo...
All other code:
github.com/bnsreenu/python_fo...
This tutorial explains the simple process of fine-tuning GPT2 using your own documents. It also demonstrates the advantages of structuring your training data as Q & A rather than long text.
Наука та технологія

КОМЕНТАРІ • 56

@kazeemkz 6 місяців тому ⁺¹
Fantastic, just what i was looking for !
@goncaavci1579 Рік тому ⁺¹
You are perfect to teach And I like your English. Thank you
@Jjiihhhgges Рік тому
Very helpful, thanks to you sir
@dcdcdc5469 Рік тому ⁺⁴
Hello, one doubt, the question "What is the Babel fish?" in defined explicity in the file custom_q_and_a ? the users only could ask questions that was defined in the file custom_q_and_a? Thanks for your content are very interesting and clear
@dhairyakataria9859 Рік тому
First of all your tutorial is very useful, there is very little materials on net related to this.
I had some questions,
like can we use this same approach for different models like flan-t5 or bloom, by just changing the model name,
Is there any other way to train model on only text data and not question/answer?
In this method is the model adding new parameters to the pre-trained model or is it doing something else.
Thanks in advance
@prep756 9 місяців тому ⁺¹
Hii Sreeni! I loved your video, which gave me a lot of insights to learn the LLM model and fine-tune it over my own dataset. When I fine-tuned my own q_and_a dataset, I didn't get a good result. Can I have access to the article where you trained your model?
@user-ig3gx3tw3e Рік тому ⁺⁷
Could you also share the dataset please?
@ritubansal3397 Рік тому
Hey, the tutorial is quite interesting but I'm having a challenge to finetune the model on 50 epochs and the dataset is around 22MB how much it will take the time to train using GPT2 model.
@MultiTheflyer Рік тому
Thank you for the very useful tutorial, I'd like to do something similar but which looks a lot more like additional unsupervised pretraining rather than fine tuning. Essentially what I am trying to do is to give to a pre trained transformer a lot of documentation about marketing (e.g. 50/60 papers on marketing and business) to ingest. Unfortunately the documentation is in a completely unstructured format, given that it is composed essentially of pdf files... but I'd like it to be able to reply in chat like manner... Do you have any idea how that could be achieved?
I have already tried the "document" retrieval option by generating embeddings etc and it really doesn't work all that well unfortunately...
sorry for the question but I've been scouring the internet for this and haven't really been able to find an answer...
@sathvikreddy4807 11 місяців тому
even I am in same situation the pdf is in unstructed manner, what do I do?
@a3hindawi 2 місяці тому
Thanks!
@DigitalSreeni 2 місяці тому
Thank you
@souravpal6406 3 місяці тому ⁺¹
I have one question...how do I add memory to the model,...in other words, how do I allow the model to remember older conversations?
@anuragpatil7319 25 днів тому
Hey brother,
Did you get the answer?
I want to know as well that how will the bot remember and relate to the historical context provided to it earlier.?
@pareshmishra23 4 місяці тому
Good Video pls keep doing , pls share the git labs files as well
@m-aki1649 3 місяці тому
Very good and useful tutorial. I tried this as in the tutorial, and trained the model to create an inquiry service chatbot on my campus. but the answer given is still not relevant, is there a problem in the dataset? Maybe you can tell me what the minimum dataset is needed or what optimizations affect the ability of this model?
@Shehab_Zexas Рік тому
your tutorials have been a greet help for me during my college!
i have a small question
if i want to train it for text summarization do i follow the same steps ?
my data set is text and summary .. should i join them in the same file as [text] then [summary] like we did in the questions ?
@DigitalSreeni Рік тому ⁺¹
This approach makes sense if you have limited training data. Normally, you'd train a large language model using large amount data to make it versatile for chat or text summarization. You may find this resource helpful for your task: huggingface.co/docs/transformers/tasks/summarization
@Shehab_Zexas Рік тому
@@DigitalSreeni will check it, thanks alot! 🫡
@Shehab_Zexas Рік тому
@@DigitalSreeni another question pleases , can gpt 2 be fine tune for summarization with 75k article/summary pairs ? or models like mt5 will be better ?
@user-nd2ef1vv8v 10 місяців тому
Thank you sir! Can I finetune GPT for NER Detection
@16876 Рік тому ⁺⁵
the question is, was that same question in the training set with a similar response?
@quantumbyte-studios Рік тому ⁺²
Yah, if the question exists already, that kind of cheating. Users won't always ask questions that are in the q and a document..Or is it just better to train it on the original text for more epochs so it is more robust and can handle any question?
@georgekokkinakis7288 10 місяців тому
It would be very helpful if we had an answer to the above question
@georgekokkinakis7288 10 місяців тому
Also shouldn't the samples have ?
@user-wr4yl7tx3w 5 місяців тому
How did you make q and a dataset?
@srujalreddy__3132 10 місяців тому ⁺¹
very helpful Need the dataset please!!!!!!
@winayazarkasih8653 7 місяців тому
Is there any example of dataset Sir, Your video is very helpful but I have struggle to make or find the dataset
@JeisonJimenez-tb3nc 7 місяців тому
Thank for the video!
I have a question:
How can I use the GPU for this training?
@przemek7465 2 місяці тому
Not sure if you still need it, but in Google Colab you can go to the "Runtime" tab and click "Change runtime" to something like T4
@devtest202 4 місяці тому
Thank you for this!! I ask you a question if I wanted to do this but with more than 10 thousand documents, would it be better to use a vector database and do similarity queries?
@DigitalSreeni 4 місяці тому ⁺¹
When dealing with a large number of documents, using a vector database for similarity queries can be a more efficient approach. Faiss or Annoy or such vector databases can scale better than simple linear search methods.
@devtest202 4 місяці тому ⁺¹
Thanks!! I think
the same@@DigitalSreeni
@08-nguyentungan12 6 місяців тому
Can I use this method for other language?
@andresshamis4348 Рік тому
How do I make this code work for gpt3 or gpt4? The part I am confused about the regular openai api for fine tuning is that when fine tuning it completes my prompts perfectly, but when I try to have a conversation it doesnt remember the context of previous messages. So i got the idea to feed it back in as an input the entire conversation up to that point, however it gets all crazy after 3-4 messages because the training dataset does not contain any prompts that have conversation history in it, so how is this done in order to have conversations with it like chatgpt?? Please someone help me
@MultiTheflyer Рік тому ⁺¹
you have to fine tune it using chat history as well, e.g. your input would be the last 7/8 messages (which should be tagged user/assisant) and your output would be the message you want to generate depending on chat history (last 7/8 messages). I'm pretty sure Openai documentation has a pretty thorough example on this
@muslim_bro 3 місяці тому
Amazing how easy gpt2 can be fine tuned. My question is , do the final result can handle different questions in same context?
Who is the babel fish?
Tell me about babel fish?
Do you know babel fish?
Etc..
@camcodex Рік тому
Great
@samsonthomas6797 Рік тому
am getting this error while running the code on :: RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`
@reema6540 8 місяців тому
same, did you solve it?
@anammanzoor1166 2 місяці тому
Hi Sreeni can you please make one video on ViT Swin transformers implementation?
@sripalimanimekala8730 6 місяців тому ⁺¹
i followed your tutorial to finetune a gpt2model, but when the train function is called 1st i got a " 1785 if not is_sagemaker_mp_enabled():
1786 if not is_accelerate_available(min_version="0.20.1"):
-> 1787 raise ImportError(
1788 "Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`"
1789 )
ImportError: Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`" error, after pip install accelerate now I get NameError: name 'AcceleratorState' is not defined error. can somebody help me with this
@sripalimanimekala8730 6 місяців тому
did anyone encounted the same erro..plss this is urgent
@Andy-rq6rq 5 місяців тому
bit of a late reply but you just do !pip install accelerate -U and then restart the runtime and run the script again
@robosergTV 11 місяців тому
What if I have a large raw text of documentation. Would I need to manually prepare questions and answers? That would be too tedious and require lots of human work.
Are there no other option? Like maybe pre-train the model on a huge raw text of documents, and then find some public question answering dataset to prime to model?
Same for summarization, how can I train a model on my custom text if I dont have examples of summarization? Can I somehow leverage pre-trained models that do summarization?
All I have is unfiltered raw text of documents.
@sathvikreddy4807 11 місяців тому
same, did you find anything?
@oliverdassinger9522 6 місяців тому
also curious, any updates?@@sathvikreddy4807
@mytv2362 4 місяці тому
I used gpt to generate question and answers 😂
@koustavchakraborty4047 Рік тому ⁺¹
can we get the datasets?
@srujalreddy__3132 10 місяців тому
did you find the dataset?
@winayazarkasih8653 7 місяців тому
@@srujalreddy__3132 same maybe somebody get the dataset?
@gameandit 7 місяців тому
nice guy
@danielrubio6220 Рік тому
Dear Sreeni, are you able to contact by email? Can you indicate me it?
@Daniel-fl4si Рік тому
Pretty arrogant!
@DigitalSreeni Рік тому ⁺¹
What part, if I may ask?
@Daniel-fl4si Рік тому ⁺¹
@@DigitalSreeni Hey, I'm so sorry, I didn't mean arrogant, I meant 'approachable', in the sense that it's not as difficult as it sounds. Please accept my apologies. I really like your videos, they have helped me a lot.