Building a Translator with Transformers
Вставка
- Опубліковано 1 тра 2023
- SPONSOR
Get 20% off and be apart of a Software Community: jointaro.com/r/ajayh486
ABOUT ME
⭕ Subscribe: ua-cam.com/users/CodeEmporiu...
📚 Medium Blog: / dataemporium
💻 Github: github.com/ajhalthor
👔 LinkedIn: / ajay-halthor-477974bb
RESOURCES
[1] Code for Video: github.com/ajhalthor/Transfor...
PLAYLISTS FROM MY CHANNEL
⭕ Transformers from scratch playlist: • Self Attention in Tran...
⭕ ChatGPT Playlist of all other videos: • ChatGPT
⭕ Transformer Neural Networks: • Natural Language Proce...
⭕ Convolutional Neural Networks: • Convolution Neural Net...
⭕ The Math You Should Know : • The Math You Should Know
⭕ Probability Theory for Machine Learning: • Probability Theory for...
⭕ Coding Machine Learning: • Code Machine Learning
Great work Ajay - and that's another 'Yes please!' to the all encompassing transformer video,
Loved all your 13 videos. Good to see the problem is resolved. Many thanks for your hard work
Glad you liked the series! And my pleasure!
Thank you so much for taking the time to code and explain the transformer model in such detail, I followed your series from zeros to heros. You are amazing and, if possible please do a series on how transformers can be used for time series anomaly detection and forecasting. it is extremly necessary on yotube for somone!
I also loved all your videos of this playlist. Could you please also teach how to use this transformer model that you coded in earlier videos to answer questions based on the data it was trained on.
Yes I would like you to make that all encompassing video, because I like your way of explaining things!
Aye aye I shall
I especially would like to know how to change the code in your notebook so that it can translate from Dutch to English let’s say. That would be easier for me to follow the process because I know both languages very well!
love the series and your energy through out the series. Thank you for your hard work!
Hi Ajay, don't you think that in a character by character translation, the context is somewhat (or, a lot) between words lost? If the context has to be established considering the order of characters (than the order of words) , then in that case the attention mechanism will surely have to work more rigorously or / and calling much larger data for training a character by character language translation model. V keen to know what you think? I'd also say the number of attention heads should be much more in a character translation as there are more aspects of language to be learned..
Glad that you found the answer to your issues ! And i would be great to make that big vid of an end to end creation of the transformer architecture !
Yep! I’ll probably get back to it after I make some more NLP videos
@@CodeEmporium Are you planning on making a vid about word2vec or doc2vec ?
Yea! Actually one of my next videos is on word embeddings. And there should be some more on this front in the future
Great video. very helpful. Could you add a notebook with all pieces combined.
What if I need to perform word-to-word translation from English to French rather than doing character-to-character translation? Do I need to feed the word embeddings at the encoder and decoder ends using any of BERT, GLOVE, or Word2Vec?
nice work ! how much data(parallel corpora) is sufficient or atleast required for machine translation ?
Hi, thank you for sharing. I have a question that when a change the data (english -> vietnamese) and also vocabulary, the code not work. can you explain this problem? Thank you.
This is really Awesome!!!
Thank you so much for the entire playlist. I have learnt so many things from your video's!
How much time did you all together take to understand and get the video done for transformers?
Thanks so much for the kind words and donation. I initially got into transformers back in 2019 so I had some baseline knowledge on the matter. But for this playlist, I would say it took a few weeks to get everything set up from scratch + create the diagrams and videos. I was stuck on the last piece for some time until a viewer helped me diagnose the issue.
I must say that you have a very good understanding of the topic.
I had another doubt, in this we are generating characters but in chat gpt the transformer used are generating words/tokens? How does it work? Can we have a video about how we can fine tune the llama model for some particular use case?
Hello Sir, I'm looking through the code and the "forward" function in the transformer architecture. Why exactly are you not using softmax?
Hi Ajay, Thanks for creating these videos and I really learnt a lot. I converted you code in Tensorflow. The transformer model runs without error, however the loss is reducing by 0.005 each epoch and accuracy is only 1%-2%. When I try to infer - it only predicts end token as output. Can you please help me fix it?
Thank you so much for the video and the code, can you please tell how do I add another language?
Great explanation for both theory as well as code implementation. Can you share your hardware specification on which model training done !!
This was on google colab. So you are using googles free GPU to train the model. I just wrote and executed everything on my MacBook Air :)
Thanks for the videos.
Is the transformer model videos completed for some time or are you planning to make more on thet topic.
The series is done with this video. But I am making more videos on language models in general. So a lot more to come on that front! Also, I am considering making a single long video that constructs transformers from scratch for a smooth flow. I’ll consider this in the next few weeks
Good day Sir can i ask something if i can i use different language for this?
hi, will it work offline as well, in case if it is integrated into a system
Need more videos about BERT and GPT. 🙂
Yep. Gonna be making more videos on language models in general. So hope you’ll stick around
sir please how do I finetune this code for sentence to sentence (word) translation?
Some one explain me how i run this model on Jupiter notebook on github many files .ipnyb files which is complete file and i put my hebrew to english data set and run this model to train my data set how?
How did you get the Dataset?
"Can you create a video content explaining how the Transformer works for a time series forecasting project? I want to try something new but I'm still stuck in my progress!"
Sir.. when I trying this code for English to Bengali translation, it is giving errors in model training.
Most probably in the decoder section.
Index out of range in self.
Please help sir.
Hello, when I run the attention notebook as a python module I get and error:
Traceback (most recent call last):
File "D:\Jerry\DSLR_PICTURE_DATA\DSLR_PIC\AI\self_attention\self_attention_code_empire\attention_notebook.py", line 198, in
kn_predictions = transformer(eng_batch,
File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
n\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Jerry\DSLR_PICTURE_DATA\DSLR_PIC\AI\self_attention\self_attention_code_empire\transformer.py", line 301, in forward
x = self.encoder(x, encoder_self_attention_mask, start_token=enc_start_token, end_token=enc_end_token)
File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
n\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Jerry\DSLR_PICTURE_DATA\DSLR_PIC\AI\self_attention\self_attention_code_empire\transformer.py", line 178, in forward
x = self.sentence_embedding(x, start_token, end_token)
File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
n\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Jerry\DSLR_PICTURE_DATA\DSLR_PIC\AI\self_attention\self_attention_code_empire\transformer.py", line 71, in forward
x = self.embedding(x)
File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
n\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
n\modules\sparse.py", line 162, in forward
return F.embedding(
File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
n\functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
any thoughts as to my mistake
Same error in my case when I am trying from English to bengali translation.
Waiting for Mega video.
Hello, guys! I have some problem. I trained this transformer model for other languages and tested it in jupyter notebook, model worked good. I tried to save it using torch.save(model.state_dict(), "path/to/file.pth"). Than I loaded it from this file and it generated very very strange things. What can be the reason of it?
Hmm. Did you set the model in evaluation mode executing model.eval() before making predictions?
@@CodeEmporium yes. I found mistake. I changed your code a little bit and used list(set("string")) for building a vocabulary and in different notebooks it works different. I was surprised when I found it)).
Thank you for your videos, you helped me to make project for my CS course in university!
sir, i want to remove noise from EEG signal by using Transformer please give me codes
I WANNA SEE THE MEGA VIDEO
Great explanation but i get in the epochs an error: IndexError: index out of range in self. What could it be?
I am also getting the same error. Did you solve it??
@Harmonikas1992
@@mahirrafid6626 sorry no :(
Did we fix the error?
@@DigitalShaolin Sorry still not fixed. :(
Thanks for the great series Ajay. I am a retired man trying to understand the basics of transformer networks and it helped me a lot. I speak kannada too by the way. Whenever you get a chance could you make a video series on PyTorch? Thanks again.
Super happy you enjoyed the series! Torch series might by outside my wheel house at the moment. But there are some really good deep learning coders out there on UA-cam. I like Nicholas Renotte. I think he might have the PyTorch videos you are looking for!
Yolov8 plz
Did you solve the problem ?
Yep. Just released the new video
Many thanks :)