Building a Translator with Transformers

Поділитися
Вставка
  • Опубліковано 1 тра 2023
  • SPONSOR
    Get 20% off and be apart of a Software Community: jointaro.com/r/ajayh486
    ABOUT ME
    ⭕ Subscribe: ua-cam.com/users/CodeEmporiu...
    📚 Medium Blog: / dataemporium
    💻 Github: github.com/ajhalthor
    👔 LinkedIn: / ajay-halthor-477974bb
    RESOURCES
    [1] Code for Video: github.com/ajhalthor/Transfor...
    PLAYLISTS FROM MY CHANNEL
    ⭕ Transformers from scratch playlist: • Self Attention in Tran...
    ⭕ ChatGPT Playlist of all other videos: • ChatGPT
    ⭕ Transformer Neural Networks: • Natural Language Proce...
    ⭕ Convolutional Neural Networks: • Convolution Neural Net...
    ⭕ The Math You Should Know : • The Math You Should Know
    ⭕ Probability Theory for Machine Learning: • Probability Theory for...
    ⭕ Coding Machine Learning: • Code Machine Learning

КОМЕНТАРІ • 62

  • @marktahu2932
    @marktahu2932 Рік тому +2

    Great work Ajay - and that's another 'Yes please!' to the all encompassing transformer video,

  • @AlokKumar-fi8qh
    @AlokKumar-fi8qh Рік тому +7

    Loved all your 13 videos. Good to see the problem is resolved. Many thanks for your hard work

    • @CodeEmporium
      @CodeEmporium  Рік тому +1

      Glad you liked the series! And my pleasure!

  • @amiralioghli8622
    @amiralioghli8622 8 місяців тому +2

    Thank you so much for taking the time to code and explain the transformer model in such detail, I followed your series from zeros to heros. You are amazing and, if possible please do a series on how transformers can be used for time series anomaly detection and forecasting. it is extremly necessary on yotube for somone!

  • @RadRebel4
    @RadRebel4 11 місяців тому +1

    I also loved all your videos of this playlist. Could you please also teach how to use this transformer model that you coded in earlier videos to answer questions based on the data it was trained on.

  • @scitechtalktv9742
    @scitechtalktv9742 Рік тому +5

    Yes I would like you to make that all encompassing video, because I like your way of explaining things!

    • @CodeEmporium
      @CodeEmporium  Рік тому +1

      Aye aye I shall

    • @scitechtalktv9742
      @scitechtalktv9742 Рік тому

      I especially would like to know how to change the code in your notebook so that it can translate from Dutch to English let’s say. That would be easier for me to follow the process because I know both languages very well!

  • @DevHisham
    @DevHisham 10 місяців тому +1

    love the series and your energy through out the series. Thank you for your hard work!

  • @ajaytaneja111
    @ajaytaneja111 7 місяців тому +4

    Hi Ajay, don't you think that in a character by character translation, the context is somewhat (or, a lot) between words lost? If the context has to be established considering the order of characters (than the order of words) , then in that case the attention mechanism will surely have to work more rigorously or / and calling much larger data for training a character by character language translation model. V keen to know what you think? I'd also say the number of attention heads should be much more in a character translation as there are more aspects of language to be learned..

  • @TransalpDave
    @TransalpDave Рік тому

    Glad that you found the answer to your issues ! And i would be great to make that big vid of an end to end creation of the transformer architecture !

    • @CodeEmporium
      @CodeEmporium  Рік тому +1

      Yep! I’ll probably get back to it after I make some more NLP videos

    • @TransalpDave
      @TransalpDave Рік тому

      @@CodeEmporium Are you planning on making a vid about word2vec or doc2vec ?

    • @CodeEmporium
      @CodeEmporium  Рік тому

      Yea! Actually one of my next videos is on word embeddings. And there should be some more on this front in the future

  • @VarunReddy23
    @VarunReddy23 Рік тому +1

    Great video. very helpful. Could you add a notebook with all pieces combined.

  • @dwarakanathchandra7611
    @dwarakanathchandra7611 7 місяців тому

    What if I need to perform word-to-word translation from English to French rather than doing character-to-character translation? Do I need to feed the word embeddings at the encoder and decoder ends using any of BERT, GLOVE, or Word2Vec?

  • @vijaysen9739
    @vijaysen9739 26 днів тому

    nice work ! how much data(parallel corpora) is sufficient or atleast required for machine translation ?

  • @nhuttiennguyenbach5689
    @nhuttiennguyenbach5689 4 місяці тому

    Hi, thank you for sharing. I have a question that when a change the data (english -> vietnamese) and also vocabulary, the code not work. can you explain this problem? Thank you.

  • @lakshman587
    @lakshman587 6 місяців тому +2

    This is really Awesome!!!
    Thank you so much for the entire playlist. I have learnt so many things from your video's!
    How much time did you all together take to understand and get the video done for transformers?

    • @CodeEmporium
      @CodeEmporium  6 місяців тому +1

      Thanks so much for the kind words and donation. I initially got into transformers back in 2019 so I had some baseline knowledge on the matter. But for this playlist, I would say it took a few weeks to get everything set up from scratch + create the diagrams and videos. I was stuck on the last piece for some time until a viewer helped me diagnose the issue.

    • @lakshman587
      @lakshman587 6 місяців тому

      I must say that you have a very good understanding of the topic.
      I had another doubt, in this we are generating characters but in chat gpt the transformer used are generating words/tokens? How does it work? Can we have a video about how we can fine tune the llama model for some particular use case?

  • @aaronzheng2341
    @aaronzheng2341 9 днів тому

    Hello Sir, I'm looking through the code and the "forward" function in the transformer architecture. Why exactly are you not using softmax?

  • @niranjandeshpande2973
    @niranjandeshpande2973 10 місяців тому

    Hi Ajay, Thanks for creating these videos and I really learnt a lot. I converted you code in Tensorflow. The transformer model runs without error, however the loss is reducing by 0.005 each epoch and accuracy is only 1%-2%. When I try to infer - it only predicts end token as output. Can you please help me fix it?

  • @mariej397
    @mariej397 6 місяців тому

    Thank you so much for the video and the code, can you please tell how do I add another language?

  • @sameer_datascience
    @sameer_datascience Рік тому +1

    Great explanation for both theory as well as code implementation. Can you share your hardware specification on which model training done !!

    • @CodeEmporium
      @CodeEmporium  Рік тому +1

      This was on google colab. So you are using googles free GPU to train the model. I just wrote and executed everything on my MacBook Air :)

  • @shivampradhan6101
    @shivampradhan6101 Рік тому

    Thanks for the videos.
    Is the transformer model videos completed for some time or are you planning to make more on thet topic.

    • @CodeEmporium
      @CodeEmporium  Рік тому

      The series is done with this video. But I am making more videos on language models in general. So a lot more to come on that front! Also, I am considering making a single long video that constructs transformers from scratch for a smooth flow. I’ll consider this in the next few weeks

  • @batistisrico8035
    @batistisrico8035 4 місяці тому

    Good day Sir can i ask something if i can i use different language for this?

  • @user-up2sd1wq1n
    @user-up2sd1wq1n 5 місяців тому

    hi, will it work offline as well, in case if it is integrated into a system

  • @manoj_dil
    @manoj_dil Рік тому +1

    Need more videos about BERT and GPT. 🙂

    • @CodeEmporium
      @CodeEmporium  Рік тому

      Yep. Gonna be making more videos on language models in general. So hope you’ll stick around

  • @user-zb4hk3sp7h
    @user-zb4hk3sp7h 8 місяців тому

    sir please how do I finetune this code for sentence to sentence (word) translation?

  • @hsb601
    @hsb601 8 місяців тому

    Some one explain me how i run this model on Jupiter notebook on github many files .ipnyb files which is complete file and i put my hebrew to english data set and run this model to train my data set how?

  • @lakshman200
    @lakshman200 6 місяців тому +1

    How did you get the Dataset?

  • @adyudhapr
    @adyudhapr 7 місяців тому

    "Can you create a video content explaining how the Transformer works for a time series forecasting project? I want to try something new but I'm still stuck in my progress!"

  • @arijitgoswami7388
    @arijitgoswami7388 20 днів тому

    Sir.. when I trying this code for English to Bengali translation, it is giving errors in model training.
    Most probably in the decoder section.
    Index out of range in self.
    Please help sir.

  • @lisaming5423
    @lisaming5423 2 місяці тому +2

    Hello, when I run the attention notebook as a python module I get and error:
    Traceback (most recent call last):
    File "D:\Jerry\DSLR_PICTURE_DATA\DSLR_PIC\AI\self_attention\self_attention_code_empire\attention_notebook.py", line 198, in
    kn_predictions = transformer(eng_batch,
    File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
    n\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
    File "D:\Jerry\DSLR_PICTURE_DATA\DSLR_PIC\AI\self_attention\self_attention_code_empire\transformer.py", line 301, in forward
    x = self.encoder(x, encoder_self_attention_mask, start_token=enc_start_token, end_token=enc_end_token)
    File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
    n\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
    File "D:\Jerry\DSLR_PICTURE_DATA\DSLR_PIC\AI\self_attention\self_attention_code_empire\transformer.py", line 178, in forward
    x = self.sentence_embedding(x, start_token, end_token)
    File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
    n\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
    File "D:\Jerry\DSLR_PICTURE_DATA\DSLR_PIC\AI\self_attention\self_attention_code_empire\transformer.py", line 71, in forward
    x = self.embedding(x)
    File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
    n\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
    File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
    n\modules\sparse.py", line 162, in forward
    return F.embedding(
    File "C:\Users\mingl\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch
    n\functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
    IndexError: index out of range in self
    any thoughts as to my mistake

    • @arijitgoswami7388
      @arijitgoswami7388 20 днів тому

      Same error in my case when I am trying from English to bengali translation.

  • @amitsingha1637
    @amitsingha1637 8 місяців тому

    Waiting for Mega video.

  • @dimaglobin9562
    @dimaglobin9562 Рік тому

    Hello, guys! I have some problem. I trained this transformer model for other languages and tested it in jupyter notebook, model worked good. I tried to save it using torch.save(model.state_dict(), "path/to/file.pth"). Than I loaded it from this file and it generated very very strange things. What can be the reason of it?

    • @CodeEmporium
      @CodeEmporium  Рік тому

      Hmm. Did you set the model in evaluation mode executing model.eval() before making predictions?

    • @dimaglobin9562
      @dimaglobin9562 Рік тому

      @@CodeEmporium yes. I found mistake. I changed your code a little bit and used list(set("string")) for building a vocabulary and in different notebooks it works different. I was surprised when I found it)).
      Thank you for your videos, you helped me to make project for my CS course in university!

  • @user-cn2en7oh6x
    @user-cn2en7oh6x 10 місяців тому

    sir, i want to remove noise from EEG signal by using Transformer please give me codes

  • @emptiness116
    @emptiness116 11 місяців тому

    I WANNA SEE THE MEGA VIDEO

  • @Harmonikas1992
    @Harmonikas1992 Рік тому +2

    Great explanation but i get in the epochs an error: IndexError: index out of range in self. What could it be?

    • @mahirrafid6626
      @mahirrafid6626 10 місяців тому

      I am also getting the same error. Did you solve it??

    • @mahirrafid6626
      @mahirrafid6626 10 місяців тому

      @Harmonikas1992

    • @Harmonikas1992
      @Harmonikas1992 10 місяців тому

      @@mahirrafid6626 sorry no :(

    • @DigitalShaolin
      @DigitalShaolin 9 місяців тому

      Did we fix the error?

    • @Harmonikas1992
      @Harmonikas1992 9 місяців тому

      @@DigitalShaolin Sorry still not fixed. :(

  • @gopalsurya
    @gopalsurya Рік тому

    Thanks for the great series Ajay. I am a retired man trying to understand the basics of transformer networks and it helped me a lot. I speak kannada too by the way. Whenever you get a chance could you make a video series on PyTorch? Thanks again.

    • @CodeEmporium
      @CodeEmporium  Рік тому

      Super happy you enjoyed the series! Torch series might by outside my wheel house at the moment. But there are some really good deep learning coders out there on UA-cam. I like Nicholas Renotte. I think he might have the PyTorch videos you are looking for!

  • @vigneshvicky6720
    @vigneshvicky6720 9 місяців тому +1

    Yolov8 plz

  • @SolathPrime
    @SolathPrime Рік тому +4

    Did you solve the problem ?

  • @Pf.Basset
    @Pf.Basset 9 місяців тому