Image Caption Generator using Flickr Dataset | Deep Learning | Python

Поділитися
Вставка
  • Опубліковано 11 вер 2024

КОМЕНТАРІ • 500

  • @HackersRealm
    @HackersRealm  Рік тому +10

    Hey Hackers,
    I have updated the code to test the model with new image/image url along with flickr32k dataset for better results. You can find the latest code in the description.
    For users getting the following error:
    `output_signature` must contain objects that are subclass of `tf.TypeSpec`
    Please update the code snippets in data_generator and model creation like I updated in my website/description link. It's working with latest version of tensorflow as well without issues.
    Happy Learning!!!

    • @petenallan24
      @petenallan24 Рік тому +1

      Sir what will be the base dir and work dir if working in jupyter notebook

    • @HackersRealm
      @HackersRealm  Рік тому

      @@petenallan24 you can change to your dataset directory and some new folder as working directory!!!

    • @rohith646
      @rohith646 Рік тому

      @@HackersRealm sir i have uploaded dataset and captions file in my google drive and started doing in google colab now what i have to keep my base dir and working dir??

    • @HackersRealm
      @HackersRealm  Рік тому

      @@rohith646 the base dir will be the dataset folder... Try to check if that works or change the code accordingly

    • @beatx2173
      @beatx2173 10 місяців тому +1

      thanks

  • @JannatulFerdous-ew5ko
    @JannatulFerdous-ew5ko 2 роки тому +25

    Appreciated your project details. It took me almost 3 weeks to reproduce similar results.

    • @HackersRealm
      @HackersRealm  2 роки тому +1

      Glad it helped you!!!

    • @keerthyk2284
      @keerthyk2284 7 місяців тому +2

      My final year project also this topic

    • @amineakkati7501
      @amineakkati7501 5 місяців тому

      can I please contact you i need some ideas about the project

  • @MsBothyna
    @MsBothyna 6 місяців тому +1

    By the way, I forgot to thank you for all this excellent explanation. You, sir, are a truly great person. I am very grateful to you.

    • @HackersRealm
      @HackersRealm  6 місяців тому +1

      Thanks for your kind words!! Happy to help!!!

    • @MsBothyna
      @MsBothyna 6 місяців тому

      Yes, indeed your explanation and video have helped me understand machine learning and models much better than my professor's explanation. 😅😅
      However, I have a simple question. When I try to use the part for "Test with Real Image", I get an incorrect prediction result. Could you please explain to me what I should do? Keep in mind that all the results in the code are correct and all the steps match exactly as in your explanation.@@HackersRealm

    • @HackersRealm
      @HackersRealm  6 місяців тому +1

      @@MsBothynaCurrently we are using a smaller dataset, if you train with flikr 32k dataset, you might see better results.

  • @plabmadeeasy
    @plabmadeeasy Рік тому +2

    Beautiful explanation! Thanks for this!

  • @rishabhvyas4969
    @rishabhvyas4969 Рік тому +2

    why you didn't use keras imagedatagenerator to extract the features from the model. It create whole image preprocessing pipeline so you don't have to do it manually. Btw great tutorial!

    • @HackersRealm
      @HackersRealm  Рік тому +2

      It will extract the features step again for rerun. By extracting separately and storing helps to avoid the rerun from scratch.

  • @mohamedsahli9935
    @mohamedsahli9935 Рік тому +2

    thank u sir best explained IC video so far

  • @wildshore8580
    @wildshore8580 2 роки тому +4

    Loved the implementation and the explanation. Could you please do an end to end chatbot implementation like this, using cornell movie dataset?

    • @HackersRealm
      @HackersRealm  2 роки тому +2

      chatbot application is already done for generic messages, check the python projects playlist

  • @harshith24
    @harshith24 10 місяців тому +2

    It's a wonderful project and I could easily get the output by following your instructions , but after completing everything , if I try predicting the output for a new image , the output is not relevant , how can I correct this , It would be very helpful if you could help us do this . Thank you

    • @HackersRealm
      @HackersRealm  10 місяців тому

      You could use flickr 32k dataset which has much variety so that new image can work very well

  • @mayur7452
    @mayur7452 8 місяців тому +2

    Hello sir. I am doing this project but using EfficientNetV2B0 and GRU. But my bleu1 score is not getting more than 0.22. What needs to be changed? Is it possible to get bleu1 score more than 0.5? also, how can we load this model so that retraining is not required and how to implement it in the GUI

  • @SaiKumar-mf3pw
    @SaiKumar-mf3pw Рік тому +4

    Can we use jupyter notebook for this project

  • @sudeshnakundu3909
    @sudeshnakundu3909 2 місяці тому +1

    Thanks for this video, explained well! Can the model predict on monuments and historical structures? I mean can the model predict on totally unseen data and can you please make a video of how to put entity awareness on top of it

    • @HackersRealm
      @HackersRealm  2 місяці тому

      yes, but you have to train with more data for better results, I have used smaller dataset for the demo

  • @sreelakshminarayanan.m6609
    @sreelakshminarayanan.m6609 6 місяців тому +1

    Thanks for the wonderful video , code and explanation

  • @Carbon69
    @Carbon69 9 днів тому +2

    How to do it like i will provide any random google image and it will prove caption according to that

    • @HackersRealm
      @HackersRealm  8 днів тому

      If you train with flick32k dataset, it would provide better results

  • @MarehAboGhanem
    @MarehAboGhanem 4 місяці тому +1

    Hello! I tried with 70 epochs and the result doesn’t improve than 52 BELU score and I want to try hyper parameter using grid search but it’s not work without “y-train”, could you tell me How to get the y and how to apply this technique?!

  • @nivedansharma4293
    @nivedansharma4293 2 роки тому +3

    I'm getting error 'int' object not iterable in model.fit(generator, epochs=1 , steps_per_epoch = steps , verbose =1)

    • @HackersRealm
      @HackersRealm  2 роки тому

      Did u run the same code

    • @nivedansharma4293
      @nivedansharma4293 2 роки тому

      @@HackersRealm yes i run the exactly same code

    • @nivedansharma4293
      @nivedansharma4293 2 роки тому

      epochs =15
      batch_size = 64
      steps = len(train) // batch_size
      for i in range(epochs):
      generator = data_generator(train , mapping , features , tokenizer , max_length , vocab_size , batch_size)
      model.fit(generator , epochs=1 , steps_per_epoch = steps , verbose=1)
      TypeError: 'int' object is not iterable

    • @nivedansharma4293
      @nivedansharma4293 2 роки тому

      and i run it in kaggle notebook

    • @youssefrizk5905
      @youssefrizk5905 2 роки тому

      ​@@nivedansharma4293 Hello, I was getting the same error, in addition to another error. The mistake that I did is: While creating the model:
      fe1 = Droupout(0.4)(imputs1) and se2 = Droupout(0.4)(se1). But I was writing 0,4 instead of 0.4 This error was gone when I corrected it

  • @tanviladdha4120
    @tanviladdha4120 2 роки тому +1

    this is the best video and so perfectly explained. sir can you please make a video on video captioning using MSVD dataset. thankyou 👍🏼

    • @HackersRealm
      @HackersRealm  2 роки тому +1

      Planning to do it as upcoming project, will do. Glad you liked this video!!!

    • @tanviladdha4120
      @tanviladdha4120 2 роки тому +1

      @@HackersRealm thats great! 😊
      will be waiting for it and hoping to see it soon

  • @manishakumari4501
    @manishakumari4501 6 місяців тому +1

    I really liked this video, great!!!

  • @prodevmahi4901
    @prodevmahi4901 Рік тому +1

    Kaggle in "Accelerator" tab now provides even TPU, out of 4 options shown in the drop down, which to choose?

  • @riyanagar2619
    @riyanagar2619 11 місяців тому

    Thank you for great explanation. I have a question, what's the accuracy of this project?

  • @sreelakshmi7932
    @sreelakshmi7932 6 місяців тому +3

    Helle Sir when i try to Extract image features it shows gaierrors , url errors, exception in model=VGG16 etc
    How Can i fix it? Plz help me..

  • @trangle1506
    @trangle1506 6 місяців тому +1

    Well explained. Thank you so much bro

  • @user-jo7pq2ti7r
    @user-jo7pq2ti7r 2 роки тому +2

    The video was great,so much love.
    Can you tell me how can I apply the same code for Bengali caption generation?
    where will be the changes?

    • @HackersRealm
      @HackersRealm  2 роки тому

      If you have the dataset similar to this, you can proceed with the same workflow

    • @Mehedi.25
      @Mehedi.25 2 роки тому

      vai apni ki ai project niye r kaj korecen?

  • @LK-cp5ow
    @LK-cp5ow 11 місяців тому +1

    It will be great if you add the pickled model in the git repo, as it's going to take my pc about 4hrs to train the model... :(. Other than that, fantastic video!

    • @HackersRealm
      @HackersRealm  11 місяців тому +1

      I will try to upload that if possible

  • @percyjackson583
    @percyjackson583 6 місяців тому

    The predicted caption is empty, only startseq and endseq is there, I am too trying to resolve, any suggestion whIch part should I check?

  • @tazarhussain22
    @tazarhussain22 2 роки тому +3

    Thank you for the nice implementation. I have a question, can i use the same approach to generate text from numbers (like tabular data) instead of image features?

    • @HackersRealm
      @HackersRealm  2 роки тому

      Yes, It may possible but you have to properly adjust the layers and features accordingly

    • @hariom6910
      @hariom6910 2 роки тому

      Bro,Have you got the output of the code

    • @HackersRealm
      @HackersRealm  2 роки тому

      @@hariom6910 you can see at the end of the video

  • @saithota6568
    @saithota6568 15 днів тому

    What is the use of batch size and dense layer??✨✨

  • @subhayanbhattacharya2674
    @subhayanbhattacharya2674 2 роки тому +1

    Wonderful video. Very insightful. Can you please mention what version of tensorflow and keras you are working with. Thanks in advance!

    • @HackersRealm
      @HackersRealm  2 роки тому

      I am using the modules available in kaggle, you can check the version of the modules there itself

    • @hamnamatloob8231
      @hamnamatloob8231 2 роки тому +1

      when i load the VGG16 model it pass an error

  • @user-ho6bv7kr9b
    @user-ho6bv7kr9b 6 місяців тому

    In the original extract features from image step, I followed your steps and displayed 'Error displaying widget: model not found'. How to solve it? I've been looking for a solution for a long time, but there's no solution.

  • @jodgamezoo2076
    @jodgamezoo2076 6 місяців тому +2

    Getting there error while triainng the model . Code : # train the model
    epochs = 20
    batch_size = 32
    steps = len(train) // batch_size
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    for i in range(epochs):
    # create data generator
    generator = data_generator(train, mapping, features, tokenizer, max_length, vocab_size, batch_size)
    # fit for one epoch
    model.fit(generator, epochs=1, steps_per_epoch=steps, verbose=1)
    Error:
    TypeError: `output_signature` must contain objects that are subclass of `tf.TypeSpec` but found which is not.

    • @jodgamezoo2076
      @jodgamezoo2076 6 місяців тому

      I tried actual code also :# train the model
      epochs = 20
      batch_size = 32
      steps = len(train) // batch_size

      for i in range(epochs):
      # create data generator
      generator = data_generator(train, mapping, features, tokenizer, max_length, vocab_size, batch_size)
      # fit for one epoch
      model.fit(generator, epochs=1, steps_per_epoch=steps, verbose=1)

    • @meghanaarvapally5484
      @meghanaarvapally5484 20 днів тому

      Hey I am getting url fetch exception in image extraction. How to correct it can u tell me

  • @vedicakandoi7949
    @vedicakandoi7949 Місяць тому

    Getting this error while training the model -
    assertion failed: [You are passing a RNN mask that does not correspond to right-padded sequences, while using cuDNN, which is not supported. With cuDNN, RNN masks can only be used for right-padding, e.g. `[[True, True, False, False]]` would be a valid mask, but any mask that isn\'t just contiguous `True`\'s on the left and contiguous `False`\'s on the right would be invalid. You can pass `use_cudnn=False` to your RNN layer to stop using cuDNN (this may be slower).]
    [[{{node functional_1_1/lstm_1/Assert/Assert}}]] [Op:__inference_one_step_on_iterator_423791]
    I have not changed anything in the code. Running your code only. Please suggest what to do?

  • @udaykiran2065
    @udaykiran2065 Рік тому +1

    There is error in data generator function that is basically some of the keys are not present in features
    def data_generator(data_keys, mapping, features, tokenizer, max_length, vocab_size, batch_size):
    # loop over images
    X1, X2, y = list(), list(), list()
    n = 0
    while 1:
    for key in data_keys:
    n += 1
    captions = mapping[key]
    # process each caption
    for caption in captions:
    # encode the sequence
    seq = tokenizer.texts_to_sequences([caption])[0]
    # split the sequence into X, y pairs
    for i in range(1, len(seq)):
    # split into input and output pairs
    in_seq, out_seq = seq[:i], seq[i]
    # pad input sequence
    in_seq = pad_sequences([in_seq], maxlen=max_length)[0]
    # encode output sequence
    out_seq = to_categorical([out_seq], num_classes=vocab_size)[0]

    # store the sequences
    if key in features:
    X1.append(features[key][0])
    X2.append(in_seq)
    y.append(out_seq)
    if n == batch_size:
    X1, X2, y = np.array(X1), np.array(X2), np.array(y)
    yield [X1, X2], y
    X1, X2, y = list(), list(), list()
    n = 0
    do the needful so that no one encounters the same error.

    • @venkateshprasad-vs9ih
      @venkateshprasad-vs9ih Рік тому

      did you get resolve this issue? I got the same error ..... Can you help me out with what to do next ...thanks in advance

    • @MennaSeleem-fd9zb
      @MennaSeleem-fd9zb Рік тому

      If its the
      X1.append(features[key][0]) then me too and i cant seem to solve it so if anyone could please send it

    • @varshanikumbh717
      @varshanikumbh717 Рік тому

      Even I too get the same error

  • @mahidiwijayantha
    @mahidiwijayantha Рік тому

    Thank you for the nice explanation. I have few questions.
    1. Can we use this flow with larger dataset?
    2. Can we use this flow for an image caption generator of fashion product images?

    • @HackersRealm
      @HackersRealm  Рік тому +1

      yes you can use the same!!!

    • @mahidiwijayantha
      @mahidiwijayantha Рік тому

      @@HackersRealm Thank you for your response. I've two more questions.
      1. Can we use this flow for generating a caption for a new image which is not in the training dataset?
      2. I want to create an image caption generator for fashion products. I created a dataset with images and captions for training. Can I use this flow to generate captions by extracting features (attributes and categories) of the fashion products?

    • @HackersRealm
      @HackersRealm  Рік тому +1

      @@mahidiwijayantha yes, it's possible for both scenarios

  • @solutiontolifetarotreading9103
    @solutiontolifetarotreading9103 7 місяців тому +1

    Load the model file vgg16 is error 😢😢 how can i resolved??

    • @HackersRealm
      @HackersRealm  7 місяців тому

      Try to enable internet connection in kaggle settings

  • @mytv2362
    @mytv2362 Рік тому +1

    I made the gui for this thanks for the code btw

  • @rappaivo5779
    @rappaivo5779 2 роки тому +2

    May I know why the 'return_sequence' and 'return_state' of LSTM set as False (default) in a text prediction network?

    • @akshayhasabe8766
      @akshayhasabe8766 6 місяців тому

      Bcz there is only 1 lstm layer.... We don't need output of every time step to pass to next layer here.. if u are stacking multiple lstm or gru you will need output from every time steps

  • @koushikguptabonthala2429
    @koushikguptabonthala2429 2 роки тому +2

    If the BLEU is above 0.5 then what is accuracy in percentage. Can you please tell that

    • @HackersRealm
      @HackersRealm  2 роки тому +2

      accuracy is not a meaningful metric for this problem

  • @percyjackson583
    @percyjackson583 6 місяців тому +1

    Great video but can anyone tell what application is being used like what is the name of IDE that is being used ? anyone pls quick

  • @pradnyeshdoshi348
    @pradnyeshdoshi348 2 роки тому +3

    Thanks for wonderful implementation 😊. I run it successfully but
    Can you tell me how context.txt fill is created because I saw that our input image should be in particular format and we get correct results only for 8000 images.
    Is it possible for other images? and I think it's not extracting text from image, it's extracting from context.text file.
    If I am wrong then please correct me.
    Thank you 😊

    • @HackersRealm
      @HackersRealm  2 роки тому +1

      I can able to predict for new images that are not in the dataset as well; for better prediction use flickr32k dataset and use it

    • @pradnyeshdoshi348
      @pradnyeshdoshi348 2 роки тому +1

      @@HackersRealm Thanks for replied.
      Can you tell me how you get new input image.
      Images need proper name. How you set that name?

    • @pradnyeshdoshi348
      @pradnyeshdoshi348 2 роки тому +1

      @@HackersRealm can you make short video ? Then everyone get idea about it. We are not looking for accuracy. We just excited to know how image processing done by CNN. Amd i don't have enough resources to train model with 30000 images. 8000 images is sufficient for me 😅

    • @HackersRealm
      @HackersRealm  2 роки тому

      @@pradnyeshdoshi348 Then you can try to predict with new image and check the results, the process is same for the prediction

    • @mr.anonymous8410
      @mr.anonymous8410 2 роки тому +2

      @@HackersRealm Hi how to predict captions for new images (which is not present in the flickr dataset)?.

  • @I.II..III...IIIII.....
    @I.II..III...IIIII..... Рік тому +1

    Hello. I've followed your video and I tried to train a model on flickr30k. My problem is that the captions that I generate are repetitive. What I mean by that is that whatever is in the image, my captions are always something like: "A man in a black shirt is walking down the street". How can I make the model more diverse?

    • @HackersRealm
      @HackersRealm  Рік тому +1

      Is this showing for any image you try? But that shouldn't happen as there should be slight difference in output even the input is changed

    • @HackersRealm
      @HackersRealm  11 місяців тому

      @@user-px8qq6on1p it's very unlikely happen if you follow the same steps, as you can see in the video... it's generating different results for each image... we need to find out where it's going wrong as there are so many moving parts

  • @manmaaze
    @manmaaze 2 роки тому +2

    What if I give an image other than in the dataset? Will it preditct the caption ?

    • @HackersRealm
      @HackersRealm  2 роки тому +2

      It will try to predict the caption in general manner; You can train with more images for better prediction

    • @FatimaYousif
      @FatimaYousif 2 роки тому

      @@HackersRealm how will we achieve the finding of unseen image's captions in the code?
      Would be grateful if you help me in this regard, since I have a demo to present on new/unseen images the next week.

    • @HackersRealm
      @HackersRealm  2 роки тому

      @@FatimaYousif You can train the model with flickr 32k dataset, that will give good predictions on new image data

  • @soumyasingh8500
    @soumyasingh8500 11 місяців тому

    is the predicted output, not wrong in every case?

  • @meriemsabour2830
    @meriemsabour2830 Рік тому +3

    When i try model.fit(generator, epochs=1, steps_per_epoch=steps, verbose=1) i have this error:
    KeyError: '1000268201_693b08cb0e' & when i do len(features) i find 80 While len(image_names) = 8101, whyy it did not process all the images ??

  • @Waliul_The_Wall-E
    @Waliul_The_Wall-E 11 місяців тому

    Thanks for the implementation. But I have a question and that is, what is the LSTM layer doing (1:00:12)? What's the use of this layer? All the papers use the LSTM for the word generation but you're not using the LSTM layer for word generation, you are using a Dense layer for word generation. Then why are you using the LSTM layer? And also, how is the Embedding layer learning here? TIA.

    • @HackersRealm
      @HackersRealm  11 місяців тому

      All the mentioned layers are used for the lstm model to generate a new word at a time

  • @bhushanambhore8378
    @bhushanambhore8378 Рік тому +1

    Does you model doing training for all 8000 images in the dataset?
    Because when I tried different model it only taking at the most 1600 images for training from dataset due to memory issue.

    • @HackersRealm
      @HackersRealm  Рік тому

      the memory issue won't happen due to custom data generator function.

    • @bhushanambhore8378
      @bhushanambhore8378 Рік тому +1

      @@HackersRealm Okay, but approx how many image does your model using for training, is it using all the 8000 images from the dataset?

    • @HackersRealm
      @HackersRealm  Рік тому +1

      @@bhushanambhore8378 i think around 6.5k something, you can check the video again as i have split the data for train and test

  • @akulasaimanasa3344
    @akulasaimanasa3344 Рік тому +1

    Does BASE_DIR consists of only images or the folder consisting images and captions?
    And what does working_dir holds?Is that an empty folder?

    • @HackersRealm
      @HackersRealm  Рік тому +1

      we will store extracted features there in working directory

    • @meghanaarvapally5484
      @meghanaarvapally5484 20 днів тому

      ​@HackersRealm I am getting url fetch failure in image extraction how to correct it. Can u pls tell sir

  • @free-Palestine11
    @free-Palestine11 Рік тому +1

    Thank you for this video! Had a question. How can I pickle the implemented model to use it in some app. I am having trouble getting models out in .h5 or pkl formats in general. Can anyone help with that?

    • @HackersRealm
      @HackersRealm  Рік тому

      Usually we store in the model in h5 format and it works well without any issues while reloading!!! What error you're facing in this?

  • @SanapPrasad-e2r
    @SanapPrasad-e2r 6 місяців тому

    I am getting ZeroDivision error when finding the BLEU score, can you please help me what to do?

  • @rakeshkumarrout2629
    @rakeshkumarrout2629 2 роки тому +1

    Sir there is no march out there to this project. I am new to deep learning really want to learn deep in DL. Can you suggest some good Institute??

    • @HackersRealm
      @HackersRealm  2 роки тому +1

      you can learn everything in youtube itself, you can check the channel playlist to learn more concepts

    • @rakeshkumarrout2629
      @rakeshkumarrout2629 2 роки тому

      @@HackersRealm thank you sir. Can you share any reference to this project. Indepth explanation of this project. Any article

    • @HackersRealm
      @HackersRealm  2 роки тому

      @@rakeshkumarrout2629 it's in the description for text based tutorial

  • @rakshitashetty7461
    @rakshitashetty7461 2 роки тому +1

    When I do it in colab hw do i set the working directory ....i understood the path for base directory but I'm unable to do it for working directory... please help

    • @HackersRealm
      @HackersRealm  2 роки тому +1

      For colab, you can mount the drive and give the dataset path directly to use it

  • @satyamrawat4079
    @satyamrawat4079 5 місяців тому +1

    Sir i m getting this error --> TypeError: `output_signature` must contain objects that are subclass of `tf.TypeSpec` but found which is not.
    And "2.6.2" version of tensorflow is not available. Is anyone else facing this same issue? How to solve this?

    • @HackersRealm
      @HackersRealm  5 місяців тому

      it's resolved, please check the github code for latest update

  • @user-kv8oh8lx7y
    @user-kv8oh8lx7y Рік тому +1

    you are amazing

  • @muhammedmehdi8893
    @muhammedmehdi8893 8 місяців тому

    I have a question why we are using both image features and sequences from captions, we can just image features for converting into captions, after vg16 we can use bi-lstm and get our output.

    • @HackersRealm
      @HackersRealm  8 місяців тому

      Could you explain the last few lines in detail

  • @aditya95775
    @aditya95775 7 місяців тому

    could you please create a video captioning model using MSR VTT dataset it will be very helpful for my major project which is due in2 weeks thank you sir

  • @daniasalameh8579
    @daniasalameh8579 Рік тому

    Really appreciate your videos ! I want to ask you what if we want the system to answer the user's query about the text file ID , and then the system generate the picture file that represents ID. How can we change the code?

    • @HackersRealm
      @HackersRealm  Рік тому

      I didn't get the full context here, could you type it fully?

  • @sid8777
    @sid8777 Рік тому +1

    What's your IDE? Looks pretty cool

  • @syedhussainshah3766
    @syedhussainshah3766 2 роки тому +1

    hello sir as image captioning has been done previously as my project is on video captioning can u plz make a video or guide with the same procedure but for video captioning

    • @HackersRealm
      @HackersRealm  2 роки тому

      Sure I will add that to the list

    • @syedhussainshah3766
      @syedhussainshah3766 2 роки тому

      @@HackersRealm sir please can u make it quick as i have less time remaining and im really worried about my project sir it would be really great and i would be really thankful sir

  • @pratyushpandey6139
    @pratyushpandey6139 Рік тому +1

    nice

  • @aniketpatra4474
    @aniketpatra4474 9 місяців тому

    Hi thanks a lot for this awesome tutorial. Can you please make a tutorial on how to deploy this model on cloud eg AWS?

    • @HackersRealm
      @HackersRealm  9 місяців тому

      I have already made a local deployment for basic ml model... I will try to make a video for cloud deployment soon

  • @Suchithrads2003-rb5sm
    @Suchithrads2003-rb5sm 4 місяці тому

    Can u make a vedio of software installation n setting environments for image captions generating..

    • @HackersRealm
      @HackersRealm  4 місяці тому

      You can use kaggle notebook which is a online IDE, it's simple to use like I showed in the video

  • @TheYashO
    @TheYashO Рік тому +1

    Thank you for such a nice implementation and explanation ,I have 1 doubt So can you please guide me for changes to be done to get captions for random internet images ?Thank you

    • @HackersRealm
      @HackersRealm  Рік тому

      The code snippet is already available in my website. link is in the description. For better results, you have to train with more images.

  • @akulasaimanasa3344
    @akulasaimanasa3344 Рік тому

    I am getting gaierror when running the cell consisting of creating a model.Could u plz help me

  • @prajaktadhamanskar21
    @prajaktadhamanskar21 9 місяців тому

    getting error at this line " yhat = model.predict([image, sequence], verbose=0)"
    ValueError: Layer "model" expects 1 input(s), but it received 2 input tensors. Inputs received: [, ]

    • @HackersRealm
      @HackersRealm  9 місяців тому

      Have you used the same notebook to train and test?

  • @litalshytrit1490
    @litalshytrit1490 2 роки тому +1

    I have a few questions please.
    How did you choose the hyperparameters of the model?
    Why is the decoder after the encoder?
    Why use a dropout layer for the images if they're already gone through VGG16?
    How can I add a validation_data in the fit function? It shows a compatibility error.
    Thanks!

    • @HackersRealm
      @HackersRealm  2 роки тому

      You can change the model parameters or layers for experimentation too, but make sure the sequence of flow does not break

  • @winwiths.g6155
    @winwiths.g6155 Рік тому

    Hey greatly explained can you please tell me that if how can i reduce the model complexity to run it in raspberry pi4 and can you explain how do i run this image captioning through my webcam

    • @biancaar8032
      @biancaar8032 10 місяців тому

      Yeah you can run it in rpi by converting this model.hdf5 file into tflite file ... For doing it with webcam,u have to capture each frame and pass it as input to the model using cv2

  • @kannan1427
    @kannan1427 Рік тому

    Is it necessary to run the code everytime when we open or can we save the trained model

  • @pankajghaywat
    @pankajghaywat Рік тому

    Hello, what changes I need to do if I want to implement video captioning i.e. generating captions for short video clips?

    • @HackersRealm
      @HackersRealm  Рік тому +1

      The whole structure has to be changed... from features to the model. It will be a big task for sure

  • @soumyasingh8500
    @soumyasingh8500 11 місяців тому

    hi, so whenever the session ends, on restarting or resuming it, it loads all the data and training again, so it takes 3 hours again. Even on saving the model, it does the same. what to do?

    • @HackersRealm
      @HackersRealm  11 місяців тому

      If you save the model, you can skip some the steps used for training. Else saving the model is no use for us.

  • @NareshBalla7
    @NareshBalla7 9 місяців тому

    Hi, Thanks for the tutorial.
    I used your code without modifications and it is generating the same caption for every image.
    "startseq two people are sitting on the street endseq"
    I didn't change anything in the code. Imported the dataset and using kaggle.
    What should I change for the model to predict correctly?

    • @HackersRealm
      @HackersRealm  9 місяців тому

      is this occurring for all the images? which i tested in the video?

  • @dotnet8925
    @dotnet8925 Рік тому

    how did you add kaggle data to jupyter notebook. Which version notebooks is this ?

    • @HackersRealm
      @HackersRealm  Рік тому

      If you go to the dataset and click new notebook in kaggle. It will automatically add the dataset to that notebook

  • @enverylmaz5566
    @enverylmaz5566 8 місяців тому

    Hello. I am doing this project. In addition to this code, I want to write a caption. Then I want to make it find the closest image. (for example: two dogs are running) (When I write this caption, it will find the closest image.

    • @HackersRealm
      @HackersRealm  8 місяців тому

      You could predict the captions all the images and do the text similarly to get closest. Other than that, there are few other ways where this can be done

    • @enverylmaz5566
      @enverylmaz5566 8 місяців тому

      @@HackersRealm How can ı do this?Can you help me

  • @pradyumnasushanth4430
    @pradyumnasushanth4430 11 місяців тому +1

    Sir I am getting an error in training the model i.e Graph execution error . What to do sir

    • @HackersRealm
      @HackersRealm  11 місяців тому +1

      Are you using the same code in kaggle notebook?

    • @pradyumnasushanth4430
      @pradyumnasushanth4430 11 місяців тому

      No Iam running in Jupyter notebook and yes I have written same code

    • @HackersRealm
      @HackersRealm  11 місяців тому +1

      @@pradyumnasushanth4430 then it might be some module error in your local machine... you have to check and resolve the error in your local or you could run it in kaggle

    • @pradyumnasushanth4430
      @pradyumnasushanth4430 11 місяців тому

      ok
      @@HackersRealm

  • @keerthanarajendran5791
    @keerthanarajendran5791 3 місяці тому

    How to fix "gaierror" at extracting image features? Please help.

  • @karimbaig8573
    @karimbaig8573 Рік тому

    How to do this for dense captioning task ?

  • @006_TAMOGHNASADHUKHAN
    @006_TAMOGHNASADHUKHAN 9 місяців тому

    Can you provide me the link of the research paper ?

  • @charltondsouza9140
    @charltondsouza9140 2 роки тому

    Hi, DO you have an attention mechanism code applied to the same code. I am not quite sure about how to go about it. If not can you please explain briefly how it can be done

    • @HackersRealm
      @HackersRealm  2 роки тому

      You just have to add corresponding layers to the text model here, flow remains the same

  • @garvitgupta792
    @garvitgupta792 11 місяців тому

    Could you please tell me which application are you using to code? I am new to this and only know about Colab and Notebook.

  • @bhushanambhore8378
    @bhushanambhore8378 Рік тому

    hi, I wanted to ask do we need to train 1:04:00 model here every time after opening kaggle. Isnt there any other way to save this?

    • @HackersRealm
      @HackersRealm  Рік тому

      You can save the model using model.save method

  • @Manojkumar-vh4tc
    @Manojkumar-vh4tc 2 роки тому

    Here you have used a 8K dataset along with the captions, but if I give a new image why the model is not working, if it should work for any image how the approach should be ? can you give a flow of approach

    • @HackersRealm
      @HackersRealm  2 роки тому +1

      If you want to test with a new image, you can try the same flow with flickr32k dataset, that will improve your results

    • @Manojkumar-vh4tc
      @Manojkumar-vh4tc 2 роки тому

      @@HackersRealm Thanks
      and epoch, batch size should be higher with GPU ?

    • @HackersRealm
      @HackersRealm  2 роки тому

      @@Manojkumar-vh4tc For bigger networks, 16 or 32 is the optimal number

  • @tuppapandu7916
    @tuppapandu7916 День тому

    Hlo sir if we give the image from our gallery will it workout , i am output

    • @HackersRealm
      @HackersRealm  День тому

      You could use flickr32k dataset to get better results for that... the code in the description contains the updated code which predicts from web url

  • @noorulameenasm
    @noorulameenasm Рік тому

    when i run the epoch it shows the value error

  • @tanviladdha4120
    @tanviladdha4120 2 роки тому

    FileNotFoundError: [Errno 2] No such file or directory: '/kaggle/input/flickr8k/Images'. - i m getting this error though i have the dataset folder and the project file at the same place. i m trying this in jupyter notebook. can you please whats wrong i m doing ?

    • @HackersRealm
      @HackersRealm  2 роки тому +1

      seems correct only, check with different folder structure, if you're using local machine

  • @shouryatyagi8947
    @shouryatyagi8947 Рік тому

    Hey in model training i am getting the error that is it failed to convert a numpy array to a tensor

  • @IshaKarn-e6g
    @IshaKarn-e6g 9 місяців тому

    Sir, after successfully importing a dataset using a Kaggle token, all steps proceeded smoothly until the 'summarize' stage. However, an error emerged during the 'extracting' step, indicating 'No such file or directory: '/kaggle/input/flicr8k/Images.'' can you please guide

    • @HackersRealm
      @HackersRealm  9 місяців тому

      are you using the notebook in kaggle environment as shared in the video?

    • @IshaKarn-e6g
      @IshaKarn-e6g 9 місяців тому

      i am using google colab@@HackersRealm

  • @VinoconTino
    @VinoconTino Рік тому

    Hey, is it also possible to generate a longer description than only one sentence?

    • @HackersRealm
      @HackersRealm  Рік тому +1

      yeah if you train with longer description for the whole model. Then the model can predict longer descriptions

  • @aryamolvh3010
    @aryamolvh3010 4 місяці тому

    i run the code in jupyter notebook but i found this error:
    FileNotFoundError Traceback (most recent call last)
    Cell In[7], line 5
    2 features = {}
    3 directory = os.path.join(BASE_DIR, 'Images')
    ----> 5 for img_name in tqdm(os.listdir(directory)):
    6 # load the image from file
    7 img_path = directory + '/' + img_name
    8 image = load_img(img_path, target_size=(224, 224))
    FileNotFoundError: [WinError 3] The system cannot find the path specified: '/kaggle/input/flickr8k\\Images'
    please explain how i fix this error

    • @HackersRealm
      @HackersRealm  4 місяці тому

      You have to change the directory path if you're running this notebook in local accordingly!!!

  • @taruntammana7960
    @taruntammana7960 Рік тому

    sir at 17th cell the output coming only start and end the caption doesn't coming in between.please tell how to solve(after clean(mapping))

    • @HackersRealm
      @HackersRealm  Рік тому

      Are you using the same notebook and the dataset?

  • @patilsanket644
    @patilsanket644 Рік тому

    Hey,
    ModuleNotFoundError: No module named 'tensorflow.security'
    Getting this error while importing , i have installed tensorflow , please show me a way!!

    • @HackersRealm
      @HackersRealm  Рік тому

      If you're running locally. You can uninstall and reinstall the module or create a new environment and install the module!!!!

  • @dotnet8925
    @dotnet8925 Рік тому +1

    can this be done in visual studio instead of jupyter lab.

    • @HackersRealm
      @HackersRealm  Рік тому

      Yes, You just need to modify few things like print statements or changing few things to functions. You can use any ide you want

    • @dotnet8925
      @dotnet8925 Рік тому

      @@HackersRealm what do we do regarding base directory and working directory in visual studio.

    • @HackersRealm
      @HackersRealm  Рік тому

      @@dotnet8925 You just have to point to the dataset folder. Please change the code accordingly for the folder structure you're using

  • @harshith24
    @harshith24 6 місяців тому

    can you please give the versions of the packages u installed , because I am trying to make a user interface using streamlit in pycharm and the versions should match

    • @HackersRealm
      @HackersRealm  6 місяців тому

      Sorry, I didn't note the packages for this.

  • @IfrahRaoofdcs
    @IfrahRaoofdcs 2 роки тому +1

    Hello! Thanks for your video. I am trying your code but while extracting features i am getting this error "cannot identify image file ". Can you please help me in fixing this! please

    • @HackersRealm
      @HackersRealm  2 роки тому

      I think image may be corrupted, try removing the image which is corrupted and do the process again

  • @lalitagarwal9155
    @lalitagarwal9155 5 місяців тому

    Good evening sir, actually I am building a food website where I want to implement a feature like taking input food image from the user and generate caption of that image and then search in the database using that caption..
    So my question is just that can I use the same code to generate name for the food image inputted from user using Food-101 dataset.

    • @HackersRealm
      @HackersRealm  5 місяців тому

      If you have a similar dataset, you can train the model.

  • @revathik4143
    @revathik4143 11 місяців тому

    While extracting the image features, i am getting error,to rectify this error what shoul i do.

  • @YasinShafiei86
    @YasinShafiei86 2 роки тому

    I downloaded the code. Why it is note working?
    When I want to train. It gives me an error

  • @user-vt6yw2td3v
    @user-vt6yw2td3v 8 місяців тому

    can you please tell me the which alogrithms are used in image captioning???

    • @HackersRealm
      @HackersRealm  8 місяців тому

      I have used vgg and lstm models for the neural network

  • @abhilashsbharadwaj8630
    @abhilashsbharadwaj8630 2 роки тому

    Can we add audio to it.. I mean it should read the caption that is generated
    If the caption is "A man is driving a car"
    Audio must read the same

    • @HackersRealm
      @HackersRealm  2 роки тому

      yes you can do it using text to speech

  • @prodevmahi4901
    @prodevmahi4901 Рік тому

    You did not show how to upload the dataset in the kaggle in the same location as yours, please help

    • @HackersRealm
      @HackersRealm  Рік тому +1

      If you go to the dataset link and click new notebook, the dataset will be there automatically!!!

    • @prodevmahi4901
      @prodevmahi4901 Рік тому +1

      @@HackersRealm oh I had created a new notebook not from the dataset but separately and then I faced this problem. By the way your doubt clearance helped, thankyou

  • @user-cv6md9jo4c
    @user-cv6md9jo4c 6 місяців тому

    why u not splitting data into training dan testing data?

    • @HackersRealm
      @HackersRealm  6 місяців тому

      to check how the model is performing, we need test data which is not present in training.

  • @Hodemaru1198
    @Hodemaru1198 3 місяці тому

    Hey brother, whats code editor did u use?

  • @champav2982
    @champav2982 5 місяців тому

    Hi sir
    Can we use jupyter notebook for this project??

  • @mdmujeeb3670
    @mdmujeeb3670 Рік тому

    i am getting url fetch failure while loading the VGG16 model...please tell me what to do

    • @HackersRealm
      @HackersRealm  Рік тому

      please enable internet in the settings of kaggle. It's in right pane of the notebook