180 - LSTM Autoencoder for anomaly detection

Поділитися
Вставка
  • Опубліковано 29 лис 2024

КОМЕНТАРІ • 136

  • @yonadabjaredguzmanmendoza1576
    @yonadabjaredguzmanmendoza1576 Рік тому +3

    Your content is awesome, it's really helping me to understand more concepts about ML because you don't only stand with the theory but you moving through the practice (that's pure gold for me). Thanks for sharing all of those knowledge with us !

  • @maclovesgeet
    @maclovesgeet 2 роки тому +1

    Thank you. I could follow your story even though I am not a data scientist. You have unique skills of explaining something complex in simple words with good enough details.

  • @youngzproduction7498
    @youngzproduction7498 3 роки тому +1

    Your explanation is simple but clear. Thanks for you effort.

  • @kmiyasar
    @kmiyasar 3 роки тому +15

    The video is interesting. I have a doubt.
    1. Given the network is used to train a network where the input and output are the same, why are trainX and trainY given in the fit command.
    Shouldn't it be trainX, trainX.

    • @wanderfj
      @wanderfj 3 роки тому +6

      Same doubt here. Thanks.

    • @olivierlourme9521
      @olivierlourme9521 2 роки тому +3

      I share this doubt. With model.fit(trainX, trainY), nothing works like in the video from that point. With model.fit(trainX, trainX), we are really close to the results of the video.

    • @traveler6062
      @traveler6062 Рік тому +2

      Yes, it should be trainX, trainX. I tried it and results improved

  • @samarafroz9852
    @samarafroz9852 4 роки тому +2

    I'm highly inspired by your thoughts and from your tutorials. You're the best UA-camr for deep learning and medical image processing. Sir there is most promising task done by deep generative models (AAE)is generating novel drug molecules trained from existing datasets like Moses and zinc. And research contest shows that it's in the forefront in terms of application of deep learning in healthcare infact this is biggest research topic of AI in healthcare in 2020. Please make tutorial on that as well I'm waiting sir

    • @DigitalSreeni
      @DigitalSreeni  4 роки тому +2

      In summary you are recommending something like VAE for generating new molecules?

    • @samarafroz9852
      @samarafroz9852 4 роки тому

      @@DigitalSreeni yes sir

  • @puneetsharma4370
    @puneetsharma4370 3 роки тому

    Thanks for sharing Sreeni. I wanted to point out that the LSTM "units" argument is the number of hidden layer in the LSTM cell. Its not the number of LSTM cells in that particular layer (comments at 4:00 mins).

    • @DigitalSreeni
      @DigitalSreeni  3 роки тому

      Thanks for pointing out Puneet. The terminology for LSTM is defined in a confusing way. Here it refers to horizontal arrays of LSTM layers (units).

    • @puneetsharma4370
      @puneetsharma4370 3 роки тому +1

      @@DigitalSreeni On a separate note ... correct me if I am wrong - the inputs and outputs for autoencoder model should be the same right ... model.fit(input, Output ...), input and output should be same for autoencoders.

  • @niksable
    @niksable 3 роки тому

    Thank you for putting this out there. I was putting off building an LSTM based auto-encoder, but you broke it down very well and pushed me to get it done.

    • @chymoney1
      @chymoney1 2 роки тому

      it is very simple with Keras

  • @BROHAMMER_OK
    @BROHAMMER_OK 4 роки тому +1

    Hello, the TimeDistributed wrapper is not needed for Dense layers, but I guess making it explicit makes the tutorial more understandable. Nice video

  • @chymoney1
    @chymoney1 2 роки тому

    Wow this was fantastic! I didn't even know what an autoencoder was before watching

  • @polterp
    @polterp 3 роки тому +1

    This was greatly educational, and surprisingly in-depth and easy to digest. Thank you a lot and good luck with your channel :)

  • @InformatikInsider
    @InformatikInsider Рік тому +1

    Well done! Thanks for this nice video! Greetings from Germany

  • @antoniocamposrodriguez3726
    @antoniocamposrodriguez3726 9 місяців тому +2

    I'm not sure if it is a mistake or I misunderstood something, but I noticed that after building the encoder-decoder block you are training the model as if you were to predict the labels in this line model.fit( trainX , trainY ) but afterwards you're measuring the MAE between the original data and the reconstruction in this line np.mean( np.abs( trainPredict - trainX ) ,axis=1) however this is not the error between the reconstruction and the original data but rather the error between the original data and the predicted label, isn't it? Shoudn't you measure the MAE between the original data and the TimeDistributed layer which has the same shape as the original input data?

  • @ajit_edu
    @ajit_edu Місяць тому

    I have been following your lessons. Many thanks. In the code, you have normalized the test data as well. Shouldn't only train data be normalized ?

  • @naasvanrooyen2894
    @naasvanrooyen2894 Рік тому +3

    Thanks alot for these videos. Just a question, should trainMAE not be calculated with trainY instead of trainX? Im a bit confused.

    • @GootsGaming
      @GootsGaming Рік тому

      I think not. Because the trainMAE is based on te difference between trainX and the trainX'(value predicted by autoencoder).

  • @ismailalpaydemir4511
    @ismailalpaydemir4511 3 роки тому

    Thanks for these videos, I really love learning something from your codes and videos.

  • @mindbodyzaid7814
    @mindbodyzaid7814 3 роки тому +7

    If the LSTM is reconstructing the same input sequence, why do you create an X and Y? Shouldn't the input and output be both the "X"?

    • @mehul4mak
      @mehul4mak 2 роки тому +2

      What's the answer?

    • @abdoulazizmaiga9848
      @abdoulazizmaiga9848 Рік тому +1

      They are not the same because the output y will be slightly different from the input X due to the encoding and decoding process errors.
      But in a ideal case you will get X= Y

    • @yanyanp
      @yanyanp Рік тому

      Y predict future? but in what time frame, 1 day or 30 days?

  • @oussamacheta7106
    @oussamacheta7106 3 роки тому +1

    Thank you, it looks like GE got hit hard by the 2008-2009 economic crash and maybe by Covid-19 in 2020...

  • @jiajun898
    @jiajun898 3 роки тому +4

    How do I modify the above example to take in 3 inputs I.e. multivariate instead of univariate? I am new to this and would appreciate your great help in this.

  • @priyal_001
    @priyal_001 Рік тому

    The best video i have ever seen, great

  • @Breno9629
    @Breno9629 5 місяців тому

    Hey Sr, thank you for the video. If you allow me to ask you some questions, why do we have, while train the model, pass the X and the Y? Is the model reconstructing the original sequence and trying to predict the next value based on the 30 values provided? (I am asking because I was expecting that we would bass the same sequence, something similar as we perform using a vanilla autoencoder). It seems that we input a sequence, tries to predict the next for the given sequence as we reconstruct the initial sequence.
    When we calculate the error, the error is based on the reconstruction process am I right?
    Thank you in advance!

  • @bonadio60
    @bonadio60 3 роки тому

    Very clear explanation, fantastic video, thank you very much.

  • @beagle989
    @beagle989 3 роки тому

    When I see DigitalSreeni I know I'm in good hands

    • @DigitalSreeni
      @DigitalSreeni  3 роки тому +1

      Thanks for the trust. Now I am under pressure to live up to your expectations :)

  • @sangeetaoswal70
    @sangeetaoswal70 3 роки тому

    Thanks sir just video gave the starting point which was needed to work on (time series anomaly detection)

  • @navinbondade5365
    @navinbondade5365 4 роки тому +3

    Im also waiting for the video in which you will cover different types of GANs for example Style GAN, Conditional GAN or Cycle GAN.

    • @DigitalSreeni
      @DigitalSreeni  4 роки тому +2

      On my list for a long time. Thanks for suggesting.

  • @navinbondade5365
    @navinbondade5365 4 роки тому +1

    Im waiting for your video on Variational Autoencoder in which you tell how to put classes on Mona Lisa, Image Super Resolution and about Style Transfer

  • @adeadeyoutube1653
    @adeadeyoutube1653 9 місяців тому

    Hi, thank you for the teachings and videos.

  • @chetanbulla9185
    @chetanbulla9185 3 роки тому +4

    Nice video... Pl tell me how to find anomalies in multivariate time series

  • @PavanKumar-hp1el
    @PavanKumar-hp1el 2 роки тому +1

    I have a doubt here in autoencoders that output is also x then here why did you trained model with trainx and trainy. instead of train
    x and train x

  • @mohammadyahya78
    @mohammadyahya78 3 роки тому

    This is extremely helpful. Thank you very much.

  • @utkarshsharma9708
    @utkarshsharma9708 2 роки тому +2

    Thank you for a very informative video.
    I have one question (anyone can answer it)
    What advantage does autoencoders give for anomaly detection over classical ML algorithms?

    • @biplabroy41
      @biplabroy41 2 роки тому +1

      It can work with unsupervised data & for anomaly, it is not needed to show the model what anomaly actually looks like beforehand.

  • @7thdayadventist562
    @7thdayadventist562 2 роки тому +1

    Sir could you please provide a video on LSTM Variational Autoencoder for multivariate time series.

  • @alessandroaquino5027
    @alessandroaquino5027 2 роки тому +1

    if I wanted to use the lstm autoencoder having in input a dataset containing some text and not a temporal sequence, can it be done?
    for example with a dataset containing fake news

  • @varunbalaji6998
    @varunbalaji6998 3 роки тому +2

    First of all, thank you so much sir. I have a question on how to choose the scaler? let me put it on other words, If I have a dataset but Idk which scaler should I choose, so on what basis should I choose a scaler. What is the difference between Standard scaler and minmax scaler? why only these two scalers, any alternative that can be used for anomaly detection?

  • @fernandocabrera9072
    @fernandocabrera9072 2 роки тому

    Thank you . Very clear explanation !!

  • @imeddrioua2500
    @imeddrioua2500 Рік тому

    Thank you for sharing !
    What i can't understand here, is the part where we create the anoamly_df.
    we know that for each sequence of 30 observations, we have a single MAE.
    so how can i detecte which observation of these 30 is the anomaly within a sequence ?

    • @GootsGaming
      @GootsGaming Рік тому

      I think, for each of the 30 observations you have one MAE, since MAE is calculated based on 2 values: observed value and predicted value. What was predicted by the autoencoder was a vector of 30 values, trying to rebuild the observed values.
      Hope I made myself understandable

  • @olivierlourme9521
    @olivierlourme9521 2 роки тому

    Thank you for this valuable video!
    Is it necessary to perform a standardization (via StandardScaler methods) as there is only one feature ?

  • @denys2698
    @denys2698 2 роки тому

    how to do the same idea of anomaly detection but not for time-series data, for example, having clients in hospital and checking their health tests?

  • @mcfrenzyo2645
    @mcfrenzyo2645 Рік тому

    Hi, thanks for your video. Please, is there a way I can pull out the encoder compressed data with the original number of rows for supervised learning? I have actually tried it and the size I got was just the sample size instead of the size of the original number of rows.

  • @DavidCH12345
    @DavidCH12345 3 роки тому +1

    If I understand correctly, autoencoder are not able to detect reocurring patterns. If this anomalous drop would be something reocurring, is there a ways to take this into account?

  • @leonpilhatsch1933
    @leonpilhatsch1933 10 місяців тому

    Thank you very much for your content!!

  • @maaleem90
    @maaleem90 Рік тому

    that's a great video sir. although i got two things to say one is sir , it we be a great pleasure to vide only on time distributes and the other thing sir a query .
    here we set return sequences as false and then used a repeat vector so that we can stack a LSTM layers again .
    but cant we just use repeatvector as True in first layer so that we can eliminate that repeatvector layer .
    the thing using repeat vector is it a thing particular to autoencoder using LSTM or it is just an experimental thing tried for for better accuracy, i mean we can also try setting return sequences as true and remove repeat vector layer?

  • @gabrifroja5186
    @gabrifroja5186 Рік тому

    I have a multivariate dataset with 86 dimensions, instead of 1 like in the video.
    How do I compute the MAE in this case?

  • @yueyangu
    @yueyangu Рік тому

    Thanks! But I don't understand why the model is trained to predict y, while the anomaly score is given based on MSE between y_pred and X. Shouldn't it be between y_pred and y?

    • @antoniocamposrodriguez3726
      @antoniocamposrodriguez3726 9 місяців тому

      I do have a similar question, I don't understand why he's training the model to predict trainY and then measuring the anomaly score between the original trainX and the predicted label instead of the reconstructed data. Maybe I misunderstood something

  • @Anna-ef4id
    @Anna-ef4id 7 місяців тому

    How is it possible that timestep is 30 and the LSTM layer is 128. Shouldn't it be less than timestep to actually encode it?

  • @Ntghd1996
    @Ntghd1996 3 роки тому

    Thanks for your good tutorials and eloquence, can we also use this architecture to diagnose video data anomalies?

  • @swamchem
    @swamchem 2 роки тому

    Hi Sreeni, Thanks for the great video. But I just curious to know that after you perform Standard Scaler transformation, how the type of train & test was in pandas data frame. It will be converted to numpy array, once you have done any transformation.

  • @malavvibhakar9001
    @malavvibhakar9001 2 роки тому

    I have got error in the end
    y = scaler.inverse_transform(test[timesteps:].Open),
    Expected 2D array, got 1D array instead
    I also tried to reshape but still got a same error
    so could you help me with this

  • @abulfahadsohail466
    @abulfahadsohail466 2 роки тому

    Sir I have Timeseries dataset in which time and vibration accelerationd have been recorded. So I have to classify the faults of tool on the basis of that dataset on the basis of LSTM. so how to use it.

  • @ArunKumar-fv6uw
    @ArunKumar-fv6uw 3 роки тому

    How to use LSTM (or 1D CNN) to detect contextual anomalies in timeseries?

  • @akashgopikrishnan5019
    @akashgopikrishnan5019 2 роки тому

    Can you explain how to do the same with supervised anomaly detection with labeled multivariate dataset using LSTM

  • @radityafijarpradana1484
    @radityafijarpradana1484 3 роки тому

    Extremely helpul. Thanks very much

  • @Arcziisk8
    @Arcziisk8 2 роки тому

    How can we compare different models how it went when there are no labaled anomalies?

  • @studyhub3950
    @studyhub3950 Рік тому

    Firstly thanks. My question is that when input is 30*1 means 30 then how can be output 128 while in autoencoder we compress data then decode for example 30 to 15 to 10 then decode

  • @ansumannayak3853
    @ansumannayak3853 4 місяці тому

    how to do for multivariate timeseries data of multi companies

  • @habibuallahmanzoor9051
    @habibuallahmanzoor9051 2 роки тому

    I am having trouble plotting testPredict and testX. I want to see the predicted curve.

  • @kavinyudhitia
    @kavinyudhitia Місяць тому

    Great tutorial, thanks

  • @vamsikrishnabhadragiri402
    @vamsikrishnabhadragiri402 3 роки тому

    Why did we use time distributed dense layer? why can't we use a normal dense layer, any specific reason?

  • @withknowledgeitriump
    @withknowledgeitriump 2 роки тому

    I have a question, if I am working on a multivariate problems where i have 7 features in my data and I am using for eg. 6 features to predict 1 feature, how should I modify the code to output 1 feature since my trainX.shape[2] contains 6 features instead of 1?

  • @jamesmasai520
    @jamesmasai520 3 роки тому

    Thank you for kindly sharing this.

  • @hanssss13
    @hanssss13 2 роки тому

    i have problem with plotting anomalies (last task), How do I solve ValueError: Expected 2D array, got 1D array instead?

    • @olivierlourme9521
      @olivierlourme9521 2 роки тому

      Indeed there are some errors. This should be :
      #Plot anomalies
      sns.lineplot(x=anomaly_df['Date'], y=scaler.inverse_transform(anomaly_df[['Close']]).flatten())
      sns.scatterplot(x=anomalies['Date'], y=scaler.inverse_transform(anomalies[['Close']]).flatten(), color='r')

  • @oli111222
    @oli111222 2 роки тому

    When I'm searching for the same data from the same time interval, I get values approximately 10 times higher than in that video. How is that possible?
    For example, in the Video at 9:04 we see the table of yahoo. When I'm searching for the Values Oct 29, 2020 I find values around 60.00, in the video however I see 7.65.
    Currency is in USD as in the video, what is happening?

    • @olivierlourme9521
      @olivierlourme9521 2 роки тому +1

      In 2021 (after this video was made), GE decided that every 8 shares that investors own will be turned into one share. You have to divide the 'close' feature by 8.

  • @mukhtarayusuf4787
    @mukhtarayusuf4787 Рік тому

    So inspiring! Well done. How do we get the codes please?

  • @ihebbibani7122
    @ihebbibani7122 3 роки тому

    Thanks Sir for the videos. Do you have a tutorial on how we can use plotly that will give us at what events each anomaly corresponds ?
    Thanks in advance

  • @sagarhm2237
    @sagarhm2237 4 роки тому

    Sir y objective lens of microscope are smaller in length y can't make as size of slide that can use to focus whole slide.
    Plzzzz help regard to thise like I need 100x objective lens of larger length of like slides

    • @DigitalSreeni
      @DigitalSreeni  4 роки тому

      This is basic optics question. Why do we have different camera lenses and why not have single lens that covers a wide range? Because you compromise on quality due to many factors, optics and also chip electronics.

  • @9Manzar9
    @9Manzar9 Рік тому +1

    Isn’t this trained on next value regression and not reconstruction? Seems like you just mix the architectures and do next value prediction and then evaluate based on the regression error

    • @maaleem90
      @maaleem90 Рік тому +1

      hey hope you doing good . do you mind answering my query.
      here in the first layer turn sequences is set to off and repeat vector is used to stack another LSTM layer .
      is this method a standard procedure for autoencoder with LSTM of we can also try without repeatvector by setting return sequences as as true in first layer..
      and do you know any tutorial on time distributed layer?

    • @Raaj_ML
      @Raaj_ML Рік тому

      @@maaleem90 Yes, I agree. He has mixed forecast and reconstruction. This looks wrong.

    • @maaleem90
      @maaleem90 Рік тому +1

      @@Raaj_ML thanks brother you too got that thing. That means we are really learning it

    • @maaleem90
      @maaleem90 Рік тому

      @@Raaj_ML maaleem08 is the user name

    • @maaleem90
      @maaleem90 Рік тому

      @@Raaj_ML can we please connect over other platform so that we can have some talk coz I don't have any one in this field

  • @adityahpatel
    @adityahpatel 2 роки тому +1

    In all other autoencoder videos you've done .fit(x,x). Why are you doing .fit(x,y) here?

  • @JS-tk4ku
    @JS-tk4ku 4 роки тому

    your video is always mean to me, besides VAE and Autoencoder could you make videos to explain about SOMs and Boltzmann (unsupervised deep learning)?

  • @arnoldjanbitangjol8911
    @arnoldjanbitangjol8911 2 роки тому

    Can I use this method for clustering?

  • @navinbondade5365
    @navinbondade5365 4 роки тому

    Can you please make a video on hybrid Autoencoder that uses LSTM or GRU and CNN layers ?

    • @DigitalSreeni
      @DigitalSreeni  4 роки тому +1

      Need to think of an application, so far I haven't explored it for any of my applications.

  • @shankargonti8609
    @shankargonti8609 4 роки тому

    how we can make differentiate between Outlier and Anomaly in this problem

    • @moatazshoukry6482
      @moatazshoukry6482 4 роки тому +1

      As I understood anomaly detection is simply outliers detections so outliers and anomaly are the same

  • @reda8323-m3p
    @reda8323-m3p 3 роки тому

    Hi, You said that you use an undercomplete autoencoder which imply that your encoder compress the input i.e the number of features in the output of the encoder should be smaller than the number
    of features on input which is not the case on your model. Can you explain why you use a latent space with dimension higher than the input?
    Thank you in advance

    • @DigitalSreeni
      @DigitalSreeni  3 роки тому +3

      In my example, the first LSTM layer generates 128 features and we encode it to 64, which is smaller than 128 features. Then we decode it back to 128. Therefore, the short autoencoder we have goes from 128-64-64-128. You can make it bigger /deeper if you want. Autoencoder does not necessarily mean the encoded vector is smaller than the input, it sometimes happens to be smaller than input (especially for images). In summary, autoencoder takes features from large dimension to smaller dimension and reconstructs them back.

  • @azra-sm4xu
    @azra-sm4xu 6 місяців тому

    excellent video

  • @mostafael-sayed4244
    @mostafael-sayed4244 3 роки тому

    can i use lstm with video analysis to detect anomaly ?

  • @traveler6062
    @traveler6062 Рік тому +1

    I believe it should be model.fit(trainX, trainX) instead of model.fit(trainX, trainY)

  • @FezanRafique
    @FezanRafique 3 роки тому

    Subs Added, thanks for the wonderful video.

  • @farhanjavid6474
    @farhanjavid6474 7 місяців тому

    thank you for that 😍😍😍😍

  • @m1a2tank
    @m1a2tank 2 роки тому

    why does your script did not work in my colab environment? train loss does not reduced down to 0.3 which is much bigger value than your video. for me. every value of "trainPredict" is near -0.5 whereas trainY is distributed -1~4.

    • @olivierlourme9521
      @olivierlourme9521 2 роки тому

      It is the same for me. In my Colab environment, the training loss is 0.4 and the validation loss is 2.4 (even after 30 epochs).Nothing in common with the 0.03 and 0.07 of the video.Why?

  • @PeaceAzugo
    @PeaceAzugo 8 місяців тому

    thank you

  • @sagarhm2237
    @sagarhm2237 4 роки тому

    Hi sir

  • @zehra2334
    @zehra2334 2 роки тому

    How one feature can be 128 features... I couldn't understand here? (Input -LSTM1) @DigitalSreeni

    • @zehra2334
      @zehra2334 2 роки тому

      @DigitalSreeni

    • @zehra2334
      @zehra2334 2 роки тому

      @DigitalSreeni
      @DigitalSreeni

  • @tunabediz930
    @tunabediz930 2 роки тому

    Thank you very much for the tutorial.
    I have a problem with sns.lineplot (row 142). I always get below error. How can I fix it?
    ValueError: Expected 2D array, got 1D array instead:
    array=[0.57032452 0.37515913 0.19478522 ... 0.32379982 1.23183246 0.9894165 ].
    Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

    • @gnn816
      @gnn816 2 роки тому

      Hello, did you manage to solve this problem? The same occurs on my own dataset

    • @malavvibhakar9001
      @malavvibhakar9001 2 роки тому

      Have you found out solution?

    • @malavvibhakar9001
      @malavvibhakar9001 2 роки тому

      @@gnn816
      Malav Vibhakar
      0 seconds ago
      Have you found out solution?

    • @gnn816
      @gnn816 2 роки тому +1

      @@malavvibhakar9001 I did not unfortunately. I tried out some things from stackoverflow but did not find a way.

    • @olivierlourme9521
      @olivierlourme9521 2 роки тому +2

      Indeed there are some errors. This should be :
      # Plot anomalies
      sns.lineplot(x=anomaly_df['Date'], y=scaler.inverse_transform(anomaly_df[['Close']]).flatten())
      sns.scatterplot(x=anomalies['Date'], y=scaler.inverse_transform(anomalies[['Close']]).flatten(), color='r')