Losing your Loops Fast Numerical Computing with NumPy

Поділитися
Вставка
  • Опубліковано 9 лис 2024

КОМЕНТАРІ • 56

  • @NicolasJulioFlores
    @NicolasJulioFlores 6 років тому +17

    The fact that he showed that you can do nearest neighbors without a single loop really shows the power of numpy

  • @ericmacleod8605
    @ericmacleod8605 8 років тому +98

    This guy is python gold.

  • @joelcastellon9129
    @joelcastellon9129 9 років тому +25

    Jake is a great speaker. Enjoyed and learned a lot from his talks. Waiting for more

    • @juangutierrez7366
      @juangutierrez7366 10 місяців тому

      We are using his textbook in our intro to data science class, and his writing is also very informative and accessible at the same time.

  • @veganath
    @veganath 5 років тому +4

    Absolutely brilliant, still being appreciated, thank you Jake

  • @23489215
    @23489215 8 років тому +5

    Thanks, very informative, the tips make my program run lot faster

  • @JustinHuangA1
    @JustinHuangA1 4 роки тому

    awesome video. everyone starting out with numpy should watch this video. makes so much more sense now to me.

  • @sonersteiner
    @sonersteiner 2 роки тому

    That is cool man, fortranmagic in ipython notebooks!!!!

  • @gholamrezadar
    @gholamrezadar 2 роки тому

    Amazing video! thank you astronomer.

  • @danbrown6698
    @danbrown6698 2 роки тому

    I'm wondering how I can reduce the for loops in my project. And it happens that I met this video😮thanks a lot😁

  • @subhendum
    @subhendum 8 років тому +1

    Great talk . Learned a lot.

  • @matti1337
    @matti1337 8 років тому

    Really great and enlightening talk.

  • @nikolaytodorov9785
    @nikolaytodorov9785 5 років тому

    What makes it fast is what makes it slow...Zen.
    Joke aside, 10x for the vid! Useful info.

  • @yyf234xcvfqew4
    @yyf234xcvfqew4 4 роки тому

    Great talk.

  • @gmaffy
    @gmaffy 6 років тому +2

    Great talk. Since this talk, has there been any other methods developed to make loops faster, other than numpy? Anyone?

    • @wowepic2256
      @wowepic2256 4 роки тому +2

      Numba and pypy. Also cython

  • @adamhendry945
    @adamhendry945 2 роки тому +1

    The strategies for fast looping begin at 7:15

  • @jimmyshenmusic
    @jimmyshenmusic 4 роки тому

    This is awesome. Thanks.

  • @VickiBrownatcfcl
    @VickiBrownatcfcl 5 років тому +1

    I like the embedded image of the speaker, but not when it obscures part of the current slide. ;-(

  • @JonathanObise
    @JonathanObise 4 роки тому

    Amazing insight

  • @sapirAO
    @sapirAO 8 років тому

    Excellent talk.

  • @MarkJay
    @MarkJay 7 років тому

    awesome video!

  • @nodavood
    @nodavood 5 років тому +1

    Thanks. Very useful tips. But, the nearest neighbors example shows a fatal flaw to losing loops. The diff matrix that you generated, transforms your 1000*3 input to a 1000*1000*3 one. This leads to MemoryError in cases with larger input data. I am sorry, but having a fast loop is still a must.

    • @tejvirjogani418
      @tejvirjogani418 2 роки тому

      Can you not work in batches and minimize the number of single operations

  • @KirillBezzubkine
    @KirillBezzubkine 3 роки тому +1

    23:45 - KNN worth pure numpy

  • @sachinkaps
    @sachinkaps 9 років тому +1

    What if the data is dynamic? eg a few data points are added every second. So the process might start with no data at the beginning of the day and end up with millions of rows by the end of the day. This is typical for financial time series.
    I presume insertion of elements or copying would not be very efficient. Is pandas or any other implementation good enough for such use cases?

    • @fachofacho5436
      @fachofacho5436 6 років тому

      Correct me if i'm wrong, but couldn't you use array slicing to make operations on the array? That is because editing the sliced array edits the array as a whole.

  • @gabestrenk5471
    @gabestrenk5471 4 роки тому +1

    Are the slides available anywhere?

  • @theamrpi397
    @theamrpi397 6 років тому

    Just a doubt though....even if numpy ufunc does take an array as an element....internally should the elements of an array loop to get the output? So is it right to say looping does not happen in numpy?

  • @johnstarfire
    @johnstarfire 3 роки тому +1

    10:16 it gives me 5.19 ms in pure python and 47.4 us with numpy, python is speeding up or computers are faster?

  • @CristiNeagu
    @CristiNeagu 7 років тому

    The problem i have is that i use functions that cannot be trivially simplified to ufuncs. Stuff like detecting a rising edge, for example. How do you speed up those kinds of loops?

    • @haakonvt
      @haakonvt 5 років тому +1

      Cristi Neagu Check out numba!

  • @jb6395
    @jb6395 4 роки тому

    what about recursion?

  • @greatbahram
    @greatbahram 7 років тому

    Thank you, I now it's out of dated. But it was awesome

    • @bosk1n
      @bosk1n 6 років тому +6

      Why is that outdated? Any more effecient techniques out there to make python faster?

  • @RonJohn63
    @RonJohn63 7 років тому +1

    11:45 How many of these tasks can also be done using itertools?

    • @CristiNeagu
      @CristiNeagu 7 років тому +3

      They will be slower than ufuncs.

    • @rohitbhanot7809
      @rohitbhanot7809 5 років тому +1

      Itertools is mainly built keeping in mind memory efficiency and not really execution speed.

  • @陆得水
    @陆得水 5 років тому

    How does X.reshape(1000, 1, 3) - X end up in a result with shape(1000, 1000, 3)? I can't figure it out. Help!!!

    • @陆得水
      @陆得水 5 років тому

      Figured it out by myself. haha

    • @OlumideOni
      @OlumideOni 5 років тому

      Could you explain please?

    • @tabtang
      @tabtang 4 роки тому

      @@OlumideOni docs.scipy.org/doc/numpy/user/basics.broadcasting.html

    • @OlumideOni
      @OlumideOni 4 роки тому

      @@tabtang thank you

  • @Psycho4Ever666
    @Psycho4Ever666 7 років тому

    26:02 one could also just use D[D==0] = np.inf

    • @ScottTsaiTech
      @ScottTsaiTech 7 років тому +5

      That'd accidentally change the distance between two distinct points that happen to occupy the same space and not just between a point and itself.
      I think it depends on whether that's acceptable in your model.

    • @Psycho4Ever666
      @Psycho4Ever666 7 років тому +1

      A bit embarrassing, but I haven't thought about that... it was too obvious xD

  • @drticktock4011
    @drticktock4011 2 роки тому

    OR....just go back to FORTRAN (or C)

  • @ibrahimtouman2279
    @ibrahimtouman2279 6 років тому

    Good lecture, but it would have been more interesting if he compared NumPy to other competing numerical computing softwares such as Matlab, for example...!

  • @annamalainarayanan9310
    @annamalainarayanan9310 8 років тому +4

    almost everyone who did numpy knows this - seemingly very basic and nothing hacky!

    • @zachs5231
      @zachs5231 7 років тому +21

      you should link us to one of your talks anna

    • @tam31433
      @tam31433 6 років тому +1

      But he is right anyway.

    • @raphaeldayan
      @raphaeldayan 4 роки тому

      not everyone