I've given up the hope that I can ever feel smart again. Every time I learn something well enough to feel comfortable using it and talking about it, something else super cool and super fascinating comes around and I'm an overwhelmed, ignorant beginner again. 😁 The presentation was great. The speaker was even greater. :-)
On Peter's github page I couldn't find the notebooks for this talk. I was, however, able to find the code used in the talk on anaconda's website. Google "notebooks anaconda jbednar" and you'll see some super cool notebooks.
Its a good presentation, but basically can be summed up with 2 words: aggregation and drill down. It is the best way to do it. The cool thing is the data back end that allows you to make aggregations really really fast (Hadoop?) i wish he had talked about that more. If it's truly aggregating all that data at a higher level, it's REALLY cool - if it's just sampling the dataset, it has been done before.
The data backend is *not* Hadoop. It's using Dask and Numba: Dask to do parallelism, and Numba to compile down the Python numerical functions to high-performance machine code. The entire point of datashader is that it does *not* simply downsample the dataset.
I've given up the hope that I can ever feel smart again. Every time I learn something well enough to feel comfortable using it and talking about it, something else super cool and super fascinating comes around and I'm an overwhelmed, ignorant beginner again. 😁 The presentation was great. The speaker was even greater. :-)
Thanks for this! This was amazing.
wow, amazing talk! I need to make the switch from matplotlib to bokeh and datashader.
The cool stuff starts at approximately 10:00
is there a github of the jupyter notebooks?
On Peter's github page I couldn't find the notebooks for this talk. I was, however, able to find the code used in the talk on anaconda's website.
Google "notebooks anaconda jbednar" and you'll see some super cool notebooks.
Its a good presentation, but basically can be summed up with 2 words: aggregation and drill down. It is the best way to do it.
The cool thing is the data back end that allows you to make aggregations really really fast (Hadoop?) i wish he had talked about that more. If it's truly aggregating all that data at a higher level, it's REALLY cool - if it's just sampling the dataset, it has been done before.
The data backend is *not* Hadoop. It's using Dask and Numba: Dask to do parallelism, and Numba to compile down the Python numerical functions to high-performance machine code. The entire point of datashader is that it does *not* simply downsample the dataset.
Really cool technology!