Parallelize Python Tasks with Joblib
Вставка
- Опубліковано 1 жов 2024
- Today we learn how to parallelize Python tasks using joblib.
◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾
📚 Programming Books & Merch 📚
🐍 The Python Bible Book: www.neuralnine...
💻 The Algorithm Bible Book: www.neuralnine...
👕 Programming Merch: www.neuralnine...
🌐 Social Media & Contact 🌐
📱 Website: www.neuralnine...
📷 Instagram: / neuralnine
🐦 Twitter: / neuralnine
🤵 LinkedIn: / neuralnine
📁 GitHub: github.com/Neu...
🎙 Discord: / discord
🎵 Outro Music From: www.bensound.com/
Straight to the point and clear. Well done. No fluff.
Careful of your benchmarking. The 1st time you run it single threaded, then after that all multi threaded. However, the 1st time you run it, it actually has to go out to the internet and actually download the images. Running after that is a bit of a cheat since it's likely that from then on, the images will be called up from cache rather than re-DLd.
Just trying to make the point that when actually benchmarking, you need to average many runs, as initial conditions runs may not be indicative of 'normal' run times.
It also illustrates a bit of the difficulty in benchmarking network programs.
Many thanks as always. As usual you take "That's way above my brain limits' and transform it into 'Huh, So it's really just that easy.'
You should always use 'time.perf_counter' instead of 'time.time()' when you're trying to benchmark code. It's way more accurate.
I use %timeit. Is it better as well?
@@user-wr4yl7tx3w I think so but not completely sure. Since it's designed for timing code I would assume so but don't quote me on that 😅
well-timed video, multiprocessing module was giving me a hard time, I had to selectively scrape files (around 4K files each execution) I think this will be fantastic with HTTPX (drop-in replacement of requests module but with async support)
You could use httpx instead of requests!
is it better than requests module?
@@alliedeena1141depends on what you need
Why did your runtime change from n_jobs = 8 to -1?
That stupid easy.
So does colors2 get returned with all the results from each run appended to a list?
Awesome, this is exactly the type of pipeline I was looking for in order to integrate joblib :)
Multi processing with shared memory possible?
any advantages over 'unsync' library?
Thanks for share you aknowledge, i have a question,.
how can i use parallelize if i am training a machine learning model? i use pyspark but, is parallelize better?
I am enjoying this channel very much. Would anyone recommend a similar one but focused on JavaScript?
You should be show only work and face except others things, this is thing reduce your efforts
I am currently multithreading my web scraping project, opening multi browsers and clicking many things with Selenium. Would this improve anything in my case?
Hey, I know it's a year later but Joblib is pretty much just a wrapper for either multiprocessing or mumtithreading depending on your specified preference
With every videi , i learn something new! Good Job.
Very useful overview, thank you!
I want to apply parallel computing to the minimax algorithm I implemented on a connect 4 game. It looks like a really good improvement. Thanks for the content
What font family are you using in pycharm?
Is it faster than JAX pmap?
👌👌🌹
It is separating the input list and making function parallel..
I want to speed up a function which is opening a xlsx file which take 22 seconds to open in openpyxl library.. Is thqt possible to speed up?
Try the read- or write-only optimized modes in openpyxl.
@@SageBetko Thanks.. I'll try.
Excellent content and easily explained. Awesome 👍
Great video.
I´ve learned a great deal here. Thank you.
Awesome
Intro is amazing
Excellent video!
Great video
So cool 🥰🥰🥰
Great value as always… Thanks a lot!
why use a stupid example?
Amazing thank you!