A small correction: The GIL does not prevent all race conditions, just those inside the interpreter code. e.g. two threads updating a dictionary won't break the internal data structure, but a computation over multiple python statements can still have context switches and thus have race conditions.
Python 1.5 came out before 2000. Dual core machines were a 2006 thing, and multi processor machines have always been rare. Intel did add hyperthreading with the Pentium 4 in 2002, but that was more than 3 years after adding the GIL. So when the GIL was added, you could only _ever_ have one thread executing bytecode at a time. The GIL just ensured that thread reached a safe stopping spot before another thread could run. It was a cheap and easy way to get non-cooperative multitasking by leveraging the kernel threads.
Google attempted to build a GIL free version of Python (project Unladen Swallow). Today they use Go where they need massive parallel code execution. Back then I was hoping for a GIL free (and thus horizontally scaling) version of Python. Today I'm not too sure anymore it's a good idea - GIL free Python comes at a cost, and I'm by far not sure it will put Python on par with other languages focusing more on parallel execution. Therefore I tend to think that it's better to have two (or more) languages, and use them according to their core strengths.
Sadly, Google uses Go for far more than those cases, as they've succumbed to NIH syndrome. Go is just awful as a language, so bad I can't imagine using it outside of the rare instance where I actually need massive parallel code execution and can't do it some other way (hint: this has never come up). There are some fields where this matters, but most Python devs aren't working in those fields.
I use multiprocessing where it is beneficial to do so. But where I often run into problems are when I have a large data structure that I am making many sparse updates to. It often requires a completely different algorithm to do that with multiprocessing because of the IPC overhead. I am glad I have another available option now if I am not using incompatible CPython modules.
Great video, clearly covers the changes. I see this as having a huge benefit in IoT as well, allowing much faster multi-threading on hardware with less powerful compute.
I think I prefer Python being the glue, and lower level code doing the heavy lifting. Like with PolaRS. Furthermore running into performance limitations should be a decent sign to move on to a different language / solution.
Such is the nature of general purpose languages, though Python improving performance only means it'll be widening it's own use cases. Plus you can still make Python fairly fast through interpreter optimisation, if you love the comfort of Python that much anyway.
I agree, I am using Python as a bash replacement, with xonsh, and it is perfectly suited for that task. Python is so much better than bash. Python is not better than C/C++/Go/Rust for high performance code.
Yeah, it's honestly kind of weird, I develop a lot in Python but it's really not something we should rely on so much for high performance applications. Even if you (reasonably!) hate to meddle with C++ or Rust for data analysis and computation... Julia is right there.
This could be useful to some I feel like but like you were saying it comes with much more complexity to manage and will break a lot of packages. May be a hard adoption but maybe be useful in the short term while Mojo and maybe even Bend get developed out as a language. Thanks for the video!
I think that using Mojo (or Rust/C-C++, but Mojo seems to have better debugging support) for heavy parallel processing would be a better solution while keeping Python accessible as it has always been (think about the massive speed improvement using Polars over Pandas). There is always a trade-off between performance and ease of development. Otherwise, we would all use assembly.
I am heavily involved in Jython application development, largely due to its use in a particular, very popular, commercial SCADA platform. Jython is still a v2.7 language for several reasons, without being useless. Let me summarize why: * Jython does not have a GIL. (I'm not sure it ever had a GIL.) It's fundamental data types, like list() and dict(), are implemented with thread-safe (concurrent) java classes, so the GIL simply isn't needed. Jython already scales across multiple threads rather well. * Jython has full access to java native types, include byte arrays and java's native localization. Meaning that Jython can handle all of the internationalization (codes, locales) that breaks CPython2 without twisting itself into a pretzel. Both jython's str() and unicode() are built on top of java's String, which is already unicode compliant. * Jython has full access to java classes in general, and can therefor use all of java's native functional programming and asynchronous programming techniques. Yeah, there's no async and await constructs, but that's just syntax. * With threading and java async, the kinds of I/O bound applications that desperately need CPython3 perform just fine in Jython. In fact, the jitted generation of java bytecode for hot python methods means such hot methods also become eligible for java's jitting to native code. * While CPython3 also has other improvements and stdlib features that look quite handy, jython can import any java jar's classes as if they were native jython. The extremely large java ecosystem has alternatives for pretty much everything available in CPython "C" extensions. (Though not, typically, with an identical API.) Finally, the transformation of Jython2.7 to any Jython3.x seems to be quite difficult. The above means there isn't much pressure to move forward. Certainly not like it was for the CPython 2 => 3 transition.
Hi arjan, I don't think the slower single threaded python was a problem on your side. But the side effect of the removal of the GIL itself. I don't exactly remember the specifics but there was a very famous guy who tried removing the GIL before. But later concluded that removing it is not really that difficult but the performance that it gives after the removal is quite disappointing due to how python is built. After later searching a bit I found that the reduce in performance could be due to the exact reasons that you mentioned in the beginning. The higher complexity of managing memory, garbage collection, locks and all brings overhead. Even if we are not using the threads it is still managing the memory in a very complex way which kills it's performance. Can't say for sure but the guy who may be tried to remove it was 🤔... Larry hastings
@@aflous No that is wrong. Gilectomy was attempted by Larry Hastings in 2018, and he experienced severe performance penalty at that time. Sam Gross is a Meta engineer who used a different approach to remove the GIL, and he managed to reduce the performance penalty (not completely without penalty, though) --- search UA-cam for "Sam Gross Python" and you should find out how Sam did it in his presentation to EuroPython 2022. (Bonus scene on Sam's presentation: Larry Hastings actually attended, asked some questions, and admitted Sam's approach is "very interesting to pursue further")
I see this mostly as something that benefits complex applications, that shouldn't have been written in Python in the first place. At my last job this would have been excellent. Of course I'd have had to take better care with the multithreading and synchronization problems but any dev should be able to figure it out. What would have been even better would be if I'd been allowed to rewrite the application in something more suitable for a large complex system. Something compiled with static typing...
for me personally, i am not excited about at all. It simply doesn't matter for me. i usually do subprocesses and combine the results in the end. i do this for years now and i am happy with it. The GIL was always helpful for me. i know you can speed up your program with different interpreter and some modules, but this overcomplicates things and i don't like that. someone might do this and dislikes the limitations of the GIL, yeah do what you want
A lot of my stuff is processing information like: take these 1 million text files containing invoice data, read them, do something with the data, write the result. Multithreading my fetching/reading/writing parts was a massive improvement for me. But cpu bound tasks? Ehhh I don’t do a lot of them, but it would be nice to have that as an option
Since I have coded in Julia I don't like to return to Python. So many ways to make Python faster one hast to try and to learn. Then you realize, Julia is still faster. And there are so many additional advantages coding in in Julia.
The slower multiprocess and single threaded performance in 3.13 is likely due to the GIL not just being removed but it has probably been replaced by multiple locks in the interpreter for specific things. This has been a longstanding reason not to remove the GIL because experiments in doing so have generally slowed down a lot of use cases. Also it would have been worthwhile to attempt multiple runs in each scenario with different amount of work to separate out setup overhead from processing slowdown. I expect multi-processing to have overhead that is paid one time to get everything running but to have minimal performance losses compared to threading once executions starts, assuming shared memory is setup and an alternative approach like socket communications isn't used for multi-processing.
I would like a video from you on the comparison of different STL libraries we got like multiprocessing, threading, concurrent. I have seen multiple stackoverflow answers for the question like "fastest way to send 100k HTTP requests?" people answer it by using different STL libraries and I am no much clear about the core difference and advantages/disadvantage. The end goal remains same but usage of different libraries for achieving the same goal confuses me.
It depends on the use-case: A few mostly unique tasks that require lots of processing power? multiprocessing A few mostly unique tasks that don't rely too much on CPU, or do most of their CPU work outside of Python? threading Lots of tasks that mostly depend on networking, file access, or user input? asyncio Lots of (usually similar) tasks that require lots of processing power? concurrent (for multiple processes) Lots of (usually similar) tasks that don't rely too much on CPU, or do most of their CPU work outside of Python? concurrent (for multiple threads)
Python has been thought of as a language to make things "easy". Disabling the GIL greatly increases the complexity of the programs. It brings problems that many Python developers may not even know about. It is a new tool, let's hope it is used well and consciously.
Complexity shouldn't change much from this for python developers since they're adding per-object locks in the interpreter. That's actually why this change slowed down single-threaded python - they wanted to remove the GIL without removing the safety it provided.
I think python has by now been stretched in any direction possible. I work in science and there really ARE huge simulations often being done in python with several tweaks to make it fast and scale it. I think most "traditional" SEs would say this is stupid and just use C++, but all in all python is just so handy for "us" that you still do it. Hopefully the feature gets mostly used by people that really know what they are doing.
The biggest reason I'm excited about this news is that functional programming is so well suited for automatic concurrency, and without the GIL, Python can handle that multi-processing without any extra work on the part of the programmer. I say let's dump the GIL and promote functional programming techniques! But that's just my flavor preference 🙂
As a good practice, it is better to write imports as "from a import b" instead of "import a". Assuming the package to import is quite heavy, this can save some time when running the code.
😂 the funny thing is I didn't know about GIL, so I included locking to respective parts of the code, in the same way I was doing it back in the day in c 🤷♂️
For any use-case I've ever had, I'd be happy with a 50% reduction in single-core performance across the board if it meant no GIL. If i need any highly-performant app it's easy to write it in C and import. But using python to rapidly prototype highly parallel workflows would be quite nice!
With GIL enabled all Python's data types, including mutable ones, are guaranteed to be thread safe. Without GIL you should manage it by yourself with all that mutexes like in C++. Nobody really whant to mess with it, especially those who use Python for ML not being a professional developer.
I'm curious what you mean with this exactly. I hear a lot of strong claims about the GIL like this one. I'm not a GIL expert, but I really don't see how this is true. Or perhaps I just don't see what you mean. As far as I know, python has never guaranteed to hold the GIL longer than a single bytecode instruction. If your code has "a = 10; b = a + 5" then there are already multiple bytecode instructions, and another thread could have taken the lock in between the "STORE_FAST (a)" and "LOAD_FAST (a)" instructions, even though no other instructions occur between them. So if you mean user-implemented data types, they are not guaranteed to be thread safe right now anyway. But if you mean builtin datatypes no longer being thread safe, how would a user ever manage this anyway? If builtin datatypes can become invalid without the GIL then that is definitely always a language bug. It would mean that a list with 3 items could report a length of 4, etc. The python team is not going to let that happen on purpose.
If you look at PEP-703, they address this by adding per-object locks at the interpreter level. Removing the GIL shouldn't affect the behavior of existing code.
@@upgradeplans777 GIL is the core feature of the Cpython interpreter, which means that your python process (interpreter, that in turn, reads and executes the bytecode) can't take advantage of the efficient thread system present at the kernel level. If I am NOT using concurrency/parallel primitives/paradigm in my python source code, and the current running thread at OS level (running interpreter) is at bytecode for "a = ....", I CAN GUARANTEE that no other thing WITHIN MY PYTHON NAMESPACE in RAM, would take the lock and execute in between that and "b = a +...". I can write nothing in pure python to tinker with GIL (have to go through "extensions" path, libs like numpy do this to circumvent GIL). That's why threading is only meaningful in I/O bound tasks in python, because whatever you write as a Python thread (threading.Thread) will only ever run using a single OS level thread (syscall fork for linux), so CPU work on another Python thread can take that one OS level thread ONLY WHEN - either the Python thread (currently using OS thread) is waiting on I/O or - Processor receives an interrupt for pre-emptive context switch. (maybe some other app loaded in RAM, running in the background, raised it...)
@@AustinWitherspoon famous last words 😂 "shouldn't" does a lot of lifting in that sentence. The JVM took nearly a decade before it had a stable, as in predictably working memory & threading model. The Python core devs are way in above their heads if they think it is as easy as adding a few locks here and there. There will be a ton of unexpected side effects of this change once people start using it in production code, and each one will be a nightmare to debug. I am firm in my opinion that GIL free Python the worst idea since sliced bread. In fact Python with the GIL has been so successful not least due to the fact that it enforced a well designed approach to parallel code (either using mp or extensions), so that the hard stuff was left to experts, while the core language is easy to use. We'll now start seeing people blame Python for being too complex when in fact it is not the language but the problem solving approach which is flawed (or the way it is programmed). Modern CS has mostly agreed that the best way to build scalable systems is by having no shared state and to communicate soley by messages between processes. The whole world is moving in that direction, except Python now suddendly wants to be the cool kid on the block by going back at least 1 decade 😢 /rant
@@AustinWitherspoon so is this like automatically doing a lock before assigning a variable (and then releasing it immediately afterwards)? I was thinking it'd be easy enough to make a class that handles that, but it will be baked into the interpreter already instead?
I am an openstack user and since it's all written in Python, I really hope it will bring some nice performance improvements and lower memory utilization
I think a huge number of Python "performance issues" are caused by inefficient programming e.g. nested for loops. Using techniques like vectorization can give you 10x to 100x speed up in a single process / thread.
From your video, it looks like the existing threading module as interface will be used to access GIL-free threading? Is that right? If so, that would be wonderful, I don't have to change my multi-threaded code (which assumes true threading.).
Python needs spec freeze and totally new runtime, maybe in Rust? And either incorporation of Stackless features and honest effort from IronPython and Jython to catch up to official specification or dumping projects.
It didn't really make much of an improvement over multiprocessing. I'm an old school Python programmer and the rule was always that if you need it to run fast, write it in C or some other compiled language and run SWIG over it. Don't use Python in your inner loops if it's compute intensive.
To be fair, this was a trivial application. I expect the real usefulness being when you need to compute something that has a MASSIVE amount of shared data. In multiprocessing - and the Windows implementation especially, if you have to use that - the message passing between processes is a horrible bottleneck. Multithreading with no GIL would solve that, but slowing down _everything else_ , possibly by as much as 25%, seems a steep price to pay for that.
I don't think GIL is at blame, Python is a glue code, most of the hardworking code are actually written in C/C++ or other high performance language, thus there is no great need for Python to remove the GIL. By making Python secure with GIL it helps prevent many kinds of nasty error from race condition to crash. If you need performance and scalability, go write your solution in language other than Python.
What's your take on projects like Codon which are effectively doing the same thing, but growing the support from the language and compilation side rather than from the feature side?
removing the gil still requires management overhead which is around 10% there is a whole talk on youtube about the facebook dev that is working on removing the gil since facebook obviously needs to handle a lot of concurrent users.
Pov: retro before product release - Why our service is much slower? - We disabled GIL to make multithreaded code faster, but it slows down singlethreaded code - Then make singlethreaded act as multithreaded - Lol, wat? :)
But if they remove the GIL, what will I use for my I/O bound tasks? I can't use Asyncio for this stuff unless the external library I work with has an Asyncio version too.
i dont know why people are pushing gilectomy so much. i really don't see a benefit for it from python's perspective. i think that complexity that it introduces is not worth it. you can simply use multiprocessing package or move to faster language if that's not enough.
The GIL was introduced when a lot of compute was single-core. Architecture has moved on a lot since 1994, it makes sense for Python to be able take advantage of the progress made. What are the complexities that make this not worthwhile?
@@bn_ln python has pretty fast data structures and operations for such a high level language thanks to GIL. there is not fine grained locking/checks that would otherwise slow it down. also, if it will introduce any changes that require care or change assumptions developers had previously about python interpreter, it is not worth it. as i said, you can still take advantage of multiple cores by simply using multiprocessing or spawning multiple instances of your program. i just believe that normal python process should be single threaded, that's all. the process spawning overhead is not really an issue if you use pooling. i think it just requires a shift in perspective and the end result would be the same without the hassle.
I don't care about the GIL anymore. If I need speed I just use multiprocessing or write a Cython extension. Python Devs will probably give two flavors of Python now: one with the usual performance with the Gil and the new one without Gil.
the free threaded version is slower because the optimizations done in python3.11 aren't threadsafe, so when you use a free threaded version of python, these optimizations are disabled. Hence you get similar performance to python3.10
Save the GIL 😂. My sql alchemy script works well with Alembic as well as my dynamic NT module ( which pylance complained about ) . Stability is paramount 😢
This is all well and good but as long as there's no way to *easily* package a python-script as an executable program that doesn't need to include its own interpreter, I think Python is unsuitable to widely distribute an app.
Python is not aimed at wide app distribution, there are other, more capable languages and frameworks for that. Python is very suitable for writing automations, on-machine data processing and analysis, other scientific computing and getting started quickly. For these tasks, it’s useful to know how Python’s performance changes without the GIL.
@@ArjanCodes Yeah, unfortunately, that's the case. I've written a game in Python but the one thing that prevented me from uploading it anywhere in a usable form is exactly that lack of a small, easy to produce package format. That's why I'm currently investigating Nim.
I struggled a lot when having to load 100gb dataset in memory and having no way to share memory between processes thus effectively limiting to a single core 😢
You said that python can't be multi threaded without removing the GIL, but it can be. If your threads are cpu bound then you're correct, but much of what we do is IO bound in which case we can get huge performance boosts from multithreading in python even with the GIL. Anyone who doesn't understand the difference, I suggest searching for and reading the old blog post named "There is no thread".
Yes, the benefit of multithreading would show specifically for CPU-bound tasks which need to share memory. That's about it. CPU-bound with little to no shared memory, multiprocessing is good enough for. IO-bound, as you say, even current multithreading with the GIL is often good enough.
And if you really do need this, as you would in vector and ML libraries (that are currently written in c or c++), you will soon be able to use MoJo directly with your CPython code and reap the benefits of the latest cutting edge technology in not only thread and type safety, but the use of GPUs and TPUs.
@@HaganeNoGijutsushi for I/O tasks is much better to use Asyncio. It works even faster by 10-15% than multithreaded I/O tasks. Multiprocessing is to buggy and hard to share memory between processes. Thus No GIL Python will be very useful option.
Shouldn't this move to non GIL default deserve the Python 4 version ? That's a big change that would introduce backwards incompabilities, would need migrations like we did from 2 to 3, and so on
I can’t imagine how to deal with arcs and mutexes etc in Python while keeping Python as simple as it is today. I assume the added complexity the true multi threading adds to Python defeats the purpose of using Python. If you’re really dependent on performance, just write it in Rust.
People writing Python programs do not need to change anything. They can keep programming as is. Only C extension programmers need to be more careful. Or just indicate which parts of their extensions are not thread-safe.
There's a handful of comments here about how removing the GIL makes things more dangerous - but if you read PEP-703, they solve race condition issues by using per-object locks instead of a single global lock. As far as I can tell from reading it, there isn't much concern of "python getting harder" because of this. It looks like the experience of using python should be relatively unchanged. People who make compiled extensions that interact with python at the C level WILL need to change stuff though! But the pure-python people will be fine.
Don't you already have to deal with releasing the gil if you write a c extension? It will be just become more straight forward and a standard practice instead of an optional thing.
In Spain there was a businessman called Jesús Gil y Gil. He eventually became mayor of a town called Marbella. He ran for election with a party called, wait for it, GIL.
95% of threaded programs written in python are such that most, if not all, the threads end up being I/O bound. There is absolutely no point for such a program to be threaded. Use asyncio.
"Asboltely no point" - I disagree - depending on the code, it can be much easier to be written with Python threading than a correct code with asyncIO - and if your I/O library is not Asyncio ready, you will have to resort to call the I/O in threadpool workers anyway. (That when your "async ready" I/O library doesn't do that under the hood). So, yes, _most_ of times, there will be no point in GIL-bound Python to use threading instead of asyncio, but I wouldn´t say there is "no point" in it.
There are a lot of people who think JavaScript (of all languages) is better than Python because it is faster. That is such a load of cr@p. As the UNIX gurus knew, "Programmer time is expensive. Computer time is cheap." So I will consider a language that makes me more productive as a programmer far superior compared to a language that runs my code a bit faster. Note that this maxim is from the seventies. Compared to back then, computer time has become many orders of magnitude cheaper, and programmer time quite a bit more expensive. And while Python is from a later age and not quite a scripting language, scripting languages were invented by the UNIX people for exactly this reason.
You should talk a bit more technically. How is the source code changed, do we know use all thread local variables what about the thread enter/leave calls. What about extensions? If we don't know the implementation real programmers don't can be good.
It depends on what you do. In ML/DS, Python is mainly the (easy and accessible) interface to several other libraries written in low-level languages that are not affected by the GIL.
I'm afraid this is not a great direction of changes. Even multithreaded, Python won't be a performant language - it never was designed to be one. Python is a "smart glue" to orchestrate other software and it works great that way already. Also, if somebody really wants or needs to write a super fast, performant code and it must be written with a Python syntax - that's what Mojo is for.
Well my ML data augmentation code runs in 1 hour with the GIL - and in 1 minute without it. At that point I simply cease to care whether Python is a "performant language".
@@pwinowski I don't think the client would like it very much if I rewrote everything in another language. Come to think of it, I don't think I would either.
@@eadweard. I called it going "a half step further", not even a "step further", because there is a chance you wouldn't need to rewrite anything at all. Mojo is Python, syntactically, and it gives a performance boost out of the box. You can further improve performance, using Mojo-specific features, but you don't have to. That's the whole purpose of Mojo.
@@pwinowski Much the same has always been true of Cython. But the question is why would I bother with the additional dependency for "a chance" that I could save a minute each week or two?
i’m disappointed you decided it was okay to publish this video when you clearly have little understanding of what the gil does, how it works and what removing it means in terms of actual changes. it throws shade on all your other work. this instead just confuses people who don’t understand what mutexes are and cements them in their ignorance, as evidenced by the comments. we expect more from you Arjan!
okay short version, simplifying here: at the interpreter level, gil used to be one giant lock, any code that changed interpreter state would just grab it. this means that even if you created two threads from python code, both threads could not be executing interpreter code at the same time; so it was mostly useful for IO, your thread is just waiting there blocked on io anyway so gil usually wasn’t a problem. without gil to keep code thread-safe you now need a lot of small locks instead, making parallel execution possible. unfortunately it also means you need to acquire these locks even if you only have a single thread from your python application perspective. and each lock acquisition is incredibly slow since it’s a syscall. this explains why your singlethreaded example was so much slower without the gil. there is no other major downside that doesn’t get talked about: without gil, cpython code will be littered with lock handling, making further development harder. reality is that the vast majority of code out there is either single threaded or io bound, and great workarounds exist for most exceptions. this is why this discussion is so incredibly frustrating - it’s usually people who don’t understand the nitty-gritty loudly demanding python gets rid of gil, because they hope it will somehow give them performance..
shouldn' t make any difference, actually. Unless you are using a pure-Python gui-framework which keeps the CPU busy in a loop - which is _very_ unlikely.
👷 Join the FREE Code Diagnosis Workshop to help you review code more effectively using my 3-Factor Diagnosis Framework: www.arjancodes.com/diagnosis.
We need python 3.14 to release on pi day :)
we should start a petition for that!
Pi-thon
If C Python misses that perhaps we could go for PyPy 3.14 on Pi Day?
On 22/7?
@@nicholasvinen3/14
A small correction: The GIL does not prevent all race conditions, just those inside the interpreter code. e.g. two threads updating a dictionary won't break the internal data structure, but a computation over multiple python statements can still have context switches and thus have race conditions.
I feel like it was better to compare 3.13 with and without GIL. Not 3.12 vs 3.13. Who knows if there is a speed regression not due to GIL in 3.13.
pyenv has free-threaded (ie no gil) builds available, they're the "t" versions, 3.13.0b4t for example
We should code the following versions as 3.13, 3.14, 3.141, 3.1415, 3.14159, .. until we get the GIL out, then jump to 4.0 😂
and we will do a cycle and go back to the no GIL version.
Then proceed with 4.20, 4.206, 4.2069, …
I'm sorry, that's a bit of a circular argument 😉
Please, no, I'm still traumatized from the jump from python 2...
I think that's how the versions of LaTeX are numbered already.
Python 1.5 came out before 2000. Dual core machines were a 2006 thing, and multi processor machines have always been rare. Intel did add hyperthreading with the Pentium 4 in 2002, but that was more than 3 years after adding the GIL. So when the GIL was added, you could only _ever_ have one thread executing bytecode at a time. The GIL just ensured that thread reached a safe stopping spot before another thread could run. It was a cheap and easy way to get non-cooperative multitasking by leveraging the kernel threads.
Python's first target was a massive Amoeba cluster - it _began_ its life as a parallel system! They just weren't shared-memory nodes.
Google attempted to build a GIL free version of Python (project Unladen Swallow). Today they use Go where they need massive parallel code execution. Back then I was hoping for a GIL free (and thus horizontally scaling) version of Python. Today I'm not too sure anymore it's a good idea - GIL free Python comes at a cost, and I'm by far not sure it will put Python on par with other languages focusing more on parallel execution. Therefore I tend to think that it's better to have two (or more) languages, and use them according to their core strengths.
Sadly, Google uses Go for far more than those cases, as they've succumbed to NIH syndrome. Go is just awful as a language, so bad I can't imagine using it outside of the rare instance where I actually need massive parallel code execution and can't do it some other way (hint: this has never come up). There are some fields where this matters, but most Python devs aren't working in those fields.
@@oasntet kind of a hot take, what makes you think that Go is awful? Apart from the hideous error handling, of course.
@@fswerneckI really think that go error handling is very neat
@@lucasfcnunesI think everyone says that Go's error handling is bad. Why do *you* think it is good?
@@as0482 "Everyone" is too many people haha
ua-cam.com/video/YZhwOWvoR3I/v-deo.htmlsi=-GNAGmVuTFJ5dqq4
so what we blame for slow code if they unlock gil
The memory model. It really abstracts away every interaction with hardware
@@Michallote The intern.
Dynamic typing 🤓
Which reminds me when is someone going to make the typescript equivalent for python
@@ipodtouch470 it was made already, it's called Mojo.
@@ipodtouch470that's called python lmao
I use multiprocessing where it is beneficial to do so. But where I often run into problems are when I have a large data structure that I am making many sparse updates to. It often requires a completely different algorithm to do that with multiprocessing because of the IPC overhead. I am glad I have another available option now if I am not using incompatible CPython modules.
Great video, clearly covers the changes. I see this as having a huge benefit in IoT as well, allowing much faster multi-threading on hardware with less powerful compute.
I think I prefer Python being the glue, and lower level code doing the heavy lifting. Like with PolaRS.
Furthermore running into performance limitations should be a decent sign to move on to a different language / solution.
Such is the nature of general purpose languages, though Python improving performance only means it'll be widening it's own use cases. Plus you can still make Python fairly fast through interpreter optimisation, if you love the comfort of Python that much anyway.
I agree, I am using Python as a bash replacement, with xonsh, and it is perfectly suited for that task. Python is so much better than bash. Python is not better than C/C++/Go/Rust for high performance code.
Yeah, it's honestly kind of weird, I develop a lot in Python but it's really not something we should rely on so much for high performance applications. Even if you (reasonably!) hate to meddle with C++ or Rust for data analysis and computation... Julia is right there.
@@HaganeNoGijutsushi a faster CPU is very often cheaper than a faster program. Gotta be careful about growth strategy though. Also Mojo may work out 👍
Or - before that - making design correct.
This could be useful to some I feel like but like you were saying it comes with much more complexity to manage and will break a lot of packages. May be a hard adoption but maybe be useful in the short term while Mojo and maybe even Bend get developed out as a language. Thanks for the video!
I think that using Mojo (or Rust/C-C++, but Mojo seems to have better debugging support) for heavy parallel processing would be a better solution while keeping Python accessible as it has always been (think about the massive speed improvement using Polars over Pandas). There is always a trade-off between performance and ease of development. Otherwise, we would all use assembly.
1:45
Jython is sad. Their stable release is still Python 2.7 based.
I was surprised that Jython still existed. I hadn't heard of it for at least 10 years.
I am heavily involved in Jython application development, largely due to its use in a particular, very popular, commercial SCADA platform. Jython is still a v2.7 language for several reasons, without being useless. Let me summarize why:
* Jython does not have a GIL. (I'm not sure it ever had a GIL.) It's fundamental data types, like list() and dict(), are implemented with thread-safe (concurrent) java classes, so the GIL simply isn't needed. Jython already scales across multiple threads rather well.
* Jython has full access to java native types, include byte arrays and java's native localization. Meaning that Jython can handle all of the internationalization (codes, locales) that breaks CPython2 without twisting itself into a pretzel. Both jython's str() and unicode() are built on top of java's String, which is already unicode compliant.
* Jython has full access to java classes in general, and can therefor use all of java's native functional programming and asynchronous programming techniques. Yeah, there's no async and await constructs, but that's just syntax.
* With threading and java async, the kinds of I/O bound applications that desperately need CPython3 perform just fine in Jython. In fact, the jitted generation of java bytecode for hot python methods means such hot methods also become eligible for java's jitting to native code.
* While CPython3 also has other improvements and stdlib features that look quite handy, jython can import any java jar's classes as if they were native jython. The extremely large java ecosystem has alternatives for pretty much everything available in CPython "C" extensions. (Though not, typically, with an identical API.)
Finally, the transformation of Jython2.7 to any Jython3.x seems to be quite difficult. The above means there isn't much pressure to move forward. Certainly not like it was for the CPython 2 => 3 transition.
Maybe Python on GraalVM will take the place
Hi arjan, I don't think the slower single threaded python was a problem on your side. But the side effect of the removal of the GIL itself. I don't exactly remember the specifics but there was a very famous guy who tried removing the GIL before. But later concluded that removing it is not really that difficult but the performance that it gives after the removal is quite disappointing due to how python is built. After later searching a bit I found that the reduce in performance could be due to the exact reasons that you mentioned in the beginning. The higher complexity of managing memory, garbage collection, locks and all brings overhead. Even if we are not using the threads it is still managing the memory in a very complex way which kills it's performance.
Can't say for sure but the guy who may be tried to remove it was 🤔...
Larry hastings
I remember that too. It was the main concern about GIL removal.
I think right now Sam Gross is working with the team trying to remove the GIL🤔...
Yes, it is that guy and his project was called python Gilectomy
@@aflous No that is wrong. Gilectomy was attempted by Larry Hastings in 2018, and he experienced severe performance penalty at that time. Sam Gross is a Meta engineer who used a different approach to remove the GIL, and he managed to reduce the performance penalty (not completely without penalty, though) --- search UA-cam for "Sam Gross Python" and you should find out how Sam did it in his presentation to EuroPython 2022.
(Bonus scene on Sam's presentation: Larry Hastings actually attended, asked some questions, and admitted Sam's approach is "very interesting to pursue further")
I would suggest you also compile 3.12 with the same parameters, and run you test again. Then you would have a beter speed comparison.
Good suggestion!
3 minutes into the video and you are just repeating the same thing :/
Stretching it for the 10 min mark. Every monetized channels does that
I see this mostly as something that benefits complex applications, that shouldn't have been written in Python in the first place. At my last job this would have been excellent. Of course I'd have had to take better care with the multithreading and synchronization problems but any dev should be able to figure it out.
What would have been even better would be if I'd been allowed to rewrite the application in something more suitable for a large complex system. Something compiled with static typing...
There's always Cython.
so it decreased the performance in the default single threadded and the optimized parallel computation at the same time 👏
it seems only appropriate to run it on parallel heavy workloads as stand alone nodes, then
Excellent news. Just don't remove the sequential mode for adding async 🤘
for me personally, i am not excited about at all. It simply doesn't matter for me. i usually do subprocesses and combine the results in the end. i do this for years now and i am happy with it. The GIL was always helpful for me. i know you can speed up your program with different interpreter and some modules, but this overcomplicates things and i don't like that.
someone might do this and dislikes the limitations of the GIL, yeah do what you want
IM GAGGED! FINALLY!!!
A lot of my stuff is processing information like: take these 1 million text files containing invoice data, read them, do something with the data, write the result. Multithreading my fetching/reading/writing parts was a massive improvement for me. But cpu bound tasks? Ehhh I don’t do a lot of them, but it would be nice to have that as an option
Since I have coded in Julia I don't like to return to Python. So many ways to make Python faster one hast to try and to learn. Then you realize, Julia is still faster. And there are so many additional advantages coding in in Julia.
I forgot about Julia! What happened there, there was hype for about a month then nothing. Maybe Go and Rust filled in the gaps?
Haven''t watched in a while but I like the backgrounds. Very subtle.
The slower multiprocess and single threaded performance in 3.13 is likely due to the GIL not just being removed but it has probably been replaced by multiple locks in the interpreter for specific things. This has been a longstanding reason not to remove the GIL because experiments in doing so have generally slowed down a lot of use cases.
Also it would have been worthwhile to attempt multiple runs in each scenario with different amount of work to separate out setup overhead from processing slowdown. I expect multi-processing to have overhead that is paid one time to get everything running but to have minimal performance losses compared to threading once executions starts, assuming shared memory is setup and an alternative approach like socket communications isn't used for multi-processing.
8:00
I would like a video from you on the comparison of different STL libraries we got like multiprocessing, threading, concurrent. I have seen multiple stackoverflow answers for the question like "fastest way to send 100k HTTP requests?" people answer it by using different STL libraries and I am no much clear about the core difference and advantages/disadvantage. The end goal remains same but usage of different libraries for achieving the same goal confuses me.
It depends on the use-case:
A few mostly unique tasks that require lots of processing power? multiprocessing
A few mostly unique tasks that don't rely too much on CPU, or do most of their CPU work outside of Python? threading
Lots of tasks that mostly depend on networking, file access, or user input? asyncio
Lots of (usually similar) tasks that require lots of processing power? concurrent (for multiple processes)
Lots of (usually similar) tasks that don't rely too much on CPU, or do most of their CPU work outside of Python? concurrent (for multiple threads)
Python has been thought of as a language to make things "easy". Disabling the GIL greatly increases the complexity of the programs. It brings problems that many Python developers may not even know about. It is a new tool, let's hope it is used well and consciously.
True
Complexity shouldn't change much from this for python developers since they're adding per-object locks in the interpreter. That's actually why this change slowed down single-threaded python - they wanted to remove the GIL without removing the safety it provided.
I think python has by now been stretched in any direction possible. I work in science and there really ARE huge simulations often being done in python with several tweaks to make it fast and scale it. I think most "traditional" SEs would say this is stupid and just use C++, but all in all python is just so handy for "us" that you still do it. Hopefully the feature gets mostly used by people that really know what they are doing.
@@n.w.4940 python great for prototyping and c++ for prod
This 💯
The biggest reason I'm excited about this news is that functional programming is so well suited for automatic concurrency, and without the GIL, Python can handle that multi-processing without any extra work on the part of the programmer.
I say let's dump the GIL and promote functional programming techniques! But that's just my flavor preference 🙂
As a good practice, it is better to write imports as "from a import b" instead of "import a". Assuming the package to import is quite heavy, this can save some time when running the code.
I don't get this joke.
😂 the funny thing is I didn't know about GIL, so I included locking to respective parts of the code, in the same way I was doing it back in the day in c 🤷♂️
Very informative. Thank you
You’re welcome Paul!
For any use-case I've ever had, I'd be happy with a 50% reduction in single-core performance across the board if it meant no GIL. If i need any highly-performant app it's easy to write it in C and import. But using python to rapidly prototype highly parallel workflows would be quite nice!
With GIL enabled all Python's data types, including mutable ones, are guaranteed to be thread safe. Without GIL you should manage it by yourself with all that mutexes like in C++. Nobody really whant to mess with it, especially those who use Python for ML not being a professional developer.
I'm curious what you mean with this exactly. I hear a lot of strong claims about the GIL like this one. I'm not a GIL expert, but I really don't see how this is true. Or perhaps I just don't see what you mean.
As far as I know, python has never guaranteed to hold the GIL longer than a single bytecode instruction. If your code has "a = 10; b = a + 5" then there are already multiple bytecode instructions, and another thread could have taken the lock in between the "STORE_FAST (a)" and "LOAD_FAST (a)" instructions, even though no other instructions occur between them.
So if you mean user-implemented data types, they are not guaranteed to be thread safe right now anyway. But if you mean builtin datatypes no longer being thread safe, how would a user ever manage this anyway? If builtin datatypes can become invalid without the GIL then that is definitely always a language bug. It would mean that a list with 3 items could report a length of 4, etc. The python team is not going to let that happen on purpose.
If you look at PEP-703, they address this by adding per-object locks at the interpreter level. Removing the GIL shouldn't affect the behavior of existing code.
@@upgradeplans777
GIL is the core feature of the Cpython interpreter, which means that your python process (interpreter, that in turn, reads and executes the bytecode) can't take advantage of the efficient thread system present at the kernel level.
If I am NOT using concurrency/parallel primitives/paradigm in my python source code, and the current running thread at OS level (running interpreter) is at bytecode for "a = ....", I CAN GUARANTEE that no other thing WITHIN MY PYTHON NAMESPACE in RAM, would take the lock and execute in between that and "b = a +...".
I can write nothing in pure python to tinker with GIL (have to go through "extensions" path, libs like numpy do this to circumvent GIL).
That's why threading is only meaningful in I/O bound tasks in python, because whatever you write as a Python thread (threading.Thread) will only ever run using a single OS level thread (syscall fork for linux), so CPU work on another Python thread can take that one OS level thread ONLY WHEN
- either the Python thread (currently using OS thread) is waiting on I/O or
- Processor receives an interrupt for pre-emptive context switch. (maybe some other app loaded in RAM, running in the background, raised it...)
@@AustinWitherspoon famous last words 😂 "shouldn't" does a lot of lifting in that sentence. The JVM took nearly a decade before it had a stable, as in predictably working memory & threading model. The Python core devs are way in above their heads if they think it is as easy as adding a few locks here and there. There will be a ton of unexpected side effects of this change once people start using it in production code, and each one will be a nightmare to debug. I am firm in my opinion that GIL free Python the worst idea since sliced bread. In fact Python with the GIL has been so successful not least due to the fact that it enforced a well designed approach to parallel code (either using mp or extensions), so that the hard stuff was left to experts, while the core language is easy to use. We'll now start seeing people blame Python for being too complex when in fact it is not the language but the problem solving approach which is flawed (or the way it is programmed). Modern CS has mostly agreed that the best way to build scalable systems is by having no shared state and to communicate soley by messages between processes. The whole world is moving in that direction, except Python now suddendly wants to be the cool kid on the block by going back at least 1 decade 😢 /rant
@@AustinWitherspoon so is this like automatically doing a lock before assigning a variable (and then releasing it immediately afterwards)? I was thinking it'd be easy enough to make a class that handles that, but it will be baked into the interpreter already instead?
Arjan, you've changed your keychron keyboard to a NuPhy Halo96 ?
I am an openstack user and since it's all written in Python, I really hope it will bring some nice performance improvements and lower memory utilization
It won't. The GIL is not limitting openstack.
@@miraculixxs I hope it will save some cpu and memory resources as the inter process communication is cheaper!
Thank you, I almost made a very big mistake D:
I think a huge number of Python "performance issues" are caused by inefficient programming e.g. nested for loops. Using techniques like vectorization can give you 10x to 100x speed up in a single process / thread.
From your video, it looks like the existing threading module as interface will be used to access GIL-free threading? Is that right? If so, that would be wonderful, I don't have to change my multi-threaded code (which assumes true threading.).
Python needs spec freeze and totally new runtime, maybe in Rust? And either incorporation of Stackless features and honest effort from IronPython and Jython to catch up to official specification or dumping projects.
I think event-based concurrency avoids a lot of multi-threading issues, and fortunately Python already has a library: hello asyncio!
It didn't really make much of an improvement over multiprocessing. I'm an old school Python programmer and the rule was always that if you need it to run fast, write it in C or some other compiled language and run SWIG over it. Don't use Python in your inner loops if it's compute intensive.
Unless you need to share a lot of data between processes
Exactly. And that's why Cython exists.
To be fair, this was a trivial application. I expect the real usefulness being when you need to compute something that has a MASSIVE amount of shared data. In multiprocessing - and the Windows implementation especially, if you have to use that - the message passing between processes is a horrible bottleneck. Multithreading with no GIL would solve that, but slowing down _everything else_ , possibly by as much as 25%, seems a steep price to pay for that.
Guys, I'm so sorry! I just realized the link to download the spreadsheet was broken. I just fixed it. Thanks for your patience!
Title: how much faster ? Video: we don't know. Hopefully a lot !
I don't think GIL is at blame, Python is a glue code, most of the hardworking code are actually written in C/C++ or other high performance language, thus there is no great need for Python to remove the GIL. By making Python secure with GIL it helps prevent many kinds of nasty error from race condition to crash. If you need performance and scalability, go write your solution in language other than Python.
What's your take on projects like Codon which are effectively doing the same thing, but growing the support from the language and compilation side rather than from the feature side?
removing the gil still requires management overhead which is around 10% there is a whole talk on youtube about the facebook dev that is working on removing the gil since facebook obviously needs to handle a lot of concurrent users.
That "Facebook dev" is actually highly involved in Python 3.13.
finally i could understand why juypter notebook can't work on multi cells in the same time
Pov: retro before product release
- Why our service is much slower?
- We disabled GIL to make multithreaded code faster, but it slows down singlethreaded code
- Then make singlethreaded act as multithreaded
- Lol, wat? :)
IMO just wait for Mojo. It promises proper hardware programming like C.
But if they remove the GIL, what will I use for my I/O bound tasks? I can't use Asyncio for this stuff unless the external library I work with has an Asyncio version too.
Я думаю, разработчики языка сделают GIL включённым по-умолчанию. То есть отключить его можно будет по своему желанию, тогда, когда нужно
i dont know why people are pushing gilectomy so much. i really don't see a benefit for it from python's perspective. i think that complexity that it introduces is not worth it. you can simply use multiprocessing package or move to faster language if that's not enough.
The GIL was introduced when a lot of compute was single-core. Architecture has moved on a lot since 1994, it makes sense for Python to be able take advantage of the progress made. What are the complexities that make this not worthwhile?
@@bn_ln python has pretty fast data structures and operations for such a high level language thanks to GIL. there is not fine grained locking/checks that would otherwise slow it down. also, if it will introduce any changes that require care or change assumptions developers had previously about python interpreter, it is not worth it. as i said, you can still take advantage of multiple cores by simply using multiprocessing or spawning multiple instances of your program. i just believe that normal python process should be single threaded, that's all. the process spawning overhead is not really an issue if you use pooling. i think it just requires a shift in perspective and the end result would be the same without the hassle.
What complexity does it introduce?
I hope PiThon 3.14 is released on Pi day!
Arjan, share please your compile settings. I guess, the lameness came from the build options.
Just when I finally learned about it, they go ahead and remove it
"Nooo! You can't just comprise thread safety!"
"Hehe, python go brrrrrrr."
Rust go brrrr.
well, if that happens, welcome to the world of real programming !!
The result is at 8:25
It is not faster but actually slower
it's not slower since you can't compare performance of stable with beta
I don't care about the GIL anymore. If I need speed I just use multiprocessing or write a Cython extension.
Python Devs will probably give two flavors of Python now: one with the usual performance with the Gil and the new one without Gil.
need speed? on pyphon? 🤣😂😂😂😂
"If I need speed I just use a different lauange."
@@eadweard. To me, Python is just a modern Visual Basic... but slow
@@cheblin Why are you telling me this?
@@eadweard. Kinda, but the result is directly usable on Python, simpler to maintain and way faster to develop than a similar thing written in C.
It's much more straightforward to manually manage synchronizations to me🤣
the free threaded version is slower because the optimizations done in python3.11 aren't threadsafe, so when you use a free threaded version of python, these optimizations are disabled. Hence you get similar performance to python3.10
Save the GIL 😂. My sql alchemy script works well with Alembic as well as my dynamic NT module ( which pylance complained about ) . Stability is paramount 😢
This is all well and good but as long as there's no way to *easily* package a python-script as an executable program that doesn't need to include its own interpreter, I think Python is unsuitable to widely distribute an app.
Python is not aimed at wide app distribution, there are other, more capable languages and frameworks for that. Python is very suitable for writing automations, on-machine data processing and analysis, other scientific computing and getting started quickly. For these tasks, it’s useful to know how Python’s performance changes without the GIL.
@@ArjanCodes Yeah, unfortunately, that's the case. I've written a game in Python but the one thing that prevented me from uploading it anywhere in a usable form is exactly that lack of a small, easy to produce package format. That's why I'm currently investigating Nim.
I struggled a lot when having to load 100gb dataset in memory and having no way to share memory between processes thus effectively limiting to a single core 😢
The true holy grail would be if something like Rust's unsafe sections existed in Python
So, we are all going to leave the Gil Guild sometime in the future.
You said that python can't be multi threaded without removing the GIL, but it can be. If your threads are cpu bound then you're correct, but much of what we do is IO bound in which case we can get huge performance boosts from multithreading in python even with the GIL. Anyone who doesn't understand the difference, I suggest searching for and reading the old blog post named "There is no thread".
This. People are wayyyy overestimating the benefits of GIL free multithreading.
Yes, the benefit of multithreading would show specifically for CPU-bound tasks which need to share memory. That's about it. CPU-bound with little to no shared memory, multiprocessing is good enough for. IO-bound, as you say, even current multithreading with the GIL is often good enough.
And if you really do need this, as you would in vector and ML libraries (that are currently written in c or c++), you will soon be able to use MoJo directly with your CPython code and reap the benefits of the latest cutting edge technology in not only thread and type safety, but the use of GPUs and TPUs.
@@HaganeNoGijutsushi for I/O tasks is much better to use Asyncio. It works even faster by 10-15% than multithreaded I/O tasks. Multiprocessing is to buggy and hard to share memory between processes. Thus No GIL Python will be very useful option.
The questions is that applying threading or processing on Python is quite annoying!
Shouldn't this move to non GIL default deserve the Python 4 version ? That's a big change that would introduce backwards incompabilities, would need migrations like we did from 2 to 3, and so on
This is truly one of the most important steps forward for Python.
i must admit how approacheable you made this GIL thing look, i couldn't hope for a better explanation
One minor release and we get
PITHON 3.14
I can’t imagine how to deal with arcs and mutexes etc in Python while keeping Python as simple as it is today. I assume the added complexity the true multi threading adds to Python defeats the purpose of using Python. If you’re really dependent on performance, just write it in Rust.
People writing Python programs do not need to change anything. They can keep programming as is.
Only C extension programmers need to be more careful. Or just indicate which parts of their extensions are not thread-safe.
There's a handful of comments here about how removing the GIL makes things more dangerous - but if you read PEP-703, they solve race condition issues by using per-object locks instead of a single global lock. As far as I can tell from reading it, there isn't much concern of "python getting harder" because of this. It looks like the experience of using python should be relatively unchanged.
People who make compiled extensions that interact with python at the C level WILL need to change stuff though! But the pure-python people will be fine.
Don't you already have to deal with releasing the gil if you write a c extension? It will be just become more straight forward and a standard practice instead of an optional thing.
❤
python videogames finally real?
Love this 🤭🤭🤭
GIL, that is my last name 😅
Get rid of it.
In Spain there was a businessman called Jesús Gil y Gil. He eventually became mayor of a town called Marbella. He ran for election with a party called, wait for it, GIL.
next ,python jit
It’s called PyPy
Overall: for easy with low relevance stuff Python is fine. For real serious stuff you should not use Python.
Just use pypy3
Audio is out of sync
the best multithreaded Python is the one that imports a package written in Rust (Arjan has a video on that: ua-cam.com/video/lyG6AKzu4ew/v-deo.html)
10 mins video only has a 1 min of info
95% of threaded programs written in python are such that most, if not all, the threads end up being I/O bound. There is absolutely no point for such a program to be threaded. Use asyncio.
"Asboltely no point" - I disagree - depending on the code, it can be much easier to be written with Python threading than a correct code with asyncIO - and if your I/O library is not Asyncio ready, you will have to resort to call the I/O in threadpool workers anyway. (That when your "async ready" I/O library doesn't do that under the hood). So, yes, _most_ of times, there will be no point in GIL-bound Python to use threading instead of asyncio, but I wouldn´t say there is "no point" in it.
There are a lot of people who think JavaScript (of all languages) is better than Python because it is faster.
That is such a load of cr@p.
As the UNIX gurus knew, "Programmer time is expensive. Computer time is cheap." So I will consider a language that makes me more productive as a programmer far superior compared to a language that runs my code a bit faster.
Note that this maxim is from the seventies. Compared to back then, computer time has become many orders of magnitude cheaper, and programmer time quite a bit more expensive. And while Python is from a later age and not quite a scripting language, scripting languages were invented by the UNIX people for exactly this reason.
Very good points!
You should talk a bit more technically. How is the source code changed, do we know use all thread local variables what about the thread enter/leave calls. What about extensions? If we don't know the implementation real programmers don't can be good.
It's not that simple as that. GIL is useful if you want something without GIL use C. Implement C with Python.
aren't you the guy that was predicting AI will ruin developers the other month
It's doing it haha
going from slow to a bit less slow. yey... I guess
Looks like you've lost some weight, nice!
Python 3.13 is juiced to the GILs
Sadly it's not magical, we need magical upgrades , it just works kind of ugprade
Incoherent.
So much talk about gil and showcase shows without it it's the same if not worse.
:)))))))))))))))))) out of 10
π-thon 3.14 😂🤣
Had been waiting for this so long that I completely switched to another language in the meantime, so now this feature is completely irrelevant.
It depends on what you do. In ML/DS, Python is mainly the (easy and accessible) interface to several other libraries written in low-level languages that are not affected by the GIL.
I'm afraid this is not a great direction of changes. Even multithreaded, Python won't be a performant language - it never was designed to be one.
Python is a "smart glue" to orchestrate other software and it works great that way already.
Also, if somebody really wants or needs to write a super fast, performant code and it must be written with a Python syntax - that's what Mojo is for.
Well my ML data augmentation code runs in 1 hour with the GIL - and in 1 minute without it. At that point I simply cease to care whether Python is a "performant language".
@@eadweard. Why not going half a step further and run the same code as Mojo? You may find the execution time going down to a fraction of a second :)
@@pwinowski I don't think the client would like it very much if I rewrote everything in another language. Come to think of it, I don't think I would either.
@@eadweard. I called it going "a half step further", not even a "step further", because there is a chance you wouldn't need to rewrite anything at all. Mojo is Python, syntactically, and it gives a performance boost out of the box. You can further improve performance, using Mojo-specific features, but you don't have to. That's the whole purpose of Mojo.
@@pwinowski Much the same has always been true of Cython. But the question is why would I bother with the additional dependency for "a chance" that I could save a minute each week or two?
i’m disappointed you decided it was okay to publish this video when you clearly have little understanding of what the gil does, how it works and what removing it means in terms of actual changes. it throws shade on all your other work. this instead just confuses people who don’t understand what mutexes are and cements them in their ignorance, as evidenced by the comments. we expect more from you Arjan!
okay short version, simplifying here: at the interpreter level, gil used to be one giant lock, any code that changed interpreter state would just grab it. this means that even if you created two threads from python code, both threads could not be executing interpreter code at the same time; so it was mostly useful for IO, your thread is just waiting there blocked on io anyway so gil usually wasn’t a problem.
without gil to keep code thread-safe you now need a lot of small locks instead, making parallel execution possible. unfortunately it also means you need to acquire these locks even if you only have a single thread from your python application perspective. and each lock acquisition is incredibly slow since it’s a syscall. this explains why your singlethreaded example was so much slower without the gil. there is no other major downside that doesn’t get talked about: without gil, cpython code will be littered with lock handling, making further development harder.
reality is that the vast majority of code out there is either single threaded or io bound, and great workarounds exist for most exceptions. this is why this discussion is so incredibly frustrating - it’s usually people who don’t understand the nitty-gritty loudly demanding python gets rid of gil, because they hope it will somehow give them performance..
@@EugeneYunak "there is no other major downside" - did you mean one?
Nice explanation
@@TheStuartstardust of course, thank you
Ah the mandatory snobby pythonista
Its going to make GUI interfaces much smoother
shouldn' t make any difference, actually. Unless you are using a pure-Python gui-framework which keeps the CPU busy in a loop - which is _very_ unlikely.