1. 00:25: Manual string formatting 2. 00:36: Manually closing a file 3. 00:53: Using "try:" and "finally:" instead of a context manager 4. 01:07: Using a bare "except:" clause 5. 01:32: Thinking that "^" in python means exponentiation (it's a bitwise XOR, exponentiation is "a ** b") 6. 01:42: Use of default mutable arguments 7. 02:07: Never using comprehensions, or only using list comprehensions 8. 02:24: ALWAYS using comprehensions 9. 02:50: Checking for a type using "==" 10. 03:23: Using "==" to check for None, True, or False 11. 03:37: Using an "if bool(x):" or "if len(x)" check 12. 03:51: Using the "range(len(a))" idiom 13. 04:30: Looping over the keys of a dictionary 14. 04:48: Not knowing about the dictionary items methods 15. 05:01: Not using tuple unpacking 16. 05:11: Creating your own index counter variable 17. 05:21: Using "time.time()"" to time things 18. 05:43: Littering your code with print statements instead of using the logging module 19. 05:59: Using "shell = True" on any function in the subprocess library 20. 06:13: Doing maths or any kind of data analysis in Python 21. 06:23: Using "import *" outside of an interactive session 22. 06:33: Depending on a specific directory structure for your project 23. 07:07: The common misconception that Python is not compiled 24. 07:34: Not following PEP 8 25. 07:56: Doing anything to do with Python 2
As an ex-lecturer (software engineering) I was reasonably happy with most of these. The "if len(x) != 0:" example is an exception though, as it explicitly mentions the important property (length). I find this more readable and I've had student code using "if x:" cause problems when they have passed the wrong type of data in and it's not clear what condition they are trying to test. In general, a well-read programmer specialising in one language will be fine with compact styles of expression. A non-specialist reader (maintainer, manager, QC checker) can have real problems with some of the denser language-specific constructs. When I was a programmer I had to educate some members of my coding team on complex regular expressions and the trickier bits of Oracle's flavour of SQL. They just couldn't follow the code - they were not front-line programmers and it was above their more generalist knowledge. Sometimes reducing eloquence for readability is a good call.
Yes. Even code you've written yourself can be hard to decypher a couple of years later and you often don't have the luxury of sitting down and untangling it all because you have other crap to be doing. These days, I write code with the intent that it be at least not difficult to understand and somewhat easy to fix or modify. Being concise and economical is for when you have 1k or less of memory and slow CPUs (though it is definitely possible to go too far in the other direction with that).
As Bjarne Stroustrup often says, the compiler doesn't read comments -- and neither do I :) Using a less general construct that more explicitly delineates your intent and saves you from writing a comment is a definite positive in my view.
I personally agree that “if len(x) == 0” is more informative and readable, but Python’s official style guide PEP8 recommends using the fact that empty sequences evaluate to False and writing “if not x”
@@rrr-mi9kv IMO the purpose of any style guide is to make code more readable by reducing the amount of time you need to get used to different conventions. There's little value in the style consistency beyond that, which means if violating a style guide (which, by its nature, is "dumb" in that it can't account for every conceivable situation) improves readability, you should do it.
One of the things that I don’t see very often is using underscores when writing big numbers. Example: 1000000 vs 1_000_000. The former is much harder to get at first sight than the latter
I would add that for larger powers of 10, it's just simpler to use exponentiation as in 10**9 (or even 1e9, though this gives you a float) for a billion.
watching this video really shows how far you can come in only 4 years of work. When I started learning to code I was way too worried about memorizing these things, when in the end all I needed was more coding experience. I'm glad I found some fun projects to work on while basically ddosing stack overflow with questions until the best way to go about things just got ingrained in my brain. We got this bros. My advise is to just find a project that looks fun and make it to the best of your ability, constantly pushing your ability little by little. The rest comes naturally. :)
@@bth1279 I reccomend building a web scraper. There's a lot of tutorials on here, don't overthink it, just pick the first one you see and finish it. Thats the important part
I have just started a couple of weeks ago and every now and then I get a bit down because I feel like I have to copy everything from other sources of information like tutorials and stack overflow. I’m self-learning so of course I do.. how would I know these things? But it still feels like I’m an impostor and might just be too stupid for it. Like I had to watch a video on implementing jumping into my first mario like pygame project and when I looked at how it should be done I felt so down because I was way off to solve a problem that other programmers make seem effortless and straight forward. I guess I should not expect much after couple of weeks but still feels bad 😂 Will keep trying tho as it’s fun!
2 things on this list that I need to do from now on: 1. replace time.time() with time.perf_counter() for measuring program execution time 2. learn how to install my modules as a package
1. 00:26 - Don't concatenate strings, use f"{string}" 2. 00:36 - Don't use f.close(), use with open 3. 00:54 - Don't use finally as context manager, use with 4. 01:08 - Don't use bare except, use except Exception or except ValueError 5. 01:33 - ^ is bitwise XOR, not exponentiation 6. 01:42 - Argument defaults are defined when the function is defined, not when it's run! Don't use l=[] as empty list, use l=None 7. 02:07 - Use comprehensions 8. 02:24 - Don't overuse comprehensions! 9. 02:49 - Using == to check a variable's type of a Class is tricky! 10. 03:22 - Instead of ==, use is 11. 03:39 - Using booleans with if, Don't act like a noob! 12. 03:51 - Using the range length inside the loop, instead, learn to use enumerate and zip 13. 04:31 - Looping over the keys() of a dictionary 14. 04:49 - Use items() to get key, value pairs of a dictionary 15. 05:01 - Use tuple unpacking to get all its values 16. 05:11 - Creating your own counter variable (enumerate) 17. 05:21 - Don't use time.time() to time your code, use time.perf_counter() 18. 05:43 - Instead of using print, use logging 19. 06:00 - Don't use shell = True 20. 06:14 - Learn to use numpy for math operations 21. 06:24 - Don't use import * 22. 06:34 - Importing files from other directories are tricky! Learn to Package your code & install it into your current environment 23. 07:08 - Python codes are compiled! (pyc, pycache) 24. 07:35 - Use pep8 to avoid nagging 25. 07:57 - Python 2-3, Notes about range() & keys()
After many years of python, I still didn't know that socket had a context manager, and also that range had a custom __contains__. Great vid! Also great pacing
That's because you'll never need to check if a number is in a range (I can't imagine even one case to use it in). I have 7 years of python programming experience and all this time I knew "x in range" feature. How many times I used it? ZERO
Some of these "mistakes" i make because i write in 5 program languages due to legacy. Sometimes it is diffcult to remember the "proper" shortcuts or style code of each of those languages. Still I'm not consider myself a noob programmer. And switching from 1 version to the next version is sometimes hard to get some of the habits removed. That's why i like to watch and listen to these kind of videos in my freetime.
Tell me about it. I will work with JavaScript for 2 or 3 weeks then onto maintaining some code written in Python 2, then I will do some other job function for a few months, a bit of bash scripting, then to Python 3. I have they general syntax of 5 or 6 languages bouncing around in my head. Every project I go back to, I spend a few hours picking up the nuances of a particular language again.
This is probably the most immediately useful (to me) video I've seen on UA-cam. I watched this two days ago and have since gone through two scripts I've been working on and edited them to incorporate 7 of these suggestions (yes I counted). There's at least like 3 other nooby mistakes I think I've made in the past but I'm not going back to check lol, so probably will just reference the video in the future. This also made me watch the video about looping in python, which motivated me to rework a loop in one of the scripts, and the 21 other nooby mistakes video, which I think is much less applicable to my current habits but may also be relevant later. Thanks for making this video and the whole channel!
All excellent tips, except one: "if x is True" is proscribed by PEP 8. Doing this violates LSP, which you had just mentioned. This is because a Boolean subclass should work just as well. Just write "if x"; never write "if x is True".
In dynamic languages, people keep making silly "overloaded" apis where True => {a_key: True} False => {a_key: False} but you can also submit a whole object {other: 8} => {a_key: False, other: 8} {a_key: True, other: 8} => {a_key: True, other: 8} to supply more advanced options for the same argument
Number 26: Usage of single character variable names. Single character variable names lead to confusion and bugs in your code as it's hard to discern beyond the definition point of a variable what it is being used for. Opt for a more descriptive name as, after all, variable names are free. (As a SDET by trade, this is my biggest pet peeve)
i honestly think people who only use 1-2 letter variable names are just flexing and don't want anyone to be able to read or manage their code down the road. I've straight up just deleted an old script and rewritten it just to avoid dealing with understanding the previous coder's impatience with adding 3-4 extra letters.
@@itellyouforfree7238 First of all: Not a descriptive name, refactor. Second of all: Sorry not sorry. Quality goes beyond just finding bugs. It also is about preventing bugs and single character variables are a gateway for bugs.
@@willpeterson3943 And for when people do want it to be a reference to a mutable global, the default expression can, just, well, reference that named mutable global. I can't imagine how this happened. This global they are introducing is literally anonymous. What could be worse?
The parameters passed to a function are outside the scope of the function, even if they're the default parameters. If you want the default to be within the scope of the definition then you have to put it inside the body of the definition. It's a consistent and reasonable choice. More so if you're using a mutable data type it implies you want it to persist. If not use an immutable type.
3:45 This advice is *questionable*, sometimes is good to use "if len(x)" just to clarify that x is a list/set/tuple, or use "if x > 0" just to clarify that x is a num, not always obviously, but when you're too deep that you can't verify what your variable is supposed to be it's good to have something that explicitly shows what the variable mean.
one could argue that if "you're too deep that you can't verify what your variable is supposed to be", your code is already unreadable. However, I also think that "if len(x)" is not that bad. Keep in mind that "if x > 0" actually has a different (and in some cases more useful) meaning than "if x", at least for negative x.
I think it’s called type annotations. The something that explicitly shows what the variable mean. All your assumptions with this checks may fail if X implements appropriate magic methods for this operations.
@Jazzy Jones it's because an empty array is a thing that exists and is defined, so it can't be false. If anything, it's the [] is False that is weird in python. And none of them are are treated as None, obviously. Nothing is None except for None.
@Jazzy Jones Here's the result of evaluating an empty array: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty. #pycharm, by the way :D
3:37 Honestly, writing "if x" to check if "x" is a non-empty list hurts readability quite a lot. I always like to write "if len(x) > 0", which reads like plain English (#pycharm by the way)
Of course, but the _real_ problem is that you're using x as a name for a container. _That_ is what you should change, and then "if errors:" or "if tasks_remaining:" should become _a lot_ more natural.
@@Jeyekomon Yeah sure, but at this point wouldn't it be better to have a boolean variable called "errors_occurred" (defined as len(errors) > 0) and directly check that? In general, using a non-boolean expression in an if statement is fundamentally wrong in my opinion
@@francescocariaggi1145 Opinions are nice, but have you ever coded like that? My opinions are based on decades of experience, working on systems of various complexity -- and they align with the opinions of people who have written PEP 8. I've seen, and written, "if errors" or analogous constructs, many times. I've never seen your "errors_occured" approach. I can even guess why: because it's very easy for such a separate piece of knowledge to get out of sync with the knowledge already present in "errors". Having a single point of truth is a powerful concept, especially in complicated systems. About your "fundamentally wrong", not many such ivory-tower opinions survive the collision with the real world. We do think of various operations polymorphically, and denying that in the name of some simplistic type alignment is not useful. It is the same idea as "using a non-iterator in for-loops is fundamentally wrong", or "using a float in floor division is fundamentally wrong", yet we really want to write (and think) about "for error in errors", not "for error in iter(errors)", and "n_frames // rate" instead of "n_frames // int(rate)". Practicality beats purity in many occasions. It's exactly the same with "if errors" instead of "if bool(errors)".
Can’t believe you are naming variables the lowercase “L”. That is one of the things that was stamped into my brain when learning programming. Now I feel weird if I use it lol. Anyways, super helpful video. Never knew about enumerate(). This is gonna be a total lifesaver!
@@gustanoid shortened var names make you memorize stuff. It's fine if it's something important that will be seen a lot, but for most purposes variables should have done sort of description of themselves. You don't want to have to remember a new encyclopedia for each member of your team when doing code reviews.
@@kidneybean5688 I'd like to give an example to extend your argument; For example, you could have a list named illustrations and would like to use map function to map into another list. While using lambda/anonymous functions/closures/whatever you name it, using the name illustration as parameter makes the code too long and hard to comprehend. It is clear that you are iterating over illustrations, so simple single letter like x or a would come in handy. names = map(lambda illustration: illustration.name, illustrations) names = map(lambda x: x.name, illustrations) This is also applicable to other languages, don't duplicate the obvious naming. Adding player_money field to Player class is usually pointless, don't name it over explicitly unless it is needed to be.
@@kulkalkul I've been writing a lot of code in Elm lately (great language), which definitely loves its newlines, so I can get a little more spacey with my names. My solution to this in a language that takes up more horizontal space would either be to shorten "illustration" to a mnemonic like "ill", "iln", "ilstn", or to make some newlines w/in the function call.
@@kidneybean5688 I definitely agree. I also started to use Rust for almost a year and libraries in Rust tend to use long names for sure, as being able to vertically space out code helps greatly. Though, I still prefer shorter names. It wasn't fun to use InteractionApplicationCommandCallbackDataFlags tbh :D
Had this pop up in my recommended, figured I'd watch it for fun. It is now several hours later and I've incorporated #1(f-strings), #16(index counters), and #17(time.perf_counter()) into all of the scripts I use for work.
Fun story: I was hired as a PHP developer, but soon was tasked to overtake a Python project from a freelancer that was leaving in just a couple of months. Lambdas, list compression, confusing yet powerful multi-inheritance, wrapper functions and attributes, constructors and initializators, bytecode compilation, tuples, dictionaries. I quickly fell in love with the language. Now, I'm back dealing with legacy PHP code at another company and miss Python dearly. Still need to figure out a pet project, so I could finally scratch my nostalgia itch. #pycharm
I'd love to hear more detailed explanation for 3:23. Since it's for noobs, people won't get why they shouldn't use == instead of is (or vice versa). AFAIK there are at least two reasons (1. speed--since "is" checks reference while "==" checks value, 2. custom equality definition), but I'm not sure if there are more.
The reason for using "is" with None instead of "==" is ostensibly speed, but the speed increase would be totally unnoticeable under normal circumstances. Another reason is that None is a singleton, so using "is" acknowledges it as such. This is one of those things that in actuality makes basically no difference, but people will nag at you anyway, so it tends to be noobs that still do it.
@UCAl4YW8UbOfjz9ZxaQcXlzw I think you are talking about the example prior to what the commenter was mentioning, and I can assure you LSP is relevant there. Furthermore, LSP absolutely does NOT say anything about the relationship of two children of the same parent. Substituting one child class for another from the same parent is only safe to do if the function/method/etc was expecting the parent class, not if it was expecting a specific child type. In general you cannot substitute one child class for another, regardless of their common parent. E.g. in a print_id function that expects an object, it's fine to pass a list or an int because they are both objects, but in a print_all that expects a list, you certainly cannot expect it to work if you pass an int just because int shares the same parent type (object). I think you are also mixing up LSP, aka "behavioral subtyping", with Python's notion of structural or "duck typing". LSP is a general principle not specific to Python. It is a stricter requirement than duck typing. I am definitely not saying you should always try to adhere to LSP, or even to duck typing for that matter. However, checking a type for equality is a violation of both LSP and duck typing (well, assuming you actually do something different in the two branches of the if statement). The "Pythonic" answer is to only enforce duck typing, not the stronger LSP, but in practice this becomes very difficult to maintain on anything but small projects. Many developers therefore choose to abide by the stricter LSP most of the time. Moreover, if someone (especially a noob) tried to check a type for equality, say to overload behavior if they got a str vs bytes, an isinstance check is morally probably closer to what they wanted than "just see what happens if you call quack()".
If you start to worry about speed then Python is not your language. I always loathed that there is the “is” statement and the == it’s poor design. After all, functionally you do the same thing. A trademark of a bad language IMO.
@@CallousCoder There are subtle differences, but I do agree; even the reference comparison is easier to just do it like what one would do with C (pointer dereferencing).
Love all the tips except for a couple. I like writing things to be crystal-clear, and I find that `if len(x) == 0` takes just a little less mental processing power to understand. Same for `for k in my_dict.keys()`.
yes, I thought the same thing with both examples And the fact that you "code highlight"ed your code passages even though they are not rendered on YT is oddly satisfying :D
It's also worth noting that "if len(x)" is necessary for some data types. Pandas famously throws an error if you check the object itself: "The truth value of a (Series/DataFrame) is ambiguous." (Thanks for the headaches, Jeff)
Agree with the .keys() one, save for specific scenarios, e.g., codebase used mainly by python senior devs. In general, I'd vouch for explicit intent, as I could see reasonable arguments and counter arguments behind multiple conflicting answers to "what should be returned by default when iterating over a dict"
You can tell m has never passed 0 instead of [0] into a function they wrote and spent half an hour trying to figure out why their iterator wasn't working
This actually made me feel a lot more confident about where I am in my Python learning journey. I expected to fail all of these but I actually consistently do most of the things you recommend!
I think doing a full "Python 2 + 3 in the same module" video would be pretty valuable, you'd be surprised how many programs still use Python 2 to some capacity for scripting
Bit of a necro, but I actually came down to the comments to make almost exactly this point. In my day job, I deal with ArcGIS, but being a municipal government job with the usual attendant budget and manpower, we are still realistically a painfully-long way away from the main IT department completing their migration to ArcGIS Pro, which is when the developers finally moved to using Python 3. So I write almost exclusively for Python 2.7, because that's what's supported in the older ArcGIS Desktop. Constant references to "Why are you still using Python 2? Upgrade already!" when I'm chasing a problem are more than a little tiresome, as that decision is many, many levels away from being close to being my call.
@@altasilvapuer Came here for exactly this comment. At my job I just ArcGIS Pro on my machine, so most of my work is done in Python 3. However we use ArcGIS Server instead of Enterprise and all the environments use ArcGIS Desktop. So if I want to write any tools to sit on the server side I write in Python 2! Enterprise is in the works though...
Haven't even expected how "noob"ish I was. The channel is underrated, because it's rare to se someone who digs deep into the topic with lots of details. Usually creators just run through basics of Python and that's it. #pycharm
That's just a tip of the iceberg. Next you find out it's not possible to use Python in a non-noobish way. It just makes it impossible to write a program properly.
this is one of those videos where I learn about a bunch of things I'll immediately forget without usage. I reckon I ought to save it so I can reference it the next time I'm pythoning
Now even if you save it, you're just going to let it sit there and some things will change by the next version of python. Then, after you review it, you'll still forget it later. And the likely scenarios where you do need it won't come up. Info dumps, man. It's all the same.
As for PEP8, please do read the intro: "Many projects have their own coding style guidelines. In the event of any conflicts, such project-specific guides take precedence for that project."
I thought I was ok, but you got me with perf_counter. Though I'm usually measuring time in minutes not microseconds, I had no idea I was being so inaccurate this whole time. Great video!
18: No, that does NOT look a lot better. In logging format, just use {} formatting instead of %. It's more consistent with the earlier referenced f-strings and therefore much more readable.
Agreed. If data such as the time capture is required, or the ability to pipe to a file is necessary then logging is the way to go. Otherwise, just use modern formatting. In either case, debugging through standard out messages alone is the real problem, and switching to logging isn't an improvement there.
I am no beginner - I started with python back in college circa 1991. But I do a lot of mathematics in pure python because I need the bignum integers and mpfr support for high precision floats.
About the 19 tip: There's in fact a function which I use to avoid typing in manually the command inside a list. Instead of just typing the list, when using big commands, I just import "split" from "shlex" module, which can be used to split text into lists (i.e. subprocess.run(shlex.split("find / -type f 2>/dev/null")) and this will handle the problem
That solves many of the problems with 'shell=True', but not all of them - in particular you still have to worry about things like spaces in filenames if you're generating the command from strings. It's potentially more exploitable with malicious user input as well. It really is best to just bite the bullet and form your command as an array. Also FYI your example doesn't work - it will execute the command "find" with four arguments: "/", "-type", "f", "2>/dev/null", which isn't what you meant. If you're not using 'shell=True', you can't use shell constructs like "2>/dev/null" or "|" or ";" etc.
One more point that I think is a must in the list: defining class properties and expecting them to behave as instance properties. That is, defining properties on a class outside of init. Seen this one to be genuinely confusing people that come from other languages.
Is it possible that you're confusing properties (i.e. @property) with attributes (e.g. self.x or cls.x)? they aren't the same thing, though I do believe you make a good point.
6. 01:42: Use of default mutable arguments 9. 02:50: Checking for a type using "==" 17. 05:21: Using "time.time()"" to time things 18. 05:43: Littering your code with print statements instead of using the logging module 19. 05:59: Using "shell = True" on any function in the subprocess library 22. 06:33: Depending on a specific directory structure for your project 23. 07:07: The common misconception that Python is not compiled 24. 07:34: Not following PEP 8
3:43 I find "if len(x) > 0: ..." more readable than "if x: ..." 4:27 Nested tuple unpacking is new to me though, thank you :) 5:51 I also didn't know about logging. I always rolled my own logger class.
@@Grivian Of course it depends on how you use x. But semantically speaking, the if statement expects some kind of condition to be evaluated, and x probably being a list in this context is not a very prototypical condition, unlike len(x)>0 which to me actually looks like a condition.
Somehow this channel is extremely underrated. I’m just a beginner in python but few of the videos have helped me to such a extent that has reduced my months of work. Just to give an example the video on for loop Vs while loop is a gem. Thanks a lot sir for sharing this valuable information. #pycharm
@@yjc5931 You want to make sure that your variable is of type bool, not anything else, like a list of three strings, which is also True. And this exactly is not a good coding practice in most cases. Generally your code should not need to test type of your variables.
@@Jeyekomon agree and disagree, I agree that in most cases we won’t need to check it, as we can just use type hint in function argument, but I don’t think we can completely ignore the fact that in the absence of type hint, we won’t ever need this. Just pointing out the video isn’t wrong either
Great video. I so much agree with 8, few times I stumbled across the projects that overused list comprehensions, sometimes it was painful to read such code. List comprehension is a fantastic Python feature, but people should learn when to use it and when to avoid it. Readability should always come first imo #pycharm
@@kilianmio6243 Python is regarded as self-documenting code. As soon as you need to explain what your code is doing, it's not good code. Comments should only be used for why your code is doing something.
With great power comes great responsibility. Python has a ton of features that can both make and ruin your code. Knowing what tool to use in each situation ensures the code is readable
Woow, as a total newbie - started with python this semester at collage - this was an information overload, great vid though. P.S. Loved the "smash the like button an odd number of times", but I'm at even and it is currently liked now, so I have to apologize for even momentarily disliking to achive this greatness
3:39 even if len will cause a little bit more execution time, it is much more readable compared to if something itself. Also, if you want a seperate *raise* call and/or logic for each case, you have to do a *seperate if ... is None* check and *if len(...) == 0* check anyways.
As an experienced Python user (5+ years, 10k+ lines experience), I am happy to know that I followed most rules mentioned in this video correctly. I did manually open and close files in REPL because you have to type 4 more space on every line afterward if you open a file using context manager (with statement). I also do this when I am writing a script in a hurry and need it to work right away. I don’t know that the perf_counter is the correct way of timing code (though I rarely measure the running time of Python code at all). For creating my own index variable, I have to say that it is sometimes necessary if you are using a while loop and the update condition is not as straightforward (for example, implementing some algorithms). Also, while I agree that numpy is powerful, there are a few cases that the built in Python math is better, such as big integer by default, and you never need to worry about overflow.
5:43 "number 18: littering your code with print statements instead of using the logging module" #pycharm I felt that... I've been doing personal python stuff for at least 3-4 years ... yet I actually never spent the time to learn how to use logging and how to format stuff with it, etc.
same here! ive been writing python for almost 8 years now and i still use print statements rather than logging lmao, logging looks so much cleaner though
Meh, I've been using logging for a decade. Most of the time I find myself just relying on a mix of assert/break/print anyways. It's just quicker and easier and 90% of the time I'm just doing a temporary inspection. Logging is super great for end-user information, but for debugging break/assert is the right tool IMO. What's the point in all that pretty printing when you're just querying about internal state, etc. Obviously this changes a bit in collaborative projects, but even at that, I'd like to argue asserts are descriptive enough.
I am learning Python and all your videos are amazing! Thank you so much for helping me in the python learning process. I really like your teaching style. Some of your python topics are above my python comprehension because they are intermediate or advanced and that is okay. I will come back to those videos in the near future when I gain knowledge in my learning process. In many of your videos there are hidden gems. Those gems makes me realize that noobies need to ditch another bad habit: not reading the python "What's New in Python" documentation when a new python version comes out. I learn so much from reading the PEP documentation with example code that are referenced under the new features/implementations. Plus, I feel this will keep me in sync better with the python language updates.
Great tips and I learned some stuff myself. For 6:14 / tip 20 (always use a math library like NumPy) though, I really need to caution against doing it blindly. For example, libraries like NumPy are mostly designed for handling large arrays. I don't want to get into the specific context where I encountered this, but I have seen situation where lots of people just blindly assume NumPy is the best and used it for doing math on 3D vectors using NumPy, resulting in really slow performance. From what I have found, NumPy is not always faster than just doing the math in Python directly if you need to do lots and lots of 2D / 3D vector calculations because in Python you could just represent them as a simple struct, and you don't have to wrap / unwrap them as NumPy arrays (which, again, is optimally designed for handling large arrays, not when length == 3). If you are doing data analysis type work with a long array of 100+ items, that's another story. So bottom-line is: try to understand the performance characteristics of your library and understand if it actually fits your needs. Just because a library like NumPy is popular and fast for its intended use cases, does not mean you should just use it for everything math related without thinking.
The thing is, if you have to do "lots and lots of 2D / 3D vector calculations" you would probably stack them in a two dimensional `numpy.ndarray` and apply the same transformation to all of them simultaneously.
I was very happy when I first found out that you can use comprehensions in python because I am absolutely obsessed with the set builder method. I have been using comprehensions since I was a newbie.
excellent video, yet laconic, thanks a lot! I'll add to your #22 that usually the most convenient way is to install a package you're working on as "editable", with pip install -e . (dot) took me a while to wrap my head around package/module importing rules
#6 can be confusing for people coming from languages like C#. In C#, we call these "default arguments" (which have to specified at the end of the parameter list) and they are always that value if nothing was passed in. Unlike Python, they aren't stored in memory to be used every single time that method is called if nothing else is passed. I think that's a horrible way for python to handle it.
That's an extremely stupid way of Python doing it. I actually had to rewind and listen to #6 again, because I thought I misunderstood something because no way it can be THAT stupid. Apparently it can.
@joje86 Independently of mutables, I would gladly trade this trick for the ability to use dynamic expressions of previous parameters in the defaults instead of having to set default=None and have default initializations take half of the function body.
3:26 Using "is" to check for booleans is also a noob mistake or even a bug, as when using other libraries is will return false no matter what. So in the case of a = np.array([True]) a[0] is True returns False. Just use if a[0]: or if you want False statements, if not a[0]:
Oh yes. I just checked it. It's because >>> type(a[0]) The 'is' keyword checks if two things are the very same. Two things which are not even the same type cannot be the very same. >>> np.bool_(True) True >>> np.bool_(True) == True True >>> np.bool_(True) is True False
What about sorting a 10-gigabyte data set just to get a max or a min element of it? Or assigning values of entirely different types to the same variable in different branches of code? Formatting strings, comparing booleans, looping over dictionaries or using comprehensions are nowhere near real problems with Python shitcode (most of which is actually promoted by the language). Imports should be handled by the IDE; if it doesn't then that's a shitty IDE, perhaps exactly matching the language quality.
#23 mostly happens because people expect bytecode-interpreted implementations to work almost as fast as compiled implementations. But due to Python's dynamic typing and slow function calls, it lags behind a lot more than other languages
Most of those comes with a bunch of asterisks for them, an a couple of them are not necessarily what you want. For example, while you can often get away with "if x", you run into the issue the the boolean value is implicitly infered, but one of the basic zen of python states that explicit is better than implicit, and in this case you can often increase the readability of your code by explicitly stating what kind of property you expect, and it will tend to also give you better errors closer to the problem when you get an "you cannot use len on this thing", rather than it just performing the code for 0 lenth collections. That is a intermediate error, and not a noob one though. Here are some examples of the asterisks (as in cases where it is not quite right): 1) When nesting string construction into each other, things quickly become complicated and you will want to start making use of more of your string building tools than just format strings. 3) When you are building context managers yourself or you want custom local behaviour. 4) The bare except can come up when dealing with dangerous multiprocessing, and you want to ensure that you do not leave zombies behind, though you would generally raise the exception again after doing graceful shutdowns. For the except Exception: case, it is especially useful for multiprocessing calls on the worker side, where you then have the option to gracefully send back information of the error. 6) There are some very rare cases very similar to singleton cases where you would want this. That said you always have to think really carefully about singleton cases, so this would be a super rare case where it would make sense. 7&8) They kind of already go there, and sometimes comprehensions are just not really good for the thing you are making, and sometimes you really should be using numpy instead. 9) Even isinstance is far from broad enough for a lot of cases, often you want to use hasattr instead to work better for ducktyping situations. Naturally there are also the rare cases where you need to make sure it has not be inheireted. 10) "==" has a different behaviour than "is" when the arguments are of different type, and especially the == True or == False has uses. Also be aware than the latter 2 can be vectorized with numpy, while the "is" format cannot (at least nicely). 11) See above about clarity/location of errors. 12) The idiom is used when you need to do alterations on a mutable collection and those alterations are not limited to operations on mutable internals. It also much more often happen as part of algorithms, though those are more commonly done with just range(n). An typical example (where it is double nested) is when you apply a local filter to an image, and you loop over the elements of the filter, and then do numpy operations to construct the image component from each part of the filter. 13) Okay this one is a bit far fetched, but the dictionary.keys() object can be a lot smaller (iirc), so if you want to build something based on those keys outside of the shared memory area, you could package it down and send that instead of the larger full dictionary. This would mainly be for multiprocessing. 14) Most of the time you actually want the dic.values() instead, if all you want are the items. 15) Tuple packing and unpacking can take time, especially if you have something long, and it will create the parts you ask for. If all you want to check is the first 3 elements of a 100+ length thing, then do not spend the time writing a, b, c, *_ = items, just write first_three = items[:3]. 16) The case here is when you do not necessarily increment it every time as it refers to something slightly different, but I am unsure of whether that really is needed, and that kind of algorithm code is much more often written in c style languages. 17) Most of the time, you should run cProfiler on your code instead of checking manually this way, it will give you much more information. 18) Logging can be problematic in multiprocessing situations, whereas print is usually pretty safe there (logging can get into nasty deadlocks). 20) Numpy is nice, but a lot of good algorithms cannot make good use of it for at least some parts, and sometimes going with O(n log(n)) is better than vectorizable O(n^2). Neither numpy or pandas uses GPU acceleration, so once you start wanting that you also need to move away from them. Pandas is also kind of weird, in the sense that if you take a look at benchmarks, it tends to be quite poor compared to just base numpy, so you have to really want the other features of it for it to make sense to use, because a lot of your basics are just going to be slower because of it. 21) The general python style advice is to import modules and not component out of modules in most cases. 22) Making everything as packages makes it very hard to run parts of the code as scripts, so you should not just convert everything into packages. 23) This is blurring the waters a bit, as it is intepreter dependent, and you will not find compiled files for all python code. I for instance have not see such files for interactive sessions and other equivalent things. It is correct that python code is built into python byte-code at least im memory though. 25) In python 2, print was a statement and not function. Iirc generator comprehension also first came into python in one of the 3.x versions. Of all those habits, the only one I would say I fall into is 17, as I was not aware of time.perf_counter.
8:30 not necessarily true. *Only* works if the value you are checking is exactly of type int (or bool). Does not work for other int subclasses, any other object, and , most importantly, float. For those it will still fall back to one for one checking all values by iterating over them. That is very very slow for large ranges.
#pycharm 1. In my defense, print is a perfectly fine way to debug a code. It's simple and easy assuming it's only temporary. Also, I'm surprised you didn't include using top-level programming as noobish. It's fine for simple code but some beginners still make their code looks like it's not an OOP language
After years of writing code professionally, I still can't be bothered to use the logging module for debugging. print statements all the way, they're easily added and easily removed.
@@jurgenhaan7652 One advantage of logging instead of printing is that you don't have to add or remove print statements manually - you can easily config logging to only print debug information in debug mode, but disable everything except critical errors in production mode.
I place at the beginning of my code: DEBUG = print When complete, I do a line replace and get rid of all of them. And DEBUG gets its own highlight, and so stands out from needed "print" statements.
@@k.chriscaldwell4141 "when complete" is not always a hard statement. That's why I also do as you, but don't remove the line, instead replace DEBUG for #DEBUG....just in case.
I had no idea about the issue with mutable defaults. That's wildly counterintuitive. Of course, they probably can't change that behavior because some major library somewhere is probably using it as a core feature.
Thanks a lot for this. I'm an engineer with a lot of experience in other languages but Python is pretty new to me. This helped me easily understand most of the more niche concepts and the dos/dont's. Thanks!
Most of these things are really specific to Python. I’ve been learning this language by myself doing some personal projects and I see that I don’t know most of these pythonic stuff yet, but I can code many things following what I already know in other languages like javascript, C, C# and Java for example. When looking for beginner courses I usually don’t get many of those features that were shown in this video, instead usually those basic courses teach about introductory things in a more procedural way and in some cases in object oriented fashion.
Yeah I learned python before java and for a long time most of my python code looked like java code. It's not the end of the world if you do this, but it is still very important to learn the "pythonic" way of doing things
5:43 this isn't a nooby thing to do. The logging module requires some thought and setting up (especially if your code is spread across multiple files?), while the print function is always just there. Even the documentation states that in some cases *print* is the best tool for the job. Also the logging module can litter your output with unsolicited logs at times. I'm not sure where they come from (they're a bit cryptic), but I've been getting that while using imgui with pygame.
@@skaruts Then you need to rethink your strategy. But debug, info, warning, critical are industry standard. At least with logging you can turn off all debug messaging in one place. With print statements all over your code, you're going one by one. The logging module is very flexible.
@@michaelpuskar6975 I think you're missing the point here. No one is saying logging is inferior or that it can't be the best option in certain circumstances. The point is simply that it's not always the best option, and claiming that using prints is a nooby thing is just masturbatory elitism. I do simple utilitarian scripting in python, for example. I don't often delve into complex projects with it. For the most part, print statements are more than enough for me and it would be a complete waste of my time to be configuring the logging system. *_"Then you need to rethink your strategy."_* Indeed. I use print statements. 👍
Awesome, thankyou! These were really easy to correct with refactoring tools, or at worst, search/replace. The surprising thing was, how often some of these nooby habits occurred in my very well supported dependencies!
Love your format. Thanks for taking the time to put these videos together for us! I've decided to move over to a new career in python and data analysis after working for years with php. Every day is a school day👍 #PyCharm
3:26 I use “==” all the time for None. Why make it a special case? As for “True” and “False”, I do sometimes compare explicitly with these values, where the variable might have a third state (“undetermined”, represented by, wait for it, “None”).
I also used explicit compare with True and False, although in most case an enum will be appropriate (or explicitly compare to None). About why using * is *, I think that on those cases is mostly a convention. There might be a minor speed advantage for it as well but it mostly irrelevant. Still, the * is * format is the generally accepted way, so if you write code for other people as well, you should probably use it.
it's mostly special cased because of the fact that they're all global constants and could be compared by identity but imo it's not a valid reason to special case it as it's rather implementation aspect which doesn't really affect user code semantically also as i can see only None is guaranteed to be constant and singular, True and False aren't defined this way, but only happen to be in CPython and PEP8 actually states that checking boolean values with 'is' is incorrect and correct way is using implicit behaviour of 'if expr' or 'if not expr', so, this vid actually contradicts itself :) tho i would say using 'is' for None checking is preferable as it's the way standard library is written, but it's not necessary at all
I'd like to talk about (list) comprehensions. The reason you should be using them instead of basic for-loops is because the latter have issues with respect to concurrency and readability (independent of language). But this doesn't mean that there are no better alternatives. To me an issue with comprehensions is that they extend the language syntax while only providing few benefits (Just like "lambda" in my opinion, just define a function). This is mostly because functions already exist for this kind of functionality, most importantly "map" and "filter". Also, comprehensions are inherently limited such that conventional syntax often needs to be used on top of them anyways, like sorting. This leads to a huge mess of syntax rules. There is unfortunately the problem that Python does not offer a lot of inbuilt support for chaining functions neatly (like java streams), which is why I see people preferring comprehensions. However, i tend to think general, simple solutions are almost always better.
@Tochka What exactly "in dictionary" does was not an easy decision to Guido van Rossum (the creator of Python) as far as I am informed. This should imply that "in dictionary" has ambiguous intuitive meaning. Hence, it is a problem when intuitively reading code, which is why sometimes using ".keys()" explicitly can help. On a side note, if you are using "if x in dictionary:" or the like, there is a chance you are also retrieving the associated value. In that case, using "dictionary.get()" and comparing against the default return value is safer and most probably faster.
#pycharm best channel I've watched for python! Graduated recently and know python at a standard level and your channel gives such easy access to more in-depth topics
Formatting with string concatenation is the worst thing you can do, whenever I learn a new language, the first thing I look at is how to format string.
One of the biggest ones is not realizing that mutable objects are always treated as references when passed as arguments to functions. I cannot count how many times my coworkers passed a mutable object to a function, modified that object within the function, and then _returned_ that object, not realizing that the object passed was already modified. And often they would write the script with the assumption that the original passed object didn't change, and things would go wrong without them realizing.
Notes to self 0:36 using with 1:06 bare except 3:51 range len instead of enumerate 4:48 dict item methods 5:43 logging 6:33 package structure 7:34 PEP8
To anybody worried about this: most of genuinely do not matter. Is somebody looking over your code going to think you’re trash because you don’t use enumerate? Or zip? No. You should be worrying about readability, but please focus more on efficiency (unless you’re on a hugely collaborative project)
One of the most useful python oriented channels i came across youtube. Watching this video for the nth time too. Thanks for the continued great content!
Great insights, with at least one observation, and forgive me if I come off noobish, but @3:37, wherein the context of the function presented, for x is unclear both in terms of expectation, and the specified function's purpose. The function looks to be a validator, and depending on what it's meant to validate, simply refactoring the code to a single if x. may or may not do the intent of the presented code. Take for example if it's meant to ensure that X evaluated to true or false as a value, but more specifically true. The first test would pass if x evaluated to true, but not if it evaluated to 0 or false. However, the second test may pass x if it were a string value with a length > 0. But since it's not converting x to a type wherein a length attribute exists, the context will not evaluate to true for a numeric type (as is). so a value of 0 will not be given a length of 1 allowing it to pass. So a string or number evaluating to 0 will never pass these tests, and will not pass "if x" either by my understanding for 0 translates to false. If this function were to have behavior based on x instead of validating, that's also a matter that a simple "if x" test would not address as the two tests can deduce type after a fashion and behave differently based upon which test "passes", and in both scenarios, what a pass does in that context. I mention that because the pass case doesn't seem winner take all how it's coded. so both tests should execute and do something, again depending on what behavior that pass is meant to do. @Derek Youngman presented the following in his response, which highlights some of the nature of my point, that if x will evaluate to true if there's a non-zero evaluation of x and that includes its behavior with True and None, when not checking their identity with the is operator in those cases, but say x = None. if x will not evaluate to true, but is not specifically checked for with is in the case presented. @Derek Youngman response: If x is True doesn’t do the same thing as if x. x = 4 if x -> True if x is True -> False y = True if y -> True if y is True -> True /end response So, I would conclude that it would depend on what the objective of the code is on whether this case was noob goodness or not. But the case may have also been vague in the interest of the claim, but that further should have made the example more compelling to show where the claim had its validity in a more IRL context than a hypothetical one, for clarity sake as the claim of one not knowing the language was pretty strong, given the lead-in stating there was nothing particularly wrong with the statements themselves. If a bool or null check is what's called for then by all means they should be used, it's like the folks that hated the or tags, seems a bit personal and maybe a little pedantic. And to that end, the function name should have changed to reflect that drastic change of the functions purpose if no longer checking for bool or length, but if non-zero instead, which again, doesn't address if the original code was fit for purpose and the person "optimizing\ efactoring" understands they why of the original logic to conclude incompetence of the original Developer. But I ain't one to gossip... .
Almost 2 years into python, and i've programmed a lot of things in python. Well,... there's some of these things I knew but didn't apply ( enumerate instead of range(len)), and some I simply didn't know. I'm baffled. Finally a video of this kind, or the kind "things I wish I knew" that is actually helpful ! congratz ! (Also shame on me, for taking bad habbits, this isn't going to be easy to fix)
Thanks. But honestly, I disagree on many. For e.g. point 1 - Manual String Formatting. If it works, it works. While I personally use f strings, have no problem with anyone or myself using + operator as well. It is there for that reason - because it works. Nothing wrong with using it or makes one a "noob" which is derogatory when it makes not much difference other than pleasing ego for some people as if they're elite somehow because they use f string over string concats. So I think it is bad to say such a thing when the language allows it. And "prone to error"? Code would break and the person would correct it or maybe they know what they're doing and it doesn't break.
"Readability" is difficult thing to tackle. But I guess one should tailor readability of his/her code to the preferences of whoever is signing off their wage cheque. Some "best practices to improve readability" make the code less readable for everybody, but those few who are used to this or that specific way of coding..
Sometimes I code "if len(x):" or "if len(x) > 0:" when it makes it more clear. I think that's the only exception to your rules that you haven't mentioned.
little pushback at 5:14 -- this is presuming you already have a well-defined iterative constraint that's designed to be used in very limited scope. sometimes it's helpful to have an external iterator that you're adjusting in multiple places. and as this is programming and you can accomplish any task one thousand different ways, of course it's possible to extend the functionality of a typical enumerate iterator. but when it coms to simplicity and readability and the way you're structuring the logic of an algo, sometimes the option is convenient.
I had no idea about #6. I'm sure you just saved me from an eventual debugging nightmare; I can imagine a lot of cases where this subtly messes up things in a way that is hard to track down.
Yeah I learned that one relatively late in my Python education and was quite surprised at what a footgun it is. It may or may not be justifiable that it's there, but it's definitely a potent footgun.
[2:01] In Lua we do something similar: in Lua there aren't default values for function parameters. By default all parameters are nil, so we do something like: var = var or some_value Or first we can assert(var) to check if you placed the correct type.
Almost turned this to 1.5 speed at the start, but probably could have watched at .75 actually! Great quick introduction to some ways to make my code less nooby
1. 00:25: Manual string formatting
2. 00:36: Manually closing a file
3. 00:53: Using "try:" and "finally:" instead of a context manager
4. 01:07: Using a bare "except:" clause
5. 01:32: Thinking that "^" in python means exponentiation (it's a bitwise XOR, exponentiation is "a ** b")
6. 01:42: Use of default mutable arguments
7. 02:07: Never using comprehensions, or only using list comprehensions
8. 02:24: ALWAYS using comprehensions
9. 02:50: Checking for a type using "=="
10. 03:23: Using "==" to check for None, True, or False
11. 03:37: Using an "if bool(x):" or "if len(x)" check
12. 03:51: Using the "range(len(a))" idiom
13. 04:30: Looping over the keys of a dictionary
14. 04:48: Not knowing about the dictionary items methods
15. 05:01: Not using tuple unpacking
16. 05:11: Creating your own index counter variable
17. 05:21: Using "time.time()"" to time things
18. 05:43: Littering your code with print statements instead of using the logging module
19. 05:59: Using "shell = True" on any function in the subprocess library
20. 06:13: Doing maths or any kind of data analysis in Python
21. 06:23: Using "import *" outside of an interactive session
22. 06:33: Depending on a specific directory structure for your project
23. 07:07: The common misconception that Python is not compiled
24. 07:34: Not following PEP 8
25. 07:56: Doing anything to do with Python 2
Good man/person!
Thanks for this!
hey u can add 0:00 and put the text in the descriotion so we get the topics also when using the skipbar
I have to admit, some of the items in that list actually make sense.
gods work
"you don't need to turn every loop into a comprehension"
I do not comprehend.
Me, who confidently wrote Python as one of my skills in resume: Visible sweating
+1 :))
well the fact you were open to watching speaks well of you
enumerate(l)
And, how did you fare?
@@m.sierra5258 It was beside Java. So none cared😂
As an ex-lecturer (software engineering) I was reasonably happy with most of these. The "if len(x) != 0:" example is an exception though, as it explicitly mentions the important property (length). I find this more readable and I've had student code using "if x:" cause problems when they have passed the wrong type of data in and it's not clear what condition they are trying to test.
In general, a well-read programmer specialising in one language will be fine with compact styles of expression. A non-specialist reader (maintainer, manager, QC checker) can have real problems with some of the denser language-specific constructs. When I was a programmer I had to educate some members of my coding team on complex regular expressions and the trickier bits of Oracle's flavour of SQL. They just couldn't follow the code - they were not front-line programmers and it was above their more generalist knowledge.
Sometimes reducing eloquence for readability is a good call.
Yes. Even code you've written yourself can be hard to decypher a couple of years later and you often don't have the luxury of sitting down and untangling it all because you have other crap to be doing. These days, I write code with the intent that it be at least not difficult to understand and somewhat easy to fix or modify. Being concise and economical is for when you have 1k or less of memory and slow CPUs (though it is definitely possible to go too far in the other direction with that).
As Bjarne Stroustrup often says, the compiler doesn't read comments -- and neither do I :)
Using a less general construct that more explicitly delineates your intent and saves you from writing a comment is a definite positive in my view.
@@isodoubIet Not only that but the compiler will (if it's any good) optimize it to the same machine code (or byte code).
I personally agree that “if len(x) == 0” is more informative and readable, but Python’s official style guide PEP8 recommends using the fact that empty sequences evaluate to False and writing “if not x”
@@rrr-mi9kv IMO the purpose of any style guide is to make code more readable by reducing the amount of time you need to get used to different conventions. There's little value in the style consistency beyond that, which means if violating a style guide (which, by its nature, is "dumb" in that it can't account for every conceivable situation) improves readability, you should do it.
One of the things that I don’t see very often is using underscores when writing big numbers. Example: 1000000 vs 1_000_000. The former is much harder to get at first sight than the latter
Wow, u change my life of coding, but well not something u can save in an int
@@Absoluto777 Yes you can!
a = 1_000_000 * 2
print(a)
OUTPUT:
2000000
You can also format in the print function to get a formatted output.
I would add that for larger powers of 10, it's just simpler to use exponentiation as in 10**9 (or even 1e9, though this gives you a float) for a billion.
Tbh many companies still use old af compilers which not a big problem if in python but if you are taking the code to another language then it's rough
I dislike how the underscores look tho
watching this video really shows how far you can come in only 4 years of work. When I started learning to code I was way too worried about memorizing these things, when in the end all I needed was more coding experience. I'm glad I found some fun projects to work on while basically ddosing stack overflow with questions until the best way to go about things just got ingrained in my brain. We got this bros. My advise is to just find a project that looks fun and make it to the best of your ability, constantly pushing your ability little by little. The rest comes naturally. :)
o]
Can you give me some examples of projects a beginner can work on please? I'm looking at getting into Python soon.
@@bth1279 I reccomend building a web scraper. There's a lot of tutorials on here, don't overthink it, just pick the first one you see and finish it. Thats the important part
Sounds good. I will now redesign the Free CAD Gui using python and or C++. I am learning python just for this :) Im also trying to learn QT in C++.
I have just started a couple of weeks ago and every now and then I get a bit down because I feel like I have to copy everything from other sources of information like tutorials and stack overflow. I’m self-learning so of course I do.. how would I know these things? But it still feels like I’m an impostor and might just be too stupid for it. Like I had to watch a video on implementing jumping into my first mario like pygame project and when I looked at how it should be done I felt so down because I was way off to solve a problem that other programmers make seem effortless and straight forward. I guess I should not expect much after couple of weeks but still feels bad 😂 Will keep trying tho as it’s fun!
2 things on this list that I need to do from now on:
1. replace time.time() with time.perf_counter() for measuring program execution time
2. learn how to install my modules as a package
Notes for myself:
0:27 f string
3:51 range len
4:30 Loop dict
5:11 index counter
5:43 logs instead of prints
1. 00:26 - Don't concatenate strings, use f"{string}"
2. 00:36 - Don't use f.close(), use with open
3. 00:54 - Don't use finally as context manager, use with
4. 01:08 - Don't use bare except, use except Exception or except ValueError
5. 01:33 - ^ is bitwise XOR, not exponentiation
6. 01:42 - Argument defaults are defined when the function is defined, not when it's run! Don't use l=[] as empty list, use l=None
7. 02:07 - Use comprehensions
8. 02:24 - Don't overuse comprehensions!
9. 02:49 - Using == to check a variable's type of a Class is tricky!
10. 03:22 - Instead of ==, use is
11. 03:39 - Using booleans with if, Don't act like a noob!
12. 03:51 - Using the range length inside the loop, instead, learn to use enumerate and zip
13. 04:31 - Looping over the keys() of a dictionary
14. 04:49 - Use items() to get key, value pairs of a dictionary
15. 05:01 - Use tuple unpacking to get all its values
16. 05:11 - Creating your own counter variable (enumerate)
17. 05:21 - Don't use time.time() to time your code, use time.perf_counter()
18. 05:43 - Instead of using print, use logging
19. 06:00 - Don't use shell = True
20. 06:14 - Learn to use numpy for math operations
21. 06:24 - Don't use import *
22. 06:34 - Importing files from other directories are tricky! Learn to Package your code & install it into your current environment
23. 07:08 - Python codes are compiled! (pyc, pycache)
24. 07:35 - Use pep8 to avoid nagging
25. 07:57 - Python 2-3, Notes about range() & keys()
Thx!
Thanks!😁 Legend😊
After many years of python, I still didn't know that socket had a context manager, and also that range had a custom __contains__. Great vid! Also great pacing
Same here. Reinforcing my imposter syndrome ;)
@@Stoney-g1o sus
@@kenonerboy What does "sus" mean?
@@Stoney-g1o suspicious
That's because you'll never need to check if a number is in a range (I can't imagine even one case to use it in). I have 7 years of python programming experience and all this time I knew "x in range" feature. How many times I used it? ZERO
Some of these "mistakes" i make because i write in 5 program languages due to legacy. Sometimes it is diffcult to remember the "proper" shortcuts or style code of each of those languages. Still I'm not consider myself a noob programmer. And switching from 1 version to the next version is sometimes hard to get some of the habits removed. That's why i like to watch and listen to these kind of videos in my freetime.
Tell me about it. I will work with JavaScript for 2 or 3 weeks then onto maintaining some code written in Python 2, then I will do some other job function for a few months, a bit of bash scripting, then to Python 3. I have they general syntax of 5 or 6 languages bouncing around in my head. Every project I go back to, I spend a few hours picking up the nuances of a particular language again.
Each ide has code formating you don't need to remember everything.
This is probably the most immediately useful (to me) video I've seen on UA-cam. I watched this two days ago and have since gone through two scripts I've been working on and edited them to incorporate 7 of these suggestions (yes I counted). There's at least like 3 other nooby mistakes I think I've made in the past but I'm not going back to check lol, so probably will just reference the video in the future. This also made me watch the video about looping in python, which motivated me to rework a loop in one of the scripts, and the 21 other nooby mistakes video, which I think is much less applicable to my current habits but may also be relevant later. Thanks for making this video and the whole channel!
All excellent tips, except one: "if x is True" is proscribed by PEP 8. Doing this violates LSP, which you had just mentioned. This is because a Boolean subclass should work just as well. Just write "if x"; never write "if x is True".
but that doesn't imply if you are checking for a False boolean, or a 'falsy' value, like None, 0 or an empty data structure
@@cristian-bull then you use "if not x:" instead of "if x == False:"
In dynamic languages, people keep making silly "overloaded" apis where
True => {a_key: True}
False => {a_key: False}
but you can also submit a whole object
{other: 8} => {a_key: False, other: 8}
{a_key: True, other: 8} => {a_key: True, other: 8}
to supply more advanced options for the same argument
If x is not the same as if x is true. And if not x is not the same as if x is false.
Christian is right.
@Certyfikowany Przewracacz Hulajnóg Elektrycznych That's because many C programmers have a bad habit to compare boolean things with true or false.
Number 26: Usage of single character variable names. Single character variable names lead to confusion and bugs in your code as it's hard to discern beyond the definition point of a variable what it is being used for. Opt for a more descriptive name as, after all, variable names are free. (As a SDET by trade, this is my biggest pet peeve)
i honestly think people who only use 1-2 letter variable names are just flexing and don't want anyone to be able to read or manage their code down the road. I've straight up just deleted an old script and rewritten it just to avoid dealing with understanding the previous coder's impatience with adding 3-4 extra letters.
@@RobGodMC If you're looping over the indices, it'd be "index". Never end a variable name with "_variable", it's always redundant.
`for i_hate_long_names in range(10):`
@@itellyouforfree7238 First of all: Not a descriptive name, refactor.
Second of all: Sorry not sorry. Quality goes beyond just finding bugs. It also is about preventing bugs and single character variables are a gateway for bugs.
Number 27: thinking that long_names and 1-2 characters are the only options.
The "default gets evaluated on module loading and not during call with undefined argument" is insane.
You are greatly underestimating the sanity of job security.
@@DemPilafian , you greatly overestimate how much the people who design Python care about job security.
i 100% agree. is it really even a default at this point? i think not.
@@willpeterson3943 And for when people do want it to be a reference to a mutable global, the default expression can, just, well, reference that named mutable global.
I can't imagine how this happened. This global they are introducing is literally anonymous. What could be worse?
The parameters passed to a function are outside the scope of the function, even if they're the default parameters. If you want the default to be within the scope of the definition then you have to put it inside the body of the definition. It's a consistent and reasonable choice. More so if you're using a mutable data type it implies you want it to persist. If not use an immutable type.
I'm so nooby I didn't understand most of this video... Lol.
3:45 This advice is *questionable*, sometimes is good to use "if len(x)" just to clarify that x is a list/set/tuple, or use "if x > 0" just to clarify that x is a num, not always obviously, but when you're too deep that you can't verify what your variable is supposed to be it's good to have something that explicitly shows what the variable mean.
Not to mention that this advice would fail on numpy arrays, which are recommended in the same video.
one could argue that if "you're too deep that you can't verify what your variable is supposed to be", your code is already unreadable. However, I also think that "if len(x)" is not that bad. Keep in mind that "if x > 0" actually has a different (and in some cases more useful) meaning than "if x", at least for negative x.
I think it’s called type annotations. The something that explicitly shows what the variable mean.
All your assumptions with this checks may fail if X implements appropriate magic methods for this operations.
@Jazzy Jones it's because an empty array is a thing that exists and is defined, so it can't be false. If anything, it's the [] is False that is weird in python. And none of them are are treated as None, obviously. Nothing is None except for None.
@Jazzy Jones Here's the result of evaluating an empty array:
DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
#pycharm, by the way :D
3:37 Honestly, writing "if x" to check if "x" is a non-empty list hurts readability quite a lot. I always like to write "if len(x) > 0", which reads like plain English (#pycharm by the way)
Came here to say this. You shouldn't be trying to prove how smart you are with your code, you should be making it readable to even the dumbest person
Of course, but the _real_ problem is that you're using x as a name for a container. _That_ is what you should change, and then "if errors:" or "if tasks_remaining:" should become _a lot_ more natural.
Vedran is right. "if errors" reads much more like english than "if len(errors) > 0".
@@Jeyekomon Yeah sure, but at this point wouldn't it be better to have a boolean variable called "errors_occurred" (defined as len(errors) > 0) and directly check that? In general, using a non-boolean expression in an if statement is fundamentally wrong in my opinion
@@francescocariaggi1145 Opinions are nice, but have you ever coded like that? My opinions are based on decades of experience, working on systems of various complexity -- and they align with the opinions of people who have written PEP 8. I've seen, and written, "if errors" or analogous constructs, many times. I've never seen your "errors_occured" approach.
I can even guess why: because it's very easy for such a separate piece of knowledge to get out of sync with the knowledge already present in "errors". Having a single point of truth is a powerful concept, especially in complicated systems.
About your "fundamentally wrong", not many such ivory-tower opinions survive the collision with the real world. We do think of various operations polymorphically, and denying that in the name of some simplistic type alignment is not useful. It is the same idea as "using a non-iterator in for-loops is fundamentally wrong", or "using a float in floor division is fundamentally wrong", yet we really want to write (and think) about "for error in errors", not "for error in iter(errors)", and "n_frames // rate" instead of "n_frames // int(rate)". Practicality beats purity in many occasions. It's exactly the same with "if errors" instead of "if bool(errors)".
Can’t believe you are naming variables the lowercase “L”. That is one of the things that was stamped into my brain when learning programming. Now I feel weird if I use it lol. Anyways, super helpful video. Never knew about enumerate(). This is gonna be a total lifesaver!
@meme why single character vars are bad? I'm newbie
@@gustanoid shortened var names make you memorize stuff. It's fine if it's something important that will be seen a lot, but for most purposes variables should have done sort of description of themselves. You don't want to have to remember a new encyclopedia for each member of your team when doing code reviews.
@@kidneybean5688 I'd like to give an example to extend your argument;
For example, you could have a list named illustrations and would like to use map function to map into another list. While using lambda/anonymous functions/closures/whatever you name it, using the name illustration as parameter makes the code too long and hard to comprehend. It is clear that you are iterating over illustrations, so simple single letter like x or a would come in handy.
names = map(lambda illustration: illustration.name, illustrations)
names = map(lambda x: x.name, illustrations)
This is also applicable to other languages, don't duplicate the obvious naming. Adding player_money field to Player class is usually pointless, don't name it over explicitly unless it is needed to be.
@@kulkalkul I've been writing a lot of code in Elm lately (great language), which definitely loves its newlines, so I can get a little more spacey with my names. My solution to this in a language that takes up more horizontal space would either be to shorten "illustration" to a mnemonic like "ill", "iln", "ilstn", or to make some newlines w/in the function call.
@@kidneybean5688 I definitely agree. I also started to use Rust for almost a year and libraries in Rust tend to use long names for sure, as being able to vertically space out code helps greatly. Though, I still prefer shorter names. It wasn't fun to use InteractionApplicationCommandCallbackDataFlags tbh :D
Had this pop up in my recommended, figured I'd watch it for fun.
It is now several hours later and I've incorporated #1(f-strings), #16(index counters), and #17(time.perf_counter()) into all of the scripts I use for work.
Fun story: I was hired as a PHP developer, but soon was tasked to overtake a Python project from a freelancer that was leaving in just a couple of months. Lambdas, list compression, confusing yet powerful multi-inheritance, wrapper functions and attributes, constructors and initializators, bytecode compilation, tuples, dictionaries. I quickly fell in love with the language. Now, I'm back dealing with legacy PHP code at another company and miss Python dearly. Still need to figure out a pet project, so I could finally scratch my nostalgia itch. #pycharm
Look at Ruby, it is Python done properly.
@@transientaardvark6231 Thanks! I'll look into it. For now I'm having fun with F#.
It's list comprehension, not compression
@@transientaardvark6231 It is also Python abandoned
@@itellyouforfree7238 yeah, looks like it got autocorrected, and I missed it
I'd love to hear more detailed explanation for 3:23. Since it's for noobs, people won't get why they shouldn't use == instead of is (or vice versa). AFAIK there are at least two reasons (1. speed--since "is" checks reference while "==" checks value, 2. custom equality definition), but I'm not sure if there are more.
The reason for using "is" with None instead of "==" is ostensibly speed, but the speed increase would be totally unnoticeable under normal circumstances. Another reason is that None is a singleton, so using "is" acknowledges it as such. This is one of those things that in actuality makes basically no difference, but people will nag at you anyway, so it tends to be noobs that still do it.
@UCAl4YW8UbOfjz9ZxaQcXlzw I think you are talking about the example prior to what the commenter was mentioning, and I can assure you LSP is relevant there. Furthermore, LSP absolutely does NOT say anything about the relationship of two children of the same parent. Substituting one child class for another from the same parent is only safe to do if the function/method/etc was expecting the parent class, not if it was expecting a specific child type. In general you cannot substitute one child class for another, regardless of their common parent. E.g. in a print_id function that expects an object, it's fine to pass a list or an int because they are both objects, but in a print_all that expects a list, you certainly cannot expect it to work if you pass an int just because int shares the same parent type (object). I think you are also mixing up LSP, aka "behavioral subtyping", with Python's notion of structural or "duck typing". LSP is a general principle not specific to Python. It is a stricter requirement than duck typing. I am definitely not saying you should always try to adhere to LSP, or even to duck typing for that matter. However, checking a type for equality is a violation of both LSP and duck typing (well, assuming you actually do something different in the two branches of the if statement). The "Pythonic" answer is to only enforce duck typing, not the stronger LSP, but in practice this becomes very difficult to maintain on anything but small projects. Many developers therefore choose to abide by the stricter LSP most of the time. Moreover, if someone (especially a noob) tried to check a type for equality, say to overload behavior if they got a str vs bytes, an isinstance check is morally probably closer to what they wanted than "just see what happens if you call quack()".
If you start to worry about speed then Python is not your language.
I always loathed that there is the “is” statement and the == it’s poor design. After all, functionally you do the same thing.
A trademark of a bad language IMO.
@@CallousCoder There are subtle differences, but I do agree; even the reference comparison is easier to just do it like what one would do with C (pointer dereferencing).
@@arduous222 exactly!
Love all the tips except for a couple. I like writing things to be crystal-clear, and I find that `if len(x) == 0` takes just a little less mental processing power to understand. Same for `for k in my_dict.keys()`.
yes, I thought the same thing with both examples
And the fact that you "code highlight"ed your code passages even though they are not rendered on YT is oddly satisfying :D
It's also worth noting that "if len(x)" is necessary for some data types. Pandas famously throws an error if you check the object itself: "The truth value of a (Series/DataFrame) is ambiguous." (Thanks for the headaches, Jeff)
Agree with the .keys() one, save for specific scenarios, e.g., codebase used mainly by python senior devs.
In general, I'd vouch for explicit intent, as I could see reasonable arguments and counter arguments behind multiple conflicting answers to "what should be returned by default when iterating over a dict"
Yep
You can tell m has never passed 0 instead of [0] into a function they wrote and spent half an hour trying to figure out why their iterator wasn't working
This actually made me feel a lot more confident about where I am in my Python learning journey. I expected to fail all of these but I actually consistently do most of the things you recommend!
I think doing a full "Python 2 + 3 in the same module" video would be pretty valuable, you'd be surprised how many programs still use Python 2 to some capacity for scripting
exactly. i am stuck with python 2.7 at work because the program that we use has not yet updated their API to be compatible with python 3...
Bit of a necro, but I actually came down to the comments to make almost exactly this point. In my day job, I deal with ArcGIS, but being a municipal government job with the usual attendant budget and manpower, we are still realistically a painfully-long way away from the main IT department completing their migration to ArcGIS Pro, which is when the developers finally moved to using Python 3. So I write almost exclusively for Python 2.7, because that's what's supported in the older ArcGIS Desktop.
Constant references to "Why are you still using Python 2? Upgrade already!" when I'm chasing a problem are more than a little tiresome, as that decision is many, many levels away from being close to being my call.
@@altasilvapuer Came here for exactly this comment. At my job I just ArcGIS Pro on my machine, so most of my work is done in Python 3. However we use ArcGIS Server instead of Enterprise and all the environments use ArcGIS Desktop. So if I want to write any tools to sit on the server side I write in Python 2! Enterprise is in the works though...
Haven't even expected how "noob"ish I was.
The channel is underrated, because it's rare to se someone who digs deep into the topic with lots of details. Usually creators just run through basics of Python and that's it.
#pycharm
That's just a tip of the iceberg. Next you find out it's not possible to use Python in a non-noobish way. It just makes it impossible to write a program properly.
this is one of those videos where I learn about a bunch of things I'll immediately forget without usage. I reckon I ought to save it so I can reference it the next time I'm pythoning
Now even if you save it, you're just going to let it sit there and some things will change by the next version of python. Then, after you review it, you'll still forget it later. And the likely scenarios where you do need it won't come up. Info dumps, man. It's all the same.
As for PEP8, please do read the intro: "Many projects have their own coding style guidelines. In the event of any conflicts, such project-specific guides take precedence for that project."
I thought I was ok, but you got me with perf_counter. Though I'm usually measuring time in minutes not microseconds, I had no idea I was being so inaccurate this whole time. Great video!
1 Snooty coding habit you need to ditch:
Coming up with lists that denigrate other coders.
Don't pretend you're instructing. You're not.
18: No, that does NOT look a lot better. In logging format, just use {} formatting instead of %. It's more consistent with the earlier referenced f-strings and therefore much more readable.
Agreed. If data such as the time capture is required, or the ability to pipe to a file is necessary then logging is the way to go. Otherwise, just use modern formatting.
In either case, debugging through standard out messages alone is the real problem, and switching to logging isn't an improvement there.
agreed.
I am no beginner - I started with python back in college circa 1991. But I do a lot of mathematics in pure python because I need the bignum integers and mpfr support for high precision floats.
Numbers 6, 7, 9, 12, 15, 17 are really useful and well illustrated. Thanks for the explanations & comparisons
About the 19 tip: There's in fact a function which I use to avoid typing in manually the command inside a list. Instead of just typing the list, when using big commands, I just import "split" from "shlex" module, which can be used to split text into lists (i.e. subprocess.run(shlex.split("find / -type f 2>/dev/null")) and this will handle the problem
That solves many of the problems with 'shell=True', but not all of them - in particular you still have to worry about things like spaces in filenames if you're generating the command from strings. It's potentially more exploitable with malicious user input as well. It really is best to just bite the bullet and form your command as an array.
Also FYI your example doesn't work - it will execute the command "find" with four arguments: "/", "-type", "f", "2>/dev/null", which isn't what you meant. If you're not using 'shell=True', you can't use shell constructs like "2>/dev/null" or "|" or ";" etc.
There are 25 tips, not 19
@@csanadtemesvari9251 would seem they meant tip 19/25
One more point that I think is a must in the list: defining class properties and expecting them to behave as instance properties. That is, defining properties on a class outside of init.
Seen this one to be genuinely confusing people that come from other languages.
Is it possible that you're confusing properties (i.e. @property) with attributes (e.g. self.x or cls.x)? they aren't the same thing, though I do believe you make a good point.
@@valorien1 no, it's about defining attributes in class definition vs. assigning inside init
@@kristobaljunta Which was precisely my point. you wrote "defining class properties..." when you meant "attributes". They aren't the same thing.
@@valorien1 what would i google to learn more of this?
3:36 Don't use if x is True, use if x. Don't use if x is False, use if not x.
6. 01:42: Use of default mutable arguments
9. 02:50: Checking for a type using "=="
17. 05:21: Using "time.time()"" to time things
18. 05:43: Littering your code with print statements instead of using the logging module
19. 05:59: Using "shell = True" on any function in the subprocess library
22. 06:33: Depending on a specific directory structure for your project
23. 07:07: The common misconception that Python is not compiled
24. 07:34: Not following PEP 8
3:43 I find "if len(x) > 0: ..." more readable than "if x: ..."
4:27 Nested tuple unpacking is new to me though, thank you :)
5:51 I also didn't know about logging. I always rolled my own logger class.
It depends how you use x. Sometimes it makes more sense to use len in order to understand the code
@@Grivian Of course it depends on how you use x. But semantically speaking, the if statement expects some kind of condition to be evaluated, and x probably being a list in this context is not a very prototypical condition, unlike len(x)>0 which to me actually looks like a condition.
Somehow this channel is extremely underrated. I’m just a beginner in python but few of the videos have helped me to such a extent that has reduced my months of work.
Just to give an example the video on for loop Vs while loop is a gem.
Thanks a lot sir for sharing this valuable information.
#pycharm
2:40 Use "yield" and turn that function into an iterator.
3:30 Use "if x", "if not x" instead of literal "if x is True", "if x is False".
“if x is True” is totally valid, cause what if x isn’t a Boolean True/False, we want to make sure it point to that singleton object “True”
@@yjc5931 You want to make sure that your variable is of type bool, not anything else, like a list of three strings, which is also True. And this exactly is not a good coding practice in most cases. Generally your code should not need to test type of your variables.
@@Jeyekomon agree and disagree, I agree that in most cases we won’t need to check it, as we can just use type hint in function argument, but I don’t think we can completely ignore the fact that in the absence of type hint, we won’t ever need this. Just pointing out the video isn’t wrong either
Great video. I so much agree with 8, few times I stumbled across the projects that overused list comprehensions, sometimes it was painful to read such code. List comprehension is a fantastic Python feature, but people should learn when to use it and when to avoid it. Readability should always come first imo #pycharm
Well said!
readabilty can be improved by using multiple lines, in this format:
[ this
for this in that
condition
condition
]
@@kilianmio6243 Python is regarded as self-documenting code. As soon as you need to explain what your code is doing, it's not good code. Comments should only be used for why your code is doing something.
@@mCoding #26, not installing black and having it run on save...
With great power comes great responsibility. Python has a ton of features that can both make and ruin your code. Knowing what tool to use in each situation ensures the code is readable
Woow, as a total newbie - started with python this semester at collage - this was an information overload, great vid though.
P.S. Loved the "smash the like button an odd number of times", but I'm at even and it is currently liked now, so I have to apologize for even momentarily disliking to achive this greatness
3:39 even if len will cause a little bit more execution time, it is much more readable compared to if something itself. Also, if you want a seperate *raise* call and/or logic for each case, you have to do a *seperate if ... is None* check and *if len(...) == 0* check anyways.
As an experienced Python user (5+ years, 10k+ lines experience), I am happy to know that I followed most rules mentioned in this video correctly. I did manually open and close files in REPL because you have to type 4 more space on every line afterward if you open a file using context manager (with statement). I also do this when I am writing a script in a hurry and need it to work right away. I don’t know that the perf_counter is the correct way of timing code (though I rarely measure the running time of Python code at all). For creating my own index variable, I have to say that it is sometimes necessary if you are using a while loop and the update condition is not as straightforward (for example, implementing some algorithms). Also, while I agree that numpy is powerful, there are a few cases that the built in Python math is better, such as big integer by default, and you never need to worry about overflow.
5:43 "number 18: littering your code with print statements instead of using the logging module" #pycharm
I felt that... I've been doing personal python stuff for at least 3-4 years ... yet I actually never spent the time to learn how to use logging and how to format stuff with it, etc.
same here! ive been writing python for almost 8 years now and i still use print statements rather than logging lmao, logging looks so much cleaner though
It looks nice but I can live without it
Meh, I've been using logging for a decade. Most of the time I find myself just relying on a mix of assert/break/print anyways. It's just quicker and easier and 90% of the time I'm just doing a temporary inspection. Logging is super great for end-user information, but for debugging break/assert is the right tool IMO. What's the point in all that pretty printing when you're just querying about internal state, etc. Obviously this changes a bit in collaborative projects, but even at that, I'd like to argue asserts are descriptive enough.
I am learning Python and all your videos are amazing! Thank you so much for helping me in the python learning process. I really like your teaching style. Some of your python topics are above my python comprehension because they are intermediate or advanced and that is okay. I will come back to those videos in the near future when I gain knowledge in my learning process.
In many of your videos there are hidden gems. Those gems makes me realize that noobies need to ditch another bad habit: not reading the python "What's New in Python" documentation when a new python version comes out. I learn so much from reading the PEP documentation with example code that are referenced under the new features/implementations. Plus, I feel this will keep me in sync better with the python language updates.
Great tips and I learned some stuff myself. For 6:14 / tip 20 (always use a math library like NumPy) though, I really need to caution against doing it blindly. For example, libraries like NumPy are mostly designed for handling large arrays. I don't want to get into the specific context where I encountered this, but I have seen situation where lots of people just blindly assume NumPy is the best and used it for doing math on 3D vectors using NumPy, resulting in really slow performance. From what I have found, NumPy is not always faster than just doing the math in Python directly if you need to do lots and lots of 2D / 3D vector calculations because in Python you could just represent them as a simple struct, and you don't have to wrap / unwrap them as NumPy arrays (which, again, is optimally designed for handling large arrays, not when length == 3). If you are doing data analysis type work with a long array of 100+ items, that's another story.
So bottom-line is: try to understand the performance characteristics of your library and understand if it actually fits your needs. Just because a library like NumPy is popular and fast for its intended use cases, does not mean you should just use it for everything math related without thinking.
The thing is, if you have to do "lots and lots of 2D / 3D vector calculations" you would probably stack them in a two dimensional `numpy.ndarray` and apply the same transformation to all of them simultaneously.
I was very happy when I first found out that you can use comprehensions in python because I am absolutely obsessed with the set builder method. I have been using comprehensions since I was a newbie.
excellent video, yet laconic, thanks a lot! I'll add to your #22 that usually the most convenient way is to install a package you're working on as "editable", with pip install -e . (dot)
took me a while to wrap my head around package/module importing rules
#6 can be confusing for people coming from languages like C#. In C#, we call these "default arguments" (which have to specified at the end of the parameter list) and they are always that value if nothing was passed in. Unlike Python, they aren't stored in memory to be used every single time that method is called if nothing else is passed. I think that's a horrible way for python to handle it.
That's pretty much every language except Python, which decided to be weird. Python has a lot of quirks like this.
That's an extremely stupid way of Python doing it. I actually had to rewind and listen to #6 again, because I thought I misunderstood something because no way it can be THAT stupid. Apparently it can.
@@sgtGiggsyIt seems really stupid until you realize you can (ab)use it. It's a pretty neat way to do caching, for instance.
@joje86 Independently of mutables, I would gladly trade this trick for the ability to use dynamic expressions of previous parameters in the defaults instead of having to set default=None and have default initializations take half of the function body.
3:26
Using "is" to check for booleans is also a noob mistake or even a bug, as when using other libraries is will return false no matter what. So in the case of
a = np.array([True])
a[0] is True
returns False. Just use if a[0]: or if you want False statements, if not a[0]:
Oh yes. I just checked it. It's because
>>> type(a[0])
The 'is' keyword checks if two things are the very same. Two things which are not even the same type cannot be the very same.
>>> np.bool_(True)
True
>>> np.bool_(True) == True
True
>>> np.bool_(True) is True
False
This is not a bug but intended behavior!
If a[0] is something completely different than if a[0] is True.
What about sorting a 10-gigabyte data set just to get a max or a min element of it?
Or assigning values of entirely different types to the same variable in different branches of code?
Formatting strings, comparing booleans, looping over dictionaries or using comprehensions are nowhere near real problems with Python shitcode (most of which is actually promoted by the language). Imports should be handled by the IDE; if it doesn't then that's a shitty IDE, perhaps exactly matching the language quality.
#23 mostly happens because people expect bytecode-interpreted implementations to work almost as fast as compiled implementations. But due to Python's dynamic typing and slow function calls, it lags behind a lot more than other languages
Holy crap it is soooo slow haha. Doesn't matter most of the time though.
Most of those comes with a bunch of asterisks for them, an a couple of them are not necessarily what you want. For example, while you can often get away with "if x", you run into the issue the the boolean value is implicitly infered, but one of the basic zen of python states that explicit is better than implicit, and in this case you can often increase the readability of your code by explicitly stating what kind of property you expect, and it will tend to also give you better errors closer to the problem when you get an "you cannot use len on this thing", rather than it just performing the code for 0 lenth collections. That is a intermediate error, and not a noob one though.
Here are some examples of the asterisks (as in cases where it is not quite right):
1) When nesting string construction into each other, things quickly become complicated and you will want to start making use of more of your string building tools than just format strings.
3) When you are building context managers yourself or you want custom local behaviour.
4) The bare except can come up when dealing with dangerous multiprocessing, and you want to ensure that you do not leave zombies behind, though you would generally raise the exception again after doing graceful shutdowns. For the except Exception: case, it is especially useful for multiprocessing calls on the worker side, where you then have the option to gracefully send back information of the error.
6) There are some very rare cases very similar to singleton cases where you would want this. That said you always have to think really carefully about singleton cases, so this would be a super rare case where it would make sense.
7&8) They kind of already go there, and sometimes comprehensions are just not really good for the thing you are making, and sometimes you really should be using numpy instead.
9) Even isinstance is far from broad enough for a lot of cases, often you want to use hasattr instead to work better for ducktyping situations. Naturally there are also the rare cases where you need to make sure it has not be inheireted.
10) "==" has a different behaviour than "is" when the arguments are of different type, and especially the == True or == False has uses. Also be aware than the latter 2 can be vectorized with numpy, while the "is" format cannot (at least nicely).
11) See above about clarity/location of errors.
12) The idiom is used when you need to do alterations on a mutable collection and those alterations are not limited to operations on mutable internals. It also much more often happen as part of algorithms, though those are more commonly done with just range(n). An typical example (where it is double nested) is when you apply a local filter to an image, and you loop over the elements of the filter, and then do numpy operations to construct the image component from each part of the filter.
13) Okay this one is a bit far fetched, but the dictionary.keys() object can be a lot smaller (iirc), so if you want to build something based on those keys outside of the shared memory area, you could package it down and send that instead of the larger full dictionary. This would mainly be for multiprocessing.
14) Most of the time you actually want the dic.values() instead, if all you want are the items.
15) Tuple packing and unpacking can take time, especially if you have something long, and it will create the parts you ask for. If all you want to check is the first 3 elements of a 100+ length thing, then do not spend the time writing a, b, c, *_ = items, just write first_three = items[:3].
16) The case here is when you do not necessarily increment it every time as it refers to something slightly different, but I am unsure of whether that really is needed, and that kind of algorithm code is much more often written in c style languages.
17) Most of the time, you should run cProfiler on your code instead of checking manually this way, it will give you much more information.
18) Logging can be problematic in multiprocessing situations, whereas print is usually pretty safe there (logging can get into nasty deadlocks).
20) Numpy is nice, but a lot of good algorithms cannot make good use of it for at least some parts, and sometimes going with O(n log(n)) is better than vectorizable O(n^2). Neither numpy or pandas uses GPU acceleration, so once you start wanting that you also need to move away from them. Pandas is also kind of weird, in the sense that if you take a look at benchmarks, it tends to be quite poor compared to just base numpy, so you have to really want the other features of it for it to make sense to use, because a lot of your basics are just going to be slower because of it.
21) The general python style advice is to import modules and not component out of modules in most cases.
22) Making everything as packages makes it very hard to run parts of the code as scripts, so you should not just convert everything into packages.
23) This is blurring the waters a bit, as it is intepreter dependent, and you will not find compiled files for all python code. I for instance have not see such files for interactive sessions and other equivalent things. It is correct that python code is built into python byte-code at least im memory though.
25) In python 2, print was a statement and not function. Iirc generator comprehension also first came into python in one of the 3.x versions.
Of all those habits, the only one I would say I fall into is 17, as I was not aware of time.perf_counter.
8:30 not necessarily true. *Only* works if the value you are checking is exactly of type int (or bool). Does not work for other int subclasses, any other object, and , most importantly, float. For those it will still fall back to one for one checking all values by iterating over them. That is very very slow for large ranges.
2:47 wait can you do that “ij-entry” thing and have it store the actual values of i and j? I had no idea
#pycharm
1. In my defense, print is a perfectly fine way to debug a code. It's simple and easy assuming it's only temporary.
Also, I'm surprised you didn't include using top-level programming as noobish. It's fine for simple code but some beginners still make their code looks like it's not an OOP language
After years of writing code professionally, I still can't be bothered to use the logging module for debugging. print statements all the way, they're easily added and easily removed.
@@jurgenhaan7652 One advantage of logging instead of printing is that you don't have to add or remove print statements manually - you can easily config logging to only print debug information in debug mode, but disable everything except critical errors in production mode.
I place at the beginning of my code: DEBUG = print
When complete, I do a line replace and get rid of all of them. And DEBUG gets its own highlight, and so stands out from needed "print" statements.
@@k.chriscaldwell4141 "when complete" is not always a hard statement. That's why I also do as you, but don't remove the line, instead replace DEBUG for #DEBUG....just in case.
" still make their code looks like it's not an OOP language"
Well, it's not.
#pycharm I'm migrating from primarily C# and C++, and I really appreciate this video as a guide to get into good Python habits. Thank you for this!
why the migration ?
#pycharm Gotta love how you're channel's python tips consistently manage to surprise, been hooked since the video about the cache decorator!
Owo
I had no idea about the issue with mutable defaults. That's wildly counterintuitive.
Of course, they probably can't change that behavior because some major library somewhere is probably using it as a core feature.
The enumerate thing is a life saver never knew this existed
Glad to help!
then you are a real noob😂
#pycharm I hopped in this video thinking I was the expert then the 'perf_counter' hit me hard... Amazing video as always :D
#pycharm
Really like your videos!
#pycharm I'd really love an expansion on the logging and use of perf_counter. Timing stuff always seems like a pain
Thanks a lot for this. I'm an engineer with a lot of experience in other languages but Python is pretty new to me. This helped me easily understand most of the more niche concepts and the dos/dont's. Thanks!
Most of these things are really specific to Python. I’ve been learning this language by myself doing some personal projects and I see that I don’t know most of these pythonic stuff yet, but I can code many things following what I already know in other languages like javascript, C, C# and Java for example.
When looking for beginner courses I usually don’t get many of those features that were shown in this video, instead usually those basic courses teach about introductory things in a more procedural way and in some cases in object oriented fashion.
Yeah I learned python before java and for a long time most of my python code looked like java code. It's not the end of the world if you do this, but it is still very important to learn the "pythonic" way of doing things
5:43 this isn't a nooby thing to do. The logging module requires some thought and setting up (especially if your code is spread across multiple files?), while the print function is always just there. Even the documentation states that in some cases *print* is the best tool for the job.
Also the logging module can litter your output with unsolicited logs at times. I'm not sure where they come from (they're a bit cryptic), but I've been getting that while using imgui with pygame.
Using print for logging is absolutely a nooby thing to do. If you have "unsolicited" logs, that is what logging levels are for.
@@michaelpuskar6975 changing levels will turn off those unsolicited logs AND the ones I'm soliciting of the same level.
@@skaruts Then you need to rethink your strategy. But debug, info, warning, critical are industry standard. At least with logging you can turn off all debug messaging in one place. With print statements all over your code, you're going one by one.
The logging module is very flexible.
@@michaelpuskar6975 I think you're missing the point here. No one is saying logging is inferior or that it can't be the best option in certain circumstances. The point is simply that it's not always the best option, and claiming that using prints is a nooby thing is just masturbatory elitism.
I do simple utilitarian scripting in python, for example. I don't often delve into complex projects with it. For the most part, print statements are more than enough for me and it would be a complete waste of my time to be configuring the logging system.
*_"Then you need to rethink your strategy."_*
Indeed. I use print statements. 👍
Awesome, thankyou! These were really easy to correct with refactoring tools, or at worst, search/replace. The surprising thing was, how often some of these nooby habits occurred in my very well supported dependencies!
Love your format. Thanks for taking the time to put these videos together for us! I've decided to move over to a new career in python and data analysis after working for years with php. Every day is a school day👍 #PyCharm
Well I have now learnt that I am officially a noob lmao. #pycharm
6. You shouldn't hand over mutable objects anyway in most of the cases. This can always cause sideeffects.
Counter for knowing nothing about what was discussed in this video...
3:26 I use “==” all the time for None. Why make it a special case?
As for “True” and “False”, I do sometimes compare explicitly with these values, where the variable might have a third state (“undetermined”, represented by, wait for it, “None”).
I also used explicit compare with True and False, although in most case an enum will be appropriate (or explicitly compare to None).
About why using * is *, I think that on those cases is mostly a convention. There might be a minor speed advantage for it as well but it mostly irrelevant. Still, the * is * format is the generally accepted way, so if you write code for other people as well, you should probably use it.
it's mostly special cased because of the fact that they're all global constants and could be compared by identity
but imo it's not a valid reason to special case it as it's rather implementation aspect which doesn't really affect user code semantically
also as i can see only None is guaranteed to be constant and singular, True and False aren't defined this way, but only happen to be in CPython
and PEP8 actually states that checking boolean values with 'is' is incorrect and correct way is using implicit behaviour of 'if expr' or 'if not expr', so, this vid actually contradicts itself :)
tho i would say using 'is' for None checking is preferable as it's the way standard library is written, but it's not necessary at all
I'd like to talk about (list) comprehensions.
The reason you should be using them instead of basic for-loops is because the latter have issues with respect to concurrency and readability (independent of language).
But this doesn't mean that there are no better alternatives. To me an issue with comprehensions is that they extend the language syntax while only providing few benefits (Just like "lambda" in my opinion, just define a function). This is mostly because functions already exist for this kind of functionality, most importantly "map" and "filter". Also, comprehensions are inherently limited such that conventional syntax often needs to be used on top of them anyways, like sorting. This leads to a huge mess of syntax rules.
There is unfortunately the problem that Python does not offer a lot of inbuilt support for chaining functions neatly (like java streams), which is why I see people preferring comprehensions. However, i tend to think general, simple solutions are almost always better.
Have you not tried chaining decorators or explored the functools module?
I also disagree about the "d not d.keys()" iteration at 4:40. I think it improves readability actually
do what you think is better, it's a really minor thing in general :p
tho for some reason with python implicit things are preferred
@Tochka What exactly "in dictionary" does was not an easy decision to Guido van Rossum (the creator of Python) as far as I am informed. This should imply that "in dictionary" has ambiguous intuitive meaning. Hence, it is a problem when intuitively reading code, which is why sometimes using ".keys()" explicitly can help.
On a side note, if you are using "if x in dictionary:" or the like, there is a chance you are also retrieving the associated value. In that case, using "dictionary.get()" and comparing against the default return value is safer and most probably faster.
I am amazed by how many things I actually did right even though I just learned the syntax.
#pycharm best channel I've watched for python! Graduated recently and know python at a standard level and your channel gives such easy access to more in-depth topics
1- Using python
That's the only nooby python habit you need to ditch
00:04 I have the cracked version of pycharm lol😂
2:25 generator comprehension
5:57 logging format
5:42 using time.perf_counter() in place of time.time()
0:12 vs code btter then pychrm
agree
Vim is even better
Formatting with string concatenation is the worst thing you can do, whenever I learn a new language, the first thing I look at is how to format string.
A sign of great experience! I've been burned by string formatting in every language i know too many times to count!
One of the biggest ones is not realizing that mutable objects are always treated as references when passed as arguments to functions. I cannot count how many times my coworkers passed a mutable object to a function, modified that object within the function, and then _returned_ that object, not realizing that the object passed was already modified. And often they would write the script with the assumption that the original passed object didn't change, and things would go wrong without them realizing.
Notes to self
0:36 using with
1:06 bare except
3:51 range len instead of enumerate
4:48 dict item methods
5:43 logging
6:33 package structure
7:34 PEP8
To anybody worried about this: most of genuinely do not matter. Is somebody looking over your code going to think you’re trash because you don’t use enumerate? Or zip? No.
You should be worrying about readability, but please focus more on efficiency (unless you’re on a hugely collaborative project)
One of the most useful python oriented channels i came across youtube. Watching this video for the nth time too. Thanks for the continued great content!
0:35 I know how to use f string it was the same story in Js with the backticks, I JUST PREFER MANUAL STRING FORMATINF
Great insights, with at least one observation, and forgive me if I come off noobish, but @3:37, wherein the context of the function presented, for x is unclear both in terms of expectation, and the specified function's purpose. The function looks to be a validator, and depending on what it's meant to validate, simply refactoring the code to a single if x. may or may not do the intent of the presented code. Take for example if it's meant to ensure that X evaluated to true or false as a value, but more specifically true. The first test would pass if x evaluated to true, but not if it evaluated to 0 or false. However, the second test may pass x if it were a string value with a length > 0. But since it's not converting x to a type wherein a length attribute exists, the context will not evaluate to true for a numeric type (as is). so a value of 0 will not be given a length of 1 allowing it to pass. So a string or number evaluating to 0 will never pass these tests, and will not pass "if x" either by my understanding for 0 translates to false.
If this function were to have behavior based on x instead of validating, that's also a matter that a simple "if x" test would not address as the two tests can deduce type after a fashion and behave differently based upon which test "passes", and in both scenarios, what a pass does in that context. I mention that because the pass case doesn't seem winner take all how it's coded. so both tests should execute and do something, again depending on what behavior that pass is meant to do.
@Derek Youngman presented the following in his response, which highlights some of the nature of my point, that if x will evaluate to true if there's a non-zero evaluation of x and that includes its behavior with True and None, when not checking their identity with the is operator in those cases, but say x = None. if x will not evaluate to true, but is not specifically checked for with is in the case presented. @Derek Youngman response:
If x is True doesn’t do the same thing as if x.
x = 4
if x -> True
if x is True -> False
y = True
if y -> True
if y is True -> True
/end response
So, I would conclude that it would depend on what the objective of the code is on whether this case was noob goodness or not. But the case may have also been vague in the interest of the claim, but that further should have made the example more compelling to show where the claim had its validity in a more IRL context than a hypothetical one, for clarity sake as the claim of one not knowing the language was pretty strong, given the lead-in stating there was nothing particularly wrong with the statements themselves. If a bool or null check is what's called for then by all means they should be used, it's like the folks that hated the or tags, seems a bit personal and maybe a little pedantic. And to that end, the function name should have changed to reflect that drastic change of the functions purpose if no longer checking for bool or length, but if non-zero instead, which again, doesn't address if the original code was fit for purpose and the person "optimizing\
efactoring" understands they why of the original logic to conclude incompetence of the original Developer.
But I ain't one to gossip... .
Almost 2 years into python, and i've programmed a lot of things in python.
Well,... there's some of these things I knew but didn't apply ( enumerate instead of range(len)), and some I simply didn't know.
I'm baffled. Finally a video of this kind, or the kind "things I wish I knew" that is actually helpful ! congratz ! (Also shame on me, for taking bad habbits, this isn't going to be easy to fix)
Thanks. But honestly, I disagree on many. For e.g. point 1 - Manual String Formatting. If it works, it works. While I personally use f strings, have no problem with anyone or myself using + operator as well. It is there for that reason - because it works. Nothing wrong with using it or makes one a "noob" which is derogatory when it makes not much difference other than pleasing ego for some people as if they're elite somehow because they use f string over string concats. So I think it is bad to say such a thing when the language allows it. And "prone to error"? Code would break and the person would correct it or maybe they know what they're doing and it doesn't break.
"Readability" is difficult thing to tackle. But I guess one should tailor readability of his/her code to the preferences of whoever is signing off their wage cheque.
Some "best practices to improve readability" make the code less readable for everybody, but those few who are used to this or that specific way of coding..
Sometimes I code "if len(x):" or "if len(x) > 0:" when it makes it more clear.
I think that's the only exception to your rules that you haven't mentioned.
Returning to this video for 25th time, so helpful, thanks!
Solid video - thank you for making and sharing!
Although now these things look natural for me, it helped me a lot when I was a beginner, first time I saw this video. Thanks for making videos.
You are very welcome!
little pushback at 5:14 -- this is presuming you already have a well-defined iterative constraint that's designed to be used in very limited scope.
sometimes it's helpful to have an external iterator that you're adjusting in multiple places. and as this is programming and you can accomplish any task one thousand different ways, of course it's possible to extend the functionality of a typical enumerate iterator. but when it coms to simplicity and readability and the way you're structuring the logic of an algo, sometimes the option is convenient.
I had no idea about #6. I'm sure you just saved me from an eventual debugging nightmare; I can imagine a lot of cases where this subtly messes up things in a way that is hard to track down.
Yeah I learned that one relatively late in my Python education and was quite surprised at what a footgun it is. It may or may not be justifiable that it's there, but it's definitely a potent footgun.
3:34 why is this a good idea? Shouldn't you just look at truthiness? (i.e., `if boolean_value:` instead of `if boolean_value is True:`)
[2:01] In Lua we do something similar: in Lua there aren't default values for function parameters. By default all parameters are nil, so we do something like:
var = var or some_value
Or first we can assert(var) to check if you placed the correct type.
it's important to note assert(false) will also fail, not just nil values
Almost turned this to 1.5 speed at the start, but probably could have watched at .75 actually! Great quick introduction to some ways to make my code less nooby
So glad to see these types of vids for any language.