I really enjoy Arjan code content. I found this channel late 2022 and it's helped me deeply. A lot of time when you are being taught clean Code, they make it seem like OOP is the way to go. But obviously as Arjan has pointed out, in python, there are times when you are better off with a module or with just functions.
The difference between getters/setters in Java and Python is that in Java, if you don't use get/set and make the member itself public, you cannot change that in the future: you're stuck with a public member forever as part of the interface. In Python, you can start with a public member, but if you want to refine the behaviour later, you can make it private and add a property. The interface doesn't change for users.
This is heretical, but the presence of get methods for attributes is itself a bad sign. Mostly, your attributes should not be private and modified only by the regular methods. Set methods are less problematic but using them a lot is also a bad sign.
@@brandonblackwell1216 Yeah,.I know. But thinking about state as private is still mostly the right thing. And the leading underscore convention to mark an instance variable as hands off seems reasonablely well understood in python.
Great video! I have a request: could you create a follow-up video explaining when it makes sense to move from using just functions to adopting OOP, particularly when you should start incorporating methods (not just properties) in your classes?
I think that I suggested to you in the discord the bad practice of a class only containing static methods. Really happy to see that it got included in a video aha 😀
Another excellent presentation, Arjan! I've recommended it to our team as a good refresher. Some of your refactoring also was a good demo of dependency inversion. SRP definitely is the single (pun intended) most important guide star I stress to my colleagues - and to refactor, repeatedly and brutally, until you almost giggle because your code is so concise, readable and beautiful. Which means it's most likely also much more maintainable and reusable, perhaps even in ways you haven't planned for.
So, a lot of these focus on people using classes where they should perhaps use something else, like functions & modules, which is valid, but something I see quite a lot is people using something else when they should actually use a class. For example, using a dictionary to structure complex, nested data when that data structure is known to the developer at compile (write) time, like requests with a known return structure and config files where I feel like you as the developer should really be prescribing how it is structured. So instead of having to hunt around a code base to find what keys are being accesed, classes allow for a much more declarative way of defining how the data should be structured. Another, albeit rarer example, is a module with a set of functions and a global variable that the other functions are reading and updating. Dont do that, please use a class.
100%. I create a class even when my config has one parameter :) Type hinting + data wrappers is really the way to go. class Config: a: str b: int @classmethod def from_dict(cls, data: dict) -> "Config": return cls(...) When I see a class that accepts "config: dict", I wanna cry.
@@garrywreck4291 I totally agree with you. Another example: If i read a csv i usually gather the column names in a class: class CSVHeader(object): date: str = "date" id: str = "id" name: str = "name" with open(pl.Path("data.csv"), mode = "r") as in_stream: dr: csv.DictReader = csv.DictReader(in_stream) for line in dr: print(f"date: {line[CSVHeader.date]}") So I can reuse them as often as i want (e.g. CSVHeader.date) ... and if i need to change a column name in my csv-data later on, then there is exactly one place where i need to cahnge my code (namely in my class). Hence I don't need to go through my code line by line and by hand and by [CTRL]+[F]. Plus my IDE (pycharm) can support me better with code completion on the csv header names.
@@garrywreck4291 It's all good until you reach a situation that you have different initialization parameters for, for example, neural networks, and on the project you're doing experiments, change and add parameters constantly. It's much simpler to manage all these configuration via something like Hydra, and while the resulting DictConfig is opaque, it saves a lot of time rather than trying to trace where each dataclass is called. Doing dataclass configs are good when you are 100% certain about what goes there.
Protip: TypedDict I only add config classes when I actually need some functionality of a class otherwise just a typed dict for passing typed data around to functions works great. But say you want to encode defaults, add some calculated fields, etc. Then turn it into a data class (or even better pydantic model/dataclass)
In the sixth example with getters and setters, you should not really have setters at all. I think it’s better practice to only have immutable data structures. Why would someone instantiate a Person and then change the name ? Just make a new Person. In Python this is hard to enforce but you can encourage it with underscore names and the absence of “public” methods to do the mutation.
I had huge discussions about topic 1 (wrapping functions into classes just because it looks “more professional”) with my team mates, who are computer scientists (and I am a data scientist). CS people are socialized on Java and they are educated to wrap the full world into classes, whether it is actually needed or not.
Not entirely OOP related, but rather general python related. Whenever I am refactoring or dealing with bug fixes or new features I find myself dreading methods that return a tuple of objects. Most of the time this is only 2 different things, which isn't going to be too problematic. But if there's more than 2 elements in a tuple, especially if some of them are of the same type, this will soon become problematic. Sure, we can use (e.g) `mypy` to check for types, but if there's multiple elements of the same type, these types of checks won't find the error in the intent of the code (though tests should do). So I find myself then creating data classes to make sure these things are handled correctly. And while on the other hand, the underlying issue might have to do with poor design, sometimes the coupled natured of the code doesn't allow for an easy fix for this within the confines of the current maintenance task. Other times even changing the return type could be problematic.
One of the eye openers for me was tarting to think domain centric, so thinking of what aspect you're trying to represent in the real world and then copy that in the code. So if you do a library system, the book may be objects, but checking in and out could be functions, rather than an object. Doing this made my code cleaner and more robust
Regarding the first two (changing classes to functions when they are simple) I think the main reason we usually stick to classes is regarding to testing. It's easier to inject a class as a dependency of another, then you can easily switch it for a test double. Sure, you can also use `patch`
Regarding functions masquerading as a class: I once found a UA-cam video in which the presenter said, "If a class contains 2 methods and one of them is __init__ it probably isn't a class". Which is effectively what Arjan said of his DataLoader class.
"The signature of 'This shouldn't be a class' is that it has two methods, one of which is init. Anytime you see that, you should probably think: 'Hey, maybe I just need the one method.'" --Jack Diederich, "Stop Writing Classes", PyConUS 2012
One ok-ish example are interpolators. At construction they find coefficients of polynomials and store them. Then the only method interpolates using these polinomials. So that coefficients are not recomputed at each interpolation.
1. Classes with a single method and init can still be useful in situations when initializing and action are controlled by a different code. Say, you need to load some data from file and you query filename in one function and actually read it way after. You could pass around raw filename, but often it's safer to wrap it in a class with a reasonable interface. It will be easier to mock for testing or enchance for some other format later. Basicaly it replaces string filename with a protocol with one load method that can have different implementations. 2. Using modules instead of classes with static methods is for sure a good idea, but there are rare cases when you actually can't do that easily. For example if you need alternative implementations of those functions and could benefit from inheritance. Something like pathlib with identical interface for manipulating with paths on different platforms. 3. I disagree that converting an attribute to a property by itself improves encapsulation in any way. What does improve it is domain-specific method like "withdraw", but leaving external interface as is and implementing limitations in methods like __iadd__ is fine as well.
Composition over inheritance! My old C++ job had a load of code that was at least 10 levels deep of inheritance on embedded code. Debugging and reasoning was impossible to learn, you had to have grown up with it. Superb video! And of course, don't forget that sometimes these bad practices are the right tool for the job, it's just rare. A good example is _global_, it's almost always bad apart from the rare time when it isn't (e.g. logging library logger mapping) but you must be very aware of the caveats (e.g. they're process local, not truly "global.")
I love this one 🎉. I noticed minor issues on the first example (pathlib has its own open … path.read_text :) so there is no need for `with open …`) . What would be cooler for Bank example is using setters and getter(return self._balance). with setter on balance, we could create checks without introducing extra function). Beyond that, I loved everything
FunctionClass: a class with a lowercase verb describing the core operation, which is implemented in the __call__ method of it’s metaclass, and invariant operations in static methods of the main class, named as adverbs, nouns or qualifiers that compose with the class’s verb name. Usage looks like: from utils import crop crop(img) crop.with_margins(img) crop.without_headers(img) crop.as_png(in) crop.fit_to_page(img, page)
For the inheritance and role: but if you create a lib then the enum is under your control. So adding addtional roles will be hard for someone using you lib. If you keep a virutal function getRole -> string someone can derive from you class and add a new role
The only reason I chose using a class over a function even thought it is static, is Dependency injection. I use FastAPI quite often, and more often than not, I inject service classes into other services or routes. Unittesting is much easier when you inject a class instead of a function. Yes, you can use patch/monkeypatch, but that is a skewed way of mocking (IMO). I would like to hear what your thoughts are on this.
My thinking is that: - If this module describes data, use a class - If that class doesn’t use a lot of methods inherent to that data, that isn’t just data validation, use a data class - If you must use inheritance, keep it really flat and stick to interfaces where possible to define expected behavior. Abstract classes can really help here. - Otherwise use modules with functions
good content i just subscribe! (just a tip: English its not my first language and see your videos with subtitle so when you type something in bottom of screen its goes under the subtitle please try up the screen if possible)
Hi Arjan, I've finally implemented CI/CD pipeline using github actions into my python libraries. Can you please discuss about `act` to run github workflows locally - specifically running tests on different python versions.
As for complex inheritence structures, sometimes you are kind forced to do this instead of repeating the code a bunch of times. For example if working with (Pytorch) Lightning, you create a custom Lightning Module by inheriting from a L.LightningModule class, and in itself it contains a lot of functionality that you need in order for it to train your neural network. You can get easily into situations that you have multiple models that use share some basic methods between them, but differ in others, like for example a custom "forward" method per each model. It can get even more complicated if you're building a meta-learning model or a GAN, but you still need the basic L.LightningModule functionality so you may find yourself with multiple layers of inheritance. If your lowest level classes don't rely on inheriting from outside packages then I agree, better not to have this hierarchy and strive to inherit only from a ABC.
Inner me is crying as I have to deal with Java daily and it's boilerplate oriented programming. I am so happy when "I have to adapt some Python scripts" for data analysis :) Thank you for the great video!
As far as I know, the use of typing is being phased out in favor of using the actual objects directly, rather than their proxies through typing. For example, instead of from "typing import Callable", it's better to use from collections(dot)abc import Callable. Similarly, instead of from typing import Tuple, you should simply use tuple without any imports.
In Typescript I like hard currying dependencies like saver and logger to create a readymade processOrder function that is easier to use, then in tests I can inject other things to the original processOrderFactory function. It kind of looks like inheritance but I found that it does not lead to the same problems as a proliferation of classes does, because they all have the same type (generics will be evaluated differently) and they don't require their own initializer or method overrides, so they are very lightweight. This is the same kind of difference between a function decorator and a higher order function. everytime you use a HOF as a decorator you loose flexibility. The only problem with this pattern is the naming convention.
7:27 Arjan, do you have more examples of code, that goes better with composition. I think I kind of get the concept, and then I realize I haven't really figured out how to do it.
When I was learning Java, I couldn't internalize all the loops, smoke and mirrors that you had to go through to "emuilate" functions and procedural code. Python refreshingly gives you the option of using both at the same time. But like anything, you can abuse the system. Thanks for your wisdom!
Really nice video! What code formatter do you use? I have Black configured in VS Code and it does not highlight the code when the function parameters have no type
In some languages, you ALWAYS create getters and setters for data members, because of the pain that will be incurred when you change the underlying data structures and / or add algorithms. In Python, this is absolutely not necessary. If at some point you decide that accessing an attribute needs to be replaced with a function call, in Python you can make the attribute a `property`.
A module can still contain static methods on a class. Say you wanted 3 sets of related but distinct functions available. You might create a module and simply use the class name as a namespace. The reason why static methods are useful is because they provide another level of abstraction. As long as your style is consistent, there are no issues with them. They can even improve readability.
If all you want to do is structuring, you can simply use submodules. Then it is the user who can decide which names should remain and which shouldn't. Or whether an abbreviation of the name should be used.
@@__christopher__ Sure you can use a sub-package, and you should anyway. It's a matter of personal preference. Static methods provide another layer of abstraction without creating yet another level to a package, so there is no reason not to do both when it seems right. There are many approaches and as long as it's readable, reusable and minimal, these choices can vary.
What vscode extensions do you use? The interface for renaming a variable seems different from what i get. Also, what autocomplete shadow hints are you using?
What is the reason to avoid namespacing imported methods ? I find it much easier to read code if an imported method is namespaced to a class, e.gx StringProcessor.uppercase vs from stringutils import uppercase. With the namespace qualifier you immediately know where the method is coming from anywhere in the code. And it cannot conflict with a variable accidentally named the same way.
Usually your program isnt going to be just a main function and some classes/functions, what happens when you have a function that takes a class call another function that calls a class, i should add both class instances at the main function level? It seems like it could get out of hand after 3 or so class instances being passed in
I'm having a lot of trouble using Protocol. There is something about it that does not work with mypy, or maybe is it Pylance. I start with a Protocol, it seems to be working for a while, but then comes a scenario in which type narrowing fails, or type inference maybe, then I have a useless and painful code poking session and I end up replacing the Protocol with a normal base class and it works as expected.
@@ArjanCodes By the way, have you ever considered doing a Kivy tutorial series? I recently made a mobile app and was really struggling because there arent any videos out there that get you to the finish line of having an app on the play store. I'd be happy to share everything i know for free but I don't have an audience :) I'm pretty confident Mobile app development in Python is gonna get huge, but right now information scarcity is holding people back. Quite a shame really
@@jonaskarlsson5901 It can. Class methods and static methods could be used on class directly, same for class attributes. That should be avoided, but it's possible for sure.
Related to type annotations: I often end up passing all of my **kwargs to the same inside function, but then you are basically blind when you call the outside function. Is there a way to wrap a function like this and preserve type and documentation info?
when I have a project and do the yet know how big it’s gonna be, I usually write procedural, then functional first and only start to introduce OOP when that’s actually helpful
I guess, we can change the self._balance to self.balance in the init method so that we can call the property in the init. In that case any change in the property will be applied to init as well.
I always like 'getter/setter' when there are restrictions on what values should be allowed. Like your bank balance example. If an instance of a class would be 'invalid' in some way when one of its member elements has a bogus value, then don't trust users of said objects to modify directly.
The second is not completly correct, I'm a lazy, and want to import with autocomplete and forget about module name, actually second case, it's if I have a few methods with the same name
3:12 when I try to use relative imports I get the relative import error. I don't quite understand why this worked in your case when I also have the same folder structure
First example with "composition" about managers and employees - is not composition in common sense. Second example with "composition" about mixins is more like it, because of most popular languages nowadays use references for all objects, for example: Python, JavaScript, Java and so on. But initially, composition is when object have everything of other objects as its parts. What you show was called "delegation" some time ago. I don't know what it's called nowadays. The difference is that your object delegate some work relying on other objects, which often shared by other objects. Easier to explain it for C/C++ users: composition is when you have other class as a field, delegation is when you have a pointer or reference to other class as a field. In the end, the video is most about which patterns to choose, and how to use them correctly.
Not necessarily. Cancelling an order, maybe, but processing (or fulfilling) an order might involve many steps (packing, shipping, inventory management etc) and have many dependencies. IMO that should not be the responsibility of the Order class.
@@ArjanCodesI was referring to the order's inner data only (like delivery/billing, status, eta, etc). I wasn't trying to stuff unrelated processing inside. My question was made in this sense, my apologies if it wasn't detailed enough in the first place.
If one can start from A and finish at B by just using function, then why use class? Classes look and feel complicating, more codes are needed to use classes, with no major benefit compared to functions. Seems like classes are only useful in some really edge cases. Thanks for listening to my rants.
A pet peeve of mine is seeing a dataclass with application methods and it's own __init__ constructor. Like why even make it a dataclass when plain old python class would do. Another one is a property getter that mutates class state.
Good point. I’ll use the new Callable location in upcoming videos (even though I don’t find ‘collections.abc’ a logical place for something like Callable, which is neither an ABC nor a collection).
isn't a python module signified by the fact that it's a folder that contains __init__.py? a folder with just python files is just a bunch of files or am i wrong? 🤔
Hey, I have one question - how is the way to properly model a data from server in python if for example response is json have allot of keys and more than that - inside this are a lot of inner dicts like: dict and inside that are a lot of dicts inside dicts inside dicts inside list I think that it will be nice material for video, many times when I model a data like this sometimes some inner dicts is null or it's empty and it throwing me error, need to make a lot of conditions and "ifs" checking if this key in dictionary exists or not etc.
That sounds like the format of the data output by the server should be changed be because that's pretty bad. If you are not able to change how the server outputs though, I would create a recursive function that automatically iterates through all the nested dictionaries to find a specific key and have the function output the key/value pair or None if it's not found.
I still don't get why you don't like OOP anymore. Even in your example, OOP would be a better choice. Like "process order" and "cancel_order" could be reduced as two methods ("process" and "cancel") from the Order class... Or maybe am I missing something?
The thing is that both processing and cancelling an order in the real world is probably quite a few lines of code. If you put all of that in a single Order class, this quickly becomes hard to work with. By the way, I still like OOP, but I do try to keep classes small. If functionality is complex (like in the case of processing or cancelling an order), I prefer to put that in a separate function. That also allows you to introduce an abstraction layer so writing tests for those functions becomes easier.
I love "fake" or "proxy" classes. Python lets you do all the things, so just creating a static or data class to use it purely as proxy for some other underlying technology (executing a command, listening or writing to a socket, etc) is a great fit for the debate for whether it makes sense to make it a class ends with "yes" :D
19:29 your preferred solution about Mixins just seems like more work? "They can create all sort of problems" is not enough of a statement because any fool can say that but only a genius can explain why. And in your example I don't see any problems with it
💡 Get my FREE 7-step guide to help you consistently design great software: arjancodes.com/designguide.
Man I love these “slashing over engineered OOP cruft into nice little functions videos” 😂
Haha, glad you enjoy them 😎
I really enjoy Arjan code content. I found this channel late 2022 and it's helped me deeply. A lot of time when you are being taught clean Code, they make it seem like OOP is the way to go. But obviously as Arjan has pointed out, in python, there are times when you are better off with a module or with just functions.
The difference between getters/setters in Java and Python is that in Java, if you don't use get/set and make the member itself public, you cannot change that in the future: you're stuck with a public member forever as part of the interface. In Python, you can start with a public member, but if you want to refine the behaviour later, you can make it private and add a property. The interface doesn't change for users.
This is heretical, but the presence of get methods for attributes is itself a bad sign. Mostly, your attributes should not be private and modified only by the regular methods. Set methods are less problematic but using them a lot is also a bad sign.
Technically speaking, there’s no such thing as a “private” member in Python.
@@brandonblackwell1216 Yeah,.I know. But thinking about state as private is still mostly the right thing. And the leading underscore convention to mark an instance variable as hands off seems reasonablely well understood in python.
why compare it with Java and not with C#? C# has 100 times better getters/setters than Java
Great video! I have a request: could you create a follow-up video explaining when it makes sense to move from using just functions to adopting OOP, particularly when you should start incorporating methods (not just properties) in your classes?
I think that I suggested to you in the discord the bad practice of a class only containing static methods. Really happy to see that it got included in a video aha 😀
Yes!👍🏻 That discussion in Discord was actually my motivation to do this video in the first place 😁.
Another excellent presentation, Arjan! I've recommended it to our team as a good refresher. Some of your refactoring also was a good demo of dependency inversion. SRP definitely is the single (pun intended) most important guide star I stress to my colleagues - and to refactor, repeatedly and brutally, until you almost giggle because your code is so concise, readable and beautiful. Which means it's most likely also much more maintainable and reusable, perhaps even in ways you haven't planned for.
So, a lot of these focus on people using classes where they should perhaps use something else, like functions & modules, which is valid, but something I see quite a lot is people using something else when they should actually use a class. For example, using a dictionary to structure complex, nested data when that data structure is known to the developer at compile (write) time, like requests with a known return structure and config files where I feel like you as the developer should really be prescribing how it is structured. So instead of having to hunt around a code base to find what keys are being accesed, classes allow for a much more declarative way of defining how the data should be structured. Another, albeit rarer example, is a module with a set of functions and a global variable that the other functions are reading and updating. Dont do that, please use a class.
100%. I create a class even when my config has one parameter :)
Type hinting + data wrappers is really the way to go.
class Config:
a: str
b: int
@classmethod
def from_dict(cls, data: dict) -> "Config":
return cls(...)
When I see a class that accepts "config: dict", I wanna cry.
@@garrywreck4291 I totally agree with you. Another example: If i read a csv i usually gather the column names in a class:
class CSVHeader(object):
date: str = "date"
id: str = "id"
name: str = "name"
with open(pl.Path("data.csv"), mode = "r") as in_stream:
dr: csv.DictReader = csv.DictReader(in_stream)
for line in dr:
print(f"date: {line[CSVHeader.date]}")
So I can reuse them as often as i want (e.g. CSVHeader.date) ... and if i need to change a column name in my csv-data later on, then there is exactly one place where i need to cahnge my code (namely in my class). Hence I don't need to go through my code line by line and by hand and by [CTRL]+[F]. Plus my IDE (pycharm) can support me better with code completion on the csv header names.
@@garrywreck4291 don't forget about `@dataclass`
@@garrywreck4291 It's all good until you reach a situation that you have different initialization parameters for, for example, neural networks, and on the project you're doing experiments, change and add parameters constantly. It's much simpler to manage all these configuration via something like Hydra, and while the resulting DictConfig is opaque, it saves a lot of time rather than trying to trace where each dataclass is called. Doing dataclass configs are good when you are 100% certain about what goes there.
Protip: TypedDict
I only add config classes when I actually need some functionality of a class otherwise just a typed dict for passing typed data around to functions works great.
But say you want to encode defaults, add some calculated fields, etc. Then turn it into a data class (or even better pydantic model/dataclass)
In the sixth example with getters and setters, you should not really have setters at all. I think it’s better practice to only have immutable data structures. Why would someone instantiate a Person and then change the name ? Just make a new Person. In Python this is hard to enforce but you can encourage it with underscore names and the absence of “public” methods to do the mutation.
I had huge discussions about topic 1 (wrapping functions into classes just because it looks “more professional”) with my team mates, who are computer scientists (and I am a data scientist). CS people are socialized on Java and they are educated to wrap the full world into classes, whether it is actually needed or not.
Not entirely OOP related, but rather general python related.
Whenever I am refactoring or dealing with bug fixes or new features I find myself dreading methods that return a tuple of objects. Most of the time this is only 2 different things, which isn't going to be too problematic. But if there's more than 2 elements in a tuple, especially if some of them are of the same type, this will soon become problematic. Sure, we can use (e.g) `mypy` to check for types, but if there's multiple elements of the same type, these types of checks won't find the error in the intent of the code (though tests should do). So I find myself then creating data classes to make sure these things are handled correctly.
And while on the other hand, the underlying issue might have to do with poor design, sometimes the coupled natured of the code doesn't allow for an easy fix for this within the confines of the current maintenance task. Other times even changing the return type could be problematic.
One of the eye openers for me was tarting to think domain centric, so thinking of what aspect you're trying to represent in the real world and then copy that in the code. So if you do a library system, the book may be objects, but checking in and out could be functions, rather than an object. Doing this made my code cleaner and more robust
Thanks for sharing re: Mixin. I like how you keep coming back to what is simple, effective and easy to understand.
Love it! Anything to remove unecessary oop is always welcome! Great examples.
Regarding the first two (changing classes to functions when they are simple) I think the main reason we usually stick to classes is regarding to testing. It's easier to inject a class as a dependency of another, then you can easily switch it for a test double. Sure, you can also use `patch`
Regarding functions masquerading as a class: I once found a UA-cam video in which the presenter said, "If a class contains 2 methods and one of them is __init__ it probably isn't a class". Which is effectively what Arjan said of his DataLoader class.
PROBABLY! is the key word here
Good comment man!
The default checking for class in Pylint has something like: Your class has too few public methods (
"The signature of 'This shouldn't be a class' is that it has two methods, one of which is init. Anytime you see that, you should probably think: 'Hey, maybe I just need the one method.'"
--Jack Diederich, "Stop Writing Classes", PyConUS 2012
One ok-ish example are interpolators. At construction they find coefficients of polynomials and store them. Then the only method interpolates using these polinomials. So that coefficients are not recomputed at each interpolation.
17:12, you need to learn ctrl+d whilst selecting some text, it will create multiple cursors enabling you easier editing
1. Classes with a single method and init can still be useful in situations when initializing and action are controlled by a different code. Say, you need to load some data from file and you query filename in one function and actually read it way after. You could pass around raw filename, but often it's safer to wrap it in a class with a reasonable interface. It will be easier to mock for testing or enchance for some other format later. Basicaly it replaces string filename with a protocol with one load method that can have different implementations.
2. Using modules instead of classes with static methods is for sure a good idea, but there are rare cases when you actually can't do that easily. For example if you need alternative implementations of those functions and could benefit from inheritance. Something like pathlib with identical interface for manipulating with paths on different platforms.
3. I disagree that converting an attribute to a property by itself improves encapsulation in any way. What does improve it is domain-specific method like "withdraw", but leaving external interface as is and implementing limitations in methods like __iadd__ is fine as well.
I love your OOP stuff/mindset. It changed my life.
Glad to hear you find it helpful!
Composition over inheritance! My old C++ job had a load of code that was at least 10 levels deep of inheritance on embedded code. Debugging and reasoning was impossible to learn, you had to have grown up with it.
Superb video!
And of course, don't forget that sometimes these bad practices are the right tool for the job, it's just rare. A good example is _global_, it's almost always bad apart from the rare time when it isn't (e.g. logging library logger mapping) but you must be very aware of the caveats (e.g. they're process local, not truly "global.")
Glad you liked the video! And good point regarding that in rare cases, doing these things might make sense.
I love this one 🎉. I noticed minor issues on the first example (pathlib has its own open … path.read_text :) so there is no need for `with open …`) . What would be cooler for Bank example is using setters and getter(return self._balance). with setter on balance, we could create checks without introducing extra function). Beyond that, I loved everything
FunctionClass: a class with a lowercase verb describing the core operation, which is implemented in the __call__ method of it’s metaclass, and invariant operations in static methods of the main class, named as adverbs, nouns or qualifiers that compose with the class’s verb name.
Usage looks like:
from utils import crop
crop(img)
crop.with_margins(img)
crop.without_headers(img)
crop.as_png(in)
crop.fit_to_page(img, page)
Yeah, that sounds like an excellent case for a module named crop with a bunch of functions inside.
For the inheritance and role: but if you create a lib then the enum is under your control. So adding addtional roles will be hard for someone using you lib. If you keep a virutal function getRole -> string someone can derive from you class and add a new role
Thank you, very helpful video! Your clarity is unmatched.
Happy you liked it!
Thank you, I enjoyed this videao. I would have liked to see some coverage of property setters in the encapsulation section.
good stuff. definitely going through these issues. didn't know about the callable or protocol so I'm going to look into those more.
The only reason I chose using a class over a function even thought it is static, is Dependency injection. I use FastAPI quite often, and more often than not, I inject service classes into other services or routes. Unittesting is much easier when you inject a class instead of a function. Yes, you can use patch/monkeypatch, but that is a skewed way of mocking (IMO). I would like to hear what your thoughts are on this.
My thinking is that:
- If this module describes data, use a class
- If that class doesn’t use a lot of methods inherent to that data, that isn’t just data validation, use a data class
- If you must use inheritance, keep it really flat and stick to interfaces where possible to define expected behavior. Abstract classes can really help here.
- Otherwise use modules with functions
Amazing ,thanks Arjan, already using all these practices
You’re welcome!
good content i just subscribe! (just a tip: English its not my first language and see your videos with subtitle so when you type something in bottom of screen its goes under the subtitle please try up the screen if possible)
Hi Arjan, I've finally implemented CI/CD pipeline using github actions into my python libraries. Can you please discuss about `act` to run github workflows locally - specifically running tests on different python versions.
Will there be a video about all modern python tooling, comparison and use cases? For example: Poetry, Hatch, uv, rye etc.
As for complex inheritence structures, sometimes you are kind forced to do this instead of repeating the code a bunch of times. For example if working with (Pytorch) Lightning, you create a custom Lightning Module by inheriting from a L.LightningModule class, and in itself it contains a lot of functionality that you need in order for it to train your neural network. You can get easily into situations that you have multiple models that use share some basic methods between them, but differ in others, like for example a custom "forward" method per each model. It can get even more complicated if you're building a meta-learning model or a GAN, but you still need the basic L.LightningModule functionality so you may find yourself with multiple layers of inheritance.
If your lowest level classes don't rely on inheriting from outside packages then I agree, better not to have this hierarchy and strive to inherit only from a ABC.
Inner me is crying as I have to deal with Java daily and it's boilerplate oriented programming. I am so happy when "I have to adapt some Python scripts" for data analysis :)
Thank you for the great video!
As far as I know, the use of typing is being phased out in favor of using the actual objects directly, rather than their proxies through typing. For example, instead of from "typing import Callable", it's better to use from collections(dot)abc import Callable. Similarly, instead of from typing import Tuple, you should simply use tuple without any imports.
In Typescript I like hard currying dependencies like saver and logger to create a readymade processOrder function that is easier to use, then in tests I can inject other things to the original processOrderFactory function.
It kind of looks like inheritance but I found that it does not lead to the same problems as a proliferation of classes does, because they all have the same type (generics will be evaluated differently) and they don't require their own initializer or method overrides, so they are very lightweight.
This is the same kind of difference between a function decorator and a higher order function. everytime you use a HOF as a decorator you loose flexibility.
The only problem with this pattern is the naming convention.
7:27 Arjan, do you have more examples of code, that goes better with composition. I think I kind of get the concept, and then I realize I haven't really figured out how to do it.
When I was learning Java, I couldn't internalize all the loops, smoke and mirrors that you had to go through to "emuilate" functions and procedural code. Python refreshingly gives you the option of using both at the same time. But like anything, you can abuse the system. Thanks for your wisdom!
Really nice video! What code formatter do you use? I have Black configured in VS Code and it does not highlight the code when the function parameters have no type
You are so settle and easy on explanation wow, thanks 👍
In some languages, you ALWAYS create getters and setters for data members, because of the pain that will be incurred when you change the underlying data structures and / or add algorithms.
In Python, this is absolutely not necessary. If at some point you decide that accessing an attribute needs to be replaced with a function call, in Python you can make the attribute a `property`.
A module can still contain static methods on a class. Say you wanted 3 sets of related but distinct functions available. You might create a module and simply use the class name as a namespace.
The reason why static methods are useful is because they provide another level of abstraction. As long as your style is consistent, there are no issues with them. They can even improve readability.
If all you want to do is structuring, you can simply use submodules. Then it is the user who can decide which names should remain and which shouldn't. Or whether an abbreviation of the name should be used.
@@__christopher__ Sure you can use a sub-package, and you should anyway. It's a matter of personal preference.
Static methods provide another layer of abstraction without creating yet another level to a package, so there is no reason not to do both when it seems right.
There are many approaches and as long as it's readable, reusable and minimal, these choices can vary.
What vscode extensions do you use? The interface for renaming a variable seems different from what i get. Also, what autocomplete shadow hints are you using?
Very good summary, Sir! Thank you
You’re welcome!
note with the `_balance` example, if you use two underscores it becomes effectively protected. Try it out :D
What about double underscore for private fields? I think it has more protection against accidental changing.
What is the reason to avoid namespacing imported methods ? I find it much easier to read code if an imported method is namespaced to a class, e.gx StringProcessor.uppercase vs from stringutils import uppercase.
With the namespace qualifier you immediately know where the method is coming from anywhere in the code. And it cannot conflict with a variable accidentally named the same way.
Usually your program isnt going to be just a main function and some classes/functions, what happens when you have a function that takes a class call another function that calls a class, i should add both class instances at the main function level? It seems like it could get out of hand after 3 or so class instances being passed in
Your content is very good. I am proud to be your subscriber.Wish best luck❤❤❤❤❤❤
Thank you so much! ❤
I absolutely love "functions as classes". I use them for partial application though.
I'm having a lot of trouble using Protocol.
There is something about it that does not work with mypy, or maybe is it Pylance.
I start with a Protocol, it seems to be working for a while, but then comes a scenario in which type narrowing fails, or type inference maybe, then I have a useless and painful code poking session and I end up replacing the Protocol with a normal base class and it works as expected.
3:36 classes should have instances
6:46 Proceeds to use Enum
Of course, while abiding to the regular way of doing things in Python.
@@ArjanCodes By the way, have you ever considered doing a Kivy tutorial series? I recently made a mobile app and was really struggling because there arent any videos out there that get you to the finish line of having an app on the play store.
I'd be happy to share everything i know for free but I don't have an audience :)
I'm pretty confident Mobile app development in Python is gonna get huge, but right now information scarcity is holding people back. Quite a shame really
Fields of enum are actually it's instances, so no contradiction here.
a class can't be used without making an instance of it so idk what you mean lol
@@jonaskarlsson5901 It can. Class methods and static methods could be used on class directly, same for class attributes. That should be avoided, but it's possible for sure.
thank you so much
can i ask you a video about decorators?
Regarding Employee, Manager, etc. I would have them all inherit from a base class and return a class attribute that describes roll.
Related to type annotations: I often end up passing all of my **kwargs to the same inside function, but then you are basically blind when you call the outside function. Is there a way to wrap a function like this and preserve type and documentation info?
when I have a project and do the yet know how big it’s gonna be, I usually write procedural, then functional first and only start to introduce OOP when that’s actually helpful
I guess, we can change the self._balance to self.balance in the init method so that we can call the property in the init. In that case any change in the property will be applied to init as well.
6:46 Does this method work with ORM objects like when you use sqlalchemy?
I always like 'getter/setter' when there are restrictions on what values should be allowed. Like your bank balance example. If an instance of a class would be 'invalid' in some way when one of its member elements has a bogus value, then don't trust users of said objects to modify directly.
a setter with validation is the most fundamental example of "business logic" and is exactly what classes are intended for
Can we use data classes instead of sqlalchemy data class
The second is not completly correct, I'm a lazy, and want to import with autocomplete and forget about module name, actually second case, it's if I have a few methods with the same name
in the email service example if you had more than one email service how would the protocol know which one to use ?
Great tips!!
Glad you think so!
Has Arjan, covered situations where different providers (protocol implementations) rely on different arguments? Can anyone link a vid if he did?
3:12 when I try to use relative imports I get the relative import error. I don't quite understand why this worked in your case when I also have the same folder structure
I would suggest another topi to cover Arjan: "underscores in Python"
Dataclasses in Python can be a fantastic way to start and are fairly easy to convert to a class if need be. Normally, though, you never need to. :)
14:40 Every viewer who has ever worked on banking software just yelled at their screen
Already happened long before, because a debit account can't be created with a negative amount.
Very useful. Thanks
Glad it was helpful!
First example with "composition" about managers and employees - is not composition in common sense. Second example with "composition" about mixins is more like it, because of most popular languages nowadays use references for all objects, for example: Python, JavaScript, Java and so on. But initially, composition is when object have everything of other objects as its parts. What you show was called "delegation" some time ago. I don't know what it's called nowadays. The difference is that your object delegate some work relying on other objects, which often shared by other objects. Easier to explain it for C/C++ users: composition is when you have other class as a field, delegation is when you have a pointer or reference to other class as a field.
In the end, the video is most about which patterns to choose, and how to use them correctly.
But, since both "process" and "cancel" are referring to the Order, shouldn't better to have them inside Order?
Not necessarily. Cancelling an order, maybe, but processing (or fulfilling) an order might involve many steps (packing, shipping, inventory management etc) and have many dependencies. IMO that should not be the responsibility of the Order class.
@@ArjanCodesI was referring to the order's inner data only (like delivery/billing, status, eta, etc). I wasn't trying to stuff unrelated processing inside. My question was made in this sense, my apologies if it wasn't detailed enough in the first place.
13:57 AI caught stealing our money in 4k
Underrated comment lol
If one can start from A and finish at B by just using function, then why use class? Classes look and feel complicating, more codes are needed to use classes, with no major benefit compared to functions. Seems like classes are only useful in some really edge cases.
Thanks for listening to my rants.
A pet peeve of mine is seeing a dataclass with application methods and it's own __init__ constructor. Like why even make it a dataclass when plain old python class would do. Another one is a property getter that mutates class state.
wow thanks for sharing.
My rule of thumb is always:
use classes if something has a state that changes
Use a function otherwise
It's worth noting that TypedDict can be a lighter alternative to dataclass when you don't need to have a class.
1:59 line 9: using a list comprehension when you only need a generator expression is also bad practice
Your employee is a a bad example of how to simplify since an employee cannot do the same things as a manager.
typing.Callable is deprecated since Python 3.9. The docs say it is now an alias to collections.abc.Callable.
Good point. I’ll use the new Callable location in upcoming videos (even though I don’t find ‘collections.abc’ a logical place for something like Callable, which is neither an ABC nor a collection).
@@ArjanCodes agreed! But I'm not a core python dev so who knows 🤪
isn't a python module signified by the fact that it's a folder that contains __init__.py?
a folder with just python files is just a bunch of files or am i wrong? 🤔
In Python, a module is a file. A package is a folder normally containing a __init__.py file and other files and subfolders.
Superuseful!
So, quit using oop and write functions instead. What a Joy!)))
i am recomending you so much
Nice! 👌
Hey, I have one question - how is the way to properly model a data from server in python if for example response is json have allot of keys and more than that - inside this are a lot of inner dicts like: dict and inside that are a lot of dicts inside dicts inside dicts inside list
I think that it will be nice material for video, many times when I model a data like this sometimes some inner dicts is null or it's empty and it throwing me error, need to make a lot of conditions and "ifs" checking if this key in dictionary exists or not etc.
That sounds like the format of the data output by the server should be changed be because that's pretty bad. If you are not able to change how the server outputs though, I would create a recursive function that automatically iterates through all the nested dictionaries to find a specific key and have the function output the key/value pair or None if it's not found.
New Drinking Game to really retain the learning: Take a drink every time he says 'code'.
Love it!
I never heard of mixins before this video and I am utterly horrified by them. 😐
3:50 He destroyed Java in a classy manner 😂😂
you should have started with "stop using print in your prod"
The copilot is strong with this one 😂
If the class is just data why use a class at all when a dictionary would do?
I still don't get why you don't like OOP anymore. Even in your example, OOP would be a better choice. Like "process order" and "cancel_order" could be reduced as two methods ("process" and "cancel") from the Order class... Or maybe am I missing something?
The thing is that both processing and cancelling an order in the real world is probably quite a few lines of code. If you put all of that in a single Order class, this quickly becomes hard to work with.
By the way, I still like OOP, but I do try to keep classes small. If functionality is complex (like in the case of processing or cancelling an order), I prefer to put that in a separate function. That also allows you to introduce an abstraction layer so writing tests for those functions becomes easier.
I’ve missed this kind of video where you really hack away at code 😊
Wait wait wait, but Arjan was the OOP guy, no?
Imo almost all data should be verified, even the name should note contain numbers
I love "fake" or "proxy" classes. Python lets you do all the things, so just creating a static or data class to use it purely as proxy for some other underlying technology (executing a command, listening or writing to a socket, etc) is a great fit for the debate for whether it makes sense to make it a class ends with "yes" :D
19:29 your preferred solution about Mixins just seems like more work?
"They can create all sort of problems" is not enough of a statement because any fool can say that but only a genius can explain why. And in your example I don't see any problems with it
enterprise-ai AI fixes this. Avoid these bad practices OOP.
👍