In terms of other ways to handle database testing: Rather setup and teardown of the entire database contents each time you run the tests, we have a Docker Image for our test data. That means we can test against a consistent dataset each time, and easily reset it (i.e. destroy the container) ready for next time (spin up a new container from the image). This works well for the things like: - We have a very large schema and set of testing data - Handles the fact we are still more monolithic than microservices (when/if we can fully split out, we would need separate images for the different services) - We want to test against specific MySQL versions (i.e. testing against 5.7 with one image and 8.0 with another image) - Standard Docker benefits of being platform independent, and indeed installation independent (i.e. our server can run the tests on Linux whilst devs can run happily on Windows) We store the Dockerfile with the schema and initial data in a repository that therefore allows versioning, plus new data can be added as a new file at the end of a list of scripts to be run to setup the data. Thoughts? Are there any particular weaknesses of this approach? Or better benefits of other methods?
To me it seems that destroying your containers and subsequently restarting it for each test, doesn't scale so well. The first question that pops up is: Do you really need all that data in your db for each of your tests? Far from an expert here, but: - You could have a single session for read only tests, as they are not making mutations to the database (and hence no need to reload or reprovision). - Like I said earlier, ask yourself if you need all the data in the first place - as part of a pytest fixture, you could copy the OG database to a unique file name, and passing that to the testclient/test, hence yielding a fresh playground db for each test, all without needing to restart your container. Of course it is different when you are testing on API level, in that case your approach makes a lot more sense already. Some (dirty) tricks that I built into my application (when in test environment) is a seperate API, to load or alter databases in my application, hence giving me more control when I run e2e tests.
@@erikvanraalte4557 Why not initialize all the data the beginning of the test suite and run each test in a transaction that rolls back after each test? If your tests require transactions you can use nested transactions.
@@erikvanraalte4557 Good point - that is the question I need to ask but have probably been ignoring as it is a bit of hard one! I should clarify that we spin up the container only once for each test run (i.e. once for 2500+ tests, and growing as our coverage increases). However, I'm starting to run into problems where the data change from previous tests can affect later tests, so your scaling point does hold a lot of weight too. It's leading me to think we'd need to rewrite the tests (e.g. why do we care about the next id in the table being a certain value?) Or to split the suite up and spin down and up between these sections. Still feels like our method isn't great, but is a starting point... I like the idea of refining/splitting up the data better, plus it makes sense to have a cleaner dataset when testing anyway to be very clear that we are testing what we think we are testing.
Thank you for the practical videos that can be used in real world work projects. The writing test could not be explained in detail more easily than this.
It's better not to create items needed just in some tests in the common setup. Every test should be atomic and work with an empty database. So if you need an item for an update test, you should create it within that test
Pytest-sqlalchemy-mock creates a "mocked_session" fixture which can be used in tests. Also useful is that mocked tables/records can defined as lists of dicts.
Since it’s testing. I suggest you simply use not async version to test. Same as alembic. So you will have async db and stuff in your prod environment and not async while doing migration and testing. Wonder why you want async with testing?
Speed? For me I used xdist. Database stuff wise you get issues so you need separate databases (or schemas) for each pytest session. This allows you to run close to X time the number of CPUs you have. Downside is the set up time so I also have a single pytest runner when running single tests. Config for creating a database client that uses separate sessions per thread goes in the conftest file.
@@po6577 I get what you are saying. The problem I encounter is that when I want to test my endpoints, they are all written with async functions and methods. Meaning that the way I handle my CRUD operations is async. I have never tried this up to this point of writing, and i'm not sure why now that I think about it. But since all of it is using async sqlalchemy, I can't use a dependency override that inserts a sync_engine. (Will test this later, but I assume it's incompatible). So to actually test my endpoint, I need to setup everything async with pytest. Now I have been able to do this, but how I'm not ENTIRELY sure how I did it, so I consider it a hack.
Not sure I follow, The controller can just be hit with a test client. That will not care if your function is sync or async. Not tested async service layer before though is that what you mean ?
if you have multiple assertions in one test that are not contingent on each other, I would recommend either splitting up the test case (sometimes a bit much) or using a package like pytest_check, which allows pytest to handle multiple checks without ending a test as soon as the first assertion fails. that way, you can get a more detailed report of everything that fails
Nice video as always! And examples are shown is pretty useful as long as your API is such simple one. In my case almost every API endpoint contains quite complicated business logic over the data that are gathered with several SQL requests that uses some results from previous ones. All kind of guides I've read about testing an API is always playing around kind of 'hello world' examples. The second issue for me is that some complex API endpoints requires DB filled with pretty complex set of data. Minimal viable data set is about 6 to 10 GB on disk, besides that it is additional effort to create it from production DB contents and ensure it is consistent.
the "setup" and "teardown" doesn't ran automatically here. I had to put that on the test function. It is possible to run it automatically like in your code? Without fixtures?
Very structured and helpful video giving some new aspects I was unsure about how to handle before. Many thanks! :) Was creating a FastAPI Project these days again for a first pitch of a small - specific - control module in a radio station. I was happy not having the need of using a database this time. In the project before (with DB) some it felt kind of unsatisfactory, as I had the feeling of writing some stuff two times (for API and for DB). Also I was not happy about the binding: The will of decoupling was strong in me, but I failed. Can't wait for refactoring this
Another issue with this approach of using sqlite is database compatibility: I had a similar setup and sqlite did not support ARRAY, which is very common in postgresql. The solution I came up with is to write a dialect adapter for sqlalchemy that emulates the ARRAY command. But that custom code may have bugs that hide real bugs (false positives) and you don't want to start testing the test framework :)
Docker makes it so easy to spin up a real database that for advanced features it makes sense to us the real thing. Great for when You need to test your app on the next database version or library. Bumping from sqlalchemy v1 to V2 for example.
@@TheRich107 Yep, I just have it on create, it will spin up a mariadb instance, create all the tables, Once it's done, i can just clone the production database to it then i can be as destructive as i want.
Does pytest auto-recognise the setup and teardown functions? If so does it call them before and after each test, or before and after each test session?
At 4:35 on line 72, you only close the DB connection if you don’t raise an error. Could that not creat problems with many sessions if the application is long-running?
This is such a quality content. Thanks a lot. One question I have though is if we're using a layered architecture... say DDD with layers like the infrustructure, domain (usecases) and such a thing how do we test them independent of the other layer?
That yield inside the try-finally is to check for errors inside the application and not with creating a session. With that try and yield you can assert that when an error occurs in your app the database connection is properly closed
There was small discussion that we had on Coding Crashkurse channel on this topic. I think the approach to have dummy DBs (using say, SQLlite) may not be the best approach for doing Unit-tests for APIs or utilities. We should rather test with mock functions that can be checked for how often they've been called! Testing with DB might fit well with an integration test - say by the QA environment.
I have a fastapi built on top of a Snowflake db (snowflake connector), which gives the team headaches to test because we are using Snowflake particular types like OBJECT in our db model, so we cannot reproduce it in a sqllite db for testing. Any suggestion?
Great video, thanks for sharing that! 1 major thing about converting different layers of models: `return Item(**db_item.__dict__)` might throw `TypeError: API.__init__() got an unexpected keyword argument 'hash_value' ` for adding another field in DB: ``` @dataclass class DB: id: str name: str # adding hash_value to DB, not meant to be shared with API hash_value: bytearray @dataclass class API: id: str name: str d = DB(id="1", name="db", hash_value=bytearray([125, 125])) api = API(**d.__dict__) ``` tests will find this bug before production, but for that you need good coverage. I prefer having it explicitly done with a convertor function i.e: create_api(db) even if it's boring as hell. What do you think about c'tors receiving unpacked arguments with **?
Nice vid. Are you interested in doing a similar video from a data engineering perspective, let’s say using polars and duckdb? Should be fun an probably a little more straightforward.
Is starting a db session for each route something that is done when using relational databases? I only know how to use MongoDB and a connection to the db is done once on startup.
Hi! I currently don't offer any type of mentorship service, for the time being. However, my Software Designer Mindset course offers two different services that might interest you. When enrolling, you'll get access to my private student community on Discord, which I interact with quite often. Depending on the package you choose, you might also get a one-time custom code review by me in either Python, JavaScript, or Typescript! You can check my website for more information if you wish! :) Hope this comment helped.
I don't see why the need of creating a custom not found error, why don't just let the operation return none so the API part can handle it and raise the proper exception? That would be easier with the same separation of concerns. Great video BTW most tutorials don't address good practices as you do, appreciated.
Do you mean dict(**db_item) ? or db_item.dict() ? Because of my knowledge sqlalchemy does not have that method. **__dict__** is a dunder that represents class attributes as a dictionary. Unpacking with ** results in initializing the pydantic model with keyword arguments. In this situation, he could have added a response_model=Item to the decorator function of the route. Fastapi would have handled the rest.
You can group the functions that do the the database operation into a class which will be repository pattern. I would also like to see how to set up the deepest model objects creation. For instance one need to write test for a project management tool like Jira. If we want to test commet. We need to create project, team, member, status, task and finally a comment. Could you make a tutorial for deepest hieararchy model object creation?
By the way the the rest o the application other than repositories classes can decoupled completely from the orm by returning the DTO from repository class methods.
Hi! I would not call it Integration test, I would call just "integrationy". As it is not tested against the same type of db as the life system, and also it is hosted a bit differently. About testing my api: In a next comment, after a sleep 🙂
Well this is a good intro,.But in production grade app things are done quite differently. FastAPI is async framework and sadly almost all tutorials show how to use FAstAPI with sync python only. In real life FastAPI app calls to DB are async, DB manages a pool of connections and everything gets quite a bit more complex. It would be nice to see a tutorial on more realistic setup of FastAPI DB layer
To be transparent, I am a bit Java-biased as I use both Java and Python daily. The solution explained in this video is not clean enough because you inject DB sessions into routers. But what the routers depend on are not DB sessions. The routers depend on operations. DB sessions are just internal details of the operations. From my Java-biased point of view, defining and injecting service objects providing operations is a cleaner solution. routers -> service objects -> DB sessions In this approach, we don't need in-memory SQLite for testing the routers. What we need is a mock of the service class. This approach makes tests of routers independent of DB access logic. By the way, what this video illustrates is something we have seen repeatedly in various programming languages and frameworks - fat controllers. And the mitigation for this problem is also always the same - the classic MVC model from the 1980s'.
I agree with your approach however I think there is value is including the db in the scope of the controller tests. The python type system can mess up decimals and float types and it only surfaces when passed to the database. The value comes from checking that when that data has come from JSON into the controller and translated by the service, it might not be as you expect. I have had a type of decimal accept floats from the controller just fine. Individual unit tests passed but actually writing controller sourced data from the controller to the database through the service failed. In that case test would pass with TCP connection but socket failed.
@@TheRich107 I understand your point, but I prefer having a clear separation of concerns. The ambiguity in type handling is a real pitfall in Python. But in this example of Web application tests, we should ensure strict type handling at the controller level (this is why we use Pydantic) and in assertions and mocks used in unit tests. The controller should focus on interfacing between the external world (Web clients) and business logic, and the business logic should be located somewhere else, like a service class. The injection of DB sessions may work in small and independent Web services. But generally, a Web service can rely on multiple databases and other external dependencies, including microservices in a company, files in S3 buckets and various services in cloud platforms. A DB session is just one of many dependencies maintaining application status. In our team, we encourage members to use predefined FastAPI project templates that include skeletons of routers and service classes with unit tests. Our templates aim to enforce a clear separation between controllers and models (services) from the beginning of every project. The overhead for introducing a service layer is literally zero. But its benefits are a lot.
@@buchi8449 Do you have an example repository with that structure? I'm not referring to your company's repo, but perhaps you've come across something very similar to what you're discussing in public repositories?
Yeah I like having a separate service layer certainly makes for easier testing. I get your point not sure I follow on the multi dependency. Let me check my dependency, It makes sense that I don't have dependency injections for S3 buckets in the controller and that is Purley a service layer concern. I might have to play around with doing the same for the database. The fast api author does know what he is doing. I would be interested to hear the pros of having the DB injected into the controller and why that is the documented way to use the framework. When I have instantiated a database connection at the service layer before I didn't find a way to override it in the tests. (I have several DB sessions running side by side in the tests)
I put a monitor while you watch monitor. 2x different mappers to same SQL table... just to update row Look: JSON -> model.Pydantic -> serialize in dict -> serialize in SQLAlchemy ORM mapper -> serializes in dict for query -> sql query. Instead could be Pydantic -> SQL query or SQLAlchemy Core.
IMHO HTTP handlers should know nothing about DB! There should be some "service" entity which you inject into handlers. Then you test HTTP-related stuff with a fake service. The "service" itself is tested separately and it knows nothing about HTTP. My personal favorite way is to have a class like "Application" which encapsulates HTTP app (e.g. Flask, FastAPI, etc.) and some service (e.g. Service class). The Application accepts the service via initializer. HTTP handlers have access to the service. Then you init the "Application" in tests passing it some FakeService, initialize a test HTTP client, and test your requests.
I made the same comment. Injecting DB sessions to routers is not clean. We should inject service objects into routers and use mocks of the service objects to test the routers. It is not difficult to modify the current code in the video to have a service class. In the video, the "operations" are extracted as pure functions. However, this is not a good model since the operations have side effects on the state of the application. The better approach is extracting the "operations" as methods of a service class. Then we can inject DB sessions to __init__ of the service class, and inject instances of the service class into routers (FastAPI's DI can recursively inject dependencies).
Unsurprisingly, a yotuber without a successful programming career has 0 skills in software architecture. You're helping some, but you disregard talking about some of the more important decisions you must make before you even start typing.
👷 Join the FREE Code Diagnosis Workshop to help you review code more effectively using my 3-Factor Diagnosis Framework: www.arjancodes.com/diagnosis
In terms of other ways to handle database testing:
Rather setup and teardown of the entire database contents each time you run the tests, we have a Docker Image for our test data.
That means we can test against a consistent dataset each time, and easily reset it (i.e. destroy the container) ready for next time (spin up a new container from the image).
This works well for the things like:
- We have a very large schema and set of testing data
- Handles the fact we are still more monolithic than microservices (when/if we can fully split out, we would need separate images for the different services)
- We want to test against specific MySQL versions (i.e. testing against 5.7 with one image and 8.0 with another image)
- Standard Docker benefits of being platform independent, and indeed installation independent (i.e. our server can run the tests on Linux whilst devs can run happily on Windows)
We store the Dockerfile with the schema and initial data in a repository that therefore allows versioning, plus new data can be added as a new file at the end of a list of scripts to be run to setup the data.
Thoughts? Are there any particular weaknesses of this approach? Or better benefits of other methods?
To me it seems that destroying your containers and subsequently restarting it for each test, doesn't scale so well. The first question that pops up is: Do you really need all that data in your db for each of your tests?
Far from an expert here, but:
- You could have a single session for read only tests, as they are not making mutations to the database (and hence no need to reload or reprovision).
- Like I said earlier, ask yourself if you need all the data in the first place
- as part of a pytest fixture, you could copy the OG database to a unique file name, and passing that to the testclient/test, hence yielding a fresh playground db for each test, all without needing to restart your container.
Of course it is different when you are testing on API level, in that case your approach makes a lot more sense already. Some (dirty) tricks that I built into my application (when in test environment) is a seperate API, to load or alter databases in my application, hence giving me more control when I run e2e tests.
@@erikvanraalte4557 Why not initialize all the data the beginning of the test suite and run each test in a transaction that rolls back after each test? If your tests require transactions you can use nested transactions.
Do you mean docker image or docker volume with the db data on ?
For those with smaller database fixture needs... Factory boy library is your friend
@@erikvanraalte4557 Good point - that is the question I need to ask but have probably been ignoring as it is a bit of hard one!
I should clarify that we spin up the container only once for each test run (i.e. once for 2500+ tests, and growing as our coverage increases).
However, I'm starting to run into problems where the data change from previous tests can affect later tests, so your scaling point does hold a lot of weight too. It's leading me to think we'd need to rewrite the tests (e.g. why do we care about the next id in the table being a certain value?) Or to split the suite up and spin down and up between these sections.
Still feels like our method isn't great, but is a starting point... I like the idea of refining/splitting up the data better, plus it makes sense to have a cleaner dataset when testing anyway to be very clear that we are testing what we think we are testing.
@@TheRich107Thanks, that looks excellent!
I don't know if it is on purpose or not, but you have been timing your video releases with my Friday lunch breaks and I love it.
Thank you for the practical videos that can be used in real world work projects. The writing test could not be explained in detail more easily than this.
Thank you! I'm glad it was helpful :)
I’m a simple man, Arjan posts a new video, I watch it! Congrats for the good work, Arjan!
Haha, thank you!😊
It's better not to create items needed just in some tests in the common setup. Every test should be atomic and work with an empty database. So if you need an item for an update test, you should create it within that test
Agreed
Pytest-sqlalchemy-mock creates a "mocked_session" fixture which can be used in tests. Also useful is that mocked tables/records can defined as lists of dicts.
A very good practical guide to testing in python in general, including some comments on integration tests. Thanks!!
I'm glad you enjoyed the content!
Your videos are amazing, you are an amazing teacher, please never stop making them!
Thank you for the kind words, Elliot! Will do!
Love how you introduced the concepts incrementally. Any plans to cover SQLModel? It plays very nicely with fastapi
Dude the way you explain stuff is out of this world
Thank you so much! Glad you enjoyed the video.
Would love to see async tests with sqlalchemy
Since it’s testing. I suggest you simply use not async version to test. Same as alembic. So you will have async db and stuff in your prod environment and not async while doing migration and testing. Wonder why you want async with testing?
Speed? For me I used xdist. Database stuff wise you get issues so you need separate databases (or schemas) for each pytest session. This allows you to run close to X time the number of CPUs you have. Downside is the set up time so I also have a single pytest runner when running single tests.
Config for creating a database client that uses separate sessions per thread goes in the conftest file.
@@po6577 I get what you are saying. The problem I encounter is that when I want to test my endpoints, they are all written with async functions and methods. Meaning that the way I handle my CRUD operations is async. I have never tried this up to this point of writing, and i'm not sure why now that I think about it. But since all of it is using async sqlalchemy, I can't use a dependency override that inserts a sync_engine. (Will test this later, but I assume it's incompatible).
So to actually test my endpoint, I need to setup everything async with pytest. Now I have been able to do this, but how I'm not ENTIRELY sure how I did it, so I consider it a hack.
Not sure I follow,
The controller can just be hit with a test client. That will not care if your function is sync or async.
Not tested async service layer before though is that what you mean ?
Nicely done! Short but very solid and specific one. Thank you very much!
Thank you for the kind words! I'm glad it was helpful.
Very useful video! also a simple quiz from a learn tail is a chef’s touch ❤ Thank you!
I'm glad you're enjoying Learntail :)
Very helpful.
Thank you
Glad the video helped!
Exactly the problem I was facing! Thank you, Arjan.
Glad the video helped!
if you have multiple assertions in one test that are not contingent on each other, I would recommend either splitting up the test case (sometimes a bit much) or using a package like pytest_check, which allows pytest to handle multiple checks without ending a test as soon as the first assertion fails. that way, you can get a more detailed report of everything that fails
Is that not what the -vv flat does with pytest?
Nice video as always!
And examples are shown is pretty useful as long as your API is such simple one.
In my case almost every API endpoint contains quite complicated business logic over the data that are gathered with several SQL requests that uses some results from previous ones.
All kind of guides I've read about testing an API is always playing around kind of 'hello world' examples.
The second issue for me is that some complex API endpoints requires DB filled with pretty complex set of data. Minimal viable data set is about 6 to 10 GB on disk, besides that it is additional effort to create it from production DB contents and ensure it is consistent.
the "setup" and "teardown" doesn't ran automatically here. I had to put that on the test function. It is possible to run it automatically like in your code? Without fixtures?
Amazing, I always learn so much. Thank you
Thank you for the kind words, James!
Very structured and helpful video giving some new aspects I was unsure about how to handle before. Many thanks! :) Was creating a FastAPI Project these days again for a first pitch of a small - specific - control module in a radio station. I was happy not having the need of using a database this time. In the project before (with DB) some it felt kind of unsatisfactory, as I had the feeling of writing some stuff two times (for API and for DB). Also I was not happy about the binding: The will of decoupling was strong in me, but I failed. Can't wait for refactoring this
Just what I was looking for. Thanks 👍🏻
Glad it was helpful!
Another issue with this approach of using sqlite is database compatibility: I had a similar setup and sqlite did not support ARRAY, which is very common in postgresql.
The solution I came up with is to write a dialect adapter for sqlalchemy that emulates the ARRAY command. But that custom code may have bugs that hide real bugs (false positives) and you don't want to start testing the test framework :)
Docker makes it so easy to spin up a real database that for advanced features it makes sense to us the real thing. Great for when You need to test your app on the next database version or library. Bumping from sqlalchemy v1 to V2 for example.
@@TheRich107 Yep, I just have it on create, it will spin up a mariadb instance, create all the tables, Once it's done, i can just clone the production database to it then i can be as destructive as i want.
Does pytest auto-recognise the setup and teardown functions? If so does it call them before and after each test, or before and after each test session?
I have the same doubt
At 4:35 on line 72, you only close the DB connection if you don’t raise an error. Could that not creat problems with many sessions if the application is long-running?
This is such a quality content. Thanks a lot. One question I have though is if we're using a layered architecture... say DDD with layers like the infrustructure, domain (usecases) and such a thing how do we test them independent of the other layer?
Where are the setup and teardown methods called from?
I have the same doubt
9:00 Shouldn't the expression on the 48th line be inside the try-finally block? Seems like the `yield database` on its own cannot throw an exception.
That yield inside the try-finally is to check for errors inside the application and not with creating a session. With that try and yield you can assert that when an error occurs in your app the database connection is properly closed
Flawless ❤
There was small discussion that we had on Coding Crashkurse channel on this topic. I think the approach to have dummy DBs (using say, SQLlite) may not be the best approach for doing Unit-tests for APIs or utilities. We should rather test with mock functions that can be checked for how often they've been called! Testing with DB might fit well with an integration test - say by the QA environment.
I have a fastapi built on top of a Snowflake db (snowflake connector), which gives the team headaches to test because we are using Snowflake particular types like OBJECT in our db model, so we cannot reproduce it in a sqllite db for testing. Any suggestion?
A video about testcontainers would be very nice
Thank you for the video! Where did you get this cool T-shirt from? :)
Hi Arjan, this video it amazing (I already commented the other day), but I can't find the test_operation file in your git repository
thank you
Glad you enjoyed it, Daniel!
Beautiful
Thank you!
@ArianCodes could you do a video doing the same but with unittest instead?
Great video, thanks for sharing that!
1 major thing about converting different layers of models:
`return Item(**db_item.__dict__)`
might throw `TypeError: API.__init__() got an unexpected keyword argument 'hash_value' `
for adding another field in DB:
```
@dataclass
class DB:
id: str
name: str
# adding hash_value to DB, not meant to be shared with API
hash_value: bytearray
@dataclass
class API:
id: str
name: str
d = DB(id="1", name="db", hash_value=bytearray([125, 125]))
api = API(**d.__dict__)
```
tests will find this bug before production, but for that you need good coverage.
I prefer having it explicitly done with a convertor function i.e: create_api(db) even if it's boring as hell.
What do you think about c'tors receiving unpacked arguments with **?
Could I use unittest instead of pytest with the code?
Nice vid. Are you interested in doing a similar video from a data engineering perspective, let’s say using polars and duckdb? Should be fun an probably a little more straightforward.
Is starting a db session for each route something that is done when using relational databases? I only know how to use MongoDB and a connection to the db is done once on startup.
@ArjanCodes do you offer a mentorship service?
Hi! I currently don't offer any type of mentorship service, for the time being.
However, my Software Designer Mindset course offers two different services that might interest you.
When enrolling, you'll get access to my private student community on Discord, which I interact with quite often. Depending on the package you choose, you might also get a one-time custom code review by me in either Python, JavaScript, or Typescript!
You can check my website for more information if you wish! :)
Hope this comment helped.
I don't see why the need of creating a custom not found error, why don't just let the operation return none so the API part can handle it and raise the proper exception? That would be easier with the same separation of concerns. Great video BTW most tutorials don't address good practices as you do, appreciated.
What if your api needs to talk to external things, like third party apis?
I’m very curious how you would approach this using SQLModel, a new ORM made by the fastAPI author and built on top of fastAPI itself and SQLAlchemy
Hi Arjan, I see you're using Item(**db_item.__dict__). Is there a reason behind using __dict__ instead of dict()?
Do you mean dict(**db_item) ? or db_item.dict() ? Because of my knowledge sqlalchemy does not have that method. **__dict__** is a dunder that represents class attributes as a dictionary. Unpacking with ** results in initializing the pydantic model with keyword arguments.
In this situation, he could have added a response_model=Item to the decorator function of the route. Fastapi would have handled the rest.
👌
You can group the functions that do the the database operation into a class which will be repository pattern. I would also like to see how to set up the deepest model objects creation. For instance one need to write test for a project management tool like Jira. If we want to test commet. We need to create project, team, member, status, task and finally a comment. Could you make a tutorial for deepest hieararchy model object creation?
It would also be good dealing with user authentication when sending API requests.
By the way the the rest o the application other than repositories classes can decoupled completely from the orm by returning the DTO from repository class methods.
what about a video regarding Testcontainers 😜
i like writing code from scratch rather than showing a screen shot of code
please make video on to use sqlalchemy 2.0 with fatapi
Hi! I would not call it Integration test, I would call just "integrationy". As it is not tested against the same type of db as the life system, and also it is hosted a bit differently. About testing my api: In a next comment, after a sleep 🙂
Well this is a good intro,.But in production grade app things are done quite differently. FastAPI is async framework and sadly almost all tutorials show how to use FAstAPI with sync python only. In real life FastAPI app calls to DB are async, DB manages a pool of connections and everything gets quite a bit more complex. It would be nice to see a tutorial on more realistic setup of FastAPI DB layer
To be transparent, I am a bit Java-biased as I use both Java and Python daily.
The solution explained in this video is not clean enough because you inject DB sessions into routers. But what the routers depend on are not DB sessions. The routers depend on operations. DB sessions are just internal details of the operations.
From my Java-biased point of view, defining and injecting service objects providing operations is a cleaner solution.
routers -> service objects -> DB sessions
In this approach, we don't need in-memory SQLite for testing the routers. What we need is a mock of the service class. This approach makes tests of routers independent of DB access logic.
By the way, what this video illustrates is something we have seen repeatedly in various programming languages and frameworks - fat controllers. And the mitigation for this problem is also always the same - the classic MVC model from the 1980s'.
I agree with your approach however I think there is value is including the db in the scope of the controller tests. The python type system can mess up decimals and float types and it only surfaces when passed to the database. The value comes from checking that when that data has come from JSON into the controller and translated by the service, it might not be as you expect. I have had a type of decimal accept floats from the controller just fine. Individual unit tests passed but actually writing controller sourced data from the controller to the database through the service failed. In that case test would pass with TCP connection but socket failed.
@@TheRich107 I understand your point, but I prefer having a clear separation of concerns.
The ambiguity in type handling is a real pitfall in Python. But in this example of Web application tests, we should ensure strict type handling at the controller level (this is why we use Pydantic) and in assertions and mocks used in unit tests. The controller should focus on interfacing between the external world (Web clients) and business logic, and the business logic should be located somewhere else, like a service class.
The injection of DB sessions may work in small and independent Web services. But generally, a Web service can rely on multiple databases and other external dependencies, including microservices in a company, files in S3 buckets and various services in cloud platforms. A DB session is just one of many dependencies maintaining application status.
In our team, we encourage members to use predefined FastAPI project templates that include skeletons of routers and service classes with unit tests. Our templates aim to enforce a clear separation between controllers and models (services) from the beginning of every project. The overhead for introducing a service layer is literally zero. But its benefits are a lot.
@@buchi8449 Do you have an example repository with that structure? I'm not referring to your company's repo, but perhaps you've come across something very similar to what you're discussing in public repositories?
Yeah I like having a separate service layer certainly makes for easier testing.
I get your point not sure I follow on the multi dependency. Let me check my dependency, It makes sense that I don't have dependency injections for S3 buckets in the controller and that is Purley a service layer concern. I might have to play around with doing the same for the database.
The fast api author does know what he is doing. I would be interested to hear the pros of having the DB injected into the controller and why that is the documented way to use the framework.
When I have instantiated a database connection at the service layer before I didn't find a way to override it in the tests. (I have several DB sessions running side by side in the tests)
@maihuynhtuanvu2254
Not got sample code, was production code 😅 good old sentry had my back though.
I put a monitor while you watch monitor. 2x different mappers to same SQL table... just to update row
Look: JSON -> model.Pydantic -> serialize in dict -> serialize in SQLAlchemy ORM mapper -> serializes in dict for query -> sql query. Instead could be Pydantic -> SQL query or SQLAlchemy Core.
IMHO HTTP handlers should know nothing about DB! There should be some "service" entity which you inject into handlers. Then you test HTTP-related stuff with a fake service. The "service" itself is tested separately and it knows nothing about HTTP.
My personal favorite way is to have a class like "Application" which encapsulates HTTP app (e.g. Flask, FastAPI, etc.) and some service (e.g. Service class). The Application accepts the service via initializer. HTTP handlers have access to the service.
Then you init the "Application" in tests passing it some FakeService, initialize a test HTTP client, and test your requests.
That´s what I also say. Unittests are there to test a code unit. Unit tests should not be there for integration tests
I made the same comment. Injecting DB sessions to routers is not clean. We should inject service objects into routers and use mocks of the service objects to test the routers.
It is not difficult to modify the current code in the video to have a service class. In the video, the "operations" are extracted as pure functions. However, this is not a good model since the operations have side effects on the state of the application.
The better approach is extracting the "operations" as methods of a service class. Then we can inject DB sessions to __init__ of the service class, and inject instances of the service class into routers (FastAPI's DI can recursively inject dependencies).
Hey man are you on the Carnivore Diet ? You look like you lost some weight!
So much glue code and scaffolding. Maybe you should learn SQL. Reminds me of people who rely on Microsoft GUI vs know how to use bash.
Unsurprisingly, a yotuber without a successful programming career has 0 skills in software architecture.
You're helping some, but you disregard talking about some of the more important decisions you must make before you even start typing.