This is why I always, always, ALWAYS use data mappers to all my endpoints. This allows me to - dynamically filter field based on some business / authorization rules, - add additional computed fields, etc. - rename database colums if / when the needs arises without breaking everything. You could also define those mappers as zod chemas that disallow extra properties
agreed, using a allowlist (select in your query) is more easy to debug than using a blocklist (omit in your query) in most situation, and we have more control about the behavior.
woah, I didn't know that. I should check the network tab next time. Also, it does make sense since passing props pretty much makes the props accessible to the component we're passing it on. Thank u!
With 'select' you should always keep it in mind, especially when you query entity and you have to JOIN or populate User entity. For me it's better to use DTO with class-transform and 'serialize' response data. And the best decision will be choose better ORM, where prive field alredy implemented)
Another approach to totally forgot to mention in this video is to create a separate CodeRacerUser model which stores all information unrelated to authentication. This would obviously be a much larger refactor, but I think that would be the BEST solution over what I did here.
I'm glad you said that because I was thinking it. 🙂 -- because you said blacklisting everywhere would be a pain...but then recommended whitelisting everywhere...which also sounds like a pain. 😁 I was also curious about how whitelisting works with types...do you also have to make a separate type with only the whitelisted pieces?
@@drewbird87 whitelisting / selecting is a pain as well, but it has LESS risk. If you forget to omit in one place, you've leaked all your users email. If you fail to select the columns you need, your UI will break and you'll know. You could also use DTOs and Mappers to make sure nothing bad is returned from your functions you await on inside your RSC, but that also runs the risk that you forget to pass something through a mapper. Basically it all boils down to you need to double check you're not leaking stuff, and that's a human problem.
@@veenallo yup, a DTO is also a good solution. The people writing Java code have a decade more experience writing large systems than this entire javascript ecosystem
After seeing the video, I was going to ask about this very thing. In one of my projects, I had seperate User and ClientUser models. ClientUser only had public info. I ended up changing to the solution in the video because I felt it made more sense.
It always baffles me when I get other dev telling me, "it doesn't matter if i request too much, i'll never know i might need it". The burden on the network and the database is often overlooked and definitely when there is sensitive data you should take extra care !
Generally, I just separate out sensitive information in the data model into a separate table. Users can only view their own data, but if they have a "profile" this is publically available.
React Sever Components: Welcome back to the ages of PHP, table leaks, and SQL injections….. Combining a view (render logic) with a controller (data access) is a bad idea and the reason why the industry advanced to API first designs. At least that was the case… phew
This is honestly why too much abstraction can be a bad thing, you're allowing some library to control what data it retrieves (or at least the default is project everything), and having a default like that is dangerous in my opinion, whereas with some query language, like SQL, you can do "SELECT *" and you explicitly know what you are projecting from the given table. Not to say it is entirely Prisma's fault either, although that default is just poor, it is also partly a Next thing where it makes it feel like the client and server are just a single application, I think sometimes that divide is important and necessary. What do other people think?
it's a couple of things. First, the code base could be setup to use a repository pattern with data mappers so that we have more obvious control over what data comes back from the database. Second, we could split our the non-sensitive data into a separate model so that our code never needs to fetch from the user model. Third, I do agree having a separate API makes it a lot more obvious what data is being leaked. Using RSC with next does obscure what is accessible publicly or not. I think prisma could contribute to the issue, but honestly I think all ORMs are setup like this to return ALL the data, and let's be honest, who wants to waste time writing raw SQL queries.
I found a website doing something similar once.. it was even worse, though. They were returning SO much stuff from the API, like even hashed passwords. I emailed them and they fixed it.
Great tips. Besides, Backenders have lotta responsibilities here. Working with Nest js, you simply serialize the response. That seals everything :) What do you think?
For example, when you hover your mouse cursor over the line of code, an automatic text appears on the right side. For example, at 2:51, on the right of line 56, "You, 4 weeks ago" and so on.
No I mean you’d migrate all the columns unrelated to authentication to a new table. A lot of this data such as averageCpm doesn’t need to live directly on the user table
So can use sterilization on your route handler instead ?! Sterilization does intercept the response before the route handler send it so you can manipulate the response object how ever you want This could be useful if you don't want to always select the same data over multiple end points
I'm not sure exactly what you mean by route handler. We are using next with RSC which means we are invoking async functions directly from our RSC. Are you saying just use a DTO / Mapper / Serializer before that function returns data to make sure it's only the fields we care about?
I believe you should always use select to fetch the specific parts of the table that you need for your use case and never the entire entry. Imo using ommit is just nonsense - don't bother with it
doesn't exist in prisma 🤣 they do provide some documentation on how to write a typesafe exclude, but the question is why isn't this baked into the ORM by default?
it's not calling prisma inside next that is a mistake, is that you unfiltered the unwanted data from the query or just didn't explicitly stated the fields you want. exactly the same stuff you should do even if the data is not fetched inside of Next!
right, but it's easier to manage a code base if you keep all your persistence layer calls in separate functions and transfer information using DTOs. For example, in prisma if you accidently do an includes: { user: true}, that'll bring over all the user info. It's simple to make a silly mistake and leak all that data wouldn't you say?
this is just a lack of logic issue. You have to show some user details but not everything, so obviously you cant return a raw findMany query. A simple `select` field solves this problem
With this trends on the last 6-7 years to use Frontend technology like a JS for Strictly Backend stuff you end here. The topic is endless and I NO want to starting flames, but just want to ask you - do you and I ask honestly with pure heart without any sarcasm, do you understand how, how ridiculous, dumb and absurd is the problem in this video that you show if you think from the perspective of someone who will use PHP, Python or C# for this?!? It is .... not just Impossible for us to spit the data this way, but the whole thing is filtered already on a SQL level and we no select data that we do not need in the working scope.
what ORMs are you speaking about? I haven't used an ORM that automatically strips email from your returned database query. The language (php, python, c#) isn't going to automatically make your data secure, so when you say "the whole thing is filtered", I do not understand what you are saying.
@@WebDevCody I not use ORM and write my SQL manually. And, Yes I Perfectly know what I saying when I write my SELECT id, name, family, phone from.... by hand and do not include email in the query... Such a poor response from you, I do not expect that...
@@vasiovasio a poor response? I'm just asked what ORM you were using. You didn't mention in the original comment that you were writing raw SQL, all you did was try to ridicule javascript developer (which is a poor comment by you). All of the languages you mention have ORMs, and every professional software I've worked with uses some type of ORM.
This is why I always, always, ALWAYS use data mappers to all my endpoints. This allows me to
- dynamically filter field based on some business / authorization rules,
- add additional computed fields, etc.
- rename database colums if / when the needs arises without breaking everything.
You could also define those mappers as zod chemas that disallow extra properties
Dear @darialyphia, do you mean a data mappers pattern for connection your database to application?
Hi @daria can you elaborate on your logic?
As someone who contributed to the project I never imagined such a big security issue. Thank you so much for this I learned something new!
Rookie mistake many of us made in the past. I once leaked password hash in JWT token. Ok, it was bcrypt hash and not plain password but still...
agreed, using a allowlist (select in your query) is more easy to debug than using a blocklist (omit in your query) in most situation, and we have more control about the behavior.
In Mongoose, you can unselect fields by default so they only get returned if you explicitly request them.
Must be nice to have such a simple feature to add inside an orm 😂
@@WebDevCody I learned about it when the exact same thing happened to me. Public leaderboard that leaked users' emails 😅
woah, I didn't know that. I should check the network tab next time. Also, it does make sense since passing props pretty much makes the props accessible to the component we're passing it on. Thank u!
With 'select' you should always keep it in mind, especially when you query entity and you have to JOIN or populate User entity. For me it's better to use DTO with class-transform and 'serialize' response data. And the best decision will be choose better ORM, where prive field alredy implemented)
Nice profile picture 😂
Thank you for pointing this out, Cody. This is really important
Another approach to totally forgot to mention in this video is to create a separate CodeRacerUser model which stores all information unrelated to authentication. This would obviously be a much larger refactor, but I think that would be the BEST solution over what I did here.
reminds me of DTOs in Java
I'm glad you said that because I was thinking it. 🙂 -- because you said blacklisting everywhere would be a pain...but then recommended whitelisting everywhere...which also sounds like a pain. 😁
I was also curious about how whitelisting works with types...do you also have to make a separate type with only the whitelisted pieces?
@@drewbird87 whitelisting / selecting is a pain as well, but it has LESS risk. If you forget to omit in one place, you've leaked all your users email. If you fail to select the columns you need, your UI will break and you'll know. You could also use DTOs and Mappers to make sure nothing bad is returned from your functions you await on inside your RSC, but that also runs the risk that you forget to pass something through a mapper. Basically it all boils down to you need to double check you're not leaking stuff, and that's a human problem.
@@veenallo yup, a DTO is also a good solution. The people writing Java code have a decade more experience writing large systems than this entire javascript ecosystem
After seeing the video, I was going to ask about this very thing. In one of my projects, I had seperate User and ClientUser models. ClientUser only had public info. I ended up changing to the solution in the video because I felt it made more sense.
It always baffles me when I get other dev telling me, "it doesn't matter if i request too much, i'll never know i might need it". The burden on the network and the database is often overlooked and definitely when there is sensitive data you should take extra care !
I dont think omitting id from the DB fetch would cause a performance issue
Generally, I just separate out sensitive information in the data model into a separate table. Users can only view their own data, but if they have a "profile" this is publically available.
Thank you. Very important topic. You covered the problem and query solution well.
React Sever Components: Welcome back to the ages of PHP, table leaks, and SQL injections….. Combining a view (render logic) with a controller (data access) is a bad idea and the reason why the industry advanced to API first designs. At least that was the case… phew
back to the future
Nice catch babe!! Good job ❤
you're a nice catch 😘
Nice, I always follow this practice to save DB bandwidth.
This is honestly why too much abstraction can be a bad thing, you're allowing some library to control what data it retrieves (or at least the default is project everything), and having a default like that is dangerous in my opinion, whereas with some query language, like SQL, you can do "SELECT *" and you explicitly know what you are projecting from the given table.
Not to say it is entirely Prisma's fault either, although that default is just poor, it is also partly a Next thing where it makes it feel like the client and server are just a single application, I think sometimes that divide is important and necessary. What do other people think?
it's a couple of things. First, the code base could be setup to use a repository pattern with data mappers so that we have more obvious control over what data comes back from the database. Second, we could split our the non-sensitive data into a separate model so that our code never needs to fetch from the user model. Third, I do agree having a separate API makes it a lot more obvious what data is being leaked. Using RSC with next does obscure what is accessible publicly or not. I think prisma could contribute to the issue, but honestly I think all ORMs are setup like this to return ALL the data, and let's be honest, who wants to waste time writing raw SQL queries.
That’s crazy they didn’t implemented DTOs on their API
Nope, it was just a small side project, but now it probably needs some
I found a website doing something similar once.. it was even worse, though. They were returning SO much stuff from the API, like even hashed passwords. I emailed them and they fixed it.
Did an email go out to all users saying data was exposed incorrectly?
Yes I sent out emails to the users affected
Great tips.
Besides, Backenders have lotta responsibilities here.
Working with Nest js, you simply serialize the response. That seals everything :)
What do you think?
Imo it makes some sense for email to be public to other users, like on mailing lists. But maybe not if you aren't logged in..
A persons email should never be public unless they explicitly allow it to be
For example, when you hover your mouse cursor over the line of code, an automatic text appears on the right side. For example, at 2:51, on the right of line 56, "You, 4 weeks ago" and so on.
that is the gitlens plugin which allows for inline git blame
Thank you @@WebDevCody
Unfortunately in Prisma if you use the include statement it selects all the field from the entity. You have always to keep that in mind.
it includes all the fields from the nested entity right? I guess we'd always need a DTO in that case
Although title is misleading in a way, you do talk about very important stuff to remember 😂
Cody as you mentioned about creating another model
Would there be any issue with redundant data?
I prefer select solution
No I mean you’d migrate all the columns unrelated to authentication to a new table. A lot of this data such as averageCpm doesn’t need to live directly on the user table
So can use sterilization on your route handler instead ?!
Sterilization does intercept the response before the route handler send it so you can manipulate the response object how ever you want
This could be useful if you don't want to always select the same data over multiple end points
I'm not sure exactly what you mean by route handler. We are using next with RSC which means we are invoking async functions directly from our RSC. Are you saying just use a DTO / Mapper / Serializer before that function returns data to make sure it's only the fields we care about?
network tab search bars have been so buggy recently, driving me crazy
for real I do cmd + f and it highlights some other panels
could this be resolved using dto kinda pattern?
yes, we could potentially pass all data that the RSC is trying to fetch into a DTO which has a whitelist of fields. Some call a DTO a mapper.
Nextjs & React Server component is never a safe option. Enterprises will never pick this as their primary backend.
I believe you should always use select to fetch the specific parts of the table that you need for your use case and never the entire entry. Imo using ommit is just nonsense - don't bother with it
will using zod with prisma helps this kind of problem?
Yeah it can help
js dev figuring out normal vulnerability
I always select
RSC will be the end of NextJS, i tell you.
can you do a video about different types of SQL joins and other general things about SQL that are needed in a professional setting
That’s why you use drizzle, so this never happens unless you do a select *
yeah, that is a good argument, making it explicit to get all the data on a column might be an important selling point.
Omit or exclude ??
doesn't exist in prisma 🤣 they do provide some documentation on how to write a typesafe exclude, but the question is why isn't this baked into the ORM by default?
bro i don't even know what this video is about i just clicked on it to tell you that i thought you were wolfy playz LMAO
it's not calling prisma inside next that is a mistake, is that you unfiltered the unwanted data from the query or just didn't explicitly stated the fields you want.
exactly the same stuff you should do even if the data is not fetched inside of Next!
right, but it's easier to manage a code base if you keep all your persistence layer calls in separate functions and transfer information using DTOs. For example, in prisma if you accidently do an includes: { user: true}, that'll bring over all the user info. It's simple to make a silly mistake and leak all that data wouldn't you say?
@@WebDevCody please fix the misleading title then
this is just a lack of logic issue. You have to show some user details but not everything, so obviously you cant return a raw findMany query. A simple `select` field solves this problem
you dont even bother to change the content of your clickbaits, just the titles
With this trends on the last 6-7 years to use Frontend technology like a JS for Strictly Backend stuff you end here.
The topic is endless and I NO want to starting flames, but just want to ask you - do you and I ask honestly with pure heart without any sarcasm, do you understand how, how ridiculous, dumb and absurd is the problem in this video that you show if you think from the perspective of someone who will use PHP, Python or C# for this?!?
It is .... not just Impossible for us to spit the data this way, but the whole thing is filtered already on a SQL level and we no select data that we do not need in the working scope.
what ORMs are you speaking about? I haven't used an ORM that automatically strips email from your returned database query. The language (php, python, c#) isn't going to automatically make your data secure, so when you say "the whole thing is filtered", I do not understand what you are saying.
@@WebDevCody I not use ORM and write my SQL manually.
And, Yes I Perfectly know what I saying when I write my SELECT id, name, family, phone from.... by hand and do not include email in the query...
Such a poor response from you, I do not expect that...
@@vasiovasio he literally gave the same solution in the video..
@@vasiovasio a poor response? I'm just asked what ORM you were using. You didn't mention in the original comment that you were writing raw SQL, all you did was try to ridicule javascript developer (which is a poor comment by you). All of the languages you mention have ORMs, and every professional software I've worked with uses some type of ORM.
@@WebDevCody ok