I rarely write any comments, but I just wanted to thank you for the content. I've had very basic understanding of webscraping with requests, but recently I've had a project to scrape as many websites as possible and thanks to you I learned about scrapy, pipelining to database, user-agents and much more. Thank you so much!
Quick note of thanks: I just landed my first programming job which is django development. Before I learned any web dev though, I was always using your scraper tutorials and building my own scrapers; working on my python chops. Those were some of the most formative things in terms of actually getting the confidence that I could learn and write code. I've been quiet on this channel while learning all the web dev stuff but I wanted to say thanks again for all the help you've provided along the way.
That’s great Matt, congratulations on your new job! I’m really pleased to have helped you in some way on your path to becoming a django dev - thank you for letting me know I appreciate it.
I stoped python for a few weeks, and have been learning html,css,js for about 3 weeks. It really help a lot to understand scrapy. Now i'm back to learn more about scrapy skills! I just scrapy all your video titles, wanted to learn one by one.You uploaded 146 videos, a heavy job omg. I couldn't directly scrapy from youtube,maybe there is something wrong with proxy.(live in cn and have to use VPN to get to youtube or google).So I just copy down the html element and use bs4 to parse it.
You might not know the true impact you are making upon us when making these videos, but it is substantial. I can’t thank you enough for what you are doing.
It's like dataclasses with validation. The only issue I have is that it's a bit rigid. Once it's a pedantic model you cannot change anything inside it. So you need to make the transformations to the incoming data (if any, and yes, there can be) before you instanciate it. But if you're doing that, then you might as well be using dataclasses to begin with and perform the transformations on the go. The problem with that is that you lose the validation, you will have to write it yourself.
I had actually heard of Pydantic several months back but never explored it further, don't know why, just kind of gave it a first glance and moved on. I did not know how useful it could be hahaha I'm definitely going to use this!
Thank you so much for this John! 🙏🥳 Great topic. I also watched your SQLModel video: thanks for that too! More videos about storing more complex data structures in an ORM setup are appreciated. My goal is to store scraped data that is enhanced with API data to the Django database. Not sure how to setup the route from JSON + other JSON to the Django ORM yet.
Not sure if you have tried this, but you can create a top class with a root attribute to hold the list of items. Instead of using list comp. in fact this is what the pydantic Jason to model generators do.
Optional doesn’t mean it can be omitted. You need to set a default value for it to truely be optional, but if the default is None then the Optional allows the mixing of list and None
I had literally just said aloud "yeah, I don't want to type all that out" and then you said "I can hear you saying ''I don't want to type that out all the time". :))
So is this similar to a json schema validator except that this is operating on a dict? For incoming requests, we're using a json schema validator to ensure the structure and type match against a json schema. I'm wondering what this will bring to the table should I replace schema validation with pydantic.
for example u have 1 excel sheet and it consist of 10000 data in it. Later when we import that excel file in pycharm or jupiter notebook. if i run that file i will get an Index range also know as Row labels. my python code should be able to read that ten thousand row labels and should be able to separate / split into 10 different excel sheet files which will have 1000 data in each of the 10 saperated sheet. other example is, if there is 9999 data in 1 sheet then my python code should divide 9000 data in 9 sheet and other 999 in other sheet without any mistakes. i am asking this because in my data there is not any unique values for my code to split the files using .unique plz help i have search the whole YT , stackoverflow, and github tooo from 3 days
Love your video and learn today. I am not clear of at 9:37 product[**item] , will you please explain what "**" mean here in the list comprehension. Look forward to your reply. Thanks
In Python, ** is used to unpack dictionaries/key-value pairs. This is useful in a few areas, I mostly see it used in function/method parameters. def function(*args, **kwargs): would let you call function(blah=True) and kwargs would be a dictionary that looks like: {"blah": True}
I wanna thank you for teaching the world. Please I'm python beginner I my next step is to build web scraper.can you please help me how do I go about it.
Don't show off your typing skills, but use copy and paste. I find it particularly annoying to cancel and correct your typing in my mind, while you do it!
I rarely write any comments, but I just wanted to thank you for the content. I've had very basic understanding of webscraping with requests, but recently I've had a project to scrape as many websites as possible and thanks to you I learned about scrapy, pipelining to database, user-agents and much more. Thank you so much!
Thank you! I’m glad you’ve enjoyed my videos and that they were useful to you, thanks for watching!
Quick note of thanks: I just landed my first programming job which is django development. Before I learned any web dev though, I was always using your scraper tutorials and building my own scrapers; working on my python chops. Those were some of the most formative things in terms of actually getting the confidence that I could learn and write code. I've been quiet on this channel while learning all the web dev stuff but I wanted to say thanks again for all the help you've provided along the way.
That’s great Matt, congratulations on your new job! I’m really pleased to have helped you in some way on your path to becoming a django dev - thank you for letting me know I appreciate it.
I've been using pydantic for year. The validator is a killer feature that needs to be built in python next version.
I stoped python for a few weeks, and have been learning html,css,js for about 3 weeks. It really help a lot to understand scrapy. Now i'm back to learn more about scrapy skills!
I just scrapy all your video titles, wanted to learn one by one.You uploaded 146 videos, a heavy job omg. I couldn't directly scrapy from youtube,maybe there is something wrong with proxy.(live in cn and have to use VPN to get to youtube or google).So I just copy down the html element and use bs4 to parse it.
Not in a programming mode but I like watching a master home his craft.
You might not know the true impact you are making upon us when making these videos, but it is substantial. I can’t thank you enough for what you are doing.
John you are amazing, thank you very much for all the knowledge you share, cheers from Chile!
Many thanks!
It's like dataclasses with validation. The only issue I have is that it's a bit rigid. Once it's a pedantic model you cannot change anything inside it. So you need to make the transformations to the incoming data (if any, and yes, there can be) before you instanciate it. But if you're doing that, then you might as well be using dataclasses to begin with and perform the transformations on the go. The problem with that is that you lose the validation, you will have to write it yourself.
I had actually heard of Pydantic several months back but never explored it further, don't know why, just kind of gave it a first glance and moved on.
I did not know how useful it could be hahaha I'm definitely going to use this!
You do a fantastic explaining concepts and give great examples. Thank you
Very nice tip and intro to Pydantic, I'll give it a try, thank you
Best Chanel ever, thank you so much
Thank you so much for this John! 🙏🥳 Great topic.
I also watched your SQLModel video: thanks for that too!
More videos about storing more complex data structures in an ORM setup are appreciated.
My goal is to store scraped data that is enhanced with API data to the Django database.
Not sure how to setup the route from JSON + other JSON to the Django ORM yet.
Thanks very kind!
FYI: There is a Pydantic PyCharm plugin that will autocomplete the class variables you type at about 3:33 where you create the item object.
Oh great thanks I didn’t know about that
Not sure if you have tried this, but you can create a top class with a root attribute to hold the list of items. Instead of using list comp. in fact this is what the pydantic Jason to model generators do.
Like a products model to hold list of product models
THAT WAS GOOD! Thanks man
Optional doesn’t mean it can be omitted. You need to set a default value for it to truely be optional, but if the default is None then the Optional allows the mixing of list and None
Yes you are right I made a mistake there
Thanks for explaining everything so clearly!
great work, this is definitely a handy tip
Brother, you should make more videos on these libraries other than just web scraping. I love your videos but I couldn't find other library videos.
I had literally just said aloud "yeah, I don't want to type all that out" and then you said "I can hear you saying ''I don't want to type that out all the time". :))
Haha yup!
what code editor u using and how did you get "show context actions" to add missing modules ?
Nice 🤠 but I Have a question how modify the Json (e.json) when the exception is raise
So is this similar to a json schema validator except that this is operating on a dict? For incoming requests, we're using a json schema validator to ensure the structure and type match against a json schema. I'm wondering what this will bring to the table should I replace schema validation with pydantic.
You can do this same thing with dataclass mixin module
how about taking it a step further and doing using dataclasses? Instead of using basemodel? Would that work too?
from pydantic import dataclasses
Perfect thank you
for example u have 1 excel sheet and it consist of 10000 data in it. Later when we import that excel file in pycharm or jupiter notebook. if i run that file i will get an Index range also know as Row labels. my python code should be able to read that ten thousand row labels and should be able to separate / split into 10 different excel sheet files which will have 1000 data in each of the 10 saperated sheet.
other example is, if there is 9999 data in 1 sheet then my python code should divide 9000 data in 9 sheet and other 999 in other sheet without any mistakes.
i am asking this because in my data there is not any unique values for my code to split the files using .unique
plz help i have search the whole YT , stackoverflow, and github tooo from 3 days
Hey John, are you going to do more tutorials on Go? You are a Gofer now after all 😁!
Haha yes I will, don’t worry
Love your video and learn today. I am not clear of at 9:37 product[**item] , will you please explain what "**" mean here in the list comprehension. Look forward to your reply. Thanks
In Python, ** is used to unpack dictionaries/key-value pairs. This is useful in a few areas, I mostly see it used in function/method parameters. def function(*args, **kwargs): would let you call function(blah=True) and kwargs would be a dictionary that looks like: {"blah": True}
@@VolundMush thank you so much
Thanks!
You can achieve all of these with dataclass also
awesome
I wanna thank you for teaching the world.
Please I'm python beginner I my next step is to build web scraper.can you please help me how do I go about it.
Thanks! I have loads of web scraping with Python content on my channel if you have a look I did a basics scraping guide not that long ago
Hey, what tech stack do you use for your blog?
Hugo static site generator, GitHub and netlify
John, pls explain what does it mean in list comprehension this Product(**item)?
We are loading the data into the Pydantic model called Product
@@JohnWatsonRooney thx
@@JohnWatsonRooney I guess you are trying to unpack the dictionary/json data structure item?
I find this video shallow and pydrntatic
-Peter Griffin
amazing!
Thank you!
why not simply use a dataclass, they have types as well.
Wilson Kenneth Thomas Cynthia Martin Ronald
Lopez Helen Thomas Sarah Walker Michelle
Ha ha python is moving towards java...
Don't show off your typing skills, but use copy and paste. I find it particularly annoying to cancel and correct your typing in my mind, while you do it!
Great content, but watching someone type is annoying AF! Why not type it up beforehand, then hide and reveal lines as you go?!
Different styles, but fair enough - my typing skills aren’t the best
@validator("sku")
def sku_length(cls, value):
assert value == 7
return value
must be better
I’ve always worked on the assumption that assert is for use in tests not code logic
@@JohnWatsonRooney fair enough