My first thought was to make the single thread command and throw GNU Parallel at it. Assuming crossplatform isn't a factor. PHP for the win btw! It was my first love and still going strong 12+ years later.
I was doing this yesterday night and I amazed by your video now! 1- You're totally right about not using the Hash facade. Here, the slower is the better so they can slow down us dramatically. 2- Using models is not that efficient here. Why didn't you use the DB facade instead? Thanks for your great videos. I learn a lot from you.
Same here. I just wish PHP added native async support. Even with Fibers stuff like sleep() will sleep the main thread and lot of native functions like file_get_contents are IO blocking
Love the fact that there is still a place for the PHP experts out there in the field. I hope you find peace in teaching all these amazing and evolving technologies with a language you love. True love means you love it no matter what. Even if the language itself is no longer evolving, and the ecosystem around it is in decline, you still love it as a language all by itself.
How about batch insert direct in the database? instead of 10000 insert queries, use 1 insert queries with 10000 values set. I believe it is better for performance. But this could run into issues of limited characters or parameters for a query, in this case I would limit the amount per batch.
I ran some benchmarks on this a while ago - 1 million entries, batches of 2500, 1 process ~7 seconds & 500 MB memory usage. I got the same numbers using Singlestore local & MySQL 8.
I think one improvement is to a single sql query to insert multiple records with Model::insert() instead of looping and using Model::create(), this might not affect performance if the database in on localhost, but it should if it's somewhere in the cloud.
Why not generate a large array of data and bulk insert them at the end? For the dates, generated an arrays of 1B elements will take fee moments and this will make timestamps unique.
What about turning off database constraints checks? This example didn't mention, but I believe it's fairly common to insert bazillions of data and don't care about checking for unique or primary keys on every insert (if you can be sure about your code producing that well enough)
I use DATETIME, but i store with utc timezone (mysql doesn't store the timezone) and let PHP's DateTime class figure it out for me. Datetime has a lot of query features, and you can easily read dates when working directly with the db.
An interesting idea I used in the past a couple of times to seed a lot of records was doing cross join (insert into... Select... Cross join...) . It might be for a very specific situation but if you find a use for it for seed and have a lot of initial data then boy, it goes brrrrr
PHP has its own well deserved place in software development. Yes, it's not for everyone and everything, yes, it has a bad reputation because it used to be very loose with its syntax and types, but since php 8 it's really a mature language.
I never use frameworks, as it takes up a lot of space and time to set up. For this I just use PCNTL and raw MySQLi. BTW, love your videos on MySQL, learned a lot!
Question , why using a external service like Planet Scale if i already have a Mysql database in my hosting ? how secure it is ? i mean , moving traffic from one place to another is secure ? queries? data ?
With that generic "password" would it be better to make a constamt of a hashed password and then use that in the function. Also make the "created at" before the "email verified" and then "email verified" can be some random number of hours after "created by".
I thought he was gonna do that actually since he mentioned it was expensive. Calculate once and reuse. I would imagine it would be particularly useful if you wanted to login as random users to do things with because the login code is gonna compare hashed values.
Nice one as always :D I use a bunch of common seeds like 90% of the time, when I need consistent stuff e.g. customers + addresses + orders(+items), refunds(+items) etc and want them to have proper relations and consistent timestamps + pk's. For date(times) is just do something like this. $from = strtotime('2020-01-01'); $until = strtotime('2023-12-01'); $sets = 1_000_000; $stepSize = (int)(($until - $from) / $sets); and then loop while `$createdAt = date('Y-m-d H:i:s', $from += $stepSize);`
so is it safe if I use this pool for backfill data in production ? What's the impact anyway? It's new for me. I usually use chunk to backfill data if the data too much. But it's very slow for sure. in this case, I use Laravel.
I get deadlocks when seeding my db with a transaction which does insert 1 row in table 1 insert multiple rows in table 2 delete multiple rows from table 3 insert multiple rows from table 3 Multiple transactions in parallel don't affect the same rows - there's no overlap except for step 2 but it seems it's step 3 or 4 which causes the deadlock. Using drizzle ORM - not sure if that matters tho. Not a big deal but I would be glad to at least have an intuition as to why that happens - have to run seeding sequentially for now :(.
@@PlanetScale less code that needs to be manually written. No need to manually write the pool code and how many to run and such you can write the base task and let parallel handle all that
Next time use laravels model insert method to insert many table rows at once it's so much faster than create. Do note though that auto filled properties of a model won't be set like created at
I hate PHP but was forced to work on it, picked up Laravel and it showed me how an ugly language such as PHP can be wonderful in some ways. My perspective on PHP has changed ever since from personally bad to not quite as bad as it sounds like.
This one I don't like. Again, people are so framework dependent it's horrible. You can do this with PHP easily. People are confusing PHP with Laravel and that's bad, it's the same with JS and React or Next. Frameworks are useful for specific projects, for small things why be framework dependent? I don't understand.
Some people hate PHP only because someone told them you should.
Thanks Aaron!
literally laughing out loud several times. absolute gold.
(partner's like "what are you watching?!" "a guy seeding a database!" 🤣)
same, I can't hide the smile on my face while watching this 😂
My first thought was to make the single thread command and throw GNU Parallel at it. Assuming crossplatform isn't a factor.
PHP for the win btw! It was my first love and still going strong 12+ years later.
Yeah was thinking the same. This way you only have to think about the single concern instead of thinking about pooling etc.
The funny thing is that webservers that run php (apache/nginx) do not allow/support PHP multithreading, so it can only be used in command line lol
PHP is wonderful. Your videos even better. Keep rocking.
PHP is underrated atm, so much hate even though its gotten quite good :)
A nominally-typed TypeScript blows PHP out of the water. Shame we don't have such language.
@@parlor3115in what sense?
@@parlor3115wym 'nominally'?
@@parlor3115 PHP can be nominally typed 🤔
they hate what they don't understand
I was doing this yesterday night and I amazed by your video now!
1- You're totally right about not using the Hash facade. Here, the slower is the better so they can slow down us dramatically.
2- Using models is not that efficient here. Why didn't you use the DB facade instead?
Thanks for your great videos. I learn a lot from you.
I love both PHP and MySQL, and using them every single day, but also C# of course ^^
Same here. I just wish PHP added native async support. Even with Fibers stuff like sleep() will sleep the main thread and lot of native functions like file_get_contents are IO blocking
Love the fact that there is still a place for the PHP experts out there in the field.
I hope you find peace in teaching all these amazing and evolving technologies with a language you love.
True love means you love it no matter what. Even if the language itself is no longer evolving, and the ecosystem around it is in decline, you still love it as a language all by itself.
How about batch insert direct in the database? instead of 10000 insert queries, use 1 insert queries with 10000 values set.
I believe it is better for performance.
But this could run into issues of limited characters or parameters for a query, in this case I would limit the amount per batch.
It is better for performance but I like going through the model for casting and convenience
I ran some benchmarks on this a while ago - 1 million entries, batches of 2500, 1 process ~7 seconds & 500 MB memory usage. I got the same numbers using Singlestore local & MySQL 8.
It is faster, yes.
Do you have a php course, I really enjoy your teaching style so asking 🥺
Not yet!
I think one improvement is to a single sql query to insert multiple records with Model::insert() instead of looping and using Model::create(), this might not affect performance if the database in on localhost, but it should if it's somewhere in the cloud.
I know this is not the channel for it, but please! More Laravel content! (whether it's on your channel or Laravel's). Love your videos!
Go to his personal channel. He posts laravel content there, but agreed, definitely more laravel here too!
the only MySQL tutorials i can watch and enjoy!
I haven't heard back yet. /s
Nice vids, keep 'em coming!
Great video! One way to solve the expensive bcrypting of the password, would be to do it once and set the value as a property on the class for re-use.
I'm still waiting on a girl to tell me "I love PHP!", so I can reply "I love Vue!"
My email got rejected, it says your inbox is full
Why not generate a large array of data and bulk insert them at the end? For the dates, generated an arrays of 1B elements will take fee moments and this will make timestamps unique.
This can take up a lot of ram, you would probably also need to increase php script memory limit
What about turning off database constraints checks?
This example didn't mention, but I believe it's fairly common to insert bazillions of data and don't care about checking for unique or primary keys on every insert (if you can be sure about your code producing that well enough)
yes that's what I would do if testing on relational data
Great example of why I use epoch timestamps for my date times in every database.
I use DATETIME, but i store with utc timezone (mysql doesn't store the timezone) and let PHP's DateTime class figure it out for me.
Datetime has a lot of query features, and you can easily read dates when working directly with the db.
An interesting idea I used in the past a couple of times to seed a lot of records was doing cross join (insert into... Select... Cross join...) . It might be for a very specific situation but if you find a use for it for seed and have a lot of initial data then boy, it goes brrrrr
You can insert 50K rows in one second to mysql with LOAD INFILE command. Yes this needs a little bit configuration in mysql side but it is possible.
Yep! I'm definitely trading raw speed for convenience. The faker library and the ORM interface make my life super easy
PHP has its own well deserved place in software development. Yes, it's not for everyone and everything, yes, it has a bad reputation because it used to be very loose with its syntax and types, but since php 8 it's really a mature language.
I didn't realize Laravel was that powerful. Nice. :) Yes, I like PHP. :)
I never use frameworks, as it takes up a lot of space and time to set up.
For this I just use PCNTL and raw MySQLi. BTW, love your videos on MySQL, learned a lot!
If you don’t use frameworks you always end up making your own. Otherwise you’ll stay inefficient.
@@invinciblemode that’s true though 👍
Setting up Laravel is done in seconds: 'composer create-project laravel/laravel example-app' and then 'php artisan serve'
Question , why using a external service like Planet Scale if i already have a Mysql database in my hosting ? how secure it is ? i mean , moving traffic from one place to another is secure ? queries? data ?
Whats the font used in your terminal? It's so clean
Great video Aaron! PHP for Life! ❤❤❤
With that generic "password" would it be better to make a constamt of a hashed password and then use that in the function. Also make the "created at" before the "email verified" and then "email verified" can be some random number of hours after "created by".
I thought he was gonna do that actually since he mentioned it was expensive. Calculate once and reuse. I would imagine it would be particularly useful if you wanted to login as random users to do things with because the login code is gonna compare hashed values.
I'm a php/laravel dev and that 'use (...)' is the most annoying thing ever 😅
When are they adding the full sysntax fn () => { // 😥 }
@Planetscale - why do you prefer PHP? Is it something in particular or just what you’ve got the most exp / comfort with?
I prefer it because I prefer Laravel. PHP isnt bad, but if I didn't have Laravel I'd probably be a ruby guy
Is this the default robbyrussel prompt?
Must have seen alot of your videos lately because i saw you in my sleep too😅
Tell me I said hi!
I love how you provide a fake email for php critics xD
Nice one as always :D
I use a bunch of common seeds like 90% of the time, when I need consistent stuff e.g.
customers + addresses + orders(+items), refunds(+items) etc and want them to have proper relations and consistent timestamps + pk's.
For date(times) is just do something like this.
$from = strtotime('2020-01-01');
$until = strtotime('2023-12-01');
$sets = 1_000_000;
$stepSize = (int)(($until - $from) / $sets);
and then loop while `$createdAt = date('Y-m-d H:i:s', $from += $stepSize);`
so is it safe if I use this pool for backfill data in production ?
What's the impact anyway? It's new for me.
I usually use chunk to backfill data if the data too much. But it's very slow for sure.
in this case, I use Laravel.
How to choose the number of processes anyway? I could not be like whatever you like, right ?
Make the database once, dump it out to an sql.gz file and then when you need a fresh one you can drop the database and import your clean one.
Sorry for this question, but what's your terminal font, sir ?
I get deadlocks when seeding my db with a transaction which does
insert 1 row in table 1
insert multiple rows in table 2
delete multiple rows from table 3
insert multiple rows from table 3
Multiple transactions in parallel don't affect the same rows - there's no overlap except for step 2 but it seems it's step 3 or 4 which causes the deadlock. Using drizzle ORM - not sure if that matters tho.
Not a big deal but I would be glad to at least have an intuition as to why that happens - have to run seeding sequentially for now :(.
What does the phrase "yonk this" mean? I understand that we are taking part of the code. But what is "yonk"?
Never heard yonk before. But vi has a yank command (copies text)
Yank, steal, grab, pluck. Something like that!
@@PlanetScale Thank you! I'm still learning English and I heard this word for the first time
In vim, the shortcut for copy is 'y', which I've been told stands for yank
Should look at using gnu parallel command to run multiple processes
Maybe! What would that do for me in this setup?
@@PlanetScale less code that needs to be manually written. No need to manually write the pool code and how many to run and such you can write the base task and let parallel handle all that
I did this exact thing, but was using queues and jobs, but this seems way tidy 🎉
Love it.
love it
What's that db viewer you're using?
😂
TablePlus
@@eleftrik thanks. I just use the mysql cli but i might try that
Thanks
How i can insert fake data on multiplle table that have variaous raletipnship like one to many and many to many?
Just add all the related data (blog posts, comments, etc) as you're inserting the user
Will this approach work with SQLite ?
Sure!
This is why I love Laravel.
Next time use laravels model insert method to insert many table rows at once it's so much faster than create. Do note though that auto filled properties of a model won't be set like created at
That's why I like going through the model!
What about Model::make() and then batch insert? Still will go through the model and use batch insert.
Before watching the video, I am assuming you will be using model factories from Laravel.
🧠
@@PlanetScaleI didnt know Laravel had that pool function. Thats so cool.
I sent an e-mail to the address mentioned at the end of the video. Why haven't I heard back from you yet?
Hmmm weird. Try again!
I just did, and NOTHING! Smells fishy@@PlanetScale
PHP stonks 📈📈
Ah yes. Famously no other modern languages have closures. Php so good
Nice one
Why db:seed not doing this by default? Any downside to it?
Cannot accept arguments in seeder.
MySQL + PHP = Aaron
Joke with email earned you like and comment 😂
I hate PHP but was forced to work on it, picked up Laravel and it showed me how an ugly language such as PHP can be wonderful in some ways. My perspective on PHP has changed ever since from personally bad to not quite as bad as it sounds like.
you don't have to hate PHP, in order to love other lang. Just Laravel is enough how PHP is doing nowadays.
Wut!? 10_000! Programming PHP for years but did not know that one!!
Kinda wild right
Pretty cool mate, I do the same!
Very jealous of your vanity address 😂
No, php is not stuppid. It is perfect for small Websites with less code.
🎉🎉🎉❤❤❤
(P)re (H)istoric (P)rogramming
👍👍
email sent
Hey Aaron do you drive a lamborghini?
Kind of! It's a Honda Odyssey minivan
Or, just access consumer data and use that 🧠
Hire me for more such pragmatic solutions 😎
I fail to understand people who hate PHP 😅
PHP is fantastic, php haters are just ignorants
😂
1k like
So many likes
Second!
This one I don't like. Again, people are so framework dependent it's horrible. You can do this with PHP easily. People are confusing PHP with Laravel and that's bad, it's the same with JS and React or Next. Frameworks are useful for specific projects, for small things why be framework dependent? I don't understand.
That's ok! Hopefully you'll enjoy the next videos more.
👍👍