Thanks for this, I just went through a very similar exercise on production tables before finding this video and almost everything you covered overlapped with my observations. I wasn't creating temporary tables from underlying production data but was generating table copies in the same transaction as the tests, so your comment about vacuum (+analyze) not running within the first minute or so explained why I needed to add an analyze statement to pick an appropriate join strategy, other than e.g. merge joins. I've also experienced joining my own (large) column metadata tables to pg_catalog tables e.g. pg_class and have seen that blow up with a nested loop. I'm guessing you can't add an index or run analyze on those system tables (need to check) so the only route open to me was to encourage the optimiser to use hash joins with enable_nestloop=off
Great talk thanks for sharing
Thanks for this, I just went through a very similar exercise on production tables before finding this video and almost everything you covered overlapped with my observations. I wasn't creating temporary tables from underlying production data but was generating table copies in the same transaction as the tests, so your comment about vacuum (+analyze) not running within the first minute or so explained why I needed to add an analyze statement to pick an appropriate join strategy, other than e.g. merge joins. I've also experienced joining my own (large) column metadata tables to pg_catalog tables e.g. pg_class and have seen that blow up with a nested loop. I'm guessing you can't add an index or run analyze on those system tables (need to check) so the only route open to me was to encourage the optimiser to use hash joins with enable_nestloop=off
What happens if you set vacuum/analyze script that runs weekly, do have to tune autovacuum