48:10 "In the general problems, neural nets have been out of fashion for a while..." I'm assuming what he meant was that back in 2015 when this thing was presumably recorded, when there weren't as many useful packages for deep learning as we have today, you were better off using those tree based models he talked about so much. Anyone disagree? Neural nets still inferior to boosted trees for non-image/nlp problems?
Good point. That comment would be outdated by now... Anyway, from what I've seen, GBM variants (e.g. XGBoost) are still the best choice in many tabular cases (not always, but often..).
@@guilamacie é nois firmão, truta!! Adding to my comment (which now seems to be a year old), we've been using neural nets to great success (several points better than all other models we've tried - which were not few) in the click prediction realm for online advertising (RTB). The data is fairly tabular (training set in the tens of millions of observations), but with several features having cardinality approaching 1,000,000 distinct values. All of the features with high cardinality are encoded using embeddings. Also, having so much data, training models on the full (or most of) the available training set would be painful to do if the framework cannot leverage hardware accelerators. PS: Palmeiras não tem mundial :)
Searched and found that you could download the slides by using link: www.slideshare.net/ShangxuanZhang/winning-data-science-competitions-presented-by-owen-zhang
48:10 "In the general problems, neural nets have been out of fashion for a while..." I'm assuming what he meant was that back in 2015 when this thing was presumably recorded, when there weren't as many useful packages for deep learning as we have today, you were better off using those tree based models he talked about so much. Anyone disagree? Neural nets still inferior to boosted trees for non-image/nlp problems?
Good point. That comment would be outdated by now... Anyway, from what I've seen, GBM variants (e.g. XGBoost) are still the best choice in many tabular cases (not always, but often..).
brasileiro tbm?
@@guilamacie é nois firmão, truta!! Adding to my comment (which now seems to be a year old), we've been using neural nets to great success (several points better than all other models we've tried - which were not few) in the click prediction realm for online advertising (RTB). The data is fairly tabular (training set in the tens of millions of observations), but with several features having cardinality approaching 1,000,000 distinct values. All of the features with high cardinality are encoded using embeddings. Also, having so much data, training models on the full (or most of) the available training set would be painful to do if the framework cannot leverage hardware accelerators. PS: Palmeiras não tem mundial :)
Anyone know where to get the pdf in the presentation?
Searched and found that you could download the slides by using link: www.slideshare.net/ShangxuanZhang/winning-data-science-competitions-presented-by-owen-zhang
Thanks for sharing
Great!