Not using a validation split is actually a pretty big difference. For some additional perspective, adding additional training examples is typically more beneficial than model architecture differences. You shouldn't use the split in TensorFlow if you want to make a comparison like this; even if you cite this reason.
I don't see why the results should be different between the 2 if everything else is the same. After all, the math to compute the weight update is the same. I was much more interested in the speed comparison. But even for that, I think it would be better to try some larger models and run them multiple times using state of the art features, instead of just simple examples...
@@loftyTHEOWNER Overfitting. The tensorflow model could learn the training data really well, but not be so good at the test data. If the validation data had a picture of a red car and there was no red car in the training data, then the model will get confused, as it may have overfitted and relied heavily on the color of the car. In general, you almost always want a validation split, as it catches the problem of overfitting early on.
He didn't even control the order of the images being used for training, it is very different. You need averages and retraining to quantify performance.
The results are also different because the initialization of the weights and biases is random. Unless you make them identical you can never expect to get the same results.
Saying that PyTorch is better than TF by just looking at the results of one run? Deep learning models have a random component … so this result is completely lead by chance. When I see the title of the video, I thought the video was about doing a real comparison regarding usability, memory and GPU consumption, etc
It looks like this video was meant for newbies, who dont even know basics yet or even what cuda is. So no wonder, that this is only "oh look, how code looks"
As a begginner, Tensorflow feels way easier, pytorch looks and feels like some deep engineering, I guess I might have to learn both. But I still prefer Tensorflow, feels like it available for everyone
yeah even i started off with tensorflow seeing how easy to use it was, but most research projects are implemented in pytorch. Since i had to analyze much open source research code during my projects, it was critical to know pytorch as well in addition to tensorflow
Now that ChatGPT can write all the code you need, I feel there is not much difference anymore. But for more complex stuff, I feel PyTorch become easier, because you don't need to go and modify the classes with some weird inheritances. I've just wrote some custom classes for learning rate schedule and early stopping in PyTorch and it completely felt like normal Python...
Nice video! Just want to add at 2:09, there are 3 methods to create the tf model. I think you missed to mention functional API which is intermediate between Sequential and Subclassing
Comparing accuracy doesn't make sense here. 1st, initial weights are random. 2. Pytorch was trained on full train dataset. If anything this just shows tf is much much faster to get a model ready. Pytorch is only good if you are getting paid for finding new stuff.
yes, I mentioned that skills could be skewed and performance is on the same level. Showing the accuracy should mainly demonstrate that the loss and accuracy values lie in the same range :)
@@rogerab1792 and create a bunch of bugs along the way? PyTorch and Tensorflow are heavily tested and verified tools. Trusting you in your academic reasoning AND your engineering skills? Just no
@@rogerab1792 I don't think it's VERY useful, I think it's fun. It's like recoding libc and C++ containers. They already exist but we recode it to prove something for ourselves. Not useful.
No JIT/ mixed precision optimizations? Comparing the accuracy of the same model on different frameworks and not understanding that the difference is due to the randomness of the initialization? Weird video.
Using all data without a validation split was the biggest factor in the difference. Given enough time and a simple task, those networks are very likely to still end up near the same local minima. But i agree that initializing all w&b to the same values would be cooler. Kind of a pity, i really wish i could see the experiment properly repeated
As far as I’m concerned, the research community heavily favors pytorch these days, at least for the field I’m in (NLP). Mostly because of huggingface and fairseq I think 😂 Also, with tensorflow and Keras, even with the eager mode enabled, the error messages are not as helpful as pytorch. Another thing I like about torch is that everything inherits from nn.module, unlike in keras, where model and layer are two separate classes.
TF is simpler for building well-researched models, and PyTorch has better data loader and lower level controls for the cutting edge. In my experience because torch tensor can be manually managed to be stored on CPU or co-processors, it can be significantly faster than TF when managed correctly. Newer versions of Nvidia’s VideoProcessingFramework even provides bindings for direct conversion between video and PyTorch Tensors on the GPU so working with video is real time
@@doords I would say so, just because most of the recent work that publish their code in open source are in PyTorch, except for google's papers (for obvious reasons). It's similar to recommending react. It's not that it is better than its competitors dramatically, but just that it is more popular in the community now, which means you can get more help, more established codebase / libraries, etc.
Great video ! But only one observation: you must compare the models'creation using the Tensorflow in it's Advanced mode, because Pytorch ia based on Oriented object.
Why didn't you include pytorch-lightning, as you have included keras for tensorflow? It would have been a better comparison, and in my usage lightning overthrows keras in terms of ease of usage. In this video, you did basically keras vs pytorch.
He can't cover in depth everything in one video! Please be patient. Which framework do you like the most? i have the feeling that every single one is wrong
Hello there, how to setup both pytorch and tensorflow in same machine using one cuda toolkit? Or in different environment maintaining same cuda toolkit? Thank you.
Wow. Great video doing the side by side. Momentum and adoption seems to favor PyTorch and in general, many FB frameworks over Google frameworks. I could definitely see where if you're a very good Python developer (or want to be), PyTorch would be a very "easy" choice. TensorFlow seems very easy for the "incremental" command workflows that many Data Scientists do especially if they aren't really good with OOP. So in essence, TensorFlow feels a little bit like R for Stats, a "researcher first" tool... you get a ton of stuff just included, whereas with Python, you have to build and customize all your outputs, which feels more SWE driven than Researcher/Academic driven, e.g. like R.
Jax, very flexible, but also fast, faster than PyTorch. And you can still run on TPU. Yes, now support for TPU has been added to PyTorch, but it was recently done, I tried it, it's still raw.
By another UA-cam video I learned that Pytorch is more commonly used for Computer Vision and NPL than for "traditional" data (numbers, categories etc). What do you say about using Pytorch with traditional data?
I dont think performance test base on acc and loss is some what appropriate, because you didn't set a seed for python or random package so the initial state of neuron weight will be different, anw I still think this is a great video, thanks for your effort.
All the randomness needs to be under control: seeds for python and CUDA, weight initialization, optimizer and etc. How about another head to head benchmark on Apple M1 Pro/Max?
Flip a coin and go with it either way. Both have plenty of tutorials, Q&A forums, etc. and you can't go wrong with the one you choose. Just stick with it for long enough you can 'think' in it and have done a small project or demo worthy of showing others before diving into the other.
With my little experience (mostly with torch), my opinion is biased but I felt torch is much more user friendly while debugging while tensorflow can give you hypertension at times.
Hi, Aishwarya here. I was looking around for data scientists and found your profile. Currently I am working on a 3D CT reconstruction project deploying tensorflow in it. I am facing a bit of errors in the code and unable to arrive at a result. Could you please help me with this rn if possible ?
If you are insecure, guess what? The rest of the world is, too. Do not overestimate the competition and underestimate yourself. You are better than you think.
Not using a validation split is actually a pretty big difference. For some additional perspective, adding additional training examples is typically more beneficial than model architecture differences. You shouldn't use the split in TensorFlow if you want to make a comparison like this; even if you cite this reason.
Yea ditto this is actually a huge difference in the comparison
exactly! I wonder why he did this since I was really intrigued to see the results :(
I don't see why the results should be different between the 2 if everything else is the same. After all, the math to compute the weight update is the same. I was much more interested in the speed comparison. But even for that, I think it would be better to try some larger models and run them multiple times using state of the art features, instead of just simple examples...
@@loftyTHEOWNER Overfitting. The tensorflow model could learn the training data really well, but not be so good at the test data. If the validation data had a picture of a red car and there was no red car in the training data, then the model will get confused, as it may have overfitted and relied heavily on the color of the car. In general, you almost always want a validation split, as it catches the problem of overfitting early on.
He didn't even control the order of the images being used for training, it is very different. You need averages and retraining to quantify performance.
The results are also different because the initialization of the weights and biases is random. Unless you make them identical you can never expect to get the same results.
If you used all data for train torch model, there is no point to compare metrics
+ you should define random seed as well
Saying that PyTorch is better than TF by just looking at the results of one run? Deep learning models have a random component … so this result is completely lead by chance. When I see the title of the video, I thought the video was about doing a real comparison regarding usability, memory and GPU consumption, etc
I don't think this video meant to be a benchmark of the two frameworks, he literally said "performance wise both frameworks are on the same level"
i think you missed the point my guy
It looks like this video was meant for newbies, who dont even know basics yet or even what cuda is. So no wonder, that this is only "oh look, how code looks"
what about weight initialization? .... that could affect the performance pretty dramatically in some cases
Precisely, running TF twice as well one will not end up with the same result unless one sets some fixed random seed.
As a begginner, Tensorflow feels way easier, pytorch looks and feels like some deep engineering, I guess I might have to learn both. But I still prefer Tensorflow, feels like it available for everyone
yeah even i started off with tensorflow seeing how easy to use it was, but most research projects are implemented in pytorch. Since i had to analyze much open source research code during my projects, it was critical to know pytorch as well in addition to tensorflow
Now that ChatGPT can write all the code you need, I feel there is not much difference anymore. But for more complex stuff, I feel PyTorch become easier, because you don't need to go and modify the classes with some weird inheritances. I've just wrote some custom classes for learning rate schedule and early stopping in PyTorch and it completely felt like normal Python...
Nice video! Just want to add at 2:09, there are 3 methods to create the tf model. I think you missed to mention functional API which is intermediate between Sequential and Subclassing
Comparing accuracy doesn't make sense here. 1st, initial weights are random. 2. Pytorch was trained on full train dataset. If anything this just shows tf is much much faster to get a model ready. Pytorch is only good if you are getting paid for finding new stuff.
He said the result could be skewed, it's not "could be" it was actually skewed, imagine doing this in type of comparison in a normal research!!
yes, I mentioned that skills could be skewed and performance is on the same level. Showing the accuracy should mainly demonstrate that the loss and accuracy values lie in the same range :)
@@rogerab1792 and create a bunch of bugs along the way? PyTorch and Tensorflow are heavily tested and verified tools. Trusting you in your academic reasoning AND your engineering skills? Just no
@@rogerab1792 I don't think it's VERY useful, I think it's fun. It's like recoding libc and C++ containers. They already exist but we recode it to prove something for ourselves. Not useful.
@patloeber Hi sir, can you kindly provide that pytorch and tensorflow code links in description?
What about training and evaluation time? Which has less value?
I was thinking the same thing.
use ONNX format for serving, then it doesnt matter if its TF or PyTorch
A UA-cam comment section with well thought out intelligent comments, making valid arguments in favor and opposition! 🤯 Great input worth pondering!
Thank you SO MUCH for making this video! I'm transitioning from TF to PyTorch and REALLY needed a "same model, both frameworks" video like this one!
No JIT/ mixed precision optimizations? Comparing the accuracy of the same model on different frameworks and not understanding that the difference is due to the randomness of the initialization? Weird video.
Using all data without a validation split was the biggest factor in the difference.
Given enough time and a simple task, those networks are very likely to still end up near the same local minima. But i agree that initializing all w&b to the same values would be cooler.
Kind of a pity, i really wish i could see the experiment properly repeated
As far as I’m concerned, the research community heavily favors pytorch these days, at least for the field I’m in (NLP). Mostly because of huggingface and fairseq I think 😂
Also, with tensorflow and Keras, even with the eager mode enabled, the error messages are not as helpful as pytorch. Another thing I like about torch is that everything inherits from nn.module, unlike in keras, where model and layer are two separate classes.
TF is simpler for building well-researched models, and PyTorch has better data loader and lower level controls for the cutting edge.
In my experience because torch tensor can be manually managed to be stored on CPU or co-processors, it can be significantly faster than TF when managed correctly. Newer versions of Nvidia’s VideoProcessingFramework even provides bindings for direct conversion between video and PyTorch Tensors on the GPU so working with video is real time
Do you find pytorch to better for NLP projects? I got in with Pytorch because our engineers from Stanford are taught in it.
@@doords I would say so, just because most of the recent work that publish their code in open source are in PyTorch, except for google's papers (for obvious reasons).
It's similar to recommending react. It's not that it is better than its competitors dramatically, but just that it is more popular in the community now, which means you can get more help, more established codebase / libraries, etc.
@@Imboredas That is good know. By the way which online communities do you recommend in case we run into problems working on NLP projects.
great vid, was cool for me to see how initializing and training a model in pytorch looks like
Great video ! But only one observation: you must compare the models'creation using the Tensorflow in it's Advanced mode, because Pytorch ia based on Oriented object.
Great video and editing, you covered a lot of ground in 13 minutes.
thank you :)
Why didn't you include pytorch-lightning, as you have included keras for tensorflow? It would have been a better comparison, and in my usage lightning overthrows keras in terms of ease of usage.
In this video, you did basically keras vs pytorch.
He can't cover in depth everything in one video! Please be patient.
Which framework do you like the most?
i have the feeling that every single one is wrong
agree 100%
Hello there, how to setup both pytorch and tensorflow in same machine using one cuda toolkit? Or in different environment maintaining same cuda toolkit?
Thank you.
I think that you should have use model sequential in pytorch or use gratienttape approach in tensorflow for better comparison.
Nice and clean presentation in 13 mints. Great video, Thank you.
Can I use "crossentropy" as a metric as we also do for the loss?
Wow. Great video doing the side by side. Momentum and adoption seems to favor PyTorch and in general, many FB frameworks over Google frameworks. I could definitely see where if you're a very good Python developer (or want to be), PyTorch would be a very "easy" choice. TensorFlow seems very easy for the "incremental" command workflows that many Data Scientists do especially if they aren't really good with OOP.
So in essence, TensorFlow feels a little bit like R for Stats, a "researcher first" tool... you get a ton of stuff just included, whereas with Python, you have to build and customize all your outputs, which feels more SWE driven than Researcher/Academic driven, e.g. like R.
Jax, very flexible, but also fast, faster than PyTorch. And you can still run on TPU. Yes, now support for TPU has been added to PyTorch, but it was recently done, I tried it, it's still raw.
Which software should I use for both ??
For a JavaScript Developer like me, Tensorflow/Keras is easier to learn bcz its syntax is almost the same as JS
You would be writing TensorFlow-style Python as if it were JavaScript?! Please don't.
@@Bengt.Lueers there is tensorflow.js I think that's what he meant
In pytorch
In the fully connected layer
Why was it multiplied by 4*4
Comparing two frameworks on different levels of abstraction is not a fair comparison.
By another UA-cam video I learned that Pytorch is more commonly used for Computer Vision and NPL than for "traditional" data (numbers, categories etc). What do you say about using Pytorch with traditional data?
On one hand python programmers want to write less code, but here they want to write more code. How?
I dont think performance test base on acc and loss is some what appropriate, because you didn't set a seed for python or random package so the initial state of neuron weight will be different, anw I still think this is a great video, thanks for your effort.
Agree. CV, and averaging across multiple runs could've solved this issue.
Where can I see the full codes?
The answer, like with many other topics in computer science: it depends
Very informative and crisp video. Love the effort put into it. Keep up the good work !
Super helpful information and very clearly communicated
All the randomness needs to be under control: seeds for python and CUDA, weight initialization, optimizer and etc.
How about another head to head benchmark on Apple M1 Pro/Max?
Very good explanation
I think both are pretty good ;-)
Thank you very much for this comparison !!!
Does anyone even use R in the real world? It has a TF option as well
Good overview. TU.
I employ both frameworks and often prototype on Apple Silicon as well as
Intel ++ NVidea ...
. . .
Which is better for a beginner?
Flip a coin and go with it either way. Both have plenty of tutorials, Q&A forums, etc. and you can't go wrong with the one you choose. Just stick with it for long enough you can 'think' in it and have done a small project or demo worthy of showing others before diving into the other.
Great video. Thanks!
With my little experience (mostly with torch), my opinion is biased but I felt torch is much more user friendly while debugging while tensorflow can give you hypertension at times.
Very nice video. Thanks
Thanks man for this amazing tutorial...
Possible the most relevant decision we do without knowing! ^_^
Hi,
Aishwarya here. I was looking around for data scientists and found your profile.
Currently I am working on a 3D CT reconstruction project deploying tensorflow in it. I am facing a bit of errors in the code and unable to arrive at a result. Could you please help me with this rn if possible ?
Tf was super confusing when they introduced 2.0
love learning by doing! love this channel!
Thank you.
what about MLflow?
It could be the random seed
I haven't tried TF, I use Pytorch
You should try TF.
Perfect!
Thanks for all the great info.. not sure the results are meaningful , if you’re not doing a 1:1 test it’s not very informative
Tensorflow is much better and well polished imo
It’s basically dead
@@maelstrom254 you're trolling, right? xD
@@elyasaf755 look at usage statistics. Plus I personally don’t know who would prefer Tensorflow (we have several floors of research department)
#TeamPytorch
If you are insecure, guess what? The rest of the world is, too. Do not overestimate the competition and underestimate yourself. You are better than you think.
I miss your channel's previous name
what was it
The entire content and Speaking are done through AI voice and few screens are incorporated manually.....
print('Thanks')
this video was unfair and inconclusive
Jesus Christ 😭🙆
Are you taking a session or just reading out something ? Sorry...teaching is not reading out something fast like this !
So bad