In the Paper they talk about "trainable" weights for the softmax classification. Can you not access those weights after training from the SBERT Modell (e.g. to make predictions)? Or is the Cosine-Distance the only way to use the model?
@james Briggs: I am not clear about this aspect: When you are doing the training using the Pytorch way, you perform concatenation operations explicitly in your code... but if training is done using the Sentence Transformer framework, we don't see any concatenation... Is it being handled automatically by the library ???
@@jamesbriggs Thank you so much for the clarification and the wonderful tutorial... I have one more followup question... In the Pytorch implementation, you have added a FFNN after the concatenated tensor... but for Sentence BERT there is no dense layer after the pooling operation... Is my understanding correct ?
Hello James, I tried to find tune the model. I have a 3050 with 4gb and when I try to fit the model with batch size of 16 i get cuda out of memory error. I just ran your exact code shown in the video. The difference is that my data is just 5000 rows. Could you please advise how to solve this?
Hey james, will you be sharing the notebook?
How can we put two sentences closer in vector space ? Can we use this approach?
In the Paper they talk about "trainable" weights for the softmax classification. Can you not access those weights after training from the SBERT Modell (e.g. to make predictions)? Or is the Cosine-Distance the only way to use the model?
So you don't update the weight of `ffnn`?
@james Briggs: I am not clear about this aspect: When you are doing the training using the Pytorch way, you perform concatenation operations explicitly in your code... but if training is done using the Sentence Transformer framework, we don't see any concatenation... Is it being handled automatically by the library ???
Yes that’s right, the library handles it automatically
@@jamesbriggs Thank you so much for the clarification and the wonderful tutorial... I have one more followup question... In the Pytorch implementation, you have added a FFNN after the concatenated tensor... but for Sentence BERT there is no dense layer after the pooling operation... Is my understanding correct ?
Why don't you freeze the layers?
Hello James,
I tried to find tune the model. I have a 3050 with 4gb and when I try to fit the model with batch size of 16 i get cuda out of memory error. I just ran your exact code shown in the video. The difference is that my data is just 5000 rows. Could you please advise how to solve this?
I had the same issue. I reduced the data size and then it went through. I am wondering how to fix the problem.
Love this intro!
Can you make that title card into a png I can use for my background?
Another great video! Thanks James! I'm hoping my team can take your Udemy course soon!
That's awesome, it's included in Udemy for Business too :)
Take me in your Team