Hi there, based on the schedule on your official course website, maybe this course should be lecture 10 and Prompting, Reinforcement Learning from Human Feedback by Jesse Mu) should be lecture 11?
Can using dropout during inference be another way to set the temperature and perform sampling? E.g., if training had a 10% dropout rate, why not apply a similar random dropout during inference? The neurons which get zeroed out could depend on some distribution, such as selecting neurons evenly or favoring the earlier layers or targeting attention heads at specific layers. One might expect the token distributions would be more varied than what beam search alone could find.
Great lecture! Thanks!
Ah, interesting… I had wondered about the distinction between NLU and NLP and now it makes sense! Cheers!
Great lecture!
Great lecture! :)
Hi there, based on the schedule on your official course website, maybe this course should be lecture 10 and Prompting, Reinforcement Learning from Human Feedback by Jesse Mu) should be lecture 11?
Yeah seems there is a mistake
thanks!!
Can using dropout during inference be another way to set the temperature and perform sampling? E.g., if training had a 10% dropout rate, why not apply a similar random dropout during inference? The neurons which get zeroed out could depend on some distribution, such as selecting neurons evenly or favoring the earlier layers or targeting attention heads at specific layers. One might expect the token distributions would be more varied than what beam search alone could find.
The TA seems attends Ng`s class a lot. Seems to imitate "ok cool" a lot. 😀
1:02:05