hey man this was just awesome, can you recommend me what are the prerequisites that I need to study before this , should I take a deep learning course to understand this ? like can you please show me the path for it. subbed.
I am confused about 1 thing. Is it transcribing the video into text and then answers based on that or is it splitting the video into frames and its using the vision capabilities to understand image
thank you sir
hey man this was just awesome, can you recommend me what are the prerequisites that I need to study before this , should I take a deep learning course to understand this ? like can you please show me the path for it. subbed.
absolutely amazing
Thanks
you are doing best bro
Thanks for the support
Why does it could'nt run on colab t4? The problem occur when install accelerate
multimodal llm can fine-tuning for sentiment analysis
please make a video about Qwen2-Math
Soon
Perfect😍
Thanks
Brother: Can this go with any other Vision model. Or qwen family specific?
Qwen family mostly. For others, you would need to change the code a bit.
brother can you finetune llama 3.1 8 b for Hindi translation task
I am confused about 1 thing. Is it transcribing the video into text and then answers based on that or is it splitting the video into frames and its using the vision capabilities to understand image
its using vision to directly understand from image data .
Question does it work in colab t4?
I tried to run the code on the form page and it gives errors
Erorr with t4