- 19
- 70 817
Whispering AI
Nepal
Приєднався 13 лис 2022
"Greetings! I'm a machine learning engineer with a passion for innovation and creativity. With my extensive technical knowledge and experience, I've worked on a variety of exciting projects, from developing advanced natural language processing models to creating cutting-edge image recognition algorithms.
As a lover of all things tech and programming, I'm thrilled to share my knowledge and insights with you through my UA-cam channel. With each video, I'll guide you through the exciting world of machine learning, offering tips, tricks, and tutorials that will help you master this complex and dynamic field.
So if you're ready to dive into the world of machine learning and unlock the potential of this incredible technology, I invite you to join me on this journey of discovery and exploration. Let's innovate together!"
To Help: ko-fi.com/whisperingai
As a lover of all things tech and programming, I'm thrilled to share my knowledge and insights with you through my UA-cam channel. With each video, I'll guide you through the exciting world of machine learning, offering tips, tricks, and tutorials that will help you master this complex and dynamic field.
So if you're ready to dive into the world of machine learning and unlock the potential of this incredible technology, I invite you to join me on this journey of discovery and exploration. Let's innovate together!"
To Help: ko-fi.com/whisperingai
GPT-4o: Create your own AI girlfriend that talks ❤️ Crazy or Creepy ?
I built a conversation AI Girlfriend that listen to your conversation and talk to you. Along with live translator that converts english to hindi all by using Openai new model GPT-4o. Here we will be using langchain, elevenlabs and gpt4-o.
GPT-4o (“o” for “omni”) is the most advanced model. It is multimodal (accepting text or image inputs and outputting text), and it has the same high intelligence as GPT-4 Turbo but is much more efficient-it generates text 2x faster and is 50% cheaper. Additionally, GPT-4o has the best vision and performance across non-English languages of any of our models.
26 Incredible Use Cases for the New GPT-4o Similar to The AI Advantage
Similar to TheAIGRID's video on "How To Use GPT-4o (GPT4o Tutorial) Complete Guide With Tips and Tricks"
🔗 Links
WhisperKit: github.com/argmaxinc/WhisperKit
huggingface: huggingface.co/Systran/faster-distil-whisper-small.en
wav2lips: github.com/Rudrabha/Wav2Lip
github: github.com/ashishjamarkattel/ai_girlfriend
Phenomes: learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-speech-synthesis-viseme?tabs=visemeid&pivots=programming-language-javascript
⏱️ Timestamps
0:00 Intro
1:30 Prompt for GPT to behave like AI girlfriend
8:04 Creating a webserver using Fastapi
9:00 Implement web app
11:33 Creating a Real time Translator from english to hindi
14:04 Text to speech
17:27 Thoughts on use cases virtual avatar using Wav2lip/ 3d based model
#langchain #autogpt #ai #nocode #tutorial #stepbystep #aigirlfriend #texttospeech #elevenlabs
GPT-4o (“o” for “omni”) is the most advanced model. It is multimodal (accepting text or image inputs and outputting text), and it has the same high intelligence as GPT-4 Turbo but is much more efficient-it generates text 2x faster and is 50% cheaper. Additionally, GPT-4o has the best vision and performance across non-English languages of any of our models.
26 Incredible Use Cases for the New GPT-4o Similar to The AI Advantage
Similar to TheAIGRID's video on "How To Use GPT-4o (GPT4o Tutorial) Complete Guide With Tips and Tricks"
🔗 Links
WhisperKit: github.com/argmaxinc/WhisperKit
huggingface: huggingface.co/Systran/faster-distil-whisper-small.en
wav2lips: github.com/Rudrabha/Wav2Lip
github: github.com/ashishjamarkattel/ai_girlfriend
Phenomes: learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-speech-synthesis-viseme?tabs=visemeid&pivots=programming-language-javascript
⏱️ Timestamps
0:00 Intro
1:30 Prompt for GPT to behave like AI girlfriend
8:04 Creating a webserver using Fastapi
9:00 Implement web app
11:33 Creating a Real time Translator from english to hindi
14:04 Text to speech
17:27 Thoughts on use cases virtual avatar using Wav2lip/ 3d based model
#langchain #autogpt #ai #nocode #tutorial #stepbystep #aigirlfriend #texttospeech #elevenlabs
Переглядів: 3 075
Відео
Hugging Face + Langchain+ Upwork | How to Solve Real World AI Job in UPWORK
Переглядів 2,5 тис.6 місяців тому
Learn how to use Hugging Face, and get access to 200k AI models while building in Langchain for FREE. In this video lets see how to leverage your knowledge to do some real world job that actually pays you money rather watching project. Here we will see how to solve the AI job posted on the upwork and try solving it on our own. It contains problem from stable diffusion, Large language model and ...
Build Your Own RAG for Unstructured PDF, Website via chatgpt & LangChain
Переглядів 1,6 тис.6 місяців тому
Advanced RAG 101 - build agentic RAG Learn how to build a RAG (Retrieval Augmented Generation) app in Python that can let you query/chat with your website using generative AI. This project contains some more advanced topics, like how to run RAG apps locally (with Ollama), how to update a vector DB with new items, how to use RAG with PDFs (or any other files), and how to test the quality of AI g...
Serve a Custom LLM Trained with RLHF in - FREE COLAB 📓
Переглядів 81310 місяців тому
This is the second part to the video, where I show you how to inference the model trained using reinforment learning with human feedback. In the first video we had fine-tune LLaMA 2 (and other LLMs like mistral 7b) for your specific use case. This allows your GPT model to perform much better for your business or personal use case. Given gpt like model mistral detailed information that it doesn'...
How I hacked my Gmail with an AutoGPT - Unveiling the secret
Переглядів 59811 місяців тому
I built an AI agent to take over my inbox , sharing my learnings & how I built it. The agent summarize all the email for me and speak out the email for me. In this video you will learn how to integrate elevenlabs, openai and gmail api in your project. Gmail automation can help you keep your inbox clean and save so much time. But what exactly can you automate in Gmail and how do you set it up? I...
🤖How I created these VIRAL POWERPOINTS 🥵🥵🥵 With AI ?
Переглядів 1,2 тис.Рік тому
ChatGPT can be a valuable tool to support the content creation process for a PowerPoint presentation, providing inspiration, research assistance, language refinement, and visual aids to help you create a compelling and effective presentation. Are you tired of spending hours creating lackluster presentations? Look no further than Tome for AI storytelling. In this step-by-step tutorial, we'll sho...
Fine Tune GPT In FIVE MINUTES with RLHF! - "Perform 10x Better For My Use Case" - FREE COLAB 📓
Переглядів 4,1 тис.Рік тому
In this video, I show you how to fine-tune LLaMA 2 (and other LLMs) for your specific use case. This allows your GPT model to perform much better for your business or personal use case. Give LLaMA detailed information that it doesn't already have, make it respond in a specific tone/personality, and much more. In this tutorial, we will be using reinforcement learning with human feed back(rlhf) t...
"How to give GPT my knowledge?" | Openai, Langchain &Mongodb | Knowledge embedding 101
Переглядів 4,9 тис.Рік тому
A step by step tutorial of how to enhance your chatbot user experience by enabling it to remember about your user 🔥🚀. Here we will create a bot using openai, langchain and mongodb.We will go in the core of development with all the thought process, how actually building happens. We will look at how to use the SmartGPT framework in Langchain to power your own LLM Apps. 🔗Links: github.com/ashishja...
🐐Llama 3 Fine-Tune with RLHF [Free Colab 👇🏽]
Переглядів 17 тис.Рік тому
In this video, I'll show you the easiest, simplest and fastest way to fine tune llama-v2 on your local machine for a custom dataset! You can also use the tutorial to train/finetune any other Large Language Model (LLM). In this tutorial, we will be using reinforcement learning with human feed back(rlhf) to train our llama, which will accelerate it performance. This technique is how this model ar...
🦙 LLAMA-2 : EASIET WAY To FINE-TUNE ON YOUR DATA Using Reinforcement Learning with Human Feedback 🙌
Переглядів 10 тис.Рік тому
In this video, I'll show you the easiest, simplest and fastest way to fine tune llama-v2 on your local machine for a custom dataset! You can also use the tutorial to train/finetune any other Large Language Model (LLM). In this tutorial, we will be using reinforcement learning with human feed back to train our llama, which will accelerate it performance. This technique is how this model are trai...
Faster LLM Inference: Speeding up Falcon 7b For CODE: FalCODER 🦅👩💻
Переглядів 1,8 тис.Рік тому
Falcon-7b fine-tuned on the CodeAlpaca 20k instructions dataset by using the method QLoRA with PEFT library. also we will see , How can you speed up your LLM inference time? In this video, we'll optimize the time for our fine-tuned Falcon 7b model with QLoRA, PEFT library for Faster inference. Falcoder 7B Full Model - huggingface.co/mrm8488/falcoder-7b Falcoder Adapter - huggingface.co/mrm8488/...
FALCON LLM in LangChain and HuggingFace. Is Falcon the OpenAI Alternative?
Переглядів 2,9 тис.Рік тому
In this video, we cover the new FALCON-40B LLM from TII, UAE. This model is able to beat all the open-source models on the OPEN LLM Leaderboard by the hugging face. This launch is great news for language model enthusiasts, industry experts and businesses, as it presents many opportunities for new use cases.In this video, we are going to compare the new Falcon-7B model against OpenAI's gpt3.5 mo...
How to Build an AI Document Chatbot in 10 Minutes ? |🦜️ 🔗LangFlow & Flowise
Переглядів 7 тис.Рік тому
Utilizing react-flow technology, LangFlow is a user interface (UI) created exclusively for LangChain. Its goal is to provide a seamless environment for simple flow testing and prototyping. Drag-and-drop components and a chat box functionality are available to users to make their experience more convenient. Build Langchain LLM apps 10x faster without code - In this video, we are going to explore...
LangChain In Action: Real-World Use Case With Step-by-Step Guide : OpenSource
Переглядів 2,6 тис.Рік тому
👋Hello my dear coders, In this video, I'll demonstrate how to solve a real world business problem with concept of langchain, without the requirement for OpenAI apis. We will see how to use the opensource embedding like Sentence-piece and Instruct embedding to build a powerful product for our business. With concept of LangChain, every step will be completed using FREE & open source technologies....
Talk to ANY PDF without OpenAI APIs: LangChain, it's not what you think
Переглядів 6 тис.Рік тому
Talk to ANY PDF without OpenAI APIs: LangChain, it's not what you think
Everything You Wanted to Know About Talk to ANY WEBPAGE without OpenAI APIs : LangChain
Переглядів 2,7 тис.Рік тому
Everything You Wanted to Know About Talk to ANY WEBPAGE without OpenAI APIs : LangChain
Sms spam Detection Using Machine Learning
Переглядів 548Рік тому
Sms spam Detection Using Machine Learning
Text Summarization Web App with Streamlit : A Step-by-Step Guide!
Переглядів 1,1 тис.Рік тому
Text Summarization Web App with Streamlit : A Step-by-Step Guide!
Is there by chanse any possibility you can make this AI girlfriend with local resources, -like a video or so? so that one doesnt have to use GPT
Yeah we can do that, using opensource model. But we need ollama or llama cpp to run unless you. Have gpu
@@WhisperingAI Like LM Studio?..."What I was thinking about was whether one could build a small, locally based AI model capable of voice-to-voice input that is self-learning. A model that you can train yourself with reasoning, perhaps a model that builds its own registers. From something small and functional to potentially something larger after training..."
@@Suketh yeah we can do, that. This will probably be the software itself. So making the video on this will not be possible, but general idea will be same as in video. Perhaps i will be glad to discuss the potential.
Nice video 👍🏻.. can I use OpenAI GPT4o as my pre-trained model?
gpt4o is not open source, u can use only through API, so GPT2 or lower versions is possible
You can use it for reward model. And if we use gpt40 it becomes RLAF
pls provide source code 🙏
@@Hexzit sure, will update in hour and two
github.com/ashishjamarkattel/Mdchat here yougo
thanks for the informative video! I have a question, do you have any idea how to store the information/main ideas of the conversation, so that the AI "remembers" of the previous conversation/topic? should it be added inside the template variable?
We should add prev conversation to the template. This might help, you can check this ua-cam.com/video/srTiN30QwSY/v-deo.html Please let me know if it didnot. You can change the prompt to Make it more robust.
great video please make more stuff with such examples
Glad you like the video .sure
Where does all the information that is entered into the LLM? Who controls that info? What does OpenAI do with it?
Are you takiing in general then. Llm are the black box, just a math that predicts the probability what is right and wrong. No body control the info, openai uses the data to retrain the llm making it more general for use case and making it more human
@@WhisperingAI I can just see how it could be misused.
Are you nepali ?
Thanks for the video. Why can't we use the ChatGpt4o API for the voice and audio input instead of ElevenLabs and Whisper?
We can do that. I am not sure we can use gpt40 api to get audio output. So i was using elevanlabs. But whisper can be replaced by gpt4o
what of doing a translation task with the LLMs and reinforce it with RLHF?
We can do that
There is an error message when I try to install trl I don't know why I am stuck... Can I have your email? To discuss with you about this issue?
Can you create raise issue in git ?
I got this error: " PermissionError: [Errno 13] Permission denied: 'audio.mp3'". Can you tell me what's wrong?
try changing version of playsound to playsound==1.2.2
Great video, very informative. Instead of using ChatGPT, is it possible to use a local LLM? I have a few that I run through LM Studio, so it would be great to be able to use them.
Yes you can use them too. Gpt4o response should be changed to your local llm.
Horrible, I couldn't follow. Your voice is not suitable for twqvhiu. You have a sound problem. Unsubscribe
Sorry to hear that. Twqvhui?
Can you guys think of new usecase for this?
can you share code bro?
github.com/ashishjamarkattel/ai_girlfriend here you go
@@WhisperingAI 🙏
can she feel fear?
Interesting.. She might change the prompt to express, also we can we replace eleven lab to openai text to speech model. In that case it might. I will give it a try
Man, your videos keep getting better every time I look. You have a great mind and your presentation is excellent. Thank you very much, again, for sharing!". Can you share the resource for avatar wav2lip and 3d based model you discussed ?
Thanks, will do!
thanks
glad you liked it
Great video make multi agent using hugging face
Sure will. What do you want your agents to do ?
thanks
Glad you liked it
Shit Kind of video ….
Why so
@@WhisperingAI what problem you try to solve ?? This is damn basic stuff
Yeah i guess you have some experience working on stuff like this then. But most people are still not aware of it. Also this wore the real jobs on upwork and wore in top at the time of making the video. So i guess someone needs it and i showed hows its done.
by any chance are you nepali?
Why so ?
Hey, Just watched your video and subscribed. Good content! Keep it up!
Thanks for the sub! Means a lot
very good video. Example also good. Can you give some more example
Yeah sure. ❤️
does this use reference model and kl divergence?
Yes it use both
Thank you for the video.
Thanks for watching!
Bro ur explanation is clear and neat to understand tq keep sharing knowledge bro
Glad you like it ❤️
Github ?
Soon, i am maitaining the github link right now
@@WhisperingAI pls provide the source code, it would be of great help. if possible could you provide the whole code of the demo website too?
I think this RLAIF instead of RLHF because the feedback is generated using BERT model instead of a human which forms a reward model
You are somewhat right and wrong aswell. But mostly we need to train reward model in order for this to give feedback in human label data. So in that case its rlhf. So happy that you point out the something interesting.❤️
hello can u help to debugging code I got some error:"0%| | 0/7 [00:00<?, ?it/s]['[negative]', '[positive]'] 0%| | 0/7 [01:36<?, ?it/s] --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-79-5f3a41baff7a> in <cell line: 12>() 39 tokenizer.add_special_tokens({'pad_token': '[PAD]'}) 40 ---> 41 logits = get_score(starcoder_model,starcoder_tokenizer, texts) 42 rewards = pos_logit_to_reward(logits, task_list) 43 2 frames <ipython-input-67-289918476360> in get_score(model, tokenizer, responses) 2 positive_logist = [] 3 for i in responses: ----> 4 instructions = tokenizer.encode_plus( 5 i, 6 padding="max_length", /usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py in encode_plus(self, text, text_pair, add_special_tokens, padding, truncation, max_length, stride, is_split_into_words, pad_to_multiple_of, return_tensors, return_token_type_ids, return_attention_mask, return_overflowing_tokens, return_special_tokens_mask, return_offsets_mapping, return_length, verbose, **kwargs) 3026 3027 # Backward compatibility for 'truncation_strategy', 'pad_to_max_length' -> 3028 padding_strategy, truncation_strategy, max_length, kwargs = self._get_padding_truncation_strategies( 3029 padding=padding, 3030 truncation=truncation, /usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py in _get_padding_truncation_strategies(self, padding, truncation, max_length, pad_to_multiple_of, verbose, **kwargs) 2761 # Test if we have a padding token 2762 if padding_strategy != PaddingStrategy.DO_NOT_PAD and (self.pad_token is None or self.pad_token_id < 0): -> 2763 raise ValueError( 2764 "Asking to pad but the tokenizer does not have a padding token. " 2765 "Please select a token to use as `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` " ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` or add a new pad token via `tokenizer.add_special_tokens({'pad_token': '[PAD]'})`. " ------------------------------------------------------------------------------------------------------------------------- code:""from random import choices from tqdm import tqdm import time import numpy as np from transformers import GPT2TokenizerFast tokenizer = GPT2TokenizerFast.from_pretrained("gpt2") # Add a padding token to the tokenizer tokenizer.add_special_tokens({'pad_token': '[PAD]'}) for epoch in range(1): for batch in tqdm(ppo_trainer.dataloader): (logs, game_data,) = ( dict(), dict(), ) print(ctrl_str) #### prepend a random control token task_list = choices(ctrl_str, k=config.batch_size) game_data["query"] = [t + q for t, q in zip(task_list, batch["query"])] #query_tensors = [torch.cat((ctrl_tokens[t], input_ids)) for t, input_ids in zip(task_list, batch["input_ids"])] #query_tensors = [torch.cat((ctrl_tokens[t].to(torch.device('cuda:0')), input_ids)) for t, input_ids in zip(task_list, batch["input_ids"])] query_tensors = [torch.cat((ctrl_tokens[t].cpu(), input_ids)) for t, input_ids in zip(task_list, batch["input_ids"])] #### get response from gpt2 response_tensors = [] for query in query_tensors: response = ppo_trainer.generate(query, **generation_kwargs) response_tensors.append(response.squeeze()[-txt_out_len:]) game_data["response"] = [starcoder_tokenizer.decode(r.squeeze()) for r in response_tensors] #### sentiment analysis texts = [q + r for q, r in zip(batch["query"], game_data["response"])] # print(starcoder_model,starcoder_tokenizer, texts) tokenizer.add_special_tokens({'pad_token': '[PAD]'}) logits = get_score(starcoder_model,starcoder_tokenizer, texts) rewards = pos_logit_to_reward(logits, task_list) #### Run PPO training t = time.time() stats = ppo_trainer.step(query_tensors, response_tensors, rewards) for cs in ctrl_str: key = "env/reward_" + cs.strip("[]") stats[key] = np.mean([r.cpu().numpy() for r, t in zip(rewards, task_list) if t == cs]) ppo_trainer.log_stats(stats, game_data, rewards)
Relatively just set the padding token while defining tokenizer also , try changing the max embedding size for the model accordimg to tokenizer if you are using different tokeizer from Model
If I want to add humanu feedbackcan i do?? and if Yes than How ??
Human feedback is the dataset that is created in step 1 and 2. So you can create your own dataset that matches that format to train all the 3 steps
@@WhisperingAI like in model traing time can i put some kind of feedback like some of Label selection during model training??
Hello Brother Your video is so much informatic and its cover each and every part like theories. Their coding is their explanations there. But can you possible, like making one good demo, like you provide the one Paragraph and showing how to data work and how to generate output like a demo of model before and after
Great video. Just a quick question, is it possible to intercept just the reward model's output for an LLM response, before the reward produced for each response goes into the LLM from reward model. Meaning, is there anyway can just use the reward model to see what response of LLM was good Vs bad and store those results?
Yes you can. In step 3 there is a line which takes the result from policy model and pass it to reward model for score. You can print the output.
how is the reward model trained can anyone explain in detail? i know that we used the starcoder model with chosen and rejected input ids, but how are these mapped to a particular score, since the output of the reward model is not always binary , it returns logits as it's output, how it is done here ?
Nice video, thanks!
Hi anyone encounterd this problem with model installation? File d:\Magister\llama_hugging_face\venv\lib\site-packages\huggingface_hub\utils\_validators.py:110, in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs) 109 if arg_name in ["repo_id", "from_id", "to_id"]: --> 110 validate_repo_id(arg_value) 112 elif arg_name == "token" and arg_value is not None: ... ) (norm): LlamaRMSNorm() ) (lm_head): Linear(in_features=4096, out_features=32000, bias=False) )'. Please provide either the path to a local folder or the repo_id of a model on the Hub.
Its because your path to the model you are trying to finetune is not there
Can we use the same code for llama2??
Yes you can but i guess you cannt run it on google colab unless you use lora or 4bit
@@WhisperingAI Im using Kagle notebooks. I have created policy model. But in reward training it gives IndexError: index out of range in self Why?
@@WhisperingAI And have executed your same code in high ram environment. But it gives same error: IndexError: index out of range in self, I want to apply RLHF in llama2. Your video is the only one i found that relates with RLHF.
There might be some issue while loading the dataset, or tokenizing. Can you share on which step you are facing this issue?
Please check your dataloader and try running each step individually
I notice the reward model structure is the same as the fine-tuned model. As someone said, we can use a small model with much fewer parameters and layer to do the reward model, that's work too right?
That works, but in case of reward model its basically the sequence classification model with one head, so output produced is only one logits, but i guess it is handled internally by the trl library.
For the demo why is the model giving same response for second part like i love it?
That's because model was not trained for long period and quality of data was not that good. It will work fine if you tune it for much longer period.
Thank you for the video. Was waiting for this ❤
Thanks for watching!
Can you use local models only, without openai?
Yes we can simply by changing the openai to local model taking input.
@@WhisperingAIFor repo's like autogpt, apparently it was designed to work specifically for openai models so it does not work as well for local models. Maybe this can be fixed with fine-tuning local models
I tend to recreate autogpt so it might be possible here. Not a full autogpt but i hardcoded some of the functionality like summarization. that does all the hardwork for us
Hey, this video is really helpful. Can you please tell how to give input and generate output after step3? Also when we create UI , then how feedback from UI will be given to policy model ? Can you please make a video on it, it will be really helpful !!!! Thanks :)
Sure i will try creating a short video for it with in couple of days.
That will really be helpful!!!!! Thanks:)@@WhisperingAI
Hey , It will really be helpful if you make it ...please help me
@@shrutidayama8193 It will be uploaded tomorrow. Thanks
Hi, It is a nice video. I got error when trying to download the pretain model. Can you help me? Below is the error. SafetensorError Traceback (most recent call last) Cell In[7], line 2 1 tokenizer = AutoTokenizer.from_pretrained("bigcode/tiny_starcoder_py") ----> 2 model = AutoModelForCausalLM.from_pretrained("bigcode/tiny_starcoder_py", use_cache=False, device_map="mps") #.to(x) 3 tokenizer.pad_token = tokenizer.eos_token 4 model.resize_token_embeddings(len(tokenizer)) ...... SafetensorError: Error while deserializing header: HeaderTooLarge
Weird error, try reinstalling the transformer library along with pytorch. Or if you cannt solve it even that way download the same model from huggingface and load it
Great video, I am also working in similar project can we have a talk?
How can I help you?
What else should I build for this AI inbox agent? Leave some comments!
nice
Thank you so much for your kind comment! I'm glad you enjoyed the video.
So what mongodb is used here for ?
Mongodb here is used to store user chat history
Thank you so much for sharing it is very helpful. I am using the colab to run the code, but I keep getting the error message of "Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu" for the ppo training section for the following line of code stats = ppo_trainer.step(query_tensors, response_tensors, rewards_list) then I use query_tensors[0].device and response_tensors[0].device rewards_list[0].device and starcoder_model.parameters()).device starcoder_model_ref.parameters()).device to check the location. All of them show cuda except for the starcoder_model and starcoder_model_ref. Then I use to(device) to move them back to cuda and re-run the code ppo_trainer = PPOTrainer(config, starcoder_model, starcoder_model_ref, starcoder_tokenizer, dataset=dataset, data_collator=collator, optimizer=optimizer) Somehow this line of code moves the model back to cpu again instead of Cuda. Do you have any suggestions on how to deal with this situation? Is this because of the config? I use your config as config = PPOConfig( model_name=model_path, steps=51200, learning_rate=1.41e-5, remove_unused_columns=True ) and model path as "summarization_policy_new/"
Sorry for the late reply. While i try to run the code, code seems to work fine. But i am aware about the issue, i try to mitigate the issue myself to solve, but it worked perfectly fine in my case. Have you tried loading the tensors in cpu, was that working fine? Have you rerun the colab after training reward model?(you should) Have you load the model's in gpu in the begining, while loading models by automodel for casuallmforvaluehead? Which model are you using ?
@@WhisperingAI Thank you so much for your help. It works after I restarted the colab after training the reward model step before step 3 But may I ask why this would be the case? Since we reinitialized every model again in step 3, wouldn't it be the same without rerun ? Is it because of the cache memory ? (Sorry I think youtube deleted my previous reply but it was asking the same question and by the way for step 3 the training dataset we used was the same as step 2 right not step 1 but the title says the data is the same as step 1
@@Marry-c2r Glad it solved your problem. Might be due to same variable name, defined to initialize the model, or any instance. Yes you could use dataset as per your need it doesnot matter.
The colab code not worked... it shows " cuda" error in step 3... can you pls help
Can you please provide explaination about issue?
0%| | 0/3 [00:00<?, ?it/s] ['[negative]', '[positive]'] --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-157-d10d934b9972> in <cell line: 6>() 15 task_list = choices(ctrl_str, k=config.batch_size) 16 game_data["query"] = [t + q for t, q in zip(task_list, batch["query"])] ---> 17 query_tensors = [torch.cat((ctrl_tokens[t], input_ids)) for t, input_ids in zip(task_list, batch["input_ids"])] 18 19 #### get response from gpt2 <ipython-input-157-d10d934b9972> in <listcomp>(.0) 15 task_list = choices(ctrl_str, k=config.batch_size) 16 game_data["query"] = [t + q for t, q in zip(task_list, batch["query"])] ---> 17 query_tensors = [torch.cat((ctrl_tokens[t], input_ids)) for t, input_ids in zip(task_list, batch["input_ids"])] 18 19 #### get response from gpt2 RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument tensors in method wrapper_CUDA_cat)@@WhisperingAI