55
25 201

21:33

Zombies & RL - Part 6 - Loot Drops

16:58

Zombies & RL - Part 5 - Hunting Zombies

28:05

Zombies & RL - Part 4 - Bullets!

20:28

Zombies & RL - Part 3 - Player Controls

25:49

Zombies & RL - Part 2 - Player and Zombies

26:45

Zombies & RL - Part 8 - Converting to OpenAI Gym

In this series on Zombies and Reinforcement Learning, we're going to build a custom Zombie game with Pygame, convert that game into an OpenAI gym environment, and use double Q learning with PyTorch to train an autonomous agent to play it. In part 8, we'll be starting the conversion to an OpenAI gym interface.
Starter project can be downloaded here -
github.com/bobcowher/youtube-zombie-shooter-starter

Відео

21:33

Zombies & RL - Part 7 - Adding Sounds

Переглядів 142 години тому

In this series on Zombies and Reinforcement Learning, we're going to build a custom Zombie game with Pygame, convert that game into an OpenAI gym environment, and use double Q learning with PyTorch to train an autonomous agent to play it. In part 7, we'll be adding sounds to our game. Starter project can be downloaded here - github.com/bobcowher/youtube-zombie-shooter-starter

16:58

Zombies & RL - Part 6 - Loot Drops

Переглядів 222 години тому

In this series on Zombies and Reinforcement Learning, we're going to build a custom Zombie game with Pygame, convert that game into an OpenAI gym environment, and use double Q learning with PyTorch to train an autonomous agent to play it. In part 6, we'll be building out the health and loot drop functionality. Starter project can be downloaded here - github.com/bobcowher/youtube-zombie-shooter-...

28:05

Zombies & RL - Part 5 - Hunting Zombies

Переглядів 82 години тому

In this series on Zombies and Reinforcement Learning, we're going to build a custom Zombie game with Pygame, convert that game into an OpenAI gym environment, and use double Q learning with PyTorch to train an autonomous agent to play it. In this video, we're going to build out the logic to move zombies towards the player and allow the player to shoot the zombies. Starter project can be downloa...

20:28

Zombies & RL - Part 4 - Bullets!

Переглядів 134 години тому

In this series on Zombies and Reinforcement Learning, we're going to build a custom Zombie game with Pygame, convert that game into an OpenAI gym environment, and use double Q learning with PyTorch to train an autonomous agent to play it. In this video, we're going to code out the Bullet classes and the basic functionality for the shotgun. Starter project can be downloaded here - github.com/bob...

25:49

Zombies & RL - Part 3 - Player Controls

Переглядів 84 години тому

In this series on Zombies and Reinforcement Learning, we're going to build a custom Zombie game with Pygame, convert that game into an OpenAI gym environment, and use double Q learning with PyTorch to train an autonomous agent to play it. This video will cover the main loop and basic player controls. Starter project can be downloaded here - github.com/bobcowher/youtube-zombie-shooter-starter

Zombies & RL - Part 2 - Player and Zombies

26:45

Zombies & RL - Part 2 - Player and Zombies

Переглядів 194 години тому

In this series on Zombies and Reinforcement Learning, we're going to build a custom Zombie game with Pygame, convert that game into an OpenAI gym environment, and use double Q learning with PyTorch to train an autonomous agent to play it. This video will cover the creation of the zombie and player classes. Starter project can be downloaded here - github.com/bobcowher/youtube-zombie-shooter-starter

Zombies & RL - Custom Gym Environments with Pygame - Part 1 - Intro & Main Loop

29:24

Zombies & RL - Custom Gym Environments with Pygame - Part 1 - Intro & Main Loop

Переглядів 194 години тому

In this series on Zombies and Reinforcement Learning, we're going to build a custom Zombie game with Pygame, convert that game into an OpenAI gym environment, and use double Q learning with PyTorch to train an autonomous agent to play it. The first video is an intro and we'll start the main loop. Starter project can be downloaded here - github.com/bobcowher/youtube-zombie-shooter-starter

Autonomous Code Generation with ChatGPT and Python

29:44

Autonomous Code Generation with ChatGPT and Python

Переглядів 12814 годин тому

In this video we're going to use the OpenAI API to automatically write code that passes a set of tests. This was a proof of concept to see how a code-writing agent might work. github.com/bobcowher/youtube-openai-dev-agent-1-starter

Training AI models remotely with RunPod + Tensorboard

30:30

Training AI models remotely with RunPod + Tensorboard

Переглядів 77День тому

In this video, we're going to be walking through building a demo python script , deploying that script to RunPod on a GPU enabled instance, and getting logs back with Tensorboard

Cracking /etc/shadow passwords with C++ (yescrypt)

41:40

Cracking /etc/shadow passwords with C++ (yescrypt)

Переглядів 4872 місяці тому

Learn to crack modern yescrypt passwords in Linux's /etc/shadow file with C . This is a great exercise for Ethical Hacking students. github.com/bobcowher/youtube-password-cracker-shadow-starter

Building a Web App with RAG & ChatGPT - Part 4 - Querying the Model

48:08

Building a Web App with RAG & ChatGPT - Part 4 - Querying the Model

Переглядів 1642 місяці тому

This is part 4 of a series on building a web app with RAG(Retrieval Augmented Generation) and ChatGPT. In this video, we'll build the text search and ask methods. Starter code: github.com/bobcowher/youtube-rag-web-starter/tree/main Completed code: github.com/bobcowher/youtube-rag-web Original series by Daniel Bourke: ua-cam.com/video/qN_2fnOPY-M/v-deo.html Human nutrition text: pressbooks.oer.h...

Building a Web App with RAG & ChatGPT - Part 3 - Building the File Processor

40:04

Building a Web App with RAG & ChatGPT - Part 3 - Building the File Processor

Переглядів 2012 місяці тому

This is part 3 of a series on building a web app with RAG(Retrieval Augmented Generation) and ChatGPT. In this video, we'll start building the RAG model & file processor. Starter code: github.com/bobcowher/youtube-rag-web-starter/tree/main Human nutrition text: pressbooks.oer.hawaii.edu/humannutrition2/open/download?type=pdf Precalculus textbook: www.opentextbookstore.com/precalc/

Building a Web App with RAG & ChatGPT - Part 2 - Building the Web Interface

14:45

Building a Web App with RAG & ChatGPT - Part 2 - Building the Web Interface

Переглядів 2283 місяці тому

This is part 2 of a series on building a web app with RAG(Retrieval Augmented Generation) and ChatGPT. In this video, we'll start by building the framework for the web interface. Starter code: github.com/bobcowher/youtube-rag-web-starter/tree/main Human nutrition text: pressbooks.oer.hawaii.edu/humannutrition2/open/download?type=pdf Precalculus textbook: www.opentextbookstore.com/precalc/

Building a Web App with RAG & ChatGPT - Part 1 - Intro

11:37

Building a Web App with RAG & ChatGPT - Part 1 - Intro

Переглядів 4733 місяці тому

We're going to build a RAG(Retrieval Augmented Generation) application with ChatGPT. This is based on the work done by Daniel Bourke under - ua-cam.com/video/qN_2fnOPY-M/v-deo.html For the starter files for this video, visit - github.com/bobcowher/youtube-rag-web-starter/tree/main

Templating CloudFormation with Python(plus C++ comparison)

20:51

Templating CloudFormation with Python(plus C++ comparison)

Переглядів 1043 місяці тому

Templating CloudFormation with Python(plus C comparison)

Robotic Arm Manipulation with Human Experiences & HRL - Part 9 - Training the Meta Agent

34:08

Robotic Arm Manipulation with Human Experiences & HRL - Part 9 - Training the Meta Agent

Переглядів 1243 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 9 - Training the Meta Agent

Robotic Arm Manipulation with Human Experiences & HRL - Part 10 - The End

6:10

Robotic Arm Manipulation with Human Experiences & HRL - Part 10 - The End

Переглядів 1093 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 10 - The End

Robotic Arm Manipulation with Human Experiences & HRL - Part 8 - Building the Meta Agent

35:13

Robotic Arm Manipulation with Human Experiences & HRL - Part 8 - Building the Meta Agent

Переглядів 833 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 8 - Building the Meta Agent

Robotic Arm Manipulation with Human Experiences & HRL - Part 7 - Training the Agent

40:32

Robotic Arm Manipulation with Human Experiences & HRL - Part 7 - Training the Agent

Переглядів 1143 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 7 - Training the Agent

Robotic Arm Manipulation with Human Experiences & HRL - Part 6 - Building the Agent

34:03

Robotic Arm Manipulation with Human Experiences & HRL - Part 6 - Building the Agent

Переглядів 1253 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 6 - Building the Agent

Robotic Arm Manipulation with Human Experiences & HRL - Part 5 - Building the Model

24:57

Robotic Arm Manipulation with Human Experiences & HRL - Part 5 - Building the Model

Переглядів 1064 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 5 - Building the Model

Robotic Arm Manipulation with Human Experiences & HRL - Part 4 - Collecting Human Experiences

32:19

Robotic Arm Manipulation with Human Experiences & HRL - Part 4 - Collecting Human Experiences

Переглядів 2254 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 4 - Collecting Human Experiences

Robotic Arm Manipulation with Human Experiences & HRL - Part 2 - Setting up the Environment

23:35

Robotic Arm Manipulation with Human Experiences & HRL - Part 2 - Setting up the Environment

Переглядів 1834 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 2 - Setting up the Environment

Robotic Arm Manipulation with Human Experiences & HRL - Part 3 - Building the Replay Buffer

27:57

Robotic Arm Manipulation with Human Experiences & HRL - Part 3 - Building the Replay Buffer

Переглядів 1304 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 3 - Building the Replay Buffer

Robotic Arm Manipulation with Human Experiences & HRL - Part 1 - Intro (Advanced)

10:55

Robotic Arm Manipulation with Human Experiences & HRL - Part 1 - Intro (Advanced)

Переглядів 4974 місяці тому

Robotic Arm Manipulation with Human Experiences & HRL - Part 1 - Intro (Advanced)

Solving Mazes with Reinforcement Learning - Part 11 - Results & Next Steps

8:09

Solving Mazes with Reinforcement Learning - Part 11 - Results & Next Steps

Переглядів 2074 місяці тому

Solving Mazes with Reinforcement Learning - Part 11 - Results & Next Steps

Solving Mazes with Reinforcement Learning - Part 10 - Building the Intrinsic Curiosity Module

31:17

Solving Mazes with Reinforcement Learning - Part 10 - Building the Intrinsic Curiosity Module

Переглядів 1114 місяці тому

Solving Mazes with Reinforcement Learning - Part 10 - Building the Intrinsic Curiosity Module

Solving Mazes with Reinforcement Learning - Part 9 - Building the Test Method

13:45

Solving Mazes with Reinforcement Learning - Part 9 - Building the Test Method

Переглядів 844 місяці тому

Solving Mazes with Reinforcement Learning - Part 9 - Building the Test Method

Solving Mazes with Reinforcement Learning - Part 8 - Building the Train Method

33:02

Solving Mazes with Reinforcement Learning - Part 8 - Building the Train Method

Переглядів 675 місяців тому

Solving Mazes with Reinforcement Learning - Part 8 - Building the Train Method

КОМЕНТАРІ

@robertcowher 2 дні тому
For questions about the videos, or just to come talk about robots and AI, join me on Discord - discord.gg/dnhsk3pD2V If you like what I'm doing here, don't forget to subscribe. Enjoy the series.
@kbkhalsabro3423 2 дні тому
Greate Thanks a lot 💖
@walkingpeepo 4 дні тому
fantastic video
@walkingpeepo 4 дні тому
is this a continuing series to continue training ai models?
@robertcowher 3 дні тому
Glad you're enjoying the videos. This one isn't part of a series, but if you look through my channel you'll find AI training videos centered around robots, maze solving, and even Atari games like Breakout. Have fun!
@walkingpeepo 3 дні тому
@robertcowher sincerely amazing, and is it tutorials? Can we get a code along? Because I want to make an AI for a project that uses fundamental analysis
@TheCho5enJuan 6 днів тому
UA-cam algorithm led me to your video. As a learning dev who uses ChatGPT to co-code i find this concept incredible. Like in all jobs people feel they are irreplaceable but i always say it just takes time or money. This will be the future and for the good devs it will be so rewarding.
@ronaknarkhede674 19 днів тому
Great video! appreciate all the work you put in.
@robertcowher 19 днів тому
Glad you enjoyed the series :)
@NhanLe-lq7xs 2 місяці тому
the training stopped after ~ 1500 epochs due to RAM limit (16GB). Is there anyway to prevent that?
@NhanLe-lq7xs 2 місяці тому
shiet I put memory_capacity = 1000000 :))
@robertcowher 2 місяці тому
Yes. The easy answer is to reduce the buffer size. It’s what uses most of the memory. Keep in mind the buffer is there for stability so reducing it too much can have a negative impact on training. Play with that value and see where it gets you.
@NhanLe-lq7xs 2 місяці тому
I am following this series, but I dont know why the Arcade Learning Environment usually go blank and show nothing. After restart the computer, I got the game once and then it show nothing for latter run. Do you have any ideas?
@robertcowher 2 місяці тому
Are you in the training or testing phase? If you're in training, it should stay blank. If you're in testing, check the render mode.
@NhanLe-lq7xs 2 місяці тому
@@robertcowher I was in testing phase. I dont know if this is because of the code or the hardware on my laptop... In the 1st run after restart the laptop ALE can show breakout, but for 2nd time it show nothing
@NhanLe-lq7xs 2 місяці тому
@@robertcowher ah, I fixed it, it is because of the env.ale.* , I have to use env.unwrapper.ale.* instead
@robertcowher 2 місяці тому
@@NhanLe-lq7xs Nice!
@CamilleCowher 2 місяці тому
Amazing!!
@kbkhalsabro3423 2 місяці тому
Hi i want to ask how can i make a real robotic arm and teach this instead of simulation. I mean i need to work with real world robotic arm. Can you guide me pls ??
@robertcowher 2 місяці тому
So what you're describing is a career journey, not a single project, and I'm not there yet but I can describe the path. If you wanted to do something like this in the real world, your path would be 1) Learn ROS, to control real world robots. 2) Learn how to physically build a robot arm. 3) Learn how to simulate your robot arm in ROS and Gazebo. 4) Learn how to port your simulated arm into a Gym-like environment for reinforcement learning. 5) Update this approach to work with your new environment and train the robot in simulation. 6) Take that trained model and move to your physical robot to complete training. If you're ready to start working with real robots, Antonio Brandi's Udemy courses are a great place to start - www.udemy.com/course/robotics-and-ros-2-learn-by-doing-manipulators
@kbkhalsabro3423 2 місяці тому
@@robertcowher Thanks a lot sir. It means a lot 🙏🙏🙏🙏😊😊
@WeiLi-s8x 2 місяці тому
Hello, is there a corresponding paper for this case? Or can you make a video to explain its environment, such as state space, action space, and reward function?
@mubasharsaeed6044 2 місяці тому
Great video. Could you please make a video on manipulation using VLM
@robertcowher 2 місяці тому
I haven't played with VLM's before but it sounds like an interesting concept. Is there a specific environment or challenge you'd like to see solved?
@bleah4321 2 місяці тому
Hello. thank you for the lesson. I'm getting an error that says "undefined reference to 'crypt' collect2 : error : ld returned 1 exit status"
@robertcowher 2 місяці тому
In your g++ command, make sure "-lcrypt" is set. Missing that could cause the error you're describing. I'm guessing you're either using VSCode without my tasks.json, or using another IDE.
@texwiller7577 2 місяці тому
That with the mutli objective single policy is really interesting and I'm looking forward to see the progress, no matter how good/bad it is. Can you at least anticipate, how do assign the rewards? Are there different reward for different tasks? How do you set the task execution order? Many thanks for all those wonderful videos.
@robertcowher 2 місяці тому
So for now, I'm not even worrying about task execution order, just whether or not it can take an environment with multiple goals and accomplish them at all, and letting the environment hand out one reward per successfully completed task. I ended up taking a long break from the problem, but I'm going to pick it up in the next week or so and take another swing at it. Sometimes it helps to just..not work on something for a little while and look at it with fresh eyes.
@texwiller7577 2 місяці тому
Hey Robert. In the playlist this video (part 2) comes after part 3. Probably you should swap them. Thanks again for your videos
@robertcowher 2 місяці тому
Good catch! Fixed.
@Patrick-wn6uj 2 місяці тому
Its like you read my mind hhhhh I have been trying to make this. boom you post it
@robertcowher 2 місяці тому
Glad to help ;) I had a lot of fun with this one.
@WeiLi-s8x 3 місяці тому
I hope you can create a tutorial video on creating a simulation environment and writing code for env and tasks.
@robertcowher 19 днів тому
I'm not building one specific to robots, but your comment sparked an idea and I've been working on a custom Pygame environment. At some point in the next month I'm hoping to put out a full series on building the game from scratch, converting it into a gym environment, and then training an agent on it.
@WeiLi-s8x 3 місяці тому
After following all the courses with you, I have gained a lot of help. I have a few questions: First, I have successfully trained all tasks using your dataset except for the hinge_cabinet task, which is not working, and I don't know why. Second, why did I use PyCharm before, but in this series, I used VSCode? What is the difference between the two? Third, could you later release a tutorial on how to import and define my own robot and environment for training?
@robertcowher 3 місяці тому
Great questions. 1) For hinge cabinet, what behavior are you seeing? Have you tried deleting the weights file for that network and retraining? 2) You can use whatever IDE you're most comfortable with for most of my videos. I switched over to VSCode full time because I needed something that supported Python, Jupyter notebooks, C++, and ROS and integrated well with ChatGPT. Jetbrains(the company that makes Pycharm) has solutions for all of those problems but it pushes you into their paid tier and they bundle some of those capabilities into different products(PyCharm v.s. CLion). 3) That's something I'm currently trying to figure out. When I do, I'll definitely make a video :)
@WeiLi-s8x 3 місяці тому
I'm very excited to receive your reply, thank you very much, and I look forward to your new work@@robertcowher
@tonyho6882 3 місяці тому
Such a greate series. I have looked up this content for a long. Thank you so much for your efforts.
@robertcowher 3 місяці тому
Glad you're enjoying the videos. I've also had a hard time finding good robotics-specific RL content. I may not be the best person to fill that gap, but I'm definitely having the most fun with it :)
@karamdaaboul6528 3 місяці тому
Thank you for your description. After setting the `last_action` for the first goal, you already obtain the next state, which could be used as input for the policy of the second task. Instead of reusing the action from task 1, wouldn’t it be better to sample a fresh action based on the task-specific policy for task 2, given the last state from task 1? This new action should be more appropriate for task 2.
@robertcowher 3 місяці тому
Good observation, and the answer is simple enough. The environment returns different observation shapes for some of the goals. You can absolutely take the last state, instead of last action, and generate a new action from the new policy, but it will be the wrong shape for some of the models and cause the agent to crash. I've been playing with solving that problem other ways(for example, padding the observations so all goals/models have the same shape), but that didn't make this implementation.
@karamdaaboul6528 3 місяці тому
@@robertcowher Thank you, it is now clear to me. You talked about your omni agent in the last video; for this agent, you might have a large observation space consisting of the observation spaces of all objects in the environment.
@scott089 3 місяці тому
With this approach do you collect manipulation data first and then train the model on this collected dataset, also another question I have is only manipulation data or some visual input too?
@robertcowher 3 місяці тому
Great question. I'm not collecting visual data for this one, but the observations include a numerical representation of the environment. What you'll find a bit later in the course is that I'm using a hybrid approach, starting live SAC training with a buffer of experiences, and weighting the buffer towards picking up more of those live experiences for the first few hundred episodes.
@Patrick-wn6uj 3 місяці тому
Make a video on decision mamba - reinforcement learning that uses mamba network
@robertcowher 3 місяці тому
I haven't tried MAMBA, but I'll check it out, and make a video if it looks interesting. Thanks for the tip!
@texwiller7577 3 місяці тому
YESSSS!!!! Now it works!
@robertcowher 3 місяці тому
Hey, thanks for the problem report. I wouldn't have noticed for weeks. :)
@texwiller7577 4 місяці тому
Totally underestimated video series. I cannot stop watching it. Anyway, if I understood correctly, we need a _super_ policy, which selects one of the sub policies for accomplishing a complex task (e.g open microwave, turn on stove, close microwave, etc.), right?
@robertcowher 4 місяці тому
Correct. Though in this case, the super policy is going to be static(i.e. looping through a list of goals and selecting the appropriate sub-policy). We'll get there in the next few videos :)
@robertcowher 4 місяці тому
I strongly encourage everyone to generate their own data, but I've also posted my dataset to HuggingFace here - huggingface.co/datasets/robertcowher/farama-kitchen-sac-hrl-youtube/tree/main
@texwiller7577 4 місяці тому
Top videos! Did I understand correctly? You took 30.000 iteractions using the joystick?
@robertcowher 4 місяці тому
Great question, and yes, but there's some nuance. Every timestep is considered a memory so continuous joystick actions tend to generate a lot of them very quickly. My average was right around 200 timesteps per task once I got good at it, so filling a 30K memory buffer takes about 150 successful completions of that task. I experimented with human memory buffers from 20K to 60K for various tasks and found 30K to be a good minimum buffer size to succeed at all tasks. To that end, the can_sample method we've coded here looks for batch_size(64) * 500, or 32,000. You could tweak that multiplier and experiment with less.
@robertcowher 4 місяці тому
I was going to do this at the end of the series, but I went ahead and pinned a comment with my data set. I still recommend doing some of your own data generation to get the full process, but it's there to save you some time.
@robertcowher 4 місяці тому
For questions about the videos, or just to come talk about robots and AI, join me on Discord - discord.gg/dnhsk3pD2V
@moshoodolawale3591 4 місяці тому
Interestingly amazing tutorials. Looking forward to more of this. Thanks
@robertcowher 4 місяці тому
Glad you enjoyed the series :) Any specific environments you'd like to see solved? I'm already working on object sorting with a virtual Emika Panda robot arm, but that's been a challenge so far.
@WilliamChen-pp3qs 4 місяці тому
why are we only using the forward model instead of both forward and inverse model of ICM?
@Patrick-wn6uj 4 місяці тому
I wish you could get a bunch of views
@robertcowher 4 місяці тому
Hey, me too :) Honestly though, UA-cam helps me feel like there's a clear end goal when building these projects, and that's a big part of why I'm doing it. If it's useful to a few people, I've hit my goals.
@Patrick-wn6uj 5 місяців тому
keep posting
@robertcowher 3 місяці тому
If you enjoyed these, you might like the new series I'm working on. It's a similar robotic arm in a much more complex environment with a human experience component. ua-cam.com/video/ma6fbvy77Uo/v-deo.html
@EthanTownsend-t7o 5 місяців тому
Hi I just followed your tutorial and it worked! Thank you for making it clear and easy to follow. If I wanted to change the rewards policy in the future (such as rewarding proximity to the ball or longevity) would I be able to? Or is the reward policy pretty much set in place?
@robertcowher 5 місяців тому
Glad to hear it, and great question. What you're describing is an observation wrapper. They're part of the gym library and here for exactly these kinds of changes. An observation wrapper let's you "wrap" the environment, and then add your custom logic on top of it. My recommendation would be to get a wrapper working for something really simple(say, doubling the reward when the agent hits a ball) to make sure you've got the wiring right, and then getting into the logic of what you really want to do. They sound complicated, but a basic observation wrapper can be 6 lines of code. Here's the doc to follow -> gymnasium.farama.org/api/wrappers/observation_wrappers/
@robertcowher 5 місяців тому
Also worth noting - An observation wrapper is what I should have used instead of processing the observation in the training loop.
@peterpettigrew5972 6 місяців тому
please provide gthub repo for this code
@mobina3017 6 місяців тому
I get this error for the forward method of the ActorNetwork class, any idea how I can solve it? `RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x128 and 46x256)`
@robertcowher 6 місяців тому
That error usually means that the output of one layer doesn't match the input of the next layer. Check for typos in the layer input/output sizes, and if nothing jumps out at you, start printing out the variables and their sizes as you go through the forward method in the network. One of them is likely going to be a different size than expected and that's your culprit.
@Paranoid_mp3 7 місяців тому
Только учусь python и как раз искал нечто похожее. Спасибо за видео.
@robertcowher 7 місяців тому
Absolutely. Glad you enjoyed it.
@asitrath8838 7 місяців тому
Hi, I keep getting the below error after a few iterations. I have combed through the code its the same as yours and chatgpt was not able to resolve the error. Can you please help? line 97, in train state_b, action_b, reward_b, done_b, next_state_b = self.memory.sample( File "C:\Users\asitr\Downloads\agent.py", line 37, in sample return [torch.cat(items).to(self.device) for items in batch] File "C:\Users\asitr\Downloads\agent.py", line 37, in <listcomp> return [torch.cat(items).to(self.device) for items in batch] TypeError: expected Tensor as element 29 in argument 0, but got tuple
@robertcowher 7 місяців тому
Would you mind posting your code to GitHub so I can do a quick comparison?
@asitrath8838 7 місяців тому
@@robertcowher I have posted my GitHub link 3 times now and keeps getting deleted. Can I give it to you in any other way?
@asitrath8838 7 місяців тому
I am getting the below error when i am trying to run atari_breakout.py. Have you encountered this? x = torch.Tensor(x) ValueError: expected sequence of length 210 at dim 1 (got 3)
@robertcowher 7 місяців тому
That usually means you've either typod a dimension, a batch size, or missed one of the torch.unsqueeze lines. General advice - 1) Go through it line by line and make sure you didn't miss anything. 2) Print out what's being passed through the model. Look at the actual output and see if it makes the error make more sense. 3) Plug the whole block of code + the error into ChatGPT. It's really good at spotting these kinds of issues.
@SachinJadhav-p8w 7 місяців тому
Content on this topic seems to be quite sparse on internet. Your videos have helped me a lot. Thanks for sharing the knowledge !!!
@robertcowher 7 місяців тому
Glad you're enjoying them, and I've noticed the same trying to learn this stuff myself. There's plenty of robotics content(ROS2, etc) and plenty of information on RL, but it's hard to find information on the intersection of those two topics. I'm working on getting SAC working with more complex tasks(like object sorting), as well as using human experience(piloting the robot by hand) to bootstrap more quickly. I'll post that project if I can get it running.
@robertcowher 8 місяців тому
I've had a few people ask about a place to talk about these projects, compare notes, and get help, so I'm starting a Discord server. Please consider this a general space to talk about your "autonomous agent" projects. I'm happy to help you get the things I've built working, and I'd love to see what you're building. Join at discord.gg/dnhsk3pD2V.
@robertcowher 8 місяців тому
I've had a few people ask about a place to talk about these projects, compare notes, and get help, so I'm starting a Discord server. Please consider this a general space to talk about your "autonomous agent" projects. I'm happy to help you get the things I've built working, and I'd love to see what you're building. Join at discord.gg/dnhsk3pD2V.
@robertcowher 8 місяців тому
I've had a few people ask about a place to talk about these projects, compare notes, and get help, so I'm starting a Discord server. Please consider this a general space to talk about your "autonomous agent" projects. I'm happy to help you get the things I've built working, and I'd love to see what you're building. Join at discord.gg/dnhsk3pD2V.
@robertcowher 8 місяців тому
I've had a few people ask about a place to talk about these projects, compare notes, and get help, so I'm starting a Discord server. Please consider this a general space to talk about your "autonomous agent" projects. I'm happy to help you get the things I've built working, and I'd love to see what you're building. Join at discord.gg/dnhsk3pD2V.
@robertcowher 8 місяців тому
I've had a few people ask about a place to talk about these projects, compare notes, and get help, so I'm starting a Discord server. Please consider this a general space to talk about your "autonomous agent" projects. I'm happy to help you get the things I've built working, and I'd love to see what you're building. Join at discord.gg/dnhsk3pD2V.
@robertcowher 8 місяців тому
I've had a few people ask about a place to talk about these projects, compare notes, and get help, so I'm starting a Discord server. Please consider this a general space to talk about your "autonomous agent" projects. I'm happy to help you get the things I've built working, and I'd love to see what you're building. Join at discord.gg/dnhsk3pD2V.
@박보형학생공과대학기 9 місяців тому
Hi, thanks for the great videos. Despite following the videos exactly (I also added "observation = next_observation"), when I run main.py, the score does not increase and converges flat to a value of around 0.4, 0.5. I've given it enough training time, but the model isn't improving. I can't find what's wrong, can you help or advise me?
@robertcowher 8 місяців тому
Out of curiosity, how much training time did you give it, and did you put your code in a GitHub repo somewhere? I don't mind taking a quick look at it. If you'd rather share that information outside of the UA-cam comment section, I just set up a Discord server for discussion on these projects, and AI/autonomous agents more generally. You're welcome to join there for assistance as well. discord.gg/dnhsk3pD2V
@Ankara_pharao 9 місяців тому
Robert, can you please inform us on versions of python, tensorflow, keras and keras-rl modules you use? It would be even cooler if you post how you install this libraries (simple pip install keras-rl or install through git clone). Great videos and explanation but hard to reproduce on current versions of libraries.
@Ankara_pharao 9 місяців тому
It wolud be even cooler if i read the video description before asking the questions. Python 3.9.16 gym - 0.21.0 tensorflow: 2.10.0 ale-py: 0.7.5 keras-rl2: 1.0.5
@robertcowher 4 місяці тому
@@Ankara_pharao No problem at all(and sorry for the late reply). In newer videos, I'm including versions, and my requirements.txt, in the video, because you're not the only one to follow up and ask. There's also a discord now for Q&A. discord.gg/dnhsk3pD2V, and if you haven't watched anything of mine in a while, I just finished posting a series on Maze-solving with SAC, and another on robotic arms with TD3.
@WaltWhite71100 9 місяців тому
excellent tutorial, I've been following along and have got the model to produce logs, as well as save and load checkpoints, however, when looking at my tensorboard, rather than a real-time graph, my tensorboard consists of a long series of scores with a single data point in each. The only other notable difference I see, is on the top of your graph, there is a string of hyperparameters (after score 0) which I don't have on mine. Any ideas what may be happening on my side to treat the data points individually rather than as a graph?
@robertcowher 9 місяців тому
So, the string of hyperparameters is mostly for my own convenience. I tend to run lots of experiments, and they all take 10+ hours, so I forget what params I ran them with if I don't put together some kind of output. I'm able to reproduce something similar to what you're describing by removing the "global_step=i" variable from writer.add_scalar. That's the part that actually tells it which "step" it's on, and lets it show data over time. That line should end up looking something like this - writer.add_scalar(f"Score - {episode_identifier} alpha={alpha} - beta={beta} - batch_size={batch_size} - Critic AdamW - l1={layer1_size} l2={layer2_size} noise={starting_noise}", score, global_step=i). Something else you'll notice, I always go directly to the "Scalars" tab, and set smoothing to 0.95 or so. The "Time Series" tab has a bunch of other options I don't try to mess with.
@Mutrino 9 місяців тому
Looks like this will not work in Python 12. The package gymnasium (previously gym) is not available for Python 12
@robertcowher 4 місяці тому
Apologies for missing this back when you posted it but yes, most of my code is running on 3.10 or 3.11, and I haven't tested for compatibility any later. I should probably start including compatible python versions for new projects.
@homataha5626 9 місяців тому
Hi can you please share the code?
@homataha5626 9 місяців тому
Hi, Thanks for the videos can you share the code?
@robertcowher 9 місяців тому
My GitHub is a bit disorganized, but yes. Please note that this was the initial implementation I based the videos on, so it should be very close but may not quite be line for line. github.com/bobcowher/duelling-dqn-breakout-pytorch

Robert Cowher - DevOps, Python, AI

КОМЕНТАРІ