Thank you for your description. After setting the `last_action` for the first goal, you already obtain the next state, which could be used as input for the policy of the second task. Instead of reusing the action from task 1, wouldn’t it be better to sample a fresh action based on the task-specific policy for task 2, given the last state from task 1? This new action should be more appropriate for task 2.
Good observation, and the answer is simple enough. The environment returns different observation shapes for some of the goals. You can absolutely take the last state, instead of last action, and generate a new action from the new policy, but it will be the wrong shape for some of the models and cause the agent to crash. I've been playing with solving that problem other ways(for example, padding the observations so all goals/models have the same shape), but that didn't make this implementation.
@@robertcowher Thank you, it is now clear to me. You talked about your omni agent in the last video; for this agent, you might have a large observation space consisting of the observation spaces of all objects in the environment.
Hi i want to ask how can i make a real robotic arm and teach this instead of simulation. I mean i need to work with real world robotic arm. Can you guide me pls ??
So what you're describing is a career journey, not a single project, and I'm not there yet but I can describe the path. If you wanted to do something like this in the real world, your path would be 1) Learn ROS, to control real world robots. 2) Learn how to physically build a robot arm. 3) Learn how to simulate your robot arm in ROS and Gazebo. 4) Learn how to port your simulated arm into a Gym-like environment for reinforcement learning. 5) Update this approach to work with your new environment and train the robot in simulation. 6) Take that trained model and move to your physical robot to complete training. If you're ready to start working with real robots, Antonio Brandi's Udemy courses are a great place to start - www.udemy.com/course/robotics-and-ros-2-learn-by-doing-manipulators
Thank you for your description. After setting the `last_action` for the first goal, you already obtain the next state, which could be used as input for the policy of the second task. Instead of reusing the action from task 1, wouldn’t it be better to sample a fresh action based on the task-specific policy for task 2, given the last state from task 1? This new action should be more appropriate for task 2.
Good observation, and the answer is simple enough. The environment returns different observation shapes for some of the goals. You can absolutely take the last state, instead of last action, and generate a new action from the new policy, but it will be the wrong shape for some of the models and cause the agent to crash. I've been playing with solving that problem other ways(for example, padding the observations so all goals/models have the same shape), but that didn't make this implementation.
@@robertcowher Thank you, it is now clear to me. You talked about your omni agent in the last video; for this agent, you might have a large observation space consisting of the observation spaces of all objects in the environment.
Hi i want to ask how can i make a real robotic arm and teach this instead of simulation. I mean i need to work with real world robotic arm. Can you guide me pls ??
So what you're describing is a career journey, not a single project, and I'm not there yet but I can describe the path. If you wanted to do something like this in the real world, your path would be 1) Learn ROS, to control real world robots. 2) Learn how to physically build a robot arm. 3) Learn how to simulate your robot arm in ROS and Gazebo. 4) Learn how to port your simulated arm into a Gym-like environment for reinforcement learning. 5) Update this approach to work with your new environment and train the robot in simulation. 6) Take that trained model and move to your physical robot to complete training. If you're ready to start working with real robots, Antonio Brandi's Udemy courses are a great place to start - www.udemy.com/course/robotics-and-ros-2-learn-by-doing-manipulators
@@robertcowher Thanks a lot sir. It means a lot 🙏🙏🙏🙏😊😊