About the question whether PAIRED is doing more than Domain Randomization: If you get a policy that adapts to all suggested environments proposed by DR, it might still not be able to generalize to environments outside of the domain of what the DR is capable of right? Because it could have memorized all the proposed environments. But with PAIRED we constrain the situations the agent would encounter and in that sense force it to learn skills that (hopefully) do generalize better?
About the question whether PAIRED is doing more than Domain Randomization: If you get a policy that adapts to all suggested environments proposed by DR, it might still not be able to generalize to environments outside of the domain of what the DR is capable of right? Because it could have memorized all the proposed environments. But with PAIRED we constrain the situations the agent would encounter and in that sense force it to learn skills that (hopefully) do generalize better?
Good
Speaking nicely
Where's the code? #papersandtalkswithcode
Speaking nicely