Thanks for video, very helpful to speed up learning this topic. Question: So if one wants to utilize other distributional statistics rather than just expectation, is it just a question of changing the policy from argmax of means of action return distributions to argmax of some risk adjusted statistic?
What you describe in your question sound like a dynamic risk. If you interest in dynamic risk, although it's result is not easily interpretable with respect to the cumulative return. Due to it's convenience people just apply it by replacing the expectation to the risk statistics. However, if you concern about the risk statistics over the cumulative return (static risk), since other measure may not satisfy the "tower rule" or "positive homogeneity" you may not able to just replace the policy optimization by naively changing the statistics of interest. If a risk statistics is positively homogeneous, convex, and satisfy tower rule then it may be used directly. However, I am not aware of such risk statistics other than "min", "mean", "max".
@@monkiedeinhau557 thanks for your your reply. My question was more in terms of dynamic risk, example applying the standard deviation which indirectly can act as a measure of uncertainty.
「ビデオコンテンツはとても素晴らしいです、おめで
Thanks for video, very helpful to speed up learning this topic. Question: So if one wants to utilize other distributional statistics rather than just expectation, is it just a question of changing the policy from argmax of means of action return distributions to argmax of some risk adjusted statistic?
What you describe in your question sound like a dynamic risk. If you interest in dynamic risk, although it's result is not easily interpretable with respect to the cumulative return. Due to it's convenience people just apply it by replacing the expectation to the risk statistics.
However, if you concern about the risk statistics over the cumulative return (static risk), since other measure may not satisfy the "tower rule" or "positive homogeneity" you may not able to just replace the policy optimization by naively changing the statistics of interest. If a risk statistics is positively homogeneous, convex, and satisfy tower rule then it may be used directly. However, I am not aware of such risk statistics other than "min", "mean", "max".
@@monkiedeinhau557 thanks for your your reply. My question was more in terms of dynamic risk, example applying the standard deviation which indirectly can act as a measure of uncertainty.