[Question] How can I check the range of action values in PPO with rsl-rl? #1486

H-Hisamichi · 2024-12-01T13:30:38Z

Hello everyone,

I am working on training a control policy for a hexapod using PPO.
My robot has joints with very different ranges of motion, so I am trying to remap the actions from the policy.
However, it seems that the action values output by the PPO policy in rsl-rl are larger than the range [-1, 1].

Where can I check the range of the action values from the policy?

Thank you!

RandomOakForest · 2024-12-06T20:10:13Z

It's seems a normalization step may be missing. Could you share how are you setting up your rsl-rl wrapper? Thanks

H-Hisamichi · 2024-12-07T08:32:47Z

Hello @RandomOakForest,

I'm using the direct workflow ANYmal C demo code rsl_rl_ppo_cfg.py as is.

# Copyright (c) 2022-2024, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause

from omni.isaac.lab.utils import configclass

from omni.isaac.lab_tasks.utils.wrappers.rsl_rl import (
    RslRlOnPolicyRunnerCfg,
    RslRlPpoActorCriticCfg,
    RslRlPpoAlgorithmCfg,
)


@configclass
#class AnymalCFlatPPORunnerCfg(RslRlOnPolicyRunnerCfg):
class AT3RFlatPPORunnerCfg(RslRlOnPolicyRunnerCfg):
    num_steps_per_env = 24
    max_iterations = 500
    save_interval = 50
    experiment_name = "AT3R_flat_direct"
    empirical_normalization = False
    policy = RslRlPpoActorCriticCfg(
        init_noise_std=1.0,
        actor_hidden_dims=[128, 128, 128],
        critic_hidden_dims=[128, 128, 128],
        activation="elu",
    )
    algorithm = RslRlPpoAlgorithmCfg(
        value_loss_coef=1.0,
        use_clipped_value_loss=True,
        clip_param=0.2,
        entropy_coef=0.005,
        num_learning_epochs=5,
        num_mini_batches=4,
        learning_rate=1.0e-3,
        schedule="adaptive",
        gamma=0.99,
        lam=0.95,
        desired_kl=0.01,
        max_grad_norm=1.0,
    )


@configclass
#class AnymalCRoughPPORunnerCfg(RslRlOnPolicyRunnerCfg):
class AT3RRoughPPORunnerCfg(RslRlOnPolicyRunnerCfg):
    num_steps_per_env = 24
    max_iterations = 10000
    save_interval = 50
    experiment_name = "AT3R_rough_direct" # def: anymal_c_rough_direct
    empirical_normalization = False
    policy = RslRlPpoActorCriticCfg(
        init_noise_std=1.0,
        actor_hidden_dims=[512, 256, 128],
        critic_hidden_dims=[512, 256, 128],
        activation="elu",
    )
    algorithm = RslRlPpoAlgorithmCfg(
        value_loss_coef=1.0,
        use_clipped_value_loss=True,
        clip_param=0.2,
        entropy_coef=0.005,
        num_learning_epochs=5,
        num_mini_batches=4,
        learning_rate=1.0e-3,
        schedule="adaptive",
        gamma=0.99,
        lam=0.95,
        desired_kl=0.01,
        max_grad_norm=1.0,
    )

H-Hisamichi changed the title ~~[Question] What is the range of action values in RSL-RL PPO algorithm?~~ [Question] How can I check the range of action values in PPO with rsl-rl? Dec 1, 2024

RandomOakForest added the question Further information is requested label Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] How can I check the range of action values in PPO with rsl-rl? #1486

[Question] How can I check the range of action values in PPO with rsl-rl? #1486

H-Hisamichi commented Dec 1, 2024

RandomOakForest commented Dec 6, 2024

H-Hisamichi commented Dec 7, 2024

[Question] How can I check the range of action values in PPO with rsl-rl? #1486

[Question] How can I check the range of action values in PPO with rsl-rl? #1486

Comments

H-Hisamichi commented Dec 1, 2024

RandomOakForest commented Dec 6, 2024

H-Hisamichi commented Dec 7, 2024