Action Space#
URBAN-SIM supports multiple types of action spaces depending on the robot embodiment. Each robot defines its own control interface by providing a corresponding action configuration class. This configuration determines the action dimension, control type, and scaling behavior used during RL training.
Taking COCO (a wheeled robot) as an example:
The action space is defined in the urbansim/primitives/robot/coco.py
.
COCO Action Binding#
When robot_name = “coco”, the following modules are loaded:
from urbansim.primitives.robot.coco import COCOVelocityActionsCfg
action_cfg = COCOVelocityActionsCfg
The class COCOVelocityActionsCfg defines a continuous 2D velocity command space, consisting of:
linear_velocity (forward/backward)
angular_velocity (rotation rate)
It is used in the full environment config:
@configclass
class EnvCfg(ManagerBasedRLEnvCfg):
...
actions = COCOVelocityActionsCfg()
More specifically, the action space is defined as:
class ClassicalCarAction(ActionTerm):
r"""Pre-trained policy action term.
This action term infers a pre-trained policy and applies the corresponding low-level actions to the robot.
The raw actions correspond to the commands for the pre-trained policy.
"""
cfg: ClassicalCarActionCfg
"""The configuration of the action term."""
def __init__(self, cfg: ClassicalCarActionCfg, env: ManagerBasedRLEnv) -> None:
# initialize the action term
super().__init__(cfg, env)
self.robot: Articulation = env.scene[cfg.asset_name]
self._counter = 0
self.last_wheel_angle = torch.zeros(self.num_envs, 1, device=self.device)
self.axle_names = ["base_to_front_axle_joint"]
self.wheel_names = ["front_left_wheel_joint","front_right_wheel_joint", "rear_left_wheel_joint", "rear_right_wheel_joint"]
self.shock_names = [".*shock_joint"]
self._raw_actions = torch.zeros(self.num_envs, 2, device=self.device)
# prepare low level actions
self.acceleration_action: JointVelocityAction = JointVelocityAction(JointVelocityActionCfg(asset_name="robot", joint_names=[".*_wheel_joint"], scale=10.0, use_default_offset=False), env)
self.steering_action: JointPositionAction = JointPositionAction(JointPositionActionCfg(asset_name="robot", joint_names=self.axle_names, scale=1., use_default_offset=True), env)
"""
Properties.
"""
@property
def action_dim(self) -> int:
return 2
@property
def raw_actions(self) -> torch.Tensor:
return self._raw_actions
@property
def processed_actions(self) -> torch.Tensor:
return self.raw_actions
"""
Operations.
"""
def process_actions(self, actions: torch.Tensor):
self._raw_actions[:] = actions
def apply_actions(self):
if self._counter % ACTION_INTERVAL == 0:
max_wheel_v = 4.
wheel_base = 1.5
radius_rear = 0.3
max_ang = 40 * torch.pi / 180
velocity = self.raw_actions[..., :1].clamp(0.0, max_wheel_v) / radius_rear
angular = self.raw_actions[..., 1:2].clamp(-max_ang, max_ang)
angular[angular.abs() < 0.05] = torch.zeros_like(angular[angular.abs() < 0.05])
R = wheel_base / torch.tan(angular)
left_wheel_angle = torch.arctan(wheel_base / (R - 0.5 * 1.8))
right_wheel_angle = torch.arctan(wheel_base / (R + 0.5 * 1.8))
self.steering_action.process_actions(((right_wheel_angle + left_wheel_angle) / 2.))
self.acceleration_action.process_actions(torch.cat([velocity, velocity, velocity, velocity], dim=1))
self.steering_action.apply_actions()
self.acceleration_action.apply_actions()
self._counter += 1
If you want to build your own action term, you can subclass ActionTerm and implement the required methods.
Action Application#
In each simulation step, the RL policy outputs a 2D vector [v_lin, v_ang]
, which is interpreted as:
v_lin
: forward speed (e.g., mapped tobase_command.v_target
)v_ang
: yaw rate (e.g., mapped tobase_command.w_target
)
This command is then used to drive the robot within the simulation loop.
Other Robot Types#
Other robot embodiments use different action configs:
Unitree Go2: GO2NavActionsCfg → joint velocity commands or high-level velocity
Unitree G1: G1NavActionsCfg → biped locomotion controls
These robots take pretrained neural networks as actions, which are applied to the robot’s joints or high-level velocity commands. More details can be found in the respective robot configuration files under urbansim/primitives/robot/ and we will provide pretrained model weights.