Creating a Manager-Based Base Environment#

Environments bring together different aspects of the simulation such as the scene, observations and actions spaces, reset events etc. to create a coherent interface for various applications. In Isaac Lab, manager-based environments are implemented as envs.ManagerBasedEnv and envs.ManagerBasedRLEnv classes. The two classes are very similar, but envs.ManagerBasedRLEnv is useful for reinforcement learning tasks and contains rewards, terminations, curriculum and command generation. The envs.ManagerBasedEnv class is useful for traditional robot control and doesn’t contain rewards and terminations.

In this tutorial, we will look at the base class envs.ManagerBasedEnv and its corresponding configuration class envs.ManagerBasedEnvCfg for the manager-based workflow. We will use the cartpole environment from earlier to illustrate the different components in creating a new envs.ManagerBasedEnv environment.

The Code#

The tutorial corresponds to the create_cartpole_base_env script in the scripts/tutorials/03_envs directory.

Code for create_cartpole_base_env.py
  1# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
  2# All rights reserved.
  3#
  4# SPDX-License-Identifier: BSD-3-Clause
  5
  6# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
  7# All rights reserved.
  8#
  9# SPDX-License-Identifier: BSD-3-Clause
 10
 11"""
 12This script demonstrates how to create a simple environment with a cartpole. It combines the concepts of
 13scene, action, observation and event managers to create an environment.
 14
 15.. code-block:: bash
 16
 17    ./isaaclab.sh -p scripts/tutorials/03_envs/create_cartpole_base_env.py --num_envs 32
 18
 19"""
 20
 21"""Launch Isaac Sim Simulator first."""
 22
 23
 24import argparse
 25
 26from isaaclab.app import AppLauncher
 27
 28# add argparse arguments
 29parser = argparse.ArgumentParser(description="Tutorial on creating a cartpole base environment.")
 30parser.add_argument("--num_envs", type=int, default=16, help="Number of environments to spawn.")
 31
 32# append AppLauncher cli args
 33AppLauncher.add_app_launcher_args(parser)
 34# parse the arguments
 35args_cli = parser.parse_args()
 36
 37# launch omniverse app
 38app_launcher = AppLauncher(args_cli)
 39simulation_app = app_launcher.app
 40
 41"""Rest everything follows."""
 42
 43import math
 44import torch
 45
 46import isaaclab.envs.mdp as mdp
 47from isaaclab.envs import ManagerBasedEnv, ManagerBasedEnvCfg
 48from isaaclab.managers import EventTermCfg as EventTerm
 49from isaaclab.managers import ObservationGroupCfg as ObsGroup
 50from isaaclab.managers import ObservationTermCfg as ObsTerm
 51from isaaclab.managers import SceneEntityCfg
 52from isaaclab.utils import configclass
 53
 54from isaaclab_tasks.manager_based.classic.cartpole.cartpole_env_cfg import CartpoleSceneCfg
 55
 56
 57@configclass
 58class ActionsCfg:
 59    """Action specifications for the environment."""
 60
 61    joint_efforts = mdp.JointEffortActionCfg(asset_name="robot", joint_names=["slider_to_cart"], scale=5.0)
 62
 63
 64@configclass
 65class ObservationsCfg:
 66    """Observation specifications for the environment."""
 67
 68    @configclass
 69    class PolicyCfg(ObsGroup):
 70        """Observations for policy group."""
 71
 72        # observation terms (order preserved)
 73        joint_pos_rel = ObsTerm(func=mdp.joint_pos_rel)
 74        joint_vel_rel = ObsTerm(func=mdp.joint_vel_rel)
 75
 76        def __post_init__(self) -> None:
 77            self.enable_corruption = False
 78            self.concatenate_terms = True
 79
 80    # observation groups
 81    policy: PolicyCfg = PolicyCfg()
 82
 83
 84@configclass
 85class EventCfg:
 86    """Configuration for events."""
 87
 88    # on startup
 89    add_pole_mass = EventTerm(
 90        func=mdp.randomize_rigid_body_mass,
 91        mode="startup",
 92        params={
 93            "asset_cfg": SceneEntityCfg("robot", body_names=["pole"]),
 94            "mass_distribution_params": (0.1, 0.5),
 95            "operation": "add",
 96        },
 97    )
 98
 99    # on reset
100    reset_cart_position = EventTerm(
101        func=mdp.reset_joints_by_offset,
102        mode="reset",
103        params={
104            "asset_cfg": SceneEntityCfg("robot", joint_names=["slider_to_cart"]),
105            "position_range": (-1.0, 1.0),
106            "velocity_range": (-0.1, 0.1),
107        },
108    )
109
110    reset_pole_position = EventTerm(
111        func=mdp.reset_joints_by_offset,
112        mode="reset",
113        params={
114            "asset_cfg": SceneEntityCfg("robot", joint_names=["cart_to_pole"]),
115            "position_range": (-0.125 * math.pi, 0.125 * math.pi),
116            "velocity_range": (-0.01 * math.pi, 0.01 * math.pi),
117        },
118    )
119
120
121@configclass
122class CartpoleEnvCfg(ManagerBasedEnvCfg):
123    """Configuration for the cartpole environment."""
124
125    # Scene settings
126    scene = CartpoleSceneCfg(num_envs=1024, env_spacing=2.5)
127    # Basic settings
128    observations = ObservationsCfg()
129    actions = ActionsCfg()
130    events = EventCfg()
131
132    def __post_init__(self):
133        """Post initialization."""
134        # viewer settings
135        self.viewer.eye = [4.5, 0.0, 6.0]
136        self.viewer.lookat = [0.0, 0.0, 2.0]
137        # step settings
138        self.decimation = 4  # env step every 4 sim steps: 200Hz / 4 = 50Hz
139        # simulation settings
140        self.sim.dt = 0.005  # sim step every 5ms: 200Hz
141
142
143def main():
144    """Main function."""
145    # parse the arguments
146    env_cfg = CartpoleEnvCfg()
147    env_cfg.scene.num_envs = args_cli.num_envs
148    env_cfg.sim.device = args_cli.device
149    # setup base environment
150    env = ManagerBasedEnv(cfg=env_cfg)
151
152    # simulate physics
153    count = 0
154    while simulation_app.is_running():
155        with torch.inference_mode():
156            # reset
157            if count % 300 == 0:
158                count = 0
159                env.reset()
160                print("-" * 80)
161                print("[INFO]: Resetting environment...")
162            # sample random actions
163            joint_efforts = torch.randn_like(env.action_manager.action)
164            # step the environment
165            obs, _ = env.step(joint_efforts)
166            # print current orientation of pole
167            print("[Env 0]: Pole joint: ", obs["policy"][0][1].item())
168            # update counter
169            count += 1
170
171    # close the environment
172    env.close()
173
174
175if __name__ == "__main__":
176    # run the main function
177    main()
178    # close sim app
179    simulation_app.close()

The Code Explained#

The base class envs.ManagerBasedEnv wraps around many intricacies of the simulation interaction and provides a simple interface for the user to run the simulation and interact with it. It is composed of the following components:

By configuring these components, the user can create different variations of the same environment with minimal effort. In this tutorial, we will go through the different components of the envs.ManagerBasedEnv class and how to configure them to create a new environment.

Designing the scene#

The first step in creating a new environment is to configure its scene. For the cartpole environment, we will be using the scene from the previous tutorial. Thus, we omit the scene configuration here. For more details on how to configure a scene, see Using the Interactive Scene.

Defining actions#

In the previous tutorial, we directly input the action to the cartpole using the assets.Articulation.set_joint_effort_target() method. In this tutorial, we will use the managers.ActionManager to handle the actions.

The action manager can comprise of multiple managers.ActionTerm. Each action term is responsible for applying control over a specific aspect of the environment. For instance, for robotic arm, we can have two action terms – one for controlling the joints of the arm, and the other for controlling the gripper. This composition allows the user to define different control schemes for different aspects of the environment.

In the cartpole environment, we want to control the force applied to the cart to balance the pole. Thus, we will create an action term that controls the force applied to the cart.

@configclass
class ActionsCfg:
    """Action specifications for the environment."""

    joint_efforts = mdp.JointEffortActionCfg(asset_name="robot", joint_names=["slider_to_cart"], scale=5.0)

Defining observations#

While the scene defines the state of the environment, the observations define the states that are observable by the agent. These observations are used by the agent to make decisions on what actions to take. In Isaac Lab, the observations are computed by the managers.ObservationManager class.

Similar to the action manager, the observation manager can comprise of multiple observation terms. These are further grouped into observation groups which are used to define different observation spaces for the environment. For instance, for hierarchical control, we may want to define two observation groups – one for the low level controller and the other for the high level controller. It is assumed that all the observation terms in a group have the same dimensions.

For this tutorial, we will only define one observation group named "policy". While not completely prescriptive, this group is a necessary requirement for various wrappers in Isaac Lab. We define a group by inheriting from the managers.ObservationGroupCfg class. This class collects different observation terms and help define common properties for the group, such as enabling noise corruption or concatenating the observations into a single tensor.

The individual terms are defined by inheriting from the managers.ObservationTermCfg class. This class takes in the managers.ObservationTermCfg.func that specifies the function or callable class that computes the observation for that term. It includes other parameters for defining the noise model, clipping, scaling, etc. However, we leave these parameters to their default values for this tutorial.

@configclass
class ObservationsCfg:
    """Observation specifications for the environment."""

    @configclass
    class PolicyCfg(ObsGroup):
        """Observations for policy group."""

        # observation terms (order preserved)
        joint_pos_rel = ObsTerm(func=mdp.joint_pos_rel)
        joint_vel_rel = ObsTerm(func=mdp.joint_vel_rel)

        def __post_init__(self) -> None:
            self.enable_corruption = False
            self.concatenate_terms = True

    # observation groups
    policy: PolicyCfg = PolicyCfg()

Defining events#

At this point, we have defined the scene, actions and observations for the cartpole environment. The general idea for all these components is to define the configuration classes and then pass them to the corresponding managers. The event manager is no different.

The managers.EventManager class is responsible for events corresponding to changes in the simulation state. This includes resetting (or randomizing) the scene, randomizing physical properties (such as mass, friction, etc.), and varying visual properties (such as colors, textures, etc.). Each of these are specified through the managers.EventTermCfg class, which takes in the managers.EventTermCfg.func that specifies the function or callable class that performs the event.

Additionally, it expects the mode of the event. The mode specifies when the event term should be applied. It is possible to specify your own mode. For this, you’ll need to adapt the ManagerBasedEnv class. However, out of the box, Isaac Lab provides three commonly used modes:

  • "startup" - Event that takes place only once at environment startup.

  • "reset" - Event that occurs on environment termination and reset.

  • "interval" - Event that are executed at a given interval, i.e., periodically after a certain number of steps.

For this example, we define events that randomize the pole’s mass on startup. This is done only once since this operation is expensive and we don’t want to do it on every reset. We also create an event to randomize the initial joint state of the cartpole and the pole at every reset.

@configclass
class EventCfg:
    """Configuration for events."""

    # on startup
    add_pole_mass = EventTerm(
        func=mdp.randomize_rigid_body_mass,
        mode="startup",
        params={
            "asset_cfg": SceneEntityCfg("robot", body_names=["pole"]),
            "mass_distribution_params": (0.1, 0.5),
            "operation": "add",
        },
    )

    # on reset
    reset_cart_position = EventTerm(
        func=mdp.reset_joints_by_offset,
        mode="reset",
        params={
            "asset_cfg": SceneEntityCfg("robot", joint_names=["slider_to_cart"]),
            "position_range": (-1.0, 1.0),
            "velocity_range": (-0.1, 0.1),
        },
    )

    reset_pole_position = EventTerm(
        func=mdp.reset_joints_by_offset,
        mode="reset",
        params={
            "asset_cfg": SceneEntityCfg("robot", joint_names=["cart_to_pole"]),
            "position_range": (-0.125 * math.pi, 0.125 * math.pi),
            "velocity_range": (-0.01 * math.pi, 0.01 * math.pi),
        },
    )

Tying it all together#

Having defined the scene and manager configurations, we can now define the environment configuration through the envs.ManagerBasedEnvCfg class. This class takes in the scene, action, observation and event configurations.

In addition to these, it also takes in the envs.ManagerBasedEnvCfg.sim which defines the simulation parameters such as the timestep, gravity, etc. This is initialized to the default values, but can be modified as needed. We recommend doing so by defining the __post_init__() method in the envs.ManagerBasedEnvCfg class, which is called after the configuration is initialized.

@configclass
class CartpoleEnvCfg(ManagerBasedEnvCfg):
    """Configuration for the cartpole environment."""

    # Scene settings
    scene = CartpoleSceneCfg(num_envs=1024, env_spacing=2.5)
    # Basic settings
    observations = ObservationsCfg()
    actions = ActionsCfg()
    events = EventCfg()

    def __post_init__(self):
        """Post initialization."""
        # viewer settings
        self.viewer.eye = [4.5, 0.0, 6.0]
        self.viewer.lookat = [0.0, 0.0, 2.0]
        # step settings
        self.decimation = 4  # env step every 4 sim steps: 200Hz / 4 = 50Hz
        # simulation settings
        self.sim.dt = 0.005  # sim step every 5ms: 200Hz

Running the simulation#

Lastly, we revisit the simulation execution loop. This is now much simpler since we have abstracted away most of the details into the environment configuration. We only need to call the envs.ManagerBasedEnv.reset() method to reset the environment and envs.ManagerBasedEnv.step() method to step the environment. Both these functions return the observation and an info dictionary which may contain additional information provided by the environment. These can be used by an agent for decision-making.

The envs.ManagerBasedEnv class does not have any notion of terminations since that concept is specific for episodic tasks. Thus, the user is responsible for defining the termination condition for the environment. In this tutorial, we reset the simulation at regular intervals.

def main():
    """Main function."""
    # parse the arguments
    env_cfg = CartpoleEnvCfg()
    env_cfg.scene.num_envs = args_cli.num_envs
    env_cfg.sim.device = args_cli.device
    # setup base environment
    env = ManagerBasedEnv(cfg=env_cfg)

    # simulate physics
    count = 0
    while simulation_app.is_running():
        with torch.inference_mode():
            # reset
            if count % 300 == 0:
                count = 0
                env.reset()
                print("-" * 80)
                print("[INFO]: Resetting environment...")
            # sample random actions
            joint_efforts = torch.randn_like(env.action_manager.action)
            # step the environment
            obs, _ = env.step(joint_efforts)
            # print current orientation of pole
            print("[Env 0]: Pole joint: ", obs["policy"][0][1].item())
            # update counter
            count += 1

    # close the environment
    env.close()

An important thing to note above is that the entire simulation loop is wrapped inside the torch.inference_mode() context manager. This is because the environment uses PyTorch operations under-the-hood and we want to ensure that the simulation is not slowed down by the overhead of PyTorch’s autograd engine and gradients are not computed for the simulation operations.

The Code Execution#

To run the base environment made in this tutorial, you can use the following command:

./isaaclab.sh -p scripts/tutorials/03_envs/create_cartpole_base_env.py --num_envs 32

This should open a stage with a ground plane, light source, and cartpoles. The simulation should be playing with random actions on the cartpole. Additionally, it opens a UI window on the bottom right corner of the screen named "Isaac Lab". This window contains different UI elements that can be used for debugging and visualization.

result of create_cartpole_base_env.py

To stop the simulation, you can either close the window, or press Ctrl+C in the terminal where you started the simulation.

In this tutorial, we learned about the different managers that help define a base environment. We include more examples of defining the base environment in the scripts/tutorials/03_envs directory. For completeness, they can be run using the following commands:

# Floating cube environment with custom action term for PD control
./isaaclab.sh -p scripts/tutorials/03_envs/create_cube_base_env.py --num_envs 32

# Quadrupedal locomotion environment with a policy that interacts with the environment
./isaaclab.sh -p scripts/tutorials/03_envs/create_quadruped_base_env.py --num_envs 32

In the following tutorial, we will look at the envs.ManagerBasedRLEnv class and how to use it to create a Markovian Decision Process (MDP).