Recording video clips during training#

Isaac Lab supports recording video clips during training using the gymnasium.wrappers.RecordVideo class. When the --video flag is enabled, Isaac Lab captures a perspective view of the scene. If a Kit or Newton visualizer is active, that visualizer selects the video backend by default. Otherwise, the backend is chosen automatically from the active physics and renderer stack: an Isaac Sim Kit camera or a Newton GL headless viewer.

This feature can be enabled by installing ffmpeg and using the following command line arguments with the training script:

  • --video: enables video recording during training

  • --video_length: length of each recorded video (in steps)

  • --video_interval: interval between each video recording (in steps)

Note that enabling recording is equivalent to enabling rendering during training, which will slow down both startup and runtime performance.

Example usage:

python scripts/reinforcement_learning/rl_games/train.py --task=Isaac-Cartpole-v0 --headless --video --video_length 100 --video_interval 500

The recorded videos will be saved in the same directory as the training checkpoints, under IsaacLab/logs/<rl_workflow>/<task>/<run>/videos/train.

Overview#

The video recording feature is implemented using the VideoRecorder class. This class is responsible for resolving the video backend from the scene, capturing the video frames, and saving them to a file.

  • VideoRecorderCfg (isaaclab.envs.utils.video_recorder_cfg) holds resolution, backend source, and world-space perspective parameters eye and lookat (defaults to a diagonal view of the scene).

  • VideoRecorder (isaaclab.envs.utils.video_recorder) picks a video backend from the scene (Kit vs Newton GL), builds the matching low-level capture object, and returns RGB frames via render_rgb_array().

  • Direct RL, Direct MARL and manager-based RL environments copy the task’s ViewerCfg eye and lookat into those fields before the recorder is constructed, so training clips align with the task’s intended viewport when origin_type is "world".

Configuration: VideoRecorderCfg#

The dataclass lives in isaaclab.envs.utils.video_recorder_cfg. Fields eye and lookat are the perspective camera position and target in meters.



@configclass
class VideoRecorderCfg:
    """Configuration for :class:`~isaaclab.envs.utils.video_recorder.VideoRecorder`."""

    class_type: type = VideoRecorder
    """Recorder class to instantiate; must accept ``(cfg, scene)``."""

    env_render_mode: str | None = None
    """Gym render mode forwarded from the environment constructor (``"rgb_array"`` when ``--video`` is active).

    Set automatically by the environment base classes; do not set manually.
    """

    eye: tuple[float, float, float] = (7.5, 7.5, 7.5)
    """Perspective camera position in world space (metres).

    Direct RL / MARL and manager-based RL environments overwrite this from
    :attr:`~isaaclab.envs.common.ViewerCfg.eye` before recording so ``--video`` matches the
    task viewport for both Kit (PhysX / Isaac RTX) and Newton GL (Newton / OVRTX / etc.).
    """

    lookat: tuple[float, float, float] = (0.0, 0.0, 0.0)
    """Perspective camera look-at target in world space (metres). Set from ``ViewerCfg.lookat`` at env init."""

    backend_source: Literal["visualizer", "renderer"] = "visualizer"
    """Source used to resolve the video capture backend.

    ``"visualizer"`` records from the active Kit or Newton visualizer when one is enabled, and falls back to the
    physics/renderer stack otherwise. ``"renderer"`` ignores active visualizers and records from the backend implied by
    the physics/renderer stack.
    """

    window_width: int = 1280
    """Width in pixels of the recorded frame."""

    window_height: int = 720
    """Height in pixels of the recorded frame."""

Task framing: ViewerCfg#

Tasks define the interactive viewer with ViewerCfg. The eye and lookat tuples are the same values the RL base classes copy into VideoRecorderCfg (see below). If your task uses origin_type="world", those tuples are world-space positions and match what the perspective recorder expects.



def _viewer_cfg_value_matches_default(current: object, default: object) -> bool:
    """Return True if ``current`` matches the dataclass field default (including list/tuple equivalence)."""
    if current == default:
        return True
    if isinstance(current, (list, tuple)) and isinstance(default, (list, tuple)):
        if len(current) != len(default):
            return False

Backend selection: Kit vs Newton GL#

VideoRecorder resolves the implementation from the live InteractiveScene. With the default VideoRecorderCfg.backend_source = "visualizer", an active --visualizer kit selects the Kit path (omni.replicator on /OmniverseKit_Persp), and an active --visualizer newton selects the Newton GL path. If both visualizers are active, Kit takes precedence and only one --video stream is recorded. Rerun records .rrd replay data through the Rerun visualizer rather than producing --video clips, and Viser does not currently provide a --video recording backend.

Set VideoRecorderCfg.backend_source = "renderer" to ignore active visualizers and choose from the physics/renderer stack instead. In that mode, PhysX physics (presets=physx,...) or Isaac RTX (presets=isaac_rtx_renderer,...) selects the Kit path. Newton physics (presets=newton_mjwarp,...) or the Newton Warp renderer (presets=newton_renderer,...) selects the Newton GL path when no Kit signal is present. OVRTX (presets=ovrtx_renderer,... from isaaclab_ov) can pair with IsaacSim or Newton physics; in that case the video backend is selected via the physics preset. If both Kit and Newton GL signals are present, the Kit path is chosen.

    from .video_recorder_cfg import VideoRecorderCfg

logger = logging.getLogger(__name__)

_VideoBackend = Literal["kit", "newton_gl"]

# visualizer types that map to a supported video backend.
# viser and rerun are intentionally absent - they have no video-capture API.
_VISUALIZER_TO_VIDEO_BACKEND: dict[str, _VideoBackend] = {
    "kit": "kit",
    "newton": "newton_gl",
}


def _resolve_video_backend(
    scene: InteractiveScene, backend_source: str = "visualizer"
) -> tuple[_VideoBackend, str | None]:
    """Return ``(backend, matched_visualizer_type)`` for the active scene.

    ``matched_visualizer_type`` is ``"kit"`` / ``"newton"`` when a visualizer drove the
    selection, or ``None`` when the physics/renderer preset stack was used instead.

Construction and dispatch#

When env_render_mode is "rgb_array" (as when wrappers or scripts request RGB frames for video), the recorder instantiates the backend-specific helper and passes through eye, lookat, and window size.


    # Prefer the visualizer backend when --visualizer is active alongside --video.
    visualizer_types: list[str] = scene.sim.resolve_visualizer_types() if backend_source == "visualizer" else []
    if visualizer_types:
        # kit takes priority when multiple visualizers are active
        for preferred in ("kit", "newton"):
            if preferred in visualizer_types:
                backend = _VISUALIZER_TO_VIDEO_BACKEND[preferred]
                logger.debug("[VideoRecorder] Using '%s' backend from active '%s' visualizer.", backend, preferred)
                return backend, preferred
        # only unsupported visualizer types (viser, rerun) are active.
        logger.warning(
            "[VideoRecorder] Active visualizer(s) %s do not support video capture; "
            "falling back to physics/renderer stack detection.",
            visualizer_types,
        )

    # fall back to physics/renderer preset stack detection.
    sim = scene.sim
    physics_name = sim.physics_manager.__name__.lower()
    renderer_types: list[str] = scene._sensor_renderer_types()

    use_kit = "physx" in physics_name or "isaac_rtx" in renderer_types
    use_newton_gl = "newton" in physics_name or "newton_warp" in renderer_types

    if use_kit:
        return "kit", None
    if use_newton_gl:
        return "newton_gl", None
    raise RuntimeError(
        "Video recording (--video) requires a supported backend: "
        "PhysX or Isaac RTX renderer (Kit camera), or Newton physics / Newton Warp renderer (GL viewer). "
        "No supported backend detected; do not use --video for this setup."
    )


def _sync_camera_from_visualizer(
    scene: InteractiveScene,
    visualizer_type: str,
    cfg: VideoRecorderCfg,
) -> None:
    """Overwrite ``cfg.eye`` and ``cfg.lookat`` from the active visualizer.

    Args:
        scene: The interactive scene that owns the sim context.

Customising the camera view#

When --video is passed, the recording camera uses the same configured position and look-at target as the active Kit or Newton visualizer when that visualizer drives backend selection. Otherwise, the defaults come from ViewerCfg:

  • eye = (7.5, 7.5, 7.5) — camera position in world space (metres)

  • lookat = (0.0, 0.0, 0.0) — camera look-at target in world space (metres)

  • Resolution 1280x720

To change the recording angle without a visualizer, override the viewer field in your task’s environment config. The RL base classes automatically copy eye and lookat into VideoRecorderCfg before recording starts (when origin_type is "world"), so the video clip uses the same configured viewpoint as the interactive viewport:

from isaaclab.envs import ManagerBasedRLEnvCfg
from isaaclab.envs.common import ViewerCfg
from isaaclab.utils import configclass

@configclass
class MyTaskCfg(ManagerBasedRLEnvCfg):
    viewer: ViewerCfg = ViewerCfg(
        eye=(5.0, 5.0, 5.0),
        lookat=(0.0, 0.0, 1.0),
    )

Summary#

Stack example (presets=...)

Video backend

Capture mechanism

physx,... or isaac_rtx_renderer,...

Kit ("kit")

/OmniverseKit_Persp + Replicator RGB

newton_mjwarp,... or newton_renderer,... (no Kit signals)

Newton GL ("newton_gl")

newton.viewer.ViewerGL on the SDP Newton model

newton_mjwarp,...,ovrtx_renderer,... (OVRTX + Newton physics)

Newton GL ("newton_gl")

newton.viewer.ViewerGL on the SDP Newton model

--visualizer kit with default backend_source

Kit ("kit")

Visualizer eye / lookat copied to /OmniverseKit_Persp + Replicator RGB

--visualizer newton with default backend_source

Newton GL ("newton_gl")

Visualizer eye / lookat initially, then live Newton viewer camera sync per frame

See also#