Recording video clips during training#
Isaac Lab supports recording video clips during training using the
gymnasium.wrappers.RecordVideo class.
When the --video flag is enabled, Isaac Lab captures a perspective view of the scene. If a Kit or
Newton visualizer is active, that visualizer selects the video backend by default. Otherwise, the
backend is chosen automatically from the active physics and renderer stack: an Isaac Sim Kit camera or
a Newton GL headless viewer.
This feature can be enabled by installing ffmpeg and using the following command line arguments with the training
script:
--video: enables video recording during training--video_length: length of each recorded video (in steps)--video_interval: interval between each video recording (in steps)
Note that enabling recording is equivalent to enabling rendering during training, which will slow down both startup and runtime performance.
Example usage:
python scripts/reinforcement_learning/rl_games/train.py --task=Isaac-Cartpole-v0 --headless --video --video_length 100 --video_interval 500
The recorded videos will be saved in the same directory as the training checkpoints, under
IsaacLab/logs/<rl_workflow>/<task>/<run>/videos/train.
Overview#
The video recording feature is implemented using the VideoRecorder class. This class is responsible for resolving the video backend from the scene, capturing the video frames, and saving them to a file.
VideoRecorderCfg(isaaclab.envs.utils.video_recorder_cfg) holds resolution, backend source, and world-space perspective parameterseyeandlookat(defaults to a diagonal view of the scene).VideoRecorder(isaaclab.envs.utils.video_recorder) picks a video backend from the scene (Kit vs Newton GL), builds the matching low-level capture object, and returns RGB frames viarender_rgb_array().Direct RL, Direct MARL and manager-based RL environments copy the task’s
ViewerCfgeyeandlookatinto those fields before the recorder is constructed, so training clips align with the task’s intended viewport whenorigin_typeis"world".
Configuration: VideoRecorderCfg#
The dataclass lives in isaaclab.envs.utils.video_recorder_cfg. Fields eye and lookat are
the perspective camera position and target in meters.
@configclass
class VideoRecorderCfg:
"""Configuration for :class:`~isaaclab.envs.utils.video_recorder.VideoRecorder`."""
class_type: type = VideoRecorder
"""Recorder class to instantiate; must accept ``(cfg, scene)``."""
env_render_mode: str | None = None
"""Gym render mode forwarded from the environment constructor (``"rgb_array"`` when ``--video`` is active).
Set automatically by the environment base classes; do not set manually.
"""
eye: tuple[float, float, float] = (7.5, 7.5, 7.5)
"""Perspective camera position in world space (metres).
Direct RL / MARL and manager-based RL environments overwrite this from
:attr:`~isaaclab.envs.common.ViewerCfg.eye` before recording so ``--video`` matches the
task viewport for both Kit (PhysX / Isaac RTX) and Newton GL (Newton / OVRTX / etc.).
"""
lookat: tuple[float, float, float] = (0.0, 0.0, 0.0)
"""Perspective camera look-at target in world space (metres). Set from ``ViewerCfg.lookat`` at env init."""
backend_source: Literal["visualizer", "renderer"] = "visualizer"
"""Source used to resolve the video capture backend.
``"visualizer"`` records from the active Kit or Newton visualizer when one is enabled, and falls back to the
physics/renderer stack otherwise. ``"renderer"`` ignores active visualizers and records from the backend implied by
the physics/renderer stack.
"""
window_width: int = 1280
"""Width in pixels of the recorded frame."""
window_height: int = 720
"""Height in pixels of the recorded frame."""
Task framing: ViewerCfg#
Tasks define the interactive viewer with ViewerCfg. The eye and
lookat tuples are the same values the RL base classes copy into VideoRecorderCfg (see below).
If your task uses origin_type="world", those tuples are world-space positions and match what the
perspective recorder expects.
def _viewer_cfg_value_matches_default(current: object, default: object) -> bool:
"""Return True if ``current`` matches the dataclass field default (including list/tuple equivalence)."""
if current == default:
return True
if isinstance(current, (list, tuple)) and isinstance(default, (list, tuple)):
if len(current) != len(default):
return False
Backend selection: Kit vs Newton GL#
VideoRecorder resolves the implementation from the live InteractiveScene.
With the default VideoRecorderCfg.backend_source = "visualizer", an active --visualizer kit
selects the Kit path (omni.replicator on /OmniverseKit_Persp), and an active
--visualizer newton selects the Newton GL path. If both visualizers are active, Kit takes
precedence and only one --video stream is recorded. Rerun records .rrd replay data through
the Rerun visualizer rather than producing --video clips, and Viser does not currently provide a
--video recording backend.
Set VideoRecorderCfg.backend_source = "renderer" to ignore active visualizers and choose from the
physics/renderer stack instead. In that mode, PhysX physics (presets=physx,...) or Isaac RTX
(presets=isaac_rtx_renderer,...) selects the Kit path. Newton physics (presets=newton_mjwarp,...) or
the Newton Warp renderer (presets=newton_renderer,...) selects the Newton GL path when no Kit
signal is present. OVRTX (presets=ovrtx_renderer,... from isaaclab_ov) can pair with IsaacSim
or Newton physics; in that case the video backend is selected via the physics preset. If both Kit and
Newton GL signals are present, the Kit path is chosen.
from .video_recorder_cfg import VideoRecorderCfg
logger = logging.getLogger(__name__)
_VideoBackend = Literal["kit", "newton_gl"]
# visualizer types that map to a supported video backend.
# viser and rerun are intentionally absent - they have no video-capture API.
_VISUALIZER_TO_VIDEO_BACKEND: dict[str, _VideoBackend] = {
"kit": "kit",
"newton": "newton_gl",
}
def _resolve_video_backend(
scene: InteractiveScene, backend_source: str = "visualizer"
) -> tuple[_VideoBackend, str | None]:
"""Return ``(backend, matched_visualizer_type)`` for the active scene.
``matched_visualizer_type`` is ``"kit"`` / ``"newton"`` when a visualizer drove the
selection, or ``None`` when the physics/renderer preset stack was used instead.
Construction and dispatch#
When env_render_mode is "rgb_array" (as when wrappers or scripts request RGB frames for
video), the recorder instantiates the backend-specific helper and passes through eye, lookat,
and window size.
# Prefer the visualizer backend when --visualizer is active alongside --video.
visualizer_types: list[str] = scene.sim.resolve_visualizer_types() if backend_source == "visualizer" else []
if visualizer_types:
# kit takes priority when multiple visualizers are active
for preferred in ("kit", "newton"):
if preferred in visualizer_types:
backend = _VISUALIZER_TO_VIDEO_BACKEND[preferred]
logger.debug("[VideoRecorder] Using '%s' backend from active '%s' visualizer.", backend, preferred)
return backend, preferred
# only unsupported visualizer types (viser, rerun) are active.
logger.warning(
"[VideoRecorder] Active visualizer(s) %s do not support video capture; "
"falling back to physics/renderer stack detection.",
visualizer_types,
)
# fall back to physics/renderer preset stack detection.
sim = scene.sim
physics_name = sim.physics_manager.__name__.lower()
renderer_types: list[str] = scene._sensor_renderer_types()
use_kit = "physx" in physics_name or "isaac_rtx" in renderer_types
use_newton_gl = "newton" in physics_name or "newton_warp" in renderer_types
if use_kit:
return "kit", None
if use_newton_gl:
return "newton_gl", None
raise RuntimeError(
"Video recording (--video) requires a supported backend: "
"PhysX or Isaac RTX renderer (Kit camera), or Newton physics / Newton Warp renderer (GL viewer). "
"No supported backend detected; do not use --video for this setup."
)
def _sync_camera_from_visualizer(
scene: InteractiveScene,
visualizer_type: str,
cfg: VideoRecorderCfg,
) -> None:
"""Overwrite ``cfg.eye`` and ``cfg.lookat`` from the active visualizer.
Args:
scene: The interactive scene that owns the sim context.
Customising the camera view#
When --video is passed, the recording camera uses the same configured
position and look-at target as the active Kit or Newton visualizer when that visualizer drives backend
selection. Otherwise, the defaults come from
ViewerCfg:
eye = (7.5, 7.5, 7.5)— camera position in world space (metres)lookat = (0.0, 0.0, 0.0)— camera look-at target in world space (metres)Resolution
1280x720
To change the recording angle without a visualizer, override the viewer field in your task’s
environment config. The RL base classes automatically copy eye and lookat into
VideoRecorderCfg before recording starts (when origin_type is "world"), so the video clip
uses the same configured viewpoint as the interactive viewport:
from isaaclab.envs import ManagerBasedRLEnvCfg
from isaaclab.envs.common import ViewerCfg
from isaaclab.utils import configclass
@configclass
class MyTaskCfg(ManagerBasedRLEnvCfg):
viewer: ViewerCfg = ViewerCfg(
eye=(5.0, 5.0, 5.0),
lookat=(0.0, 0.0, 1.0),
)
Summary#
Stack example ( |
Video backend |
Capture mechanism |
|---|---|---|
|
Kit ( |
|
|
Newton GL ( |
|
|
Newton GL ( |
|
|
Kit ( |
Visualizer |
|
Newton GL ( |
Visualizer |
See also#
Visualization - interactive visualizers