Find How Many/What Cameras You Should Train With#
Currently in Isaac Lab, there are several camera types; USD Cameras (standard), Tiled Cameras,
and Ray Caster cameras. These camera types differ in functionality and performance. The benchmark_cameras.py
script can be used to understand the difference in cameras types, as well to characterize their relative performance
at different parameters such as camera quantity, image dimensions, and data types.
This utility is provided so that one easily can find the camera type/parameters that are the most performant while meeting the requirements of the user’s scenario. This utility also helps estimate the maximum number of cameras one can realistically run, assuming that one wants to maximize the number of environments while minimizing step time.
This utility can inject cameras into an existing task from the gym registry,
which can be useful for benchmarking cameras in a specific scenario. Also,
if you install pynvml, you can let this utility automatically find the maximum
numbers of cameras that can run in your task environment up to a
certain specified system resource utilization threshold (without training; taking zero actions
at each timestep).
This guide accompanies the benchmark_cameras.py script in the scripts/benchmarks
directory.
Code for benchmark_cameras.py
1# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
2# All rights reserved.
3#
4# SPDX-License-Identifier: BSD-3-Clause
5
6"""
7This script might help you determine how many cameras your system can realistically run
8at different desired settings.
9
10You can supply different task environments to inject cameras into, or just test a sample scene.
11Additionally, you can automatically find the maximum amount of cameras you can run a task with
12through the auto-tune functionality.
13
14.. code-block:: bash
15
16 # Usage with GUI
17 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
18
19 # Usage with headless
20 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h --headless
21
22"""
23
24"""Launch Isaac Sim Simulator first."""
25
26import argparse
27from collections.abc import Callable
28
29from isaaclab.app import AppLauncher
30
31# parse the arguments
32args_cli = argparse.Namespace()
33
34parser = argparse.ArgumentParser(description="This script can help you benchmark how many cameras you could run.")
35
36"""
37The following arguments only need to be supplied for when one wishes
38to try injecting cameras into their environment, and automatically determining
39the maximum camera count.
40"""
41parser.add_argument(
42 "--task",
43 type=str,
44 default=None,
45 required=False,
46 help="Supply this argument to spawn cameras within an known manager-based task environment.",
47)
48
49parser.add_argument(
50 "--autotune",
51 default=False,
52 action="store_true",
53 help=(
54 "Autotuning is only supported for provided task environments."
55 " Supply this argument to increase the number of environments until a desired threshold is reached."
56 "Install pynvml in your environment; ./isaaclab.sh -m pip install pynvml"
57 ),
58)
59
60parser.add_argument(
61 "--task_num_cameras_per_env",
62 type=int,
63 default=1,
64 help="The number of cameras per environment to use when using a known task.",
65)
66
67parser.add_argument(
68 "--use_fabric", action="store_true", default=False, help="Enable fabric and use USD I/O operations."
69)
70
71parser.add_argument(
72 "--autotune_max_percentage_util",
73 nargs="+",
74 type=float,
75 default=[100.0, 80.0, 80.0, 80.0],
76 required=False,
77 help=(
78 "The system utilization percentage thresholds to reach before an autotune is finished. "
79 "If any one of these limits are hit, the autotune stops."
80 "Thresholds are, in order, maximum CPU percentage utilization,"
81 "maximum RAM percentage utilization, maximum GPU compute percent utilization, "
82 "amd maximum GPU memory utilization."
83 ),
84)
85
86parser.add_argument(
87 "--autotune_max_camera_count", type=int, default=4096, help="The maximum amount of cameras allowed in an autotune."
88)
89
90parser.add_argument(
91 "--autotune_camera_count_interval",
92 type=int,
93 default=25,
94 help=(
95 "The number of cameras to try to add to the environment if the current camera count"
96 " falls within permitted system resource utilization limits."
97 ),
98)
99
100"""
101The following arguments are shared for when injecting cameras into a task environment,
102as well as when creating cameras independent of a task environment.
103"""
104
105parser.add_argument(
106 "--num_tiled_cameras",
107 type=int,
108 default=0,
109 required=False,
110 help="Number of tiled cameras to create. For autotuning, this is how many cameras to start with.",
111)
112
113parser.add_argument(
114 "--num_standard_cameras",
115 type=int,
116 default=0,
117 required=False,
118 help="Number of standard cameras to create. For autotuning, this is how many cameras to start with.",
119)
120
121parser.add_argument(
122 "--num_ray_caster_cameras",
123 type=int,
124 default=0,
125 required=False,
126 help="Number of ray caster cameras to create. For autotuning, this is how many cameras to start with.",
127)
128
129parser.add_argument(
130 "--tiled_camera_data_types",
131 nargs="+",
132 type=str,
133 default=["rgb", "depth"],
134 help="The data types rendered by the tiled camera",
135)
136
137parser.add_argument(
138 "--standard_camera_data_types",
139 nargs="+",
140 type=str,
141 default=["rgb", "distance_to_image_plane", "distance_to_camera"],
142 help="The data types rendered by the standard camera",
143)
144
145parser.add_argument(
146 "--ray_caster_camera_data_types",
147 nargs="+",
148 type=str,
149 default=["distance_to_image_plane"],
150 help="The data types rendered by the ray caster camera.",
151)
152
153parser.add_argument(
154 "--ray_caster_visible_mesh_prim_paths",
155 nargs="+",
156 type=str,
157 default=["/World/ground"],
158 help="WARNING: Ray Caster can currently only cast against a single, static, object",
159)
160
161parser.add_argument(
162 "--convert_depth_to_camera_to_image_plane",
163 action="store_true",
164 default=True,
165 help=(
166 "Enable undistorting from perspective view (distance to camera data_type)"
167 "to orthogonal view (distance to plane data_type) for depth."
168 "This is currently needed to create undisorted depth images/point cloud."
169 ),
170)
171
172parser.add_argument(
173 "--keep_raw_depth",
174 dest="convert_depth_to_camera_to_image_plane",
175 action="store_false",
176 help=(
177 "Disable undistorting from perspective view (distance to camera)"
178 "to orthogonal view (distance to plane data_type) for depth."
179 ),
180)
181
182parser.add_argument(
183 "--height",
184 type=int,
185 default=120,
186 required=False,
187 help="Height in pixels of cameras",
188)
189
190parser.add_argument(
191 "--width",
192 type=int,
193 default=140,
194 required=False,
195 help="Width in pixels of cameras",
196)
197
198parser.add_argument(
199 "--warm_start_length",
200 type=int,
201 default=3,
202 required=False,
203 help=(
204 "Number of steps to run the sim before starting benchmark."
205 "Needed to avoid blank images at the start of the simulation."
206 ),
207)
208
209parser.add_argument(
210 "--experiment_length",
211 type=int,
212 default=15,
213 required=False,
214 help="Number of steps to average over",
215)
216
217# This argument is only used when a task is not provided.
218parser.add_argument(
219 "--num_objects",
220 type=int,
221 default=10,
222 required=False,
223 help="Number of objects to spawn into the scene when not using a known task.",
224)
225
226
227AppLauncher.add_app_launcher_args(parser)
228args_cli = parser.parse_args()
229args_cli.enable_cameras = True
230
231if args_cli.autotune:
232 import pynvml
233
234if len(args_cli.ray_caster_visible_mesh_prim_paths) > 1:
235 print("[WARNING]: Ray Casting is only currently supported for a single, static object")
236# launch omniverse app
237app_launcher = AppLauncher(args_cli)
238simulation_app = app_launcher.app
239
240"""Rest everything follows."""
241
242import gymnasium as gym
243import numpy as np
244import random
245import time
246import torch
247
248import psutil
249
250import isaaclab.sim as sim_utils
251from isaaclab.assets import RigidObject, RigidObjectCfg
252from isaaclab.scene.interactive_scene import InteractiveScene
253from isaaclab.sensors import (
254 Camera,
255 CameraCfg,
256 RayCasterCamera,
257 RayCasterCameraCfg,
258 TiledCamera,
259 TiledCameraCfg,
260 patterns,
261)
262from isaaclab.utils.math import orthogonalize_perspective_depth, unproject_depth
263
264from isaaclab_tasks.utils import load_cfg_from_registry
265
266"""
267Camera Creation
268"""
269
270
271def create_camera_base(
272 camera_cfg: type[CameraCfg | TiledCameraCfg],
273 num_cams: int,
274 data_types: list[str],
275 height: int,
276 width: int,
277 prim_path: str | None = None,
278 instantiate: bool = True,
279) -> Camera | TiledCamera | CameraCfg | TiledCameraCfg | None:
280 """Generalized function to create a camera or tiled camera sensor."""
281 # Determine prim prefix based on the camera class
282 name = camera_cfg.class_type.__name__
283
284 if instantiate:
285 # Create the necessary prims
286 for idx in range(num_cams):
287 sim_utils.create_prim(f"/World/{name}_{idx:02d}", "Xform")
288 if prim_path is None:
289 prim_path = f"/World/{name}_.*/{name}"
290 # If valid camera settings are provided, create the camera
291 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
292 cfg = camera_cfg(
293 prim_path=prim_path,
294 update_period=0,
295 height=height,
296 width=width,
297 data_types=data_types,
298 spawn=sim_utils.PinholeCameraCfg(
299 focal_length=24, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 1e4)
300 ),
301 )
302 if instantiate:
303 return camera_cfg.class_type(cfg=cfg)
304 else:
305 return cfg
306 else:
307 return None
308
309
310def create_tiled_cameras(
311 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
312) -> TiledCamera | None:
313 if data_types is None:
314 data_types = ["rgb", "depth"]
315 """Defines the tiled camera sensor to add to the scene."""
316 return create_camera_base(
317 camera_cfg=TiledCameraCfg,
318 num_cams=num_cams,
319 data_types=data_types,
320 height=height,
321 width=width,
322 )
323
324
325def create_cameras(
326 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
327) -> Camera | None:
328 """Defines the Standard cameras."""
329 if data_types is None:
330 data_types = ["rgb", "depth"]
331 return create_camera_base(
332 camera_cfg=CameraCfg, num_cams=num_cams, data_types=data_types, height=height, width=width
333 )
334
335
336def create_ray_caster_cameras(
337 num_cams: int = 2,
338 data_types: list[str] = ["distance_to_image_plane"],
339 mesh_prim_paths: list[str] = ["/World/ground"],
340 height: int = 100,
341 width: int = 120,
342 prim_path: str = "/World/RayCasterCamera_.*/RayCaster",
343 instantiate: bool = True,
344) -> RayCasterCamera | RayCasterCameraCfg | None:
345 """Create the raycaster cameras; different configuration than Standard/Tiled camera"""
346 for idx in range(num_cams):
347 sim_utils.create_prim(f"/World/RayCasterCamera_{idx:02d}/RayCaster", "Xform")
348
349 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
350 cam_cfg = RayCasterCameraCfg(
351 prim_path=prim_path,
352 mesh_prim_paths=mesh_prim_paths,
353 update_period=0,
354 offset=RayCasterCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
355 data_types=data_types,
356 debug_vis=False,
357 pattern_cfg=patterns.PinholeCameraPatternCfg(
358 focal_length=24.0,
359 horizontal_aperture=20.955,
360 height=480,
361 width=640,
362 ),
363 )
364 if instantiate:
365 return RayCasterCamera(cfg=cam_cfg)
366 else:
367 return cam_cfg
368
369 else:
370 return None
371
372
373def create_tiled_camera_cfg(prim_path: str) -> TiledCameraCfg:
374 """Grab a simple tiled camera config for injecting into task environments."""
375 return create_camera_base(
376 TiledCameraCfg,
377 num_cams=args_cli.num_tiled_cameras,
378 data_types=args_cli.tiled_camera_data_types,
379 width=args_cli.width,
380 height=args_cli.height,
381 prim_path="{ENV_REGEX_NS}/" + prim_path,
382 instantiate=False,
383 )
384
385
386def create_standard_camera_cfg(prim_path: str) -> CameraCfg:
387 """Grab a simple standard camera config for injecting into task environments."""
388 return create_camera_base(
389 CameraCfg,
390 num_cams=args_cli.num_standard_cameras,
391 data_types=args_cli.standard_camera_data_types,
392 width=args_cli.width,
393 height=args_cli.height,
394 prim_path="{ENV_REGEX_NS}/" + prim_path,
395 instantiate=False,
396 )
397
398
399def create_ray_caster_camera_cfg(prim_path: str) -> RayCasterCameraCfg:
400 """Grab a simple ray caster config for injecting into task environments."""
401 return create_ray_caster_cameras(
402 num_cams=args_cli.num_ray_caster_cameras,
403 data_types=args_cli.ray_caster_camera_data_types,
404 width=args_cli.width,
405 height=args_cli.height,
406 prim_path="{ENV_REGEX_NS}/" + prim_path,
407 )
408
409
410"""
411Scene Creation
412"""
413
414
415def design_scene(
416 num_tiled_cams: int = 2,
417 num_standard_cams: int = 0,
418 num_ray_caster_cams: int = 0,
419 tiled_camera_data_types: list[str] | None = None,
420 standard_camera_data_types: list[str] | None = None,
421 ray_caster_camera_data_types: list[str] | None = None,
422 height: int = 100,
423 width: int = 200,
424 num_objects: int = 20,
425 mesh_prim_paths: list[str] = ["/World/ground"],
426) -> dict:
427 """Design the scene."""
428 if tiled_camera_data_types is None:
429 tiled_camera_data_types = ["rgb"]
430 if standard_camera_data_types is None:
431 standard_camera_data_types = ["rgb"]
432 if ray_caster_camera_data_types is None:
433 ray_caster_camera_data_types = ["distance_to_image_plane"]
434
435 # Populate scene
436 # -- Ground-plane
437 cfg = sim_utils.GroundPlaneCfg()
438 cfg.func("/World/ground", cfg)
439 # -- Lights
440 cfg = sim_utils.DistantLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
441 cfg.func("/World/Light", cfg)
442
443 # Create a dictionary for the scene entities
444 scene_entities = {}
445
446 # Xform to hold objects
447 sim_utils.create_prim("/World/Objects", "Xform")
448 # Random objects
449 for i in range(num_objects):
450 # sample random position
451 position = np.random.rand(3) - np.asarray([0.05, 0.05, -1.0])
452 position *= np.asarray([1.5, 1.5, 0.5])
453 # sample random color
454 color = (random.random(), random.random(), random.random())
455 # choose random prim type
456 prim_type = random.choice(["Cube", "Cone", "Cylinder"])
457 common_properties = {
458 "rigid_props": sim_utils.RigidBodyPropertiesCfg(),
459 "mass_props": sim_utils.MassPropertiesCfg(mass=5.0),
460 "collision_props": sim_utils.CollisionPropertiesCfg(),
461 "visual_material": sim_utils.PreviewSurfaceCfg(diffuse_color=color, metallic=0.5),
462 "semantic_tags": [("class", prim_type)],
463 }
464 if prim_type == "Cube":
465 shape_cfg = sim_utils.CuboidCfg(size=(0.25, 0.25, 0.25), **common_properties)
466 elif prim_type == "Cone":
467 shape_cfg = sim_utils.ConeCfg(radius=0.1, height=0.25, **common_properties)
468 elif prim_type == "Cylinder":
469 shape_cfg = sim_utils.CylinderCfg(radius=0.25, height=0.25, **common_properties)
470 # Rigid Object
471 obj_cfg = RigidObjectCfg(
472 prim_path=f"/World/Objects/Obj_{i:02d}",
473 spawn=shape_cfg,
474 init_state=RigidObjectCfg.InitialStateCfg(pos=position),
475 )
476 scene_entities[f"rigid_object{i}"] = RigidObject(cfg=obj_cfg)
477
478 # Sensors
479 standard_camera = create_cameras(
480 num_cams=num_standard_cams, data_types=standard_camera_data_types, height=height, width=width
481 )
482 tiled_camera = create_tiled_cameras(
483 num_cams=num_tiled_cams, data_types=tiled_camera_data_types, height=height, width=width
484 )
485 ray_caster_camera = create_ray_caster_cameras(
486 num_cams=num_ray_caster_cams,
487 data_types=ray_caster_camera_data_types,
488 mesh_prim_paths=mesh_prim_paths,
489 height=height,
490 width=width,
491 )
492 # return the scene information
493 if tiled_camera is not None:
494 scene_entities["tiled_camera"] = tiled_camera
495 if standard_camera is not None:
496 scene_entities["standard_camera"] = standard_camera
497 if ray_caster_camera is not None:
498 scene_entities["ray_caster_camera"] = ray_caster_camera
499 return scene_entities
500
501
502def inject_cameras_into_task(
503 task: str,
504 num_cams: int,
505 camera_name_prefix: str,
506 camera_creation_callable: Callable,
507 num_cameras_per_env: int = 1,
508) -> gym.Env:
509 """Loads the task, sticks cameras into the config, and creates the environment."""
510 cfg = load_cfg_from_registry(task, "env_cfg_entry_point")
511 cfg.sim.device = args_cli.device
512 cfg.sim.use_fabric = args_cli.use_fabric
513 scene_cfg = cfg.scene
514
515 num_envs = int(num_cams / num_cameras_per_env)
516 scene_cfg.num_envs = num_envs
517
518 for idx in range(num_cameras_per_env):
519 suffix = "" if idx == 0 else str(idx)
520 name = camera_name_prefix + suffix
521 setattr(scene_cfg, name, camera_creation_callable(name))
522 cfg.scene = scene_cfg
523 env = gym.make(task, cfg=cfg)
524 return env
525
526
527"""
528System diagnosis
529"""
530
531
532def get_utilization_percentages(reset: bool = False, max_values: list[float] = [0.0, 0.0, 0.0, 0.0]) -> list[float]:
533 """Get the maximum CPU, RAM, GPU utilization (processing), and
534 GPU memory usage percentages since the last time reset was true."""
535 if reset:
536 max_values[:] = [0, 0, 0, 0] # Reset the max values
537
538 # CPU utilization
539 cpu_usage = psutil.cpu_percent(interval=0.1)
540 max_values[0] = max(max_values[0], cpu_usage)
541
542 # RAM utilization
543 memory_info = psutil.virtual_memory()
544 ram_usage = memory_info.percent
545 max_values[1] = max(max_values[1], ram_usage)
546
547 # GPU utilization using pynvml
548 if torch.cuda.is_available():
549
550 if args_cli.autotune:
551 pynvml.nvmlInit() # Initialize NVML
552 for i in range(torch.cuda.device_count()):
553 handle = pynvml.nvmlDeviceGetHandleByIndex(i)
554
555 # GPU Utilization
556 gpu_utilization = pynvml.nvmlDeviceGetUtilizationRates(handle)
557 gpu_processing_utilization_percent = gpu_utilization.gpu # GPU core utilization
558 max_values[2] = max(max_values[2], gpu_processing_utilization_percent)
559
560 # GPU Memory Usage
561 memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
562 gpu_memory_total = memory_info.total
563 gpu_memory_used = memory_info.used
564 gpu_memory_utilization_percent = (gpu_memory_used / gpu_memory_total) * 100
565 max_values[3] = max(max_values[3], gpu_memory_utilization_percent)
566
567 pynvml.nvmlShutdown() # Shutdown NVML after usage
568 else:
569 gpu_processing_utilization_percent = None
570 gpu_memory_utilization_percent = None
571 return max_values
572
573
574"""
575Experiment
576"""
577
578
579def run_simulator(
580 sim: sim_utils.SimulationContext | None,
581 scene_entities: dict | InteractiveScene,
582 warm_start_length: int = 10,
583 experiment_length: int = 100,
584 tiled_camera_data_types: list[str] | None = None,
585 standard_camera_data_types: list[str] | None = None,
586 ray_caster_camera_data_types: list[str] | None = None,
587 depth_predicate: Callable = lambda x: "to" in x or x == "depth",
588 perspective_depth_predicate: Callable = lambda x: x == "distance_to_camera",
589 convert_depth_to_camera_to_image_plane: bool = True,
590 max_cameras_per_env: int = 1,
591 env: gym.Env | None = None,
592) -> dict:
593 """Run the simulator with all cameras, and return timing analytics. Visualize if desired."""
594
595 if tiled_camera_data_types is None:
596 tiled_camera_data_types = ["rgb"]
597 if standard_camera_data_types is None:
598 standard_camera_data_types = ["rgb"]
599 if ray_caster_camera_data_types is None:
600 ray_caster_camera_data_types = ["distance_to_image_plane"]
601
602 # Initialize camera lists
603 tiled_cameras = []
604 standard_cameras = []
605 ray_caster_cameras = []
606
607 # Dynamically extract cameras from the scene entities up to max_cameras_per_env
608 for i in range(max_cameras_per_env):
609 # Extract tiled cameras
610 tiled_camera_key = f"tiled_camera{i}" if i > 0 else "tiled_camera"
611 standard_camera_key = f"standard_camera{i}" if i > 0 else "standard_camera"
612 ray_caster_camera_key = f"ray_caster_camera{i}" if i > 0 else "ray_caster_camera"
613
614 try: # if instead you checked ... if key is in scene_entities... # errors out always even if key present
615 tiled_cameras.append(scene_entities[tiled_camera_key])
616 standard_cameras.append(scene_entities[standard_camera_key])
617 ray_caster_cameras.append(scene_entities[ray_caster_camera_key])
618 except KeyError:
619 break
620
621 # Initialize camera counts
622 camera_lists = [tiled_cameras, standard_cameras, ray_caster_cameras]
623 camera_data_types = [tiled_camera_data_types, standard_camera_data_types, ray_caster_camera_data_types]
624 labels = ["tiled", "standard", "ray_caster"]
625
626 if sim is not None:
627 # Set camera world poses
628 for camera_list in camera_lists:
629 for camera in camera_list:
630 num_cameras = camera.data.intrinsic_matrices.size(0)
631 positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
632 targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
633 camera.set_world_poses_from_view(positions, targets)
634
635 # Initialize timing variables
636 timestep = 0
637 total_time = 0.0
638 valid_timesteps = 0
639 sim_step_time = 0.0
640
641 while simulation_app.is_running() and timestep < experiment_length:
642 print(f"On timestep {timestep} of {experiment_length}, with warm start of {warm_start_length}")
643 get_utilization_percentages()
644
645 # Measure the total simulation step time
646 step_start_time = time.time()
647
648 if sim is not None:
649 sim.step()
650
651 if env is not None:
652 with torch.inference_mode():
653 # compute zero actions
654 actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
655 # apply actions
656 env.step(actions)
657
658 # Update cameras and process vision data within the simulation step
659 clouds = {}
660 images = {}
661 depth_images = {}
662
663 # Loop through all camera lists and their data_types
664 for camera_list, data_types, label in zip(camera_lists, camera_data_types, labels):
665 for cam_idx, camera in enumerate(camera_list):
666
667 if env is None: # No env, need to step cams manually
668 # Only update the camera if it hasn't been updated as part of scene_entities.update ...
669 camera.update(dt=sim.get_physics_dt())
670
671 for data_type in data_types:
672 data_label = f"{label}_{cam_idx}_{data_type}"
673
674 if depth_predicate(data_type): # is a depth image, want to create cloud
675 depth = camera.data.output[data_type]
676 depth_images[data_label + "_raw"] = depth
677 if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
678 depth = orthogonalize_perspective_depth(
679 camera.data.output[data_type], camera.data.intrinsic_matrices
680 )
681 depth_images[data_label + "_undistorted"] = depth
682
683 pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
684 clouds[data_label] = pointcloud
685 else: # rgb image, just save it
686 image = camera.data.output[data_type]
687 images[data_label] = image
688
689 # End timing for the step
690 step_end_time = time.time()
691 sim_step_time += step_end_time - step_start_time
692
693 if timestep > warm_start_length:
694 get_utilization_percentages(reset=True)
695 total_time += step_end_time - step_start_time
696 valid_timesteps += 1
697
698 timestep += 1
699
700 # Calculate average timings
701 if valid_timesteps > 0:
702 avg_timestep_duration = total_time / valid_timesteps
703 avg_sim_step_duration = sim_step_time / experiment_length
704 else:
705 avg_timestep_duration = 0.0
706 avg_sim_step_duration = 0.0
707
708 # Package timing analytics in a dictionary
709 timing_analytics = {
710 "average_timestep_duration": avg_timestep_duration,
711 "average_sim_step_duration": avg_sim_step_duration,
712 "total_simulation_time": sim_step_time,
713 "total_experiment_duration": sim_step_time,
714 }
715
716 system_utilization_analytics = get_utilization_percentages()
717
718 print("--- Benchmark Results ---")
719 print(f"Average timestep duration: {avg_timestep_duration:.6f} seconds")
720 print(f"Average simulation step duration: {avg_sim_step_duration:.6f} seconds")
721 print(f"Total simulation time: {sim_step_time:.6f} seconds")
722 print("\nSystem Utilization Statistics:")
723 print(
724 f"| CPU:{system_utilization_analytics[0]}% | "
725 f"RAM:{system_utilization_analytics[1]}% | "
726 f"GPU Compute:{system_utilization_analytics[2]}% | "
727 f" GPU Memory: {system_utilization_analytics[3]:.2f}% |"
728 )
729
730 return {"timing_analytics": timing_analytics, "system_utilization_analytics": system_utilization_analytics}
731
732
733def main():
734 """Main function."""
735 # Load simulation context
736 if args_cli.num_tiled_cameras + args_cli.num_standard_cameras + args_cli.num_ray_caster_cameras <= 0:
737 raise ValueError("You must select at least one camera.")
738 if (
739 (args_cli.num_tiled_cameras > 0 and args_cli.num_standard_cameras > 0)
740 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_standard_cameras > 0)
741 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_tiled_cameras > 0)
742 ):
743 print("[WARNING]: You have elected to use more than one camera type.")
744 print("[WARNING]: For a benchmark to be meaningful, use ONLY ONE camera type at a time.")
745 print(
746 "[WARNING]: For example, if num_tiled_cameras=100, for a meaningful benchmark,"
747 "num_standard_cameras should be 0, and num_ray_caster_cameras should be 0"
748 )
749 raise ValueError("Benchmark one camera at a time.")
750
751 print("[INFO]: Designing the scene")
752 if args_cli.task is None:
753 print("[INFO]: No task environment provided, creating random scene.")
754 sim_cfg = sim_utils.SimulationCfg(device=args_cli.device)
755 sim = sim_utils.SimulationContext(sim_cfg)
756 # Set main camera
757 sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
758 scene_entities = design_scene(
759 num_tiled_cams=args_cli.num_tiled_cameras,
760 num_standard_cams=args_cli.num_standard_cameras,
761 num_ray_caster_cams=args_cli.num_ray_caster_cameras,
762 tiled_camera_data_types=args_cli.tiled_camera_data_types,
763 standard_camera_data_types=args_cli.standard_camera_data_types,
764 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
765 height=args_cli.height,
766 width=args_cli.width,
767 num_objects=args_cli.num_objects,
768 mesh_prim_paths=args_cli.ray_caster_visible_mesh_prim_paths,
769 )
770 # Play simulator
771 sim.reset()
772 # Now we are ready!
773 print("[INFO]: Setup complete...")
774 # Run simulator
775 run_simulator(
776 sim=sim,
777 scene_entities=scene_entities,
778 warm_start_length=args_cli.warm_start_length,
779 experiment_length=args_cli.experiment_length,
780 tiled_camera_data_types=args_cli.tiled_camera_data_types,
781 standard_camera_data_types=args_cli.standard_camera_data_types,
782 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
783 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
784 )
785 else:
786 print("[INFO]: Using known task environment, injecting cameras.")
787 autotune_iter = 0
788 max_sys_util_thresh = [0.0, 0.0, 0.0]
789 max_num_cams = max(args_cli.num_tiled_cameras, args_cli.num_standard_cameras, args_cli.num_ray_caster_cameras)
790 cur_num_cams = max_num_cams
791 cur_sys_util = max_sys_util_thresh
792 interval = args_cli.autotune_camera_count_interval
793
794 if args_cli.autotune:
795 max_sys_util_thresh = args_cli.autotune_max_percentage_util
796 max_num_cams = args_cli.autotune_max_camera_count
797 print("[INFO]: Auto tuning until any of the following threshold are met")
798 print(f"|CPU: {max_sys_util_thresh[0]}% | RAM {max_sys_util_thresh[1]}% | GPU: {max_sys_util_thresh[2]}% |")
799 print(f"[INFO]: Maximum number of cameras allowed: {max_num_cams}")
800 # Determine which camera is being tested...
801 tiled_camera_cfg = create_tiled_camera_cfg("tiled_camera")
802 standard_camera_cfg = create_standard_camera_cfg("standard_camera")
803 ray_caster_camera_cfg = create_ray_caster_camera_cfg("ray_caster_camera")
804 camera_name_prefix = ""
805 camera_creation_callable = None
806 num_cams = 0
807 if tiled_camera_cfg is not None:
808 camera_name_prefix = "tiled_camera"
809 camera_creation_callable = create_tiled_camera_cfg
810 num_cams = args_cli.num_tiled_cameras
811 elif standard_camera_cfg is not None:
812 camera_name_prefix = "standard_camera"
813 camera_creation_callable = create_standard_camera_cfg
814 num_cams = args_cli.num_standard_cameras
815 elif ray_caster_camera_cfg is not None:
816 camera_name_prefix = "ray_caster_camera"
817 camera_creation_callable = create_ray_caster_camera_cfg
818 num_cams = args_cli.num_ray_caster_cameras
819
820 while (
821 all(cur <= max_thresh for cur, max_thresh in zip(cur_sys_util, max_sys_util_thresh))
822 and cur_num_cams <= max_num_cams
823 ):
824 cur_num_cams = num_cams + interval * autotune_iter
825 autotune_iter += 1
826
827 env = inject_cameras_into_task(
828 task=args_cli.task,
829 num_cams=cur_num_cams,
830 camera_name_prefix=camera_name_prefix,
831 camera_creation_callable=camera_creation_callable,
832 num_cameras_per_env=args_cli.task_num_cameras_per_env,
833 )
834 env.reset()
835 print(f"Testing with {cur_num_cams} {camera_name_prefix}")
836 analysis = run_simulator(
837 sim=None,
838 scene_entities=env.unwrapped.scene,
839 warm_start_length=args_cli.warm_start_length,
840 experiment_length=args_cli.experiment_length,
841 tiled_camera_data_types=args_cli.tiled_camera_data_types,
842 standard_camera_data_types=args_cli.standard_camera_data_types,
843 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
844 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
845 max_cameras_per_env=args_cli.task_num_cameras_per_env,
846 env=env,
847 )
848
849 cur_sys_util = analysis["system_utilization_analytics"]
850 print("Triggering reset...")
851 env.close()
852 sim_utils.create_new_stage()
853 print("[INFO]: DONE! Feel free to CTRL + C Me ")
854 print(f"[INFO]: If you've made it this far, you can likely simulate {cur_num_cams} {camera_name_prefix}")
855 print("Keep in mind, this is without any training running on the GPU.")
856 print("Set lower utilization thresholds to account for training.")
857
858 if not args_cli.autotune:
859 print("[WARNING]: GPU Util Statistics only correct while autotuning, ignore above.")
860
861
862if __name__ == "__main__":
863 # run the main function
864 main()
865 # close sim app
866 simulation_app.close()
Possible Parameters#
First, run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
to see all possible parameters you can vary with this utility.
See the command line parameters related to autotune for more information about
automatically determining maximum camera count.
Compare Performance in Task Environments and Automatically Determine Task Max Camera Count#
Currently, tiled cameras are the most performant camera that can handle multiple dynamic objects.
For example, to see how your system could handle 100 tiled cameras in the cartpole environment, with 2 cameras per environment (so 50 environments total) only in RGB mode, run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb
If you have pynvml installed, (./isaaclab.sh -p -m pip install pynvml), you can also
find the maximum number of cameras that you could run in the specified environment up to
a certain performance threshold (specified by max CPU utilization percent, max RAM utilization percent,
max GPU compute percent, and max GPU memory percent). For example, to find the maximum number of cameras
you can run with cartpole, you could run:
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb --autotune \
--autotune_max_percentage_util 100 80 50 50
Autotune may lead to the program crashing, which means that it tried to run too many cameras at once. However, the max percentage utilization parameter is meant to prevent this from happening.
The output of the benchmark doesn’t include the overhead of training the network, so consider decreasing the maximum utilization percentages to account for this overhead. The final output camera count is for all cameras, so to get the total number of environments, divide the output camera count by the number of cameras per environment.
Compare Camera Type and Performance (Without a Specified Task)#
This tool can also asses performance without a task environment. For example, to view 100 random objects with 2 standard cameras, one could run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--height 100 --width 100 --num_standard_cameras 2 \
--standard_camera_data_types instance_segmentation_fast normals --num_objects 100 \
--experiment_length 100
If your system cannot handle this due to performance reasons, then the process will be killed.
It’s recommended to monitor CPU/RAM utilization and GPU utilization while running this script, to get
an idea of how many resources rendering the desired camera requires. In Ubuntu, you can use tools like htop and nvtop
to live monitor resources while running this script, and in Windows, you can use the Task Manager.
If your system has a hard time handling the desired cameras, you can try the following
Switch to headless mode (supply
--headless)Ensure you are using the GPU pipeline not CPU!
If you aren’t using Tiled Cameras, switch to Tiled Cameras
Decrease camera resolution
Decrease how many data_types there are for each camera.
Decrease the number of cameras
Decrease the number of objects in the scene
If your system is able to handle the amount of cameras, then the time statistics will be printed to the terminal. After the simulations stops it can be closed with CTRL+C.