Find How Many/What Cameras You Should Train With#
Currently in Isaac Lab, there are several camera types; USD Cameras (standard), Tiled Cameras,
and Ray Caster cameras. These camera types differ in functionality and performance. The benchmark_cameras.py
script can be used to understand the difference in cameras types, as well to characterize their relative performance
at different parameters such as camera quantity, image dimensions, and data types.
This utility is provided so that one easily can find the camera type/parameters that are the most performant while meeting the requirements of the user’s scenario. This utility also helps estimate the maximum number of cameras one can realistically run, assuming that one wants to maximize the number of environments while minimizing step time.
This utility can inject cameras into an existing task from the gym registry,
which can be useful for benchmarking cameras in a specific scenario. Also,
if you install pynvml, you can let this utility automatically find the maximum
numbers of cameras that can run in your task environment up to a
certain specified system resource utilization threshold (without training; taking zero actions
at each timestep).
This guide accompanies the benchmark_cameras.py script in the scripts/benchmarks
directory.
Code for benchmark_cameras.py
1# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
2# All rights reserved.
3#
4# SPDX-License-Identifier: BSD-3-Clause
5
6"""
7This script might help you determine how many cameras your system can realistically run
8at different desired settings.
9
10You can supply different task environments to inject cameras into, or just test a sample scene.
11Additionally, you can automatically find the maximum amount of cameras you can run a task with
12through the auto-tune functionality.
13
14.. code-block:: bash
15
16 # Usage with GUI
17 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
18
19 # Usage with headless
20 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h --headless
21
22"""
23
24"""Launch Isaac Sim Simulator first."""
25
26import argparse
27from collections.abc import Callable
28
29from isaaclab.app import AppLauncher
30
31# parse the arguments
32args_cli = argparse.Namespace()
33
34parser = argparse.ArgumentParser(description="This script can help you benchmark how many cameras you could run.")
35
36"""
37The following arguments only need to be supplied for when one wishes
38to try injecting cameras into their environment, and automatically determining
39the maximum camera count.
40"""
41parser.add_argument(
42 "--task",
43 type=str,
44 default=None,
45 required=False,
46 help="Supply this argument to spawn cameras within an known manager-based task environment.",
47)
48
49parser.add_argument(
50 "--autotune",
51 default=False,
52 action="store_true",
53 help=(
54 "Autotuning is only supported for provided task environments."
55 " Supply this argument to increase the number of environments until a desired threshold is reached."
56 "Install pynvml in your environment; ./isaaclab.sh -m pip install pynvml"
57 ),
58)
59
60parser.add_argument(
61 "--task_num_cameras_per_env",
62 type=int,
63 default=1,
64 help="The number of cameras per environment to use when using a known task.",
65)
66
67parser.add_argument(
68 "--use_fabric", action="store_true", default=False, help="Enable fabric and use USD I/O operations."
69)
70
71parser.add_argument(
72 "--autotune_max_percentage_util",
73 nargs="+",
74 type=float,
75 default=[100.0, 80.0, 80.0, 80.0],
76 required=False,
77 help=(
78 "The system utilization percentage thresholds to reach before an autotune is finished. "
79 "If any one of these limits are hit, the autotune stops."
80 "Thresholds are, in order, maximum CPU percentage utilization,"
81 "maximum RAM percentage utilization, maximum GPU compute percent utilization, "
82 "amd maximum GPU memory utilization."
83 ),
84)
85
86parser.add_argument(
87 "--autotune_max_camera_count", type=int, default=4096, help="The maximum amount of cameras allowed in an autotune."
88)
89
90parser.add_argument(
91 "--autotune_camera_count_interval",
92 type=int,
93 default=25,
94 help=(
95 "The number of cameras to try to add to the environment if the current camera count"
96 " falls within permitted system resource utilization limits."
97 ),
98)
99
100"""
101The following arguments are shared for when injecting cameras into a task environment,
102as well as when creating cameras independent of a task environment.
103"""
104
105parser.add_argument(
106 "--num_tiled_cameras",
107 type=int,
108 default=0,
109 required=False,
110 help="Number of tiled cameras to create. For autotuning, this is how many cameras to start with.",
111)
112
113parser.add_argument(
114 "--num_standard_cameras",
115 type=int,
116 default=0,
117 required=False,
118 help="Number of standard cameras to create. For autotuning, this is how many cameras to start with.",
119)
120
121parser.add_argument(
122 "--num_ray_caster_cameras",
123 type=int,
124 default=0,
125 required=False,
126 help="Number of ray caster cameras to create. For autotuning, this is how many cameras to start with.",
127)
128
129parser.add_argument(
130 "--tiled_camera_data_types",
131 nargs="+",
132 type=str,
133 default=["rgb", "depth"],
134 help="The data types rendered by the tiled camera",
135)
136
137parser.add_argument(
138 "--standard_camera_data_types",
139 nargs="+",
140 type=str,
141 default=["rgb", "distance_to_image_plane", "distance_to_camera"],
142 help="The data types rendered by the standard camera",
143)
144
145parser.add_argument(
146 "--ray_caster_camera_data_types",
147 nargs="+",
148 type=str,
149 default=["distance_to_image_plane"],
150 help="The data types rendered by the ray caster camera.",
151)
152
153parser.add_argument(
154 "--ray_caster_visible_mesh_prim_paths",
155 nargs="+",
156 type=str,
157 default=["/World/ground"],
158 help="WARNING: Ray Caster can currently only cast against a single, static, object",
159)
160
161parser.add_argument(
162 "--convert_depth_to_camera_to_image_plane",
163 action="store_true",
164 default=True,
165 help=(
166 "Enable undistorting from perspective view (distance to camera data_type)"
167 "to orthogonal view (distance to plane data_type) for depth."
168 "This is currently needed to create undisorted depth images/point cloud."
169 ),
170)
171
172parser.add_argument(
173 "--keep_raw_depth",
174 dest="convert_depth_to_camera_to_image_plane",
175 action="store_false",
176 help=(
177 "Disable undistorting from perspective view (distance to camera)"
178 "to orthogonal view (distance to plane data_type) for depth."
179 ),
180)
181
182parser.add_argument(
183 "--height",
184 type=int,
185 default=120,
186 required=False,
187 help="Height in pixels of cameras",
188)
189
190parser.add_argument(
191 "--width",
192 type=int,
193 default=140,
194 required=False,
195 help="Width in pixels of cameras",
196)
197
198parser.add_argument(
199 "--warm_start_length",
200 type=int,
201 default=3,
202 required=False,
203 help=(
204 "Number of steps to run the sim before starting benchmark."
205 "Needed to avoid blank images at the start of the simulation."
206 ),
207)
208
209parser.add_argument(
210 "--experiment_length",
211 type=int,
212 default=15,
213 required=False,
214 help="Number of steps to average over",
215)
216
217# This argument is only used when a task is not provided.
218parser.add_argument(
219 "--num_objects",
220 type=int,
221 default=10,
222 required=False,
223 help="Number of objects to spawn into the scene when not using a known task.",
224)
225
226# Benchmark arguments
227parser.add_argument(
228 "--benchmark_backend",
229 type=str,
230 default="omniperf",
231 choices=["json", "osmo", "omniperf", "summary"],
232 help="Benchmarking backend options, defaults omniperf",
233)
234parser.add_argument("--output_path", type=str, default=".", help="Path to output benchmark results.")
235
236
237AppLauncher.add_app_launcher_args(parser)
238args_cli = parser.parse_args()
239args_cli.enable_cameras = True
240
241if args_cli.autotune:
242 import pynvml
243
244if len(args_cli.ray_caster_visible_mesh_prim_paths) > 1:
245 print("[WARNING]: Ray Casting is only currently supported for a single, static object")
246# launch omniverse app
247app_launcher = AppLauncher(args_cli)
248simulation_app = app_launcher.app
249
250"""Rest everything follows."""
251
252import random
253import time
254
255import gymnasium as gym
256import numpy as np
257import psutil
258import torch
259
260import isaaclab.sim as sim_utils
261from isaaclab.assets import RigidObject, RigidObjectCfg
262from isaaclab.scene.interactive_scene import InteractiveScene
263from isaaclab.sensors import (
264 Camera,
265 CameraCfg,
266 RayCasterCamera,
267 RayCasterCameraCfg,
268 patterns,
269)
270from isaaclab.test.benchmark import BaseIsaacLabBenchmark, DictMeasurement, SingleMeasurement
271from isaaclab.utils.math import orthogonalize_perspective_depth, unproject_depth
272
273from isaaclab_tasks.utils import load_cfg_from_registry
274
275"""
276Camera Creation
277"""
278
279
280def create_camera_base(
281 camera_cfg: type[CameraCfg],
282 num_cams: int,
283 data_types: list[str],
284 height: int,
285 width: int,
286 prim_path: str | None = None,
287 instantiate: bool = True,
288) -> Camera | CameraCfg | None:
289 """Generalized function to create a camera or tiled camera sensor."""
290 # Determine prim prefix based on the camera class
291 name = camera_cfg.class_type.__name__
292
293 if instantiate:
294 # Create the necessary prims
295 for idx in range(num_cams):
296 sim_utils.create_prim(f"/World/{name}_{idx:02d}", "Xform")
297 if prim_path is None:
298 prim_path = f"/World/{name}_.*/{name}"
299 # If valid camera settings are provided, create the camera
300 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
301 cfg = camera_cfg(
302 prim_path=prim_path,
303 update_period=0,
304 height=height,
305 width=width,
306 data_types=data_types,
307 spawn=sim_utils.PinholeCameraCfg(
308 focal_length=24, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 1e4)
309 ),
310 )
311 if instantiate:
312 return camera_cfg.class_type(cfg=cfg)
313 else:
314 return cfg
315 else:
316 return None
317
318
319def create_tiled_cameras(
320 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
321) -> Camera | None:
322 if data_types is None:
323 data_types = ["rgb", "depth"]
324 """Defines the camera sensor to add to the scene."""
325 return create_camera_base(
326 camera_cfg=CameraCfg,
327 num_cams=num_cams,
328 data_types=data_types,
329 height=height,
330 width=width,
331 )
332
333
334def create_cameras(
335 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
336) -> Camera | None:
337 """Defines the Standard cameras."""
338 if data_types is None:
339 data_types = ["rgb", "depth"]
340 return create_camera_base(
341 camera_cfg=CameraCfg, num_cams=num_cams, data_types=data_types, height=height, width=width
342 )
343
344
345def create_ray_caster_cameras(
346 num_cams: int = 2,
347 data_types: list[str] = ["distance_to_image_plane"],
348 mesh_prim_paths: list[str] = ["/World/ground"],
349 height: int = 100,
350 width: int = 120,
351 prim_path: str = "/World/RayCasterCamera_.*/RayCaster",
352 instantiate: bool = True,
353) -> RayCasterCamera | RayCasterCameraCfg | None:
354 """Create the raycaster cameras; different configuration than Standard/Tiled camera"""
355 for idx in range(num_cams):
356 sim_utils.create_prim(f"/World/RayCasterCamera_{idx:02d}/RayCaster", "Xform")
357
358 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
359 cam_cfg = RayCasterCameraCfg(
360 prim_path=prim_path,
361 mesh_prim_paths=mesh_prim_paths,
362 update_period=0,
363 offset=RayCasterCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
364 data_types=data_types,
365 debug_vis=False,
366 pattern_cfg=patterns.PinholeCameraPatternCfg(
367 focal_length=24.0,
368 horizontal_aperture=20.955,
369 height=480,
370 width=640,
371 ),
372 )
373 if instantiate:
374 return RayCasterCamera(cfg=cam_cfg)
375 else:
376 return cam_cfg
377
378 else:
379 return None
380
381
382def create_tiled_camera_cfg(prim_path: str) -> CameraCfg:
383 """Grab a simple camera config for injecting into task environments."""
384 return create_camera_base(
385 CameraCfg,
386 num_cams=args_cli.num_tiled_cameras,
387 data_types=args_cli.tiled_camera_data_types,
388 width=args_cli.width,
389 height=args_cli.height,
390 prim_path="{ENV_REGEX_NS}/" + prim_path,
391 instantiate=False,
392 )
393
394
395def create_standard_camera_cfg(prim_path: str) -> CameraCfg:
396 """Grab a simple standard camera config for injecting into task environments."""
397 return create_camera_base(
398 CameraCfg,
399 num_cams=args_cli.num_standard_cameras,
400 data_types=args_cli.standard_camera_data_types,
401 width=args_cli.width,
402 height=args_cli.height,
403 prim_path="{ENV_REGEX_NS}/" + prim_path,
404 instantiate=False,
405 )
406
407
408def create_ray_caster_camera_cfg(prim_path: str) -> RayCasterCameraCfg:
409 """Grab a simple ray caster config for injecting into task environments."""
410 return create_ray_caster_cameras(
411 num_cams=args_cli.num_ray_caster_cameras,
412 data_types=args_cli.ray_caster_camera_data_types,
413 width=args_cli.width,
414 height=args_cli.height,
415 prim_path="{ENV_REGEX_NS}/" + prim_path,
416 )
417
418
419"""
420Scene Creation
421"""
422
423
424def design_scene(
425 num_tiled_cams: int = 2,
426 num_standard_cams: int = 0,
427 num_ray_caster_cams: int = 0,
428 tiled_camera_data_types: list[str] | None = None,
429 standard_camera_data_types: list[str] | None = None,
430 ray_caster_camera_data_types: list[str] | None = None,
431 height: int = 100,
432 width: int = 200,
433 num_objects: int = 20,
434 mesh_prim_paths: list[str] = ["/World/ground"],
435) -> dict:
436 """Design the scene."""
437 if tiled_camera_data_types is None:
438 tiled_camera_data_types = ["rgb"]
439 if standard_camera_data_types is None:
440 standard_camera_data_types = ["rgb"]
441 if ray_caster_camera_data_types is None:
442 ray_caster_camera_data_types = ["distance_to_image_plane"]
443
444 # Populate scene
445 # -- Ground-plane
446 cfg = sim_utils.GroundPlaneCfg()
447 cfg.func("/World/ground", cfg)
448 # -- Lights
449 cfg = sim_utils.DistantLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
450 cfg.func("/World/Light", cfg)
451
452 # Create a dictionary for the scene entities
453 scene_entities = {}
454
455 # Xform to hold objects
456 sim_utils.create_prim("/World/Objects", "Xform")
457 # Random objects
458 for i in range(num_objects):
459 # sample random position
460 position = np.random.rand(3) - np.asarray([0.05, 0.05, -1.0])
461 position *= np.asarray([1.5, 1.5, 0.5])
462 # sample random color
463 color = (random.random(), random.random(), random.random())
464 # choose random prim type
465 prim_type = random.choice(["Cube", "Cone", "Cylinder"])
466 common_properties = {
467 "rigid_props": sim_utils.RigidBodyPropertiesCfg(),
468 "mass_props": sim_utils.MassPropertiesCfg(mass=5.0),
469 "collision_props": sim_utils.CollisionPropertiesCfg(),
470 "visual_material": sim_utils.PreviewSurfaceCfg(diffuse_color=color, metallic=0.5),
471 "semantic_tags": [("class", prim_type)],
472 }
473 if prim_type == "Cube":
474 shape_cfg = sim_utils.CuboidCfg(size=(0.25, 0.25, 0.25), **common_properties)
475 elif prim_type == "Cone":
476 shape_cfg = sim_utils.ConeCfg(radius=0.1, height=0.25, **common_properties)
477 elif prim_type == "Cylinder":
478 shape_cfg = sim_utils.CylinderCfg(radius=0.25, height=0.25, **common_properties)
479 # Rigid Object
480 obj_cfg = RigidObjectCfg(
481 prim_path=f"/World/Objects/Obj_{i:02d}",
482 spawn=shape_cfg,
483 init_state=RigidObjectCfg.InitialStateCfg(pos=position),
484 )
485 scene_entities[f"rigid_object{i}"] = RigidObject(cfg=obj_cfg)
486
487 # Sensors
488 standard_camera = create_cameras(
489 num_cams=num_standard_cams, data_types=standard_camera_data_types, height=height, width=width
490 )
491 tiled_camera = create_tiled_cameras(
492 num_cams=num_tiled_cams, data_types=tiled_camera_data_types, height=height, width=width
493 )
494 ray_caster_camera = create_ray_caster_cameras(
495 num_cams=num_ray_caster_cams,
496 data_types=ray_caster_camera_data_types,
497 mesh_prim_paths=mesh_prim_paths,
498 height=height,
499 width=width,
500 )
501 # return the scene information
502 if tiled_camera is not None:
503 scene_entities["tiled_camera"] = tiled_camera
504 if standard_camera is not None:
505 scene_entities["standard_camera"] = standard_camera
506 if ray_caster_camera is not None:
507 scene_entities["ray_caster_camera"] = ray_caster_camera
508 return scene_entities
509
510
511def inject_cameras_into_task(
512 task: str,
513 num_cams: int,
514 camera_name_prefix: str,
515 camera_creation_callable: Callable,
516 num_cameras_per_env: int = 1,
517) -> gym.Env:
518 """Loads the task, sticks cameras into the config, and creates the environment."""
519 cfg = load_cfg_from_registry(task, "env_cfg_entry_point")
520 cfg.sim.device = args_cli.device
521 cfg.sim.use_fabric = args_cli.use_fabric
522 scene_cfg = cfg.scene
523
524 num_envs = int(num_cams / num_cameras_per_env)
525 scene_cfg.num_envs = num_envs
526
527 for idx in range(num_cameras_per_env):
528 suffix = "" if idx == 0 else str(idx)
529 name = camera_name_prefix + suffix
530 setattr(scene_cfg, name, camera_creation_callable(name))
531 cfg.scene = scene_cfg
532 env = gym.make(task, cfg=cfg)
533 return env
534
535
536"""
537System diagnosis
538"""
539
540
541def get_utilization_percentages(reset: bool = False, max_values: list[float] = [0.0, 0.0, 0.0, 0.0]) -> list[float]:
542 """Get the maximum CPU, RAM, GPU utilization (processing), and
543 GPU memory usage percentages since the last time reset was true."""
544 if reset:
545 max_values[:] = [0, 0, 0, 0] # Reset the max values
546
547 # CPU utilization
548 cpu_usage = psutil.cpu_percent(interval=0.1)
549 max_values[0] = max(max_values[0], cpu_usage)
550
551 # RAM utilization
552 memory_info = psutil.virtual_memory()
553 ram_usage = memory_info.percent
554 max_values[1] = max(max_values[1], ram_usage)
555
556 # GPU utilization using pynvml
557 if torch.cuda.is_available():
558 if args_cli.autotune:
559 pynvml.nvmlInit() # Initialize NVML
560 for i in range(torch.cuda.device_count()):
561 handle = pynvml.nvmlDeviceGetHandleByIndex(i)
562
563 # GPU Utilization
564 gpu_utilization = pynvml.nvmlDeviceGetUtilizationRates(handle)
565 gpu_processing_utilization_percent = gpu_utilization.gpu # GPU core utilization
566 max_values[2] = max(max_values[2], gpu_processing_utilization_percent)
567
568 # GPU Memory Usage
569 memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
570 gpu_memory_total = memory_info.total
571 gpu_memory_used = memory_info.used
572 gpu_memory_utilization_percent = (gpu_memory_used / gpu_memory_total) * 100
573 max_values[3] = max(max_values[3], gpu_memory_utilization_percent)
574
575 pynvml.nvmlShutdown() # Shutdown NVML after usage
576 else:
577 gpu_processing_utilization_percent = None
578 gpu_memory_utilization_percent = None
579 return max_values
580
581
582"""
583Experiment
584"""
585
586
587def run_simulator(
588 sim: sim_utils.SimulationContext | None,
589 scene_entities: dict | InteractiveScene,
590 warm_start_length: int = 10,
591 experiment_length: int = 100,
592 tiled_camera_data_types: list[str] | None = None,
593 standard_camera_data_types: list[str] | None = None,
594 ray_caster_camera_data_types: list[str] | None = None,
595 depth_predicate: Callable = lambda x: "to" in x or x == "depth",
596 perspective_depth_predicate: Callable = lambda x: x == "distance_to_camera",
597 convert_depth_to_camera_to_image_plane: bool = True,
598 max_cameras_per_env: int = 1,
599 env: gym.Env | None = None,
600) -> dict:
601 """Run the simulator with all cameras, and return timing analytics. Visualize if desired."""
602
603 if tiled_camera_data_types is None:
604 tiled_camera_data_types = ["rgb"]
605 if standard_camera_data_types is None:
606 standard_camera_data_types = ["rgb"]
607 if ray_caster_camera_data_types is None:
608 ray_caster_camera_data_types = ["distance_to_image_plane"]
609
610 # Initialize camera lists
611 tiled_cameras = []
612 standard_cameras = []
613 ray_caster_cameras = []
614
615 # Dynamically extract cameras from the scene entities up to max_cameras_per_env
616 for i in range(max_cameras_per_env):
617 # Extract tiled cameras
618 tiled_camera_key = f"tiled_camera{i}" if i > 0 else "tiled_camera"
619 standard_camera_key = f"standard_camera{i}" if i > 0 else "standard_camera"
620 ray_caster_camera_key = f"ray_caster_camera{i}" if i > 0 else "ray_caster_camera"
621
622 try: # if instead you checked ... if key is in scene_entities... # errors out always even if key present
623 tiled_cameras.append(scene_entities[tiled_camera_key])
624 standard_cameras.append(scene_entities[standard_camera_key])
625 ray_caster_cameras.append(scene_entities[ray_caster_camera_key])
626 except KeyError:
627 break
628
629 # Initialize camera counts
630 camera_lists = [tiled_cameras, standard_cameras, ray_caster_cameras]
631 camera_data_types = [tiled_camera_data_types, standard_camera_data_types, ray_caster_camera_data_types]
632 labels = ["tiled", "standard", "ray_caster"]
633
634 if sim is not None:
635 # Set camera world poses
636 for camera_list in camera_lists:
637 for camera in camera_list:
638 num_cameras = camera.data.intrinsic_matrices.size(0)
639 positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
640 targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
641 camera.set_world_poses_from_view(positions, targets)
642
643 # Initialize timing variables
644 timestep = 0
645 total_time = 0.0
646 valid_timesteps = 0
647 sim_step_time = 0.0
648
649 while simulation_app.is_running() and timestep < experiment_length:
650 print(f"On timestep {timestep} of {experiment_length}, with warm start of {warm_start_length}")
651 get_utilization_percentages()
652
653 # Measure the total simulation step time
654 step_start_time = time.time()
655
656 if sim is not None:
657 sim.step()
658
659 if env is not None:
660 with torch.inference_mode():
661 # compute zero actions
662 actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
663 # apply actions
664 env.step(actions)
665
666 # Update cameras and process vision data within the simulation step
667 clouds = {}
668 images = {}
669 depth_images = {}
670
671 # Loop through all camera lists and their data_types
672 for camera_list, data_types, label in zip(camera_lists, camera_data_types, labels):
673 for cam_idx, camera in enumerate(camera_list):
674 if env is None: # No env, need to step cams manually
675 # Only update the camera if it hasn't been updated as part of scene_entities.update ...
676 camera.update(dt=sim.get_physics_dt())
677
678 for data_type in data_types:
679 data_label = f"{label}_{cam_idx}_{data_type}"
680
681 if depth_predicate(data_type): # is a depth image, want to create cloud
682 depth = camera.data.output[data_type]
683 depth_images[data_label + "_raw"] = depth
684 if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
685 depth = orthogonalize_perspective_depth(
686 camera.data.output[data_type], camera.data.intrinsic_matrices
687 )
688 depth_images[data_label + "_undistorted"] = depth
689
690 pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
691 clouds[data_label] = pointcloud
692 else: # rgb image, just save it
693 image = camera.data.output[data_type]
694 images[data_label] = image
695
696 # End timing for the step
697 step_end_time = time.time()
698 sim_step_time += step_end_time - step_start_time
699
700 if timestep > warm_start_length:
701 get_utilization_percentages(reset=True)
702 total_time += step_end_time - step_start_time
703 valid_timesteps += 1
704
705 timestep += 1
706
707 # Calculate average timings
708 if valid_timesteps > 0:
709 avg_timestep_duration = total_time / valid_timesteps
710 avg_sim_step_duration = sim_step_time / experiment_length
711 else:
712 avg_timestep_duration = 0.0
713 avg_sim_step_duration = 0.0
714
715 # Package timing analytics in a dictionary
716 timing_analytics = {
717 "average_timestep_duration": avg_timestep_duration,
718 "average_sim_step_duration": avg_sim_step_duration,
719 "total_simulation_time": sim_step_time,
720 "total_experiment_duration": sim_step_time,
721 }
722
723 system_utilization_analytics = get_utilization_percentages()
724
725 print("--- Benchmark Results ---")
726 print(f"Average timestep duration: {avg_timestep_duration:.6f} seconds")
727 print(f"Average simulation step duration: {avg_sim_step_duration:.6f} seconds")
728 print(f"Total simulation time: {sim_step_time:.6f} seconds")
729 print("\nSystem Utilization Statistics:")
730 print(
731 f"| CPU:{system_utilization_analytics[0]}% | "
732 f"RAM:{system_utilization_analytics[1]}% | "
733 f"GPU Compute:{system_utilization_analytics[2]}% | "
734 f" GPU Memory: {system_utilization_analytics[3]:.2f}% |"
735 )
736
737 return {"timing_analytics": timing_analytics, "system_utilization_analytics": system_utilization_analytics}
738
739
740def main():
741 """Main function."""
742 # Load simulation context
743 if args_cli.num_tiled_cameras + args_cli.num_standard_cameras + args_cli.num_ray_caster_cameras <= 0:
744 raise ValueError("You must select at least one camera.")
745 if (
746 (args_cli.num_tiled_cameras > 0 and args_cli.num_standard_cameras > 0)
747 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_standard_cameras > 0)
748 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_tiled_cameras > 0)
749 ):
750 print("[WARNING]: You have elected to use more than one camera type.")
751 print("[WARNING]: For a benchmark to be meaningful, use ONLY ONE camera type at a time.")
752 print(
753 "[WARNING]: For example, if num_tiled_cameras=100, for a meaningful benchmark,"
754 "num_standard_cameras should be 0, and num_ray_caster_cameras should be 0"
755 )
756 raise ValueError("Benchmark one camera at a time.")
757
758 # Determine which camera type is being used
759 camera_type = "tiled"
760 num_cameras = args_cli.num_tiled_cameras
761 if args_cli.num_standard_cameras > 0:
762 camera_type = "standard"
763 num_cameras = args_cli.num_standard_cameras
764 elif args_cli.num_ray_caster_cameras > 0:
765 camera_type = "ray_caster"
766 num_cameras = args_cli.num_ray_caster_cameras
767
768 # Create the benchmark
769 backend_type = args_cli.benchmark_backend
770 benchmark = BaseIsaacLabBenchmark(
771 benchmark_name="benchmark_cameras",
772 backend_type=backend_type,
773 output_path=args_cli.output_path,
774 use_recorders=True,
775 frametime_recorders=backend_type in ("summary", "omniperf"),
776 output_prefix="benchmark_cameras",
777 workflow_metadata={
778 "metadata": [
779 {"name": "task", "data": args_cli.task},
780 {"name": "camera_type", "data": camera_type},
781 {"name": "num_cameras", "data": num_cameras},
782 {"name": "height", "data": args_cli.height},
783 {"name": "width", "data": args_cli.width},
784 {"name": "experiment_length", "data": args_cli.experiment_length},
785 {"name": "autotune", "data": args_cli.autotune},
786 ]
787 },
788 )
789
790 print("[INFO]: Designing the scene")
791 final_analysis = None
792
793 if args_cli.task is None:
794 print("[INFO]: No task environment provided, creating random scene.")
795 sim_cfg = sim_utils.SimulationCfg(device=args_cli.device)
796 sim = sim_utils.SimulationContext(sim_cfg)
797 # Set main camera
798 sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
799 scene_entities = design_scene(
800 num_tiled_cams=args_cli.num_tiled_cameras,
801 num_standard_cams=args_cli.num_standard_cameras,
802 num_ray_caster_cams=args_cli.num_ray_caster_cameras,
803 tiled_camera_data_types=args_cli.tiled_camera_data_types,
804 standard_camera_data_types=args_cli.standard_camera_data_types,
805 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
806 height=args_cli.height,
807 width=args_cli.width,
808 num_objects=args_cli.num_objects,
809 mesh_prim_paths=args_cli.ray_caster_visible_mesh_prim_paths,
810 )
811 # Play simulator
812 sim.reset()
813 # Now we are ready!
814 print("[INFO]: Setup complete...")
815 # Run simulator
816 final_analysis = run_simulator(
817 sim=sim,
818 scene_entities=scene_entities,
819 warm_start_length=args_cli.warm_start_length,
820 experiment_length=args_cli.experiment_length,
821 tiled_camera_data_types=args_cli.tiled_camera_data_types,
822 standard_camera_data_types=args_cli.standard_camera_data_types,
823 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
824 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
825 )
826 else:
827 print("[INFO]: Using known task environment, injecting cameras.")
828 autotune_iter = 0
829 max_sys_util_thresh = [0.0, 0.0, 0.0]
830 max_num_cams = max(args_cli.num_tiled_cameras, args_cli.num_standard_cameras, args_cli.num_ray_caster_cameras)
831 cur_num_cams = max_num_cams
832 cur_sys_util = max_sys_util_thresh
833 interval = args_cli.autotune_camera_count_interval
834
835 if args_cli.autotune:
836 max_sys_util_thresh = args_cli.autotune_max_percentage_util
837 max_num_cams = args_cli.autotune_max_camera_count
838 print("[INFO]: Auto tuning until any of the following threshold are met")
839 print(f"|CPU: {max_sys_util_thresh[0]}% | RAM {max_sys_util_thresh[1]}% | GPU: {max_sys_util_thresh[2]}% |")
840 print(f"[INFO]: Maximum number of cameras allowed: {max_num_cams}")
841 # Determine which camera is being tested...
842 tiled_camera_cfg = create_tiled_camera_cfg("tiled_camera")
843 standard_camera_cfg = create_standard_camera_cfg("standard_camera")
844 ray_caster_camera_cfg = create_ray_caster_camera_cfg("ray_caster_camera")
845 camera_name_prefix = ""
846 camera_creation_callable = None
847 num_cams = 0
848 if tiled_camera_cfg is not None:
849 camera_name_prefix = "tiled_camera"
850 camera_creation_callable = create_tiled_camera_cfg
851 num_cams = args_cli.num_tiled_cameras
852 elif standard_camera_cfg is not None:
853 camera_name_prefix = "standard_camera"
854 camera_creation_callable = create_standard_camera_cfg
855 num_cams = args_cli.num_standard_cameras
856 elif ray_caster_camera_cfg is not None:
857 camera_name_prefix = "ray_caster_camera"
858 camera_creation_callable = create_ray_caster_camera_cfg
859 num_cams = args_cli.num_ray_caster_cameras
860
861 while (
862 all(cur <= max_thresh for cur, max_thresh in zip(cur_sys_util, max_sys_util_thresh))
863 and cur_num_cams <= max_num_cams
864 ):
865 cur_num_cams = num_cams + interval * autotune_iter
866 autotune_iter += 1
867
868 env = inject_cameras_into_task(
869 task=args_cli.task,
870 num_cams=cur_num_cams,
871 camera_name_prefix=camera_name_prefix,
872 camera_creation_callable=camera_creation_callable,
873 num_cameras_per_env=args_cli.task_num_cameras_per_env,
874 )
875 env.reset()
876 print(f"Testing with {cur_num_cams} {camera_name_prefix}")
877 analysis = run_simulator(
878 sim=None,
879 scene_entities=env.unwrapped.scene,
880 warm_start_length=args_cli.warm_start_length,
881 experiment_length=args_cli.experiment_length,
882 tiled_camera_data_types=args_cli.tiled_camera_data_types,
883 standard_camera_data_types=args_cli.standard_camera_data_types,
884 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
885 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
886 max_cameras_per_env=args_cli.task_num_cameras_per_env,
887 env=env,
888 )
889
890 cur_sys_util = analysis["system_utilization_analytics"]
891 final_analysis = analysis
892 print("Triggering reset...")
893 env.close()
894 sim_utils.create_new_stage()
895 print("[INFO]: DONE! Feel free to CTRL + C Me ")
896 print(f"[INFO]: If you've made it this far, you can likely simulate {cur_num_cams} {camera_name_prefix}")
897 print("Keep in mind, this is without any training running on the GPU.")
898 print("Set lower utilization thresholds to account for training.")
899
900 if not args_cli.autotune:
901 print("[WARNING]: GPU Util Statistics only correct while autotuning, ignore above.")
902
903 # Log benchmark measurements
904 if final_analysis is not None:
905 timing = final_analysis["timing_analytics"]
906 sys_util = final_analysis["system_utilization_analytics"]
907
908 # Log timing measurements
909 benchmark.add_measurement(
910 "runtime",
911 measurement=SingleMeasurement(
912 name="Average Timestep Duration", value=timing["average_timestep_duration"] * 1000, unit="ms"
913 ),
914 )
915 benchmark.add_measurement(
916 "runtime",
917 measurement=SingleMeasurement(
918 name="Average Simulation Step Duration", value=timing["average_sim_step_duration"] * 1000, unit="ms"
919 ),
920 )
921 benchmark.add_measurement(
922 "runtime",
923 measurement=SingleMeasurement(
924 name="Total Simulation Time", value=timing["total_simulation_time"] * 1000, unit="ms"
925 ),
926 )
927
928 # Log system utilization
929 benchmark.add_measurement(
930 "runtime",
931 measurement=DictMeasurement(
932 name="System Utilization",
933 value={
934 "cpu_percent": sys_util[0],
935 "ram_percent": sys_util[1],
936 "gpu_compute_percent": sys_util[2],
937 "gpu_memory_percent": sys_util[3],
938 },
939 ),
940 )
941
942 # Finalize benchmark
943 benchmark.update_manual_recorders()
944 benchmark._finalize_impl()
945
946
947if __name__ == "__main__":
948 # run the main function
949 main()
950 # close sim app
951 simulation_app.close()
Possible Parameters#
First, run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
to see all possible parameters you can vary with this utility.
See the command line parameters related to autotune for more information about
automatically determining maximum camera count.
Compare Performance in Task Environments and Automatically Determine Task Max Camera Count#
Currently, tiled cameras are the most performant camera that can handle multiple dynamic objects.
For example, to see how your system could handle 100 tiled cameras in the cartpole environment, with 2 cameras per environment (so 50 environments total) only in RGB mode, run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb
If you have pynvml installed, (./isaaclab.sh -p -m pip install pynvml), you can also
find the maximum number of cameras that you could run in the specified environment up to
a certain performance threshold (specified by max CPU utilization percent, max RAM utilization percent,
max GPU compute percent, and max GPU memory percent). For example, to find the maximum number of cameras
you can run with cartpole, you could run:
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb --autotune \
--autotune_max_percentage_util 100 80 50 50
Autotune may lead to the program crashing, which means that it tried to run too many cameras at once. However, the max percentage utilization parameter is meant to prevent this from happening.
The output of the benchmark doesn’t include the overhead of training the network, so consider decreasing the maximum utilization percentages to account for this overhead. The final output camera count is for all cameras, so to get the total number of environments, divide the output camera count by the number of cameras per environment.
Compare Camera Type and Performance (Without a Specified Task)#
This tool can also asses performance without a task environment. For example, to view 100 random objects with 2 standard cameras, one could run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--height 100 --width 100 --num_standard_cameras 2 \
--standard_camera_data_types instance_segmentation_fast normals --num_objects 100 \
--experiment_length 100
If your system cannot handle this due to performance reasons, then the process will be killed.
It’s recommended to monitor CPU/RAM utilization and GPU utilization while running this script, to get
an idea of how many resources rendering the desired camera requires. In Ubuntu, you can use tools like htop and nvtop
to live monitor resources while running this script, and in Windows, you can use the Task Manager.
If your system has a hard time handling the desired cameras, you can try the following
Switch to headless mode (supply
--headless)Ensure you are using the GPU pipeline not CPU!
If you aren’t using Tiled Cameras, switch to Tiled Cameras
Decrease camera resolution
Decrease how many data_types there are for each camera.
Decrease the number of cameras
Decrease the number of objects in the scene
If your system is able to handle the amount of cameras, then the time statistics will be printed to the terminal. After the simulations stops it can be closed with CTRL+C.