Find How Many/What Cameras You Should Train With#
Currently in Isaac Lab, there are several camera types; USD Cameras (standard), Tiled Cameras,
and Ray Caster cameras. These camera types differ in functionality and performance. The benchmark_cameras.py
script can be used to understand the difference in cameras types, as well to characterize their relative performance
at different parameters such as camera quantity, image dimensions, and data types.
This utility is provided so that one easily can find the camera type/parameters that are the most performant while meeting the requirements of the user’s scenario. This utility also helps estimate the maximum number of cameras one can realistically run, assuming that one wants to maximize the number of environments while minimizing step time.
This utility can inject cameras into an existing task from the gym registry,
which can be useful for benchmarking cameras in a specific scenario. Also,
if you install pynvml
, you can let this utility automatically find the maximum
numbers of cameras that can run in your task environment up to a
certain specified system resource utilization threshold (without training; taking zero actions
at each timestep).
This guide accompanies the benchmark_cameras.py
script in the scripts/benchmarks
directory.
Code for benchmark_cameras.py
1# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
2# All rights reserved.
3#
4# SPDX-License-Identifier: BSD-3-Clause
5
6# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
7# All rights reserved.
8#
9# SPDX-License-Identifier: BSD-3-Clause
10
11"""
12This script might help you determine how many cameras your system can realistically run
13at different desired settings.
14
15You can supply different task environments to inject cameras into, or just test a sample scene.
16Additionally, you can automatically find the maximum amount of cameras you can run a task with
17through the auto-tune functionality.
18
19.. code-block:: bash
20
21 # Usage with GUI
22 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
23
24 # Usage with headless
25 ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h --headless
26
27"""
28
29"""Launch Isaac Sim Simulator first."""
30
31import argparse
32from collections.abc import Callable
33
34from isaaclab.app import AppLauncher
35
36# parse the arguments
37args_cli = argparse.Namespace()
38
39parser = argparse.ArgumentParser(description="This script can help you benchmark how many cameras you could run.")
40
41"""
42The following arguments only need to be supplied for when one wishes
43to try injecting cameras into their environment, and automatically determining
44the maximum camera count.
45"""
46parser.add_argument(
47 "--task",
48 type=str,
49 default=None,
50 required=False,
51 help="Supply this argument to spawn cameras within an known manager-based task environment.",
52)
53
54parser.add_argument(
55 "--autotune",
56 default=False,
57 action="store_true",
58 help=(
59 "Autotuning is only supported for provided task environments."
60 " Supply this argument to increase the number of environments until a desired threshold is reached."
61 "Install pynvml in your environment; ./isaaclab.sh -m pip install pynvml"
62 ),
63)
64
65parser.add_argument(
66 "--task_num_cameras_per_env",
67 type=int,
68 default=1,
69 help="The number of cameras per environment to use when using a known task.",
70)
71
72parser.add_argument(
73 "--use_fabric", action="store_true", default=False, help="Enable fabric and use USD I/O operations."
74)
75
76parser.add_argument(
77 "--autotune_max_percentage_util",
78 nargs="+",
79 type=float,
80 default=[100.0, 80.0, 80.0, 80.0],
81 required=False,
82 help=(
83 "The system utilization percentage thresholds to reach before an autotune is finished. "
84 "If any one of these limits are hit, the autotune stops."
85 "Thresholds are, in order, maximum CPU percentage utilization,"
86 "maximum RAM percentage utilization, maximum GPU compute percent utilization, "
87 "amd maximum GPU memory utilization."
88 ),
89)
90
91parser.add_argument(
92 "--autotune_max_camera_count", type=int, default=4096, help="The maximum amount of cameras allowed in an autotune."
93)
94
95parser.add_argument(
96 "--autotune_camera_count_interval",
97 type=int,
98 default=25,
99 help=(
100 "The number of cameras to try to add to the environment if the current camera count"
101 " falls within permitted system resource utilization limits."
102 ),
103)
104
105"""
106The following arguments are shared for when injecting cameras into a task environment,
107as well as when creating cameras independent of a task environment.
108"""
109
110parser.add_argument(
111 "--num_tiled_cameras",
112 type=int,
113 default=0,
114 required=False,
115 help="Number of tiled cameras to create. For autotuning, this is how many cameras to start with.",
116)
117
118parser.add_argument(
119 "--num_standard_cameras",
120 type=int,
121 default=0,
122 required=False,
123 help="Number of standard cameras to create. For autotuning, this is how many cameras to start with.",
124)
125
126parser.add_argument(
127 "--num_ray_caster_cameras",
128 type=int,
129 default=0,
130 required=False,
131 help="Number of ray caster cameras to create. For autotuning, this is how many cameras to start with.",
132)
133
134parser.add_argument(
135 "--tiled_camera_data_types",
136 nargs="+",
137 type=str,
138 default=["rgb", "depth"],
139 help="The data types rendered by the tiled camera",
140)
141
142parser.add_argument(
143 "--standard_camera_data_types",
144 nargs="+",
145 type=str,
146 default=["rgb", "distance_to_image_plane", "distance_to_camera"],
147 help="The data types rendered by the standard camera",
148)
149
150parser.add_argument(
151 "--ray_caster_camera_data_types",
152 nargs="+",
153 type=str,
154 default=["distance_to_image_plane"],
155 help="The data types rendered by the ray caster camera.",
156)
157
158parser.add_argument(
159 "--ray_caster_visible_mesh_prim_paths",
160 nargs="+",
161 type=str,
162 default=["/World/ground"],
163 help="WARNING: Ray Caster can currently only cast against a single, static, object",
164)
165
166parser.add_argument(
167 "--convert_depth_to_camera_to_image_plane",
168 action="store_true",
169 default=True,
170 help=(
171 "Enable undistorting from perspective view (distance to camera data_type)"
172 "to orthogonal view (distance to plane data_type) for depth."
173 "This is currently needed to create undisorted depth images/point cloud."
174 ),
175)
176
177parser.add_argument(
178 "--keep_raw_depth",
179 dest="convert_depth_to_camera_to_image_plane",
180 action="store_false",
181 help=(
182 "Disable undistorting from perspective view (distance to camera)"
183 "to orthogonal view (distance to plane data_type) for depth."
184 ),
185)
186
187parser.add_argument(
188 "--height",
189 type=int,
190 default=120,
191 required=False,
192 help="Height in pixels of cameras",
193)
194
195parser.add_argument(
196 "--width",
197 type=int,
198 default=140,
199 required=False,
200 help="Width in pixels of cameras",
201)
202
203parser.add_argument(
204 "--warm_start_length",
205 type=int,
206 default=3,
207 required=False,
208 help=(
209 "Number of steps to run the sim before starting benchmark."
210 "Needed to avoid blank images at the start of the simulation."
211 ),
212)
213
214parser.add_argument(
215 "--experiment_length",
216 type=int,
217 default=15,
218 required=False,
219 help="Number of steps to average over",
220)
221
222# This argument is only used when a task is not provided.
223parser.add_argument(
224 "--num_objects",
225 type=int,
226 default=10,
227 required=False,
228 help="Number of objects to spawn into the scene when not using a known task.",
229)
230
231
232AppLauncher.add_app_launcher_args(parser)
233args_cli = parser.parse_args()
234args_cli.enable_cameras = True
235
236if args_cli.autotune:
237 import pynvml
238
239if len(args_cli.ray_caster_visible_mesh_prim_paths) > 1:
240 print("[WARNING]: Ray Casting is only currently supported for a single, static object")
241# launch omniverse app
242app_launcher = AppLauncher(args_cli)
243simulation_app = app_launcher.app
244
245"""Rest everything follows."""
246
247import gymnasium as gym
248import numpy as np
249import random
250import time
251import torch
252
253import isaacsim.core.utils.prims as prim_utils
254import psutil
255from isaacsim.core.utils.stage import create_new_stage
256
257import isaaclab.sim as sim_utils
258from isaaclab.assets import RigidObject, RigidObjectCfg
259from isaaclab.scene.interactive_scene import InteractiveScene
260from isaaclab.sensors import (
261 Camera,
262 CameraCfg,
263 RayCasterCamera,
264 RayCasterCameraCfg,
265 TiledCamera,
266 TiledCameraCfg,
267 patterns,
268)
269from isaaclab.utils.math import orthogonalize_perspective_depth, unproject_depth
270
271from isaaclab_tasks.utils import load_cfg_from_registry
272
273"""
274Camera Creation
275"""
276
277
278def create_camera_base(
279 camera_cfg: type[CameraCfg | TiledCameraCfg],
280 num_cams: int,
281 data_types: list[str],
282 height: int,
283 width: int,
284 prim_path: str | None = None,
285 instantiate: bool = True,
286) -> Camera | TiledCamera | CameraCfg | TiledCameraCfg | None:
287 """Generalized function to create a camera or tiled camera sensor."""
288 # Determine prim prefix based on the camera class
289 name = camera_cfg.class_type.__name__
290
291 if instantiate:
292 # Create the necessary prims
293 for idx in range(num_cams):
294 prim_utils.create_prim(f"/World/{name}_{idx:02d}", "Xform")
295 if prim_path is None:
296 prim_path = f"/World/{name}_.*/{name}"
297 # If valid camera settings are provided, create the camera
298 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
299 cfg = camera_cfg(
300 prim_path=prim_path,
301 update_period=0,
302 height=height,
303 width=width,
304 data_types=data_types,
305 spawn=sim_utils.PinholeCameraCfg(
306 focal_length=24, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 1e4)
307 ),
308 )
309 if instantiate:
310 return camera_cfg.class_type(cfg=cfg)
311 else:
312 return cfg
313 else:
314 return None
315
316
317def create_tiled_cameras(
318 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
319) -> TiledCamera | None:
320 if data_types is None:
321 data_types = ["rgb", "depth"]
322 """Defines the tiled camera sensor to add to the scene."""
323 return create_camera_base(
324 camera_cfg=TiledCameraCfg,
325 num_cams=num_cams,
326 data_types=data_types,
327 height=height,
328 width=width,
329 )
330
331
332def create_cameras(
333 num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
334) -> Camera | None:
335 """Defines the Standard cameras."""
336 if data_types is None:
337 data_types = ["rgb", "depth"]
338 return create_camera_base(
339 camera_cfg=CameraCfg, num_cams=num_cams, data_types=data_types, height=height, width=width
340 )
341
342
343def create_ray_caster_cameras(
344 num_cams: int = 2,
345 data_types: list[str] = ["distance_to_image_plane"],
346 mesh_prim_paths: list[str] = ["/World/ground"],
347 height: int = 100,
348 width: int = 120,
349 prim_path: str = "/World/RayCasterCamera_.*/RayCaster",
350 instantiate: bool = True,
351) -> RayCasterCamera | RayCasterCameraCfg | None:
352 """Create the raycaster cameras; different configuration than Standard/Tiled camera"""
353 for idx in range(num_cams):
354 prim_utils.create_prim(f"/World/RayCasterCamera_{idx:02d}/RayCaster", "Xform")
355
356 if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
357 cam_cfg = RayCasterCameraCfg(
358 prim_path=prim_path,
359 mesh_prim_paths=mesh_prim_paths,
360 update_period=0,
361 offset=RayCasterCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
362 data_types=data_types,
363 debug_vis=False,
364 pattern_cfg=patterns.PinholeCameraPatternCfg(
365 focal_length=24.0,
366 horizontal_aperture=20.955,
367 height=480,
368 width=640,
369 ),
370 )
371 if instantiate:
372 return RayCasterCamera(cfg=cam_cfg)
373 else:
374 return cam_cfg
375
376 else:
377 return None
378
379
380def create_tiled_camera_cfg(prim_path: str) -> TiledCameraCfg:
381 """Grab a simple tiled camera config for injecting into task environments."""
382 return create_camera_base(
383 TiledCameraCfg,
384 num_cams=args_cli.num_tiled_cameras,
385 data_types=args_cli.tiled_camera_data_types,
386 width=args_cli.width,
387 height=args_cli.height,
388 prim_path="{ENV_REGEX_NS}/" + prim_path,
389 instantiate=False,
390 )
391
392
393def create_standard_camera_cfg(prim_path: str) -> CameraCfg:
394 """Grab a simple standard camera config for injecting into task environments."""
395 return create_camera_base(
396 CameraCfg,
397 num_cams=args_cli.num_standard_cameras,
398 data_types=args_cli.standard_camera_data_types,
399 width=args_cli.width,
400 height=args_cli.height,
401 prim_path="{ENV_REGEX_NS}/" + prim_path,
402 instantiate=False,
403 )
404
405
406def create_ray_caster_camera_cfg(prim_path: str) -> RayCasterCameraCfg:
407 """Grab a simple ray caster config for injecting into task environments."""
408 return create_ray_caster_cameras(
409 num_cams=args_cli.num_ray_caster_cameras,
410 data_types=args_cli.ray_caster_camera_data_types,
411 width=args_cli.width,
412 height=args_cli.height,
413 prim_path="{ENV_REGEX_NS}/" + prim_path,
414 )
415
416
417"""
418Scene Creation
419"""
420
421
422def design_scene(
423 num_tiled_cams: int = 2,
424 num_standard_cams: int = 0,
425 num_ray_caster_cams: int = 0,
426 tiled_camera_data_types: list[str] | None = None,
427 standard_camera_data_types: list[str] | None = None,
428 ray_caster_camera_data_types: list[str] | None = None,
429 height: int = 100,
430 width: int = 200,
431 num_objects: int = 20,
432 mesh_prim_paths: list[str] = ["/World/ground"],
433) -> dict:
434 """Design the scene."""
435 if tiled_camera_data_types is None:
436 tiled_camera_data_types = ["rgb"]
437 if standard_camera_data_types is None:
438 standard_camera_data_types = ["rgb"]
439 if ray_caster_camera_data_types is None:
440 ray_caster_camera_data_types = ["distance_to_image_plane"]
441
442 # Populate scene
443 # -- Ground-plane
444 cfg = sim_utils.GroundPlaneCfg()
445 cfg.func("/World/ground", cfg)
446 # -- Lights
447 cfg = sim_utils.DistantLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
448 cfg.func("/World/Light", cfg)
449
450 # Create a dictionary for the scene entities
451 scene_entities = {}
452
453 # Xform to hold objects
454 prim_utils.create_prim("/World/Objects", "Xform")
455 # Random objects
456 for i in range(num_objects):
457 # sample random position
458 position = np.random.rand(3) - np.asarray([0.05, 0.05, -1.0])
459 position *= np.asarray([1.5, 1.5, 0.5])
460 # sample random color
461 color = (random.random(), random.random(), random.random())
462 # choose random prim type
463 prim_type = random.choice(["Cube", "Cone", "Cylinder"])
464 common_properties = {
465 "rigid_props": sim_utils.RigidBodyPropertiesCfg(),
466 "mass_props": sim_utils.MassPropertiesCfg(mass=5.0),
467 "collision_props": sim_utils.CollisionPropertiesCfg(),
468 "visual_material": sim_utils.PreviewSurfaceCfg(diffuse_color=color, metallic=0.5),
469 "semantic_tags": [("class", prim_type)],
470 }
471 if prim_type == "Cube":
472 shape_cfg = sim_utils.CuboidCfg(size=(0.25, 0.25, 0.25), **common_properties)
473 elif prim_type == "Cone":
474 shape_cfg = sim_utils.ConeCfg(radius=0.1, height=0.25, **common_properties)
475 elif prim_type == "Cylinder":
476 shape_cfg = sim_utils.CylinderCfg(radius=0.25, height=0.25, **common_properties)
477 # Rigid Object
478 obj_cfg = RigidObjectCfg(
479 prim_path=f"/World/Objects/Obj_{i:02d}",
480 spawn=shape_cfg,
481 init_state=RigidObjectCfg.InitialStateCfg(pos=position),
482 )
483 scene_entities[f"rigid_object{i}"] = RigidObject(cfg=obj_cfg)
484
485 # Sensors
486 standard_camera = create_cameras(
487 num_cams=num_standard_cams, data_types=standard_camera_data_types, height=height, width=width
488 )
489 tiled_camera = create_tiled_cameras(
490 num_cams=num_tiled_cams, data_types=tiled_camera_data_types, height=height, width=width
491 )
492 ray_caster_camera = create_ray_caster_cameras(
493 num_cams=num_ray_caster_cams,
494 data_types=ray_caster_camera_data_types,
495 mesh_prim_paths=mesh_prim_paths,
496 height=height,
497 width=width,
498 )
499 # return the scene information
500 if tiled_camera is not None:
501 scene_entities["tiled_camera"] = tiled_camera
502 if standard_camera is not None:
503 scene_entities["standard_camera"] = standard_camera
504 if ray_caster_camera is not None:
505 scene_entities["ray_caster_camera"] = ray_caster_camera
506 return scene_entities
507
508
509def inject_cameras_into_task(
510 task: str,
511 num_cams: int,
512 camera_name_prefix: str,
513 camera_creation_callable: Callable,
514 num_cameras_per_env: int = 1,
515) -> gym.Env:
516 """Loads the task, sticks cameras into the config, and creates the environment."""
517 cfg = load_cfg_from_registry(task, "env_cfg_entry_point")
518 cfg.sim.device = args_cli.device
519 cfg.sim.use_fabric = args_cli.use_fabric
520 scene_cfg = cfg.scene
521
522 num_envs = int(num_cams / num_cameras_per_env)
523 scene_cfg.num_envs = num_envs
524
525 for idx in range(num_cameras_per_env):
526 suffix = "" if idx == 0 else str(idx)
527 name = camera_name_prefix + suffix
528 setattr(scene_cfg, name, camera_creation_callable(name))
529 cfg.scene = scene_cfg
530 env = gym.make(task, cfg=cfg)
531 return env
532
533
534"""
535System diagnosis
536"""
537
538
539def get_utilization_percentages(reset: bool = False, max_values: list[float] = [0.0, 0.0, 0.0, 0.0]) -> list[float]:
540 """Get the maximum CPU, RAM, GPU utilization (processing), and
541 GPU memory usage percentages since the last time reset was true."""
542 if reset:
543 max_values[:] = [0, 0, 0, 0] # Reset the max values
544
545 # CPU utilization
546 cpu_usage = psutil.cpu_percent(interval=0.1)
547 max_values[0] = max(max_values[0], cpu_usage)
548
549 # RAM utilization
550 memory_info = psutil.virtual_memory()
551 ram_usage = memory_info.percent
552 max_values[1] = max(max_values[1], ram_usage)
553
554 # GPU utilization using pynvml
555 if torch.cuda.is_available():
556
557 if args_cli.autotune:
558 pynvml.nvmlInit() # Initialize NVML
559 for i in range(torch.cuda.device_count()):
560 handle = pynvml.nvmlDeviceGetHandleByIndex(i)
561
562 # GPU Utilization
563 gpu_utilization = pynvml.nvmlDeviceGetUtilizationRates(handle)
564 gpu_processing_utilization_percent = gpu_utilization.gpu # GPU core utilization
565 max_values[2] = max(max_values[2], gpu_processing_utilization_percent)
566
567 # GPU Memory Usage
568 memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
569 gpu_memory_total = memory_info.total
570 gpu_memory_used = memory_info.used
571 gpu_memory_utilization_percent = (gpu_memory_used / gpu_memory_total) * 100
572 max_values[3] = max(max_values[3], gpu_memory_utilization_percent)
573
574 pynvml.nvmlShutdown() # Shutdown NVML after usage
575 else:
576 gpu_processing_utilization_percent = None
577 gpu_memory_utilization_percent = None
578 return max_values
579
580
581"""
582Experiment
583"""
584
585
586def run_simulator(
587 sim: sim_utils.SimulationContext | None,
588 scene_entities: dict | InteractiveScene,
589 warm_start_length: int = 10,
590 experiment_length: int = 100,
591 tiled_camera_data_types: list[str] | None = None,
592 standard_camera_data_types: list[str] | None = None,
593 ray_caster_camera_data_types: list[str] | None = None,
594 depth_predicate: Callable = lambda x: "to" in x or x == "depth",
595 perspective_depth_predicate: Callable = lambda x: x == "distance_to_camera",
596 convert_depth_to_camera_to_image_plane: bool = True,
597 max_cameras_per_env: int = 1,
598 env: gym.Env | None = None,
599) -> dict:
600 """Run the simulator with all cameras, and return timing analytics. Visualize if desired."""
601
602 if tiled_camera_data_types is None:
603 tiled_camera_data_types = ["rgb"]
604 if standard_camera_data_types is None:
605 standard_camera_data_types = ["rgb"]
606 if ray_caster_camera_data_types is None:
607 ray_caster_camera_data_types = ["distance_to_image_plane"]
608
609 # Initialize camera lists
610 tiled_cameras = []
611 standard_cameras = []
612 ray_caster_cameras = []
613
614 # Dynamically extract cameras from the scene entities up to max_cameras_per_env
615 for i in range(max_cameras_per_env):
616 # Extract tiled cameras
617 tiled_camera_key = f"tiled_camera{i}" if i > 0 else "tiled_camera"
618 standard_camera_key = f"standard_camera{i}" if i > 0 else "standard_camera"
619 ray_caster_camera_key = f"ray_caster_camera{i}" if i > 0 else "ray_caster_camera"
620
621 try: # if instead you checked ... if key is in scene_entities... # errors out always even if key present
622 tiled_cameras.append(scene_entities[tiled_camera_key])
623 standard_cameras.append(scene_entities[standard_camera_key])
624 ray_caster_cameras.append(scene_entities[ray_caster_camera_key])
625 except KeyError:
626 break
627
628 # Initialize camera counts
629 camera_lists = [tiled_cameras, standard_cameras, ray_caster_cameras]
630 camera_data_types = [tiled_camera_data_types, standard_camera_data_types, ray_caster_camera_data_types]
631 labels = ["tiled", "standard", "ray_caster"]
632
633 if sim is not None:
634 # Set camera world poses
635 for camera_list in camera_lists:
636 for camera in camera_list:
637 num_cameras = camera.data.intrinsic_matrices.size(0)
638 positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
639 targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
640 camera.set_world_poses_from_view(positions, targets)
641
642 # Initialize timing variables
643 timestep = 0
644 total_time = 0.0
645 valid_timesteps = 0
646 sim_step_time = 0.0
647
648 while simulation_app.is_running() and timestep < experiment_length:
649 print(f"On timestep {timestep} of {experiment_length}, with warm start of {warm_start_length}")
650 get_utilization_percentages()
651
652 # Measure the total simulation step time
653 step_start_time = time.time()
654
655 if sim is not None:
656 sim.step()
657
658 if env is not None:
659 with torch.inference_mode():
660 # compute zero actions
661 actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
662 # apply actions
663 env.step(actions)
664
665 # Update cameras and process vision data within the simulation step
666 clouds = {}
667 images = {}
668 depth_images = {}
669
670 # Loop through all camera lists and their data_types
671 for camera_list, data_types, label in zip(camera_lists, camera_data_types, labels):
672 for cam_idx, camera in enumerate(camera_list):
673
674 if env is None: # No env, need to step cams manually
675 # Only update the camera if it hasn't been updated as part of scene_entities.update ...
676 camera.update(dt=sim.get_physics_dt())
677
678 for data_type in data_types:
679 data_label = f"{label}_{cam_idx}_{data_type}"
680
681 if depth_predicate(data_type): # is a depth image, want to create cloud
682 depth = camera.data.output[data_type]
683 depth_images[data_label + "_raw"] = depth
684 if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
685 depth = orthogonalize_perspective_depth(
686 camera.data.output[data_type], camera.data.intrinsic_matrices
687 )
688 depth_images[data_label + "_undistorted"] = depth
689
690 pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
691 clouds[data_label] = pointcloud
692 else: # rgb image, just save it
693 image = camera.data.output[data_type]
694 images[data_label] = image
695
696 # End timing for the step
697 step_end_time = time.time()
698 sim_step_time += step_end_time - step_start_time
699
700 if timestep > warm_start_length:
701 get_utilization_percentages(reset=True)
702 total_time += step_end_time - step_start_time
703 valid_timesteps += 1
704
705 timestep += 1
706
707 # Calculate average timings
708 if valid_timesteps > 0:
709 avg_timestep_duration = total_time / valid_timesteps
710 avg_sim_step_duration = sim_step_time / experiment_length
711 else:
712 avg_timestep_duration = 0.0
713 avg_sim_step_duration = 0.0
714
715 # Package timing analytics in a dictionary
716 timing_analytics = {
717 "average_timestep_duration": avg_timestep_duration,
718 "average_sim_step_duration": avg_sim_step_duration,
719 "total_simulation_time": sim_step_time,
720 "total_experiment_duration": sim_step_time,
721 }
722
723 system_utilization_analytics = get_utilization_percentages()
724
725 print("--- Benchmark Results ---")
726 print(f"Average timestep duration: {avg_timestep_duration:.6f} seconds")
727 print(f"Average simulation step duration: {avg_sim_step_duration:.6f} seconds")
728 print(f"Total simulation time: {sim_step_time:.6f} seconds")
729 print("\nSystem Utilization Statistics:")
730 print(
731 f"| CPU:{system_utilization_analytics[0]}% | "
732 f"RAM:{system_utilization_analytics[1]}% | "
733 f"GPU Compute:{system_utilization_analytics[2]}% | "
734 f" GPU Memory: {system_utilization_analytics[3]:.2f}% |"
735 )
736
737 return {"timing_analytics": timing_analytics, "system_utilization_analytics": system_utilization_analytics}
738
739
740def main():
741 """Main function."""
742 # Load simulation context
743 if args_cli.num_tiled_cameras + args_cli.num_standard_cameras + args_cli.num_ray_caster_cameras <= 0:
744 raise ValueError("You must select at least one camera.")
745 if (
746 (args_cli.num_tiled_cameras > 0 and args_cli.num_standard_cameras > 0)
747 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_standard_cameras > 0)
748 or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_tiled_cameras > 0)
749 ):
750 print("[WARNING]: You have elected to use more than one camera type.")
751 print("[WARNING]: For a benchmark to be meaningful, use ONLY ONE camera type at a time.")
752 print(
753 "[WARNING]: For example, if num_tiled_cameras=100, for a meaningful benchmark,"
754 "num_standard_cameras should be 0, and num_ray_caster_cameras should be 0"
755 )
756 raise ValueError("Benchmark one camera at a time.")
757
758 print("[INFO]: Designing the scene")
759 if args_cli.task is None:
760 print("[INFO]: No task environment provided, creating random scene.")
761 sim_cfg = sim_utils.SimulationCfg(device=args_cli.device)
762 sim = sim_utils.SimulationContext(sim_cfg)
763 # Set main camera
764 sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
765 scene_entities = design_scene(
766 num_tiled_cams=args_cli.num_tiled_cameras,
767 num_standard_cams=args_cli.num_standard_cameras,
768 num_ray_caster_cams=args_cli.num_ray_caster_cameras,
769 tiled_camera_data_types=args_cli.tiled_camera_data_types,
770 standard_camera_data_types=args_cli.standard_camera_data_types,
771 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
772 height=args_cli.height,
773 width=args_cli.width,
774 num_objects=args_cli.num_objects,
775 mesh_prim_paths=args_cli.ray_caster_visible_mesh_prim_paths,
776 )
777 # Play simulator
778 sim.reset()
779 # Now we are ready!
780 print("[INFO]: Setup complete...")
781 # Run simulator
782 run_simulator(
783 sim=sim,
784 scene_entities=scene_entities,
785 warm_start_length=args_cli.warm_start_length,
786 experiment_length=args_cli.experiment_length,
787 tiled_camera_data_types=args_cli.tiled_camera_data_types,
788 standard_camera_data_types=args_cli.standard_camera_data_types,
789 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
790 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
791 )
792 else:
793 print("[INFO]: Using known task environment, injecting cameras.")
794 autotune_iter = 0
795 max_sys_util_thresh = [0.0, 0.0, 0.0]
796 max_num_cams = max(args_cli.num_tiled_cameras, args_cli.num_standard_cameras, args_cli.num_ray_caster_cameras)
797 cur_num_cams = max_num_cams
798 cur_sys_util = max_sys_util_thresh
799 interval = args_cli.autotune_camera_count_interval
800
801 if args_cli.autotune:
802 max_sys_util_thresh = args_cli.autotune_max_percentage_util
803 max_num_cams = args_cli.autotune_max_camera_count
804 print("[INFO]: Auto tuning until any of the following threshold are met")
805 print(f"|CPU: {max_sys_util_thresh[0]}% | RAM {max_sys_util_thresh[1]}% | GPU: {max_sys_util_thresh[2]}% |")
806 print(f"[INFO]: Maximum number of cameras allowed: {max_num_cams}")
807 # Determine which camera is being tested...
808 tiled_camera_cfg = create_tiled_camera_cfg("tiled_camera")
809 standard_camera_cfg = create_standard_camera_cfg("standard_camera")
810 ray_caster_camera_cfg = create_ray_caster_camera_cfg("ray_caster_camera")
811 camera_name_prefix = ""
812 camera_creation_callable = None
813 num_cams = 0
814 if tiled_camera_cfg is not None:
815 camera_name_prefix = "tiled_camera"
816 camera_creation_callable = create_tiled_camera_cfg
817 num_cams = args_cli.num_tiled_cameras
818 elif standard_camera_cfg is not None:
819 camera_name_prefix = "standard_camera"
820 camera_creation_callable = create_standard_camera_cfg
821 num_cams = args_cli.num_standard_cameras
822 elif ray_caster_camera_cfg is not None:
823 camera_name_prefix = "ray_caster_camera"
824 camera_creation_callable = create_ray_caster_camera_cfg
825 num_cams = args_cli.num_ray_caster_cameras
826
827 while (
828 all(cur <= max_thresh for cur, max_thresh in zip(cur_sys_util, max_sys_util_thresh))
829 and cur_num_cams <= max_num_cams
830 ):
831 cur_num_cams = num_cams + interval * autotune_iter
832 autotune_iter += 1
833
834 env = inject_cameras_into_task(
835 task=args_cli.task,
836 num_cams=cur_num_cams,
837 camera_name_prefix=camera_name_prefix,
838 camera_creation_callable=camera_creation_callable,
839 num_cameras_per_env=args_cli.task_num_cameras_per_env,
840 )
841 env.reset()
842 print(f"Testing with {cur_num_cams} {camera_name_prefix}")
843 analysis = run_simulator(
844 sim=None,
845 scene_entities=env.unwrapped.scene,
846 warm_start_length=args_cli.warm_start_length,
847 experiment_length=args_cli.experiment_length,
848 tiled_camera_data_types=args_cli.tiled_camera_data_types,
849 standard_camera_data_types=args_cli.standard_camera_data_types,
850 ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
851 convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
852 max_cameras_per_env=args_cli.task_num_cameras_per_env,
853 env=env,
854 )
855
856 cur_sys_util = analysis["system_utilization_analytics"]
857 print("Triggering reset...")
858 env.close()
859 create_new_stage()
860 print("[INFO]: DONE! Feel free to CTRL + C Me ")
861 print(f"[INFO]: If you've made it this far, you can likely simulate {cur_num_cams} {camera_name_prefix}")
862 print("Keep in mind, this is without any training running on the GPU.")
863 print("Set lower utilization thresholds to account for training.")
864
865 if not args_cli.autotune:
866 print("[WARNING]: GPU Util Statistics only correct while autotuning, ignore above.")
867
868
869if __name__ == "__main__":
870 # run the main function
871 main()
872 # close sim app
873 simulation_app.close()
Possible Parameters#
First, run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
to see all possible parameters you can vary with this utility.
See the command line parameters related to autotune
for more information about
automatically determining maximum camera count.
Compare Performance in Task Environments and Automatically Determine Task Max Camera Count#
Currently, tiled cameras are the most performant camera that can handle multiple dynamic objects.
For example, to see how your system could handle 100 tiled cameras in the cartpole environment, with 2 cameras per environment (so 50 environments total) only in RGB mode, run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb
If you have pynvml installed, (./isaaclab.sh -p -m pip install pynvml
), you can also
find the maximum number of cameras that you could run in the specified environment up to
a certain performance threshold (specified by max CPU utilization percent, max RAM utilization percent,
max GPU compute percent, and max GPU memory percent). For example, to find the maximum number of cameras
you can run with cartpole, you could run:
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb --autotune \
--autotune_max_percentage_util 100 80 50 50
Autotune may lead to the program crashing, which means that it tried to run too many cameras at once. However, the max percentage utilization parameter is meant to prevent this from happening.
The output of the benchmark doesn’t include the overhead of training the network, so consider decreasing the maximum utilization percentages to account for this overhead. The final output camera count is for all cameras, so to get the total number of environments, divide the output camera count by the number of cameras per environment.
Compare Camera Type and Performance (Without a Specified Task)#
This tool can also asses performance without a task environment. For example, to view 100 random objects with 2 standard cameras, one could run
./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--height 100 --width 100 --num_standard_cameras 2 \
--standard_camera_data_types instance_segmentation_fast normals --num_objects 100 \
--experiment_length 100
If your system cannot handle this due to performance reasons, then the process will be killed.
It’s recommended to monitor CPU/RAM utilization and GPU utilization while running this script, to get
an idea of how many resources rendering the desired camera requires. In Ubuntu, you can use tools like htop
and nvtop
to live monitor resources while running this script, and in Windows, you can use the Task Manager.
If your system has a hard time handling the desired cameras, you can try the following
Switch to headless mode (supply
--headless
)Ensure you are using the GPU pipeline not CPU!
If you aren’t using Tiled Cameras, switch to Tiled Cameras
Decrease camera resolution
Decrease how many data_types there are for each camera.
Decrease the number of cameras
Decrease the number of objects in the scene
If your system is able to handle the amount of cameras, then the time statistics will be printed to the terminal. After the simulations stops it can be closed with CTRL+C.