找出你应该用多少/什么相机训练#

目前在Isaac Lab,有几种相机类型; USD相机(标准)、分块相机和光线投射相机。这些相机类型在功能和性能上有所不同。 benchmark_cameras.py 脚本可用于了解相机类型的差异,以及表征它们在不同参数(如相机数量、图像尺寸和数据类型)下的相对性能。

这个实用程序的目的是让用户能够轻松找到在满足用户场景要求的情况下性能最优的相机类型/参数。该实用程序还可以帮助估计用户可以实际运行的相机最大数量,假设用户想要最大化环境数量同时最小化步骤时间。

这个实用程序可以将相机注入到来自健身房注册表的现有任务中,这对于在特定场景中对相机进行基准测试可能很有用。此外,如果您安装了 pynvml ,则可以让此实用程序自动查找可以在您的任务环境中运行的相机的最大数量,直到达到特定指定的系统资源利用阈值为止(不进行训练,在每个时间步骤上不采取任何行动)。

这个指南配套使用 scripts/benchmarks 目录中的 benchmark_cameras.py 脚本。

代码 for benchmark_cameras.py
  1# Copyright (c) 2022-2026, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
  2# All rights reserved.
  3#
  4# SPDX-License-Identifier: BSD-3-Clause
  5
  6"""
  7This script might help you determine how many cameras your system can realistically run
  8at different desired settings.
  9
 10You can supply different task environments to inject cameras into, or just test a sample scene.
 11Additionally, you can automatically find the maximum amount of cameras you can run a task with
 12through the auto-tune functionality.
 13
 14.. code-block:: bash
 15
 16    # Usage with GUI
 17    ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h
 18
 19    # Usage with headless
 20    ./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h --headless
 21
 22"""
 23
 24"""Launch Isaac Sim Simulator first."""
 25
 26import argparse
 27from collections.abc import Callable
 28
 29from isaaclab.app import AppLauncher
 30
 31# parse the arguments
 32args_cli = argparse.Namespace()
 33
 34parser = argparse.ArgumentParser(description="This script can help you benchmark how many cameras you could run.")
 35
 36"""
 37The following arguments only need to be supplied for when one wishes
 38to try injecting cameras into their environment, and automatically determining
 39the maximum camera count.
 40"""
 41parser.add_argument(
 42    "--task",
 43    type=str,
 44    default=None,
 45    required=False,
 46    help="Supply this argument to spawn cameras within an known manager-based task environment.",
 47)
 48
 49parser.add_argument(
 50    "--autotune",
 51    default=False,
 52    action="store_true",
 53    help=(
 54        "Autotuning is only supported for provided task environments."
 55        " Supply this argument to increase the number of environments until a desired threshold is reached."
 56        "Install pynvml in your environment; ./isaaclab.sh -m pip install pynvml"
 57    ),
 58)
 59
 60parser.add_argument(
 61    "--task_num_cameras_per_env",
 62    type=int,
 63    default=1,
 64    help="The number of cameras per environment to use when using a known task.",
 65)
 66
 67parser.add_argument(
 68    "--use_fabric", action="store_true", default=False, help="Enable fabric and use USD I/O operations."
 69)
 70
 71parser.add_argument(
 72    "--autotune_max_percentage_util",
 73    nargs="+",
 74    type=float,
 75    default=[100.0, 80.0, 80.0, 80.0],
 76    required=False,
 77    help=(
 78        "The system utilization percentage thresholds to reach before an autotune is finished. "
 79        "If any one of these limits are hit, the autotune stops."
 80        "Thresholds are, in order, maximum CPU percentage utilization,"
 81        "maximum RAM percentage utilization, maximum GPU compute percent utilization, "
 82        "amd maximum GPU memory utilization."
 83    ),
 84)
 85
 86parser.add_argument(
 87    "--autotune_max_camera_count", type=int, default=4096, help="The maximum amount of cameras allowed in an autotune."
 88)
 89
 90parser.add_argument(
 91    "--autotune_camera_count_interval",
 92    type=int,
 93    default=25,
 94    help=(
 95        "The number of cameras to try to add to the environment if the current camera count"
 96        " falls within permitted system resource utilization limits."
 97    ),
 98)
 99
100"""
101The following arguments are shared for when injecting cameras into a task environment,
102as well as when creating cameras independent of a task environment.
103"""
104
105parser.add_argument(
106    "--num_tiled_cameras",
107    type=int,
108    default=0,
109    required=False,
110    help="Number of tiled cameras to create. For autotuning, this is how many cameras to start with.",
111)
112
113parser.add_argument(
114    "--num_standard_cameras",
115    type=int,
116    default=0,
117    required=False,
118    help="Number of standard cameras to create. For autotuning, this is how many cameras to start with.",
119)
120
121parser.add_argument(
122    "--num_ray_caster_cameras",
123    type=int,
124    default=0,
125    required=False,
126    help="Number of ray caster cameras to create. For autotuning, this is how many cameras to start with.",
127)
128
129parser.add_argument(
130    "--tiled_camera_data_types",
131    nargs="+",
132    type=str,
133    default=["rgb", "depth"],
134    help="The data types rendered by the tiled camera",
135)
136
137parser.add_argument(
138    "--standard_camera_data_types",
139    nargs="+",
140    type=str,
141    default=["rgb", "distance_to_image_plane", "distance_to_camera"],
142    help="The data types rendered by the standard camera",
143)
144
145parser.add_argument(
146    "--ray_caster_camera_data_types",
147    nargs="+",
148    type=str,
149    default=["distance_to_image_plane"],
150    help="The data types rendered by the ray caster camera.",
151)
152
153parser.add_argument(
154    "--ray_caster_visible_mesh_prim_paths",
155    nargs="+",
156    type=str,
157    default=["/World/ground"],
158    help="WARNING: Ray Caster can currently only cast against a single, static, object",
159)
160
161parser.add_argument(
162    "--convert_depth_to_camera_to_image_plane",
163    action="store_true",
164    default=True,
165    help=(
166        "Enable undistorting from perspective view (distance to camera data_type)"
167        "to orthogonal view (distance to plane data_type) for depth."
168        "This is currently needed to create undisorted depth images/point cloud."
169    ),
170)
171
172parser.add_argument(
173    "--keep_raw_depth",
174    dest="convert_depth_to_camera_to_image_plane",
175    action="store_false",
176    help=(
177        "Disable undistorting from perspective view (distance to camera)"
178        "to orthogonal view (distance to plane data_type) for depth."
179    ),
180)
181
182parser.add_argument(
183    "--height",
184    type=int,
185    default=120,
186    required=False,
187    help="Height in pixels of cameras",
188)
189
190parser.add_argument(
191    "--width",
192    type=int,
193    default=140,
194    required=False,
195    help="Width in pixels of cameras",
196)
197
198parser.add_argument(
199    "--warm_start_length",
200    type=int,
201    default=3,
202    required=False,
203    help=(
204        "Number of steps to run the sim before starting benchmark."
205        "Needed to avoid blank images at the start of the simulation."
206    ),
207)
208
209parser.add_argument(
210    "--experiment_length",
211    type=int,
212    default=15,
213    required=False,
214    help="Number of steps to average over",
215)
216
217# This argument is only used when a task is not provided.
218parser.add_argument(
219    "--num_objects",
220    type=int,
221    default=10,
222    required=False,
223    help="Number of objects to spawn into the scene when not using a known task.",
224)
225
226
227AppLauncher.add_app_launcher_args(parser)
228args_cli = parser.parse_args()
229args_cli.enable_cameras = True
230
231if args_cli.autotune:
232    import pynvml
233
234if len(args_cli.ray_caster_visible_mesh_prim_paths) > 1:
235    print("[WARNING]: Ray Casting is only currently supported for a single, static object")
236# launch omniverse app
237app_launcher = AppLauncher(args_cli)
238simulation_app = app_launcher.app
239
240"""Rest everything follows."""
241
242import gymnasium as gym
243import numpy as np
244import random
245import time
246import torch
247
248import psutil
249
250import isaaclab.sim as sim_utils
251from isaaclab.assets import RigidObject, RigidObjectCfg
252from isaaclab.scene.interactive_scene import InteractiveScene
253from isaaclab.sensors import (
254    Camera,
255    CameraCfg,
256    RayCasterCamera,
257    RayCasterCameraCfg,
258    TiledCamera,
259    TiledCameraCfg,
260    patterns,
261)
262from isaaclab.utils.math import orthogonalize_perspective_depth, unproject_depth
263
264from isaaclab_tasks.utils import load_cfg_from_registry
265
266"""
267Camera Creation
268"""
269
270
271def create_camera_base(
272    camera_cfg: type[CameraCfg | TiledCameraCfg],
273    num_cams: int,
274    data_types: list[str],
275    height: int,
276    width: int,
277    prim_path: str | None = None,
278    instantiate: bool = True,
279) -> Camera | TiledCamera | CameraCfg | TiledCameraCfg | None:
280    """Generalized function to create a camera or tiled camera sensor."""
281    # Determine prim prefix based on the camera class
282    name = camera_cfg.class_type.__name__
283
284    if instantiate:
285        # Create the necessary prims
286        for idx in range(num_cams):
287            sim_utils.create_prim(f"/World/{name}_{idx:02d}", "Xform")
288    if prim_path is None:
289        prim_path = f"/World/{name}_.*/{name}"
290    # If valid camera settings are provided, create the camera
291    if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
292        cfg = camera_cfg(
293            prim_path=prim_path,
294            update_period=0,
295            height=height,
296            width=width,
297            data_types=data_types,
298            spawn=sim_utils.PinholeCameraCfg(
299                focal_length=24, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 1e4)
300            ),
301        )
302        if instantiate:
303            return camera_cfg.class_type(cfg=cfg)
304        else:
305            return cfg
306    else:
307        return None
308
309
310def create_tiled_cameras(
311    num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
312) -> TiledCamera | None:
313    if data_types is None:
314        data_types = ["rgb", "depth"]
315    """Defines the tiled camera sensor to add to the scene."""
316    return create_camera_base(
317        camera_cfg=TiledCameraCfg,
318        num_cams=num_cams,
319        data_types=data_types,
320        height=height,
321        width=width,
322    )
323
324
325def create_cameras(
326    num_cams: int = 2, data_types: list[str] | None = None, height: int = 100, width: int = 120
327) -> Camera | None:
328    """Defines the Standard cameras."""
329    if data_types is None:
330        data_types = ["rgb", "depth"]
331    return create_camera_base(
332        camera_cfg=CameraCfg, num_cams=num_cams, data_types=data_types, height=height, width=width
333    )
334
335
336def create_ray_caster_cameras(
337    num_cams: int = 2,
338    data_types: list[str] = ["distance_to_image_plane"],
339    mesh_prim_paths: list[str] = ["/World/ground"],
340    height: int = 100,
341    width: int = 120,
342    prim_path: str = "/World/RayCasterCamera_.*/RayCaster",
343    instantiate: bool = True,
344) -> RayCasterCamera | RayCasterCameraCfg | None:
345    """Create the raycaster cameras; different configuration than Standard/Tiled camera"""
346    for idx in range(num_cams):
347        sim_utils.create_prim(f"/World/RayCasterCamera_{idx:02d}/RayCaster", "Xform")
348
349    if num_cams > 0 and len(data_types) > 0 and height > 0 and width > 0:
350        cam_cfg = RayCasterCameraCfg(
351            prim_path=prim_path,
352            mesh_prim_paths=mesh_prim_paths,
353            update_period=0,
354            offset=RayCasterCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(1.0, 0.0, 0.0, 0.0)),
355            data_types=data_types,
356            debug_vis=False,
357            pattern_cfg=patterns.PinholeCameraPatternCfg(
358                focal_length=24.0,
359                horizontal_aperture=20.955,
360                height=480,
361                width=640,
362            ),
363        )
364        if instantiate:
365            return RayCasterCamera(cfg=cam_cfg)
366        else:
367            return cam_cfg
368
369    else:
370        return None
371
372
373def create_tiled_camera_cfg(prim_path: str) -> TiledCameraCfg:
374    """Grab a simple tiled camera config for injecting into task environments."""
375    return create_camera_base(
376        TiledCameraCfg,
377        num_cams=args_cli.num_tiled_cameras,
378        data_types=args_cli.tiled_camera_data_types,
379        width=args_cli.width,
380        height=args_cli.height,
381        prim_path="{ENV_REGEX_NS}/" + prim_path,
382        instantiate=False,
383    )
384
385
386def create_standard_camera_cfg(prim_path: str) -> CameraCfg:
387    """Grab a simple standard camera config for injecting into task environments."""
388    return create_camera_base(
389        CameraCfg,
390        num_cams=args_cli.num_standard_cameras,
391        data_types=args_cli.standard_camera_data_types,
392        width=args_cli.width,
393        height=args_cli.height,
394        prim_path="{ENV_REGEX_NS}/" + prim_path,
395        instantiate=False,
396    )
397
398
399def create_ray_caster_camera_cfg(prim_path: str) -> RayCasterCameraCfg:
400    """Grab a simple ray caster config for injecting into task environments."""
401    return create_ray_caster_cameras(
402        num_cams=args_cli.num_ray_caster_cameras,
403        data_types=args_cli.ray_caster_camera_data_types,
404        width=args_cli.width,
405        height=args_cli.height,
406        prim_path="{ENV_REGEX_NS}/" + prim_path,
407    )
408
409
410"""
411Scene Creation
412"""
413
414
415def design_scene(
416    num_tiled_cams: int = 2,
417    num_standard_cams: int = 0,
418    num_ray_caster_cams: int = 0,
419    tiled_camera_data_types: list[str] | None = None,
420    standard_camera_data_types: list[str] | None = None,
421    ray_caster_camera_data_types: list[str] | None = None,
422    height: int = 100,
423    width: int = 200,
424    num_objects: int = 20,
425    mesh_prim_paths: list[str] = ["/World/ground"],
426) -> dict:
427    """Design the scene."""
428    if tiled_camera_data_types is None:
429        tiled_camera_data_types = ["rgb"]
430    if standard_camera_data_types is None:
431        standard_camera_data_types = ["rgb"]
432    if ray_caster_camera_data_types is None:
433        ray_caster_camera_data_types = ["distance_to_image_plane"]
434
435    # Populate scene
436    # -- Ground-plane
437    cfg = sim_utils.GroundPlaneCfg()
438    cfg.func("/World/ground", cfg)
439    # -- Lights
440    cfg = sim_utils.DistantLightCfg(intensity=3000.0, color=(0.75, 0.75, 0.75))
441    cfg.func("/World/Light", cfg)
442
443    # Create a dictionary for the scene entities
444    scene_entities = {}
445
446    # Xform to hold objects
447    sim_utils.create_prim("/World/Objects", "Xform")
448    # Random objects
449    for i in range(num_objects):
450        # sample random position
451        position = np.random.rand(3) - np.asarray([0.05, 0.05, -1.0])
452        position *= np.asarray([1.5, 1.5, 0.5])
453        # sample random color
454        color = (random.random(), random.random(), random.random())
455        # choose random prim type
456        prim_type = random.choice(["Cube", "Cone", "Cylinder"])
457        common_properties = {
458            "rigid_props": sim_utils.RigidBodyPropertiesCfg(),
459            "mass_props": sim_utils.MassPropertiesCfg(mass=5.0),
460            "collision_props": sim_utils.CollisionPropertiesCfg(),
461            "visual_material": sim_utils.PreviewSurfaceCfg(diffuse_color=color, metallic=0.5),
462            "semantic_tags": [("class", prim_type)],
463        }
464        if prim_type == "Cube":
465            shape_cfg = sim_utils.CuboidCfg(size=(0.25, 0.25, 0.25), **common_properties)
466        elif prim_type == "Cone":
467            shape_cfg = sim_utils.ConeCfg(radius=0.1, height=0.25, **common_properties)
468        elif prim_type == "Cylinder":
469            shape_cfg = sim_utils.CylinderCfg(radius=0.25, height=0.25, **common_properties)
470        # Rigid Object
471        obj_cfg = RigidObjectCfg(
472            prim_path=f"/World/Objects/Obj_{i:02d}",
473            spawn=shape_cfg,
474            init_state=RigidObjectCfg.InitialStateCfg(pos=position),
475        )
476        scene_entities[f"rigid_object{i}"] = RigidObject(cfg=obj_cfg)
477
478    # Sensors
479    standard_camera = create_cameras(
480        num_cams=num_standard_cams, data_types=standard_camera_data_types, height=height, width=width
481    )
482    tiled_camera = create_tiled_cameras(
483        num_cams=num_tiled_cams, data_types=tiled_camera_data_types, height=height, width=width
484    )
485    ray_caster_camera = create_ray_caster_cameras(
486        num_cams=num_ray_caster_cams,
487        data_types=ray_caster_camera_data_types,
488        mesh_prim_paths=mesh_prim_paths,
489        height=height,
490        width=width,
491    )
492    # return the scene information
493    if tiled_camera is not None:
494        scene_entities["tiled_camera"] = tiled_camera
495    if standard_camera is not None:
496        scene_entities["standard_camera"] = standard_camera
497    if ray_caster_camera is not None:
498        scene_entities["ray_caster_camera"] = ray_caster_camera
499    return scene_entities
500
501
502def inject_cameras_into_task(
503    task: str,
504    num_cams: int,
505    camera_name_prefix: str,
506    camera_creation_callable: Callable,
507    num_cameras_per_env: int = 1,
508) -> gym.Env:
509    """Loads the task, sticks cameras into the config, and creates the environment."""
510    cfg = load_cfg_from_registry(task, "env_cfg_entry_point")
511    cfg.sim.device = args_cli.device
512    cfg.sim.use_fabric = args_cli.use_fabric
513    scene_cfg = cfg.scene
514
515    num_envs = int(num_cams / num_cameras_per_env)
516    scene_cfg.num_envs = num_envs
517
518    for idx in range(num_cameras_per_env):
519        suffix = "" if idx == 0 else str(idx)
520        name = camera_name_prefix + suffix
521        setattr(scene_cfg, name, camera_creation_callable(name))
522    cfg.scene = scene_cfg
523    env = gym.make(task, cfg=cfg)
524    return env
525
526
527"""
528System diagnosis
529"""
530
531
532def get_utilization_percentages(reset: bool = False, max_values: list[float] = [0.0, 0.0, 0.0, 0.0]) -> list[float]:
533    """Get the maximum CPU, RAM, GPU utilization (processing), and
534    GPU memory usage percentages since the last time reset was true."""
535    if reset:
536        max_values[:] = [0, 0, 0, 0]  # Reset the max values
537
538    # CPU utilization
539    cpu_usage = psutil.cpu_percent(interval=0.1)
540    max_values[0] = max(max_values[0], cpu_usage)
541
542    # RAM utilization
543    memory_info = psutil.virtual_memory()
544    ram_usage = memory_info.percent
545    max_values[1] = max(max_values[1], ram_usage)
546
547    # GPU utilization using pynvml
548    if torch.cuda.is_available():
549
550        if args_cli.autotune:
551            pynvml.nvmlInit()  # Initialize NVML
552            for i in range(torch.cuda.device_count()):
553                handle = pynvml.nvmlDeviceGetHandleByIndex(i)
554
555                # GPU Utilization
556                gpu_utilization = pynvml.nvmlDeviceGetUtilizationRates(handle)
557                gpu_processing_utilization_percent = gpu_utilization.gpu  # GPU core utilization
558                max_values[2] = max(max_values[2], gpu_processing_utilization_percent)
559
560                # GPU Memory Usage
561                memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
562                gpu_memory_total = memory_info.total
563                gpu_memory_used = memory_info.used
564                gpu_memory_utilization_percent = (gpu_memory_used / gpu_memory_total) * 100
565                max_values[3] = max(max_values[3], gpu_memory_utilization_percent)
566
567            pynvml.nvmlShutdown()  # Shutdown NVML after usage
568    else:
569        gpu_processing_utilization_percent = None
570        gpu_memory_utilization_percent = None
571    return max_values
572
573
574"""
575Experiment
576"""
577
578
579def run_simulator(
580    sim: sim_utils.SimulationContext | None,
581    scene_entities: dict | InteractiveScene,
582    warm_start_length: int = 10,
583    experiment_length: int = 100,
584    tiled_camera_data_types: list[str] | None = None,
585    standard_camera_data_types: list[str] | None = None,
586    ray_caster_camera_data_types: list[str] | None = None,
587    depth_predicate: Callable = lambda x: "to" in x or x == "depth",
588    perspective_depth_predicate: Callable = lambda x: x == "distance_to_camera",
589    convert_depth_to_camera_to_image_plane: bool = True,
590    max_cameras_per_env: int = 1,
591    env: gym.Env | None = None,
592) -> dict:
593    """Run the simulator with all cameras, and return timing analytics. Visualize if desired."""
594
595    if tiled_camera_data_types is None:
596        tiled_camera_data_types = ["rgb"]
597    if standard_camera_data_types is None:
598        standard_camera_data_types = ["rgb"]
599    if ray_caster_camera_data_types is None:
600        ray_caster_camera_data_types = ["distance_to_image_plane"]
601
602    # Initialize camera lists
603    tiled_cameras = []
604    standard_cameras = []
605    ray_caster_cameras = []
606
607    # Dynamically extract cameras from the scene entities up to max_cameras_per_env
608    for i in range(max_cameras_per_env):
609        # Extract tiled cameras
610        tiled_camera_key = f"tiled_camera{i}" if i > 0 else "tiled_camera"
611        standard_camera_key = f"standard_camera{i}" if i > 0 else "standard_camera"
612        ray_caster_camera_key = f"ray_caster_camera{i}" if i > 0 else "ray_caster_camera"
613
614        try:  # if instead you checked ... if key is in scene_entities... # errors out always even if key present
615            tiled_cameras.append(scene_entities[tiled_camera_key])
616            standard_cameras.append(scene_entities[standard_camera_key])
617            ray_caster_cameras.append(scene_entities[ray_caster_camera_key])
618        except KeyError:
619            break
620
621    # Initialize camera counts
622    camera_lists = [tiled_cameras, standard_cameras, ray_caster_cameras]
623    camera_data_types = [tiled_camera_data_types, standard_camera_data_types, ray_caster_camera_data_types]
624    labels = ["tiled", "standard", "ray_caster"]
625
626    if sim is not None:
627        # Set camera world poses
628        for camera_list in camera_lists:
629            for camera in camera_list:
630                num_cameras = camera.data.intrinsic_matrices.size(0)
631                positions = torch.tensor([[2.5, 2.5, 2.5]], device=sim.device).repeat(num_cameras, 1)
632                targets = torch.tensor([[0.0, 0.0, 0.0]], device=sim.device).repeat(num_cameras, 1)
633                camera.set_world_poses_from_view(positions, targets)
634
635    # Initialize timing variables
636    timestep = 0
637    total_time = 0.0
638    valid_timesteps = 0
639    sim_step_time = 0.0
640
641    while simulation_app.is_running() and timestep < experiment_length:
642        print(f"On timestep {timestep} of {experiment_length}, with warm start of {warm_start_length}")
643        get_utilization_percentages()
644
645        # Measure the total simulation step time
646        step_start_time = time.time()
647
648        if sim is not None:
649            sim.step()
650
651        if env is not None:
652            with torch.inference_mode():
653                # compute zero actions
654                actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
655                # apply actions
656                env.step(actions)
657
658        # Update cameras and process vision data within the simulation step
659        clouds = {}
660        images = {}
661        depth_images = {}
662
663        # Loop through all camera lists and their data_types
664        for camera_list, data_types, label in zip(camera_lists, camera_data_types, labels):
665            for cam_idx, camera in enumerate(camera_list):
666
667                if env is None:  # No env, need to step cams manually
668                    # Only update the camera if it hasn't been updated as part of scene_entities.update ...
669                    camera.update(dt=sim.get_physics_dt())
670
671                for data_type in data_types:
672                    data_label = f"{label}_{cam_idx}_{data_type}"
673
674                    if depth_predicate(data_type):  # is a depth image, want to create cloud
675                        depth = camera.data.output[data_type]
676                        depth_images[data_label + "_raw"] = depth
677                        if perspective_depth_predicate(data_type) and convert_depth_to_camera_to_image_plane:
678                            depth = orthogonalize_perspective_depth(
679                                camera.data.output[data_type], camera.data.intrinsic_matrices
680                            )
681                            depth_images[data_label + "_undistorted"] = depth
682
683                        pointcloud = unproject_depth(depth=depth, intrinsics=camera.data.intrinsic_matrices)
684                        clouds[data_label] = pointcloud
685                    else:  # rgb image, just save it
686                        image = camera.data.output[data_type]
687                        images[data_label] = image
688
689        # End timing for the step
690        step_end_time = time.time()
691        sim_step_time += step_end_time - step_start_time
692
693        if timestep > warm_start_length:
694            get_utilization_percentages(reset=True)
695            total_time += step_end_time - step_start_time
696            valid_timesteps += 1
697
698        timestep += 1
699
700    # Calculate average timings
701    if valid_timesteps > 0:
702        avg_timestep_duration = total_time / valid_timesteps
703        avg_sim_step_duration = sim_step_time / experiment_length
704    else:
705        avg_timestep_duration = 0.0
706        avg_sim_step_duration = 0.0
707
708    # Package timing analytics in a dictionary
709    timing_analytics = {
710        "average_timestep_duration": avg_timestep_duration,
711        "average_sim_step_duration": avg_sim_step_duration,
712        "total_simulation_time": sim_step_time,
713        "total_experiment_duration": sim_step_time,
714    }
715
716    system_utilization_analytics = get_utilization_percentages()
717
718    print("--- Benchmark Results ---")
719    print(f"Average timestep duration: {avg_timestep_duration:.6f} seconds")
720    print(f"Average simulation step duration: {avg_sim_step_duration:.6f} seconds")
721    print(f"Total simulation time: {sim_step_time:.6f} seconds")
722    print("\nSystem Utilization Statistics:")
723    print(
724        f"| CPU:{system_utilization_analytics[0]}% | "
725        f"RAM:{system_utilization_analytics[1]}% | "
726        f"GPU Compute:{system_utilization_analytics[2]}% | "
727        f" GPU Memory: {system_utilization_analytics[3]:.2f}% |"
728    )
729
730    return {"timing_analytics": timing_analytics, "system_utilization_analytics": system_utilization_analytics}
731
732
733def main():
734    """Main function."""
735    # Load simulation context
736    if args_cli.num_tiled_cameras + args_cli.num_standard_cameras + args_cli.num_ray_caster_cameras <= 0:
737        raise ValueError("You must select at least one camera.")
738    if (
739        (args_cli.num_tiled_cameras > 0 and args_cli.num_standard_cameras > 0)
740        or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_standard_cameras > 0)
741        or (args_cli.num_ray_caster_cameras > 0 and args_cli.num_tiled_cameras > 0)
742    ):
743        print("[WARNING]: You have elected to use more than one camera type.")
744        print("[WARNING]: For a benchmark to be meaningful, use ONLY ONE camera type at a time.")
745        print(
746            "[WARNING]: For example, if num_tiled_cameras=100, for a meaningful benchmark,"
747            "num_standard_cameras should be 0, and num_ray_caster_cameras should be 0"
748        )
749        raise ValueError("Benchmark one camera at a time.")
750
751    print("[INFO]: Designing the scene")
752    if args_cli.task is None:
753        print("[INFO]: No task environment provided, creating random scene.")
754        sim_cfg = sim_utils.SimulationCfg(device=args_cli.device)
755        sim = sim_utils.SimulationContext(sim_cfg)
756        # Set main camera
757        sim.set_camera_view([2.5, 2.5, 2.5], [0.0, 0.0, 0.0])
758        scene_entities = design_scene(
759            num_tiled_cams=args_cli.num_tiled_cameras,
760            num_standard_cams=args_cli.num_standard_cameras,
761            num_ray_caster_cams=args_cli.num_ray_caster_cameras,
762            tiled_camera_data_types=args_cli.tiled_camera_data_types,
763            standard_camera_data_types=args_cli.standard_camera_data_types,
764            ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
765            height=args_cli.height,
766            width=args_cli.width,
767            num_objects=args_cli.num_objects,
768            mesh_prim_paths=args_cli.ray_caster_visible_mesh_prim_paths,
769        )
770        # Play simulator
771        sim.reset()
772        # Now we are ready!
773        print("[INFO]: Setup complete...")
774        # Run simulator
775        run_simulator(
776            sim=sim,
777            scene_entities=scene_entities,
778            warm_start_length=args_cli.warm_start_length,
779            experiment_length=args_cli.experiment_length,
780            tiled_camera_data_types=args_cli.tiled_camera_data_types,
781            standard_camera_data_types=args_cli.standard_camera_data_types,
782            ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
783            convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
784        )
785    else:
786        print("[INFO]: Using known task environment, injecting cameras.")
787        autotune_iter = 0
788        max_sys_util_thresh = [0.0, 0.0, 0.0]
789        max_num_cams = max(args_cli.num_tiled_cameras, args_cli.num_standard_cameras, args_cli.num_ray_caster_cameras)
790        cur_num_cams = max_num_cams
791        cur_sys_util = max_sys_util_thresh
792        interval = args_cli.autotune_camera_count_interval
793
794        if args_cli.autotune:
795            max_sys_util_thresh = args_cli.autotune_max_percentage_util
796            max_num_cams = args_cli.autotune_max_camera_count
797            print("[INFO]: Auto tuning until any of the following threshold are met")
798            print(f"|CPU: {max_sys_util_thresh[0]}% | RAM {max_sys_util_thresh[1]}% | GPU: {max_sys_util_thresh[2]}% |")
799            print(f"[INFO]: Maximum number of cameras allowed: {max_num_cams}")
800        # Determine which camera is being tested...
801        tiled_camera_cfg = create_tiled_camera_cfg("tiled_camera")
802        standard_camera_cfg = create_standard_camera_cfg("standard_camera")
803        ray_caster_camera_cfg = create_ray_caster_camera_cfg("ray_caster_camera")
804        camera_name_prefix = ""
805        camera_creation_callable = None
806        num_cams = 0
807        if tiled_camera_cfg is not None:
808            camera_name_prefix = "tiled_camera"
809            camera_creation_callable = create_tiled_camera_cfg
810            num_cams = args_cli.num_tiled_cameras
811        elif standard_camera_cfg is not None:
812            camera_name_prefix = "standard_camera"
813            camera_creation_callable = create_standard_camera_cfg
814            num_cams = args_cli.num_standard_cameras
815        elif ray_caster_camera_cfg is not None:
816            camera_name_prefix = "ray_caster_camera"
817            camera_creation_callable = create_ray_caster_camera_cfg
818            num_cams = args_cli.num_ray_caster_cameras
819
820        while (
821            all(cur <= max_thresh for cur, max_thresh in zip(cur_sys_util, max_sys_util_thresh))
822            and cur_num_cams <= max_num_cams
823        ):
824            cur_num_cams = num_cams + interval * autotune_iter
825            autotune_iter += 1
826
827            env = inject_cameras_into_task(
828                task=args_cli.task,
829                num_cams=cur_num_cams,
830                camera_name_prefix=camera_name_prefix,
831                camera_creation_callable=camera_creation_callable,
832                num_cameras_per_env=args_cli.task_num_cameras_per_env,
833            )
834            env.reset()
835            print(f"Testing with {cur_num_cams} {camera_name_prefix}")
836            analysis = run_simulator(
837                sim=None,
838                scene_entities=env.unwrapped.scene,
839                warm_start_length=args_cli.warm_start_length,
840                experiment_length=args_cli.experiment_length,
841                tiled_camera_data_types=args_cli.tiled_camera_data_types,
842                standard_camera_data_types=args_cli.standard_camera_data_types,
843                ray_caster_camera_data_types=args_cli.ray_caster_camera_data_types,
844                convert_depth_to_camera_to_image_plane=args_cli.convert_depth_to_camera_to_image_plane,
845                max_cameras_per_env=args_cli.task_num_cameras_per_env,
846                env=env,
847            )
848
849            cur_sys_util = analysis["system_utilization_analytics"]
850            print("Triggering reset...")
851            env.close()
852            sim_utils.create_new_stage()
853        print("[INFO]: DONE! Feel free to CTRL + C Me ")
854        print(f"[INFO]: If you've made it this far, you can likely simulate {cur_num_cams} {camera_name_prefix}")
855        print("Keep in mind, this is without any training running on the GPU.")
856        print("Set lower utilization thresholds to account for training.")
857
858        if not args_cli.autotune:
859            print("[WARNING]: GPU Util Statistics only correct while autotuning, ignore above.")
860
861
862if __name__ == "__main__":
863    # run the main function
864    main()
865    # close sim app
866    simulation_app.close()

可能的参数#

首先,运行

./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py -h

要查看您可以使用此实用程序变化的所有可能参数。

请参阅与 autotune 相关的命令行参数,了解有关自动确定最大相机数量的更多信息。

比较任务环境中的性能并自动确定任务最大相机数量#

目前,分块相机是能够处理多个动态对象并且具有最佳性能的相机。

例如,要查看您的系统如何在cartpole环境中处理100个分块相机,每个环境中有2个相机(总共50个环境),只在RGB模式下运行。

./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb

如果您已安装pynvml, (./isaaclab.sh -p -m pip install pynvml), 您还可以找到在指定环境中运行的相机的最大数量,直到达到某个性能阈值(由最大CPU利用率百分比、最大RAM利用率百分比、最大GPU计算百分比和最大GPU内存百分比指定)。例如,要找出您可以用cartpole运行的相机的最大数量,您可以运行:

./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--task Isaac-Cartpole-v0 --num_tiled_cameras 100 \
--task_num_cameras_per_env 2 \
--tiled_camera_data_types rgb --autotune \
--autotune_max_percentage_util 100 80 50 50

自动调谐可能会导致程序崩溃,这意味着它试图一次运行太多相机。然而,最大百分比利用参数旨在阻止这种情况发生。

基准测试的输出不包括训练网络的开销,因此考虑减少最大利用率百分比以考虑这种开销。最终输出的相机数量是针对所有相机的,因此要获取总环境数量,将输出的相机数量除以每个环境的相机数量。

比较相机类型和性能(未指定任务情形下)#

这个工具还可以在没有任务环境的情况下评估性能。例如,要查看通过两个标准相机查看100个随机物体,可以运行

./isaaclab.sh -p scripts/benchmarks/benchmark_cameras.py \
--height 100 --width 100 --num_standard_cameras 2 \
--standard_camera_data_types instance_segmentation_fast normals --num_objects 100 \
--experiment_length 100

如果由于性能原因而无法处理此项,则该进程将被终止。建议在运行此脚本时监视 CPU/RAM 利用率和 GPU 利用率,以了解渲染所需的摄像机需要多少资源。在 Ubuntu 中,您可以使用 htopnvtop 等工具在运行此脚本时实时监视资源,而在 Windows 中,您可以使用任务管理器。

如果您的系统无法处理所需的相机,您可以尝试以下操作

  • 切换到无头模式(提供 --headless

  • 确保您使用的是GPU pipeline,而不是CPU!

  • 如果您没有使用分块相机,请切换到分块相机。

  • 减少相机分辨率

  • 减少每个相机的数据类型数量。

  • 减少相机数量

  • 减少场景中的物体数量

如果您的系统能够处理相机的数量,那么时间统计将被打印到终端。在仿真停止后,可以使用 CTRL+C 关闭它。