omni.isaac.lab_tasks.utils.data_collector#
Sub-module for data collection utilities.
All post-processed robomimic compatible datasets share the same data structure. A single dataset is a single HDF5 file. The stored data follows the structure provided here.
The collector takes input data in its batched format and stores them as different
demonstrations, each corresponding to a given environment index. The demonstrations are
flushed to disk when the RobomimicDataCollector.flush()
is called for the
respective environments. All the data is saved when the
RobomimicDataCollector.close()
is called.
The following sample shows how to use the RobomimicDataCollector
to store
random data in a dataset.
import os
import torch
from omni.isaac.lab_tasks.utils.data_collector import RobomimicDataCollector
# name of the environment (needed by robomimic)
task_name = "Isaac-Franka-Lift-v0"
# specify directory for logging experiments
test_dir = os.path.dirname(os.path.abspath(__file__))
log_dir = os.path.join(test_dir, "logs", "demos")
# name of the file to save data
filename = "hdf_dataset.hdf5"
# number of episodes to collect
num_demos = 10
# number of environments to simulate
num_envs = 4
# create data-collector
collector_interface = RobomimicDataCollector(task_name, log_dir, filename, num_demos)
# reset the collector
collector_interface.reset()
while not collector_interface.is_stopped():
# generate random data to store
# -- obs
obs = {
"joint_pos": torch.randn(num_envs, 10),
"joint_vel": torch.randn(num_envs, 10)
}
# -- actions
actions = torch.randn(num_envs, 10)
# -- rewards
rewards = torch.randn(num_envs)
# -- dones
dones = torch.rand(num_envs) > 0.5
# store signals
# -- obs
for key, value in obs.items():
collector_interface.add(f"obs/{key}", value)
# -- actions
collector_interface.add("actions", actions)
# -- next_obs
for key, value in obs.items():
collector_interface.add(f"next_obs/{key}", value.cpu().numpy())
# -- rewards
collector_interface.add("rewards", rewards)
# -- dones
collector_interface.add("dones", dones)
# flush data from collector for successful environments
# note: in this case we flush all the time
reset_env_ids = dones.nonzero(as_tuple=False).squeeze(-1)
collector_interface.flush(reset_env_ids)
# close collector
collector_interface.close()
Classes
Data collection interface for robomimic. |
Robomimic Data Collector#
- class omni.isaac.lab_tasks.utils.data_collector.RobomimicDataCollector[源代码]#
基类:
object
Data collection interface for robomimic.
This class implements a data collector interface for saving simulation states to disk. The data is stored in HDF5 binary data format. The class is useful for collecting demonstrations. The collected data follows the structure from robomimic.
All datasets in robomimic require the observations and next observations obtained from before and after the environment step. These are stored as a dictionary of observations in the keys “obs” and “next_obs” respectively.
For certain agents in robomimic, the episode data should have the following additional keys: “actions”, “rewards”, “dones”. This behavior can be altered by changing the dataset keys required in the training configuration for the respective learning agent.
For reference on datasets, please check the robomimic documentation.
Methods:
__init__
(env_name, directory_path[, ...])Initializes the data collection wrapper.
Whether data collection is stopped or not.
reset
()Reset the internals of data logger.
add
(key, value)Add a key-value pair to the dataset.
flush
([env_ids])Flush the episode data based on environment indices.
close
()Stop recording and save the file at its current state.
Attributes:
The number of demos collected so far.
- __init__(env_name: str, directory_path: str, filename: str = 'test', num_demos: int = 1, flush_freq: int = 1, env_config: dict | None = None)[源代码]#
Initializes the data collection wrapper.
- 参数:
env_name – The name of the environment.
directory_path – The path to store collected data.
filename – The basename of the saved file. Defaults to “test”.
num_demos – Number of demonstrations to record until stopping. Defaults to 1.
flush_freq – Frequency to dump data to disk. Defaults to 1.
env_config – The configuration for the environment. Defaults to None.
- is_stopped() bool [源代码]#
Whether data collection is stopped or not.
- 返回:
True if data collection has stopped.
- add(key: str, value: np.ndarray | torch.Tensor)[源代码]#
Add a key-value pair to the dataset.
The key can be nested by using the “/” character. For example: “obs/joint_pos”. Currently only two-level nesting is supported.
- 参数:
key – The key name.
value – The corresponding value of shape (N, …), where N is number of environments.
- 抛出:
ValueError – When provided key has sub-keys more than 2. Example: “obs/joints/pos”, instead of “obs/joint_pos”.