Make Environment Utility#
This module defines a utility function to create environments with support for noise injection, reward shaping, action rescaling, and consistent seeding. Environment names are mapped to standardized strings, using Gymnasium IDs for MuJoCo tasks and custom identifiers for DM Control tasks.
Function#
- objectrl.utils.make_env.make_env(env_name: str, seed: int, env_config, eval_env: bool = False, num_envs: int = 1) Env | VectorEnv[source]#
Create and configure a Gymnasium environment with optional wrappers for noise, reward shaping, and consistent seeding.
This function supports: - Gymnasium MuJoCo tasks - DM Control tasks, automatically wrapped for Gymnasium compatibility - MetaWorld tasks, with optional sparse rewards - Action rescaling to [-1, 1] - Noisy action and/or observation wrappers - Delayed reward and control cost penalties via PositionDelayWrapper - Reproducibility via consistent seeding for Gym, NumPy, and PyTorch
- Parameters:
env_name (str) – Name of the environment. Must be present in
env_mappingsand can belong to Gymnasium MuJoCo, DM Control, or MetaWorld suites.seed (int) – Base random seed for reproducibility.
env_config – Configuration object with nested attributes: - env_config.noisy.noisy_act (float): Std of Gaussian noise for actions. - env_config.noisy.noisy_obs (float): Std of Gaussian noise for observations. - env_config.position_delay (int): Delay threshold for reward. - env_config.control_cost_weight (float): Weight for control cost in reward.
eval_env (bool, optional) – If True, modifies seed to separate training/testing. Defaults to False.
num_envs (int, optional) – Number of environments which are parallelized if > 1. Defaults to 1.
- Returns:
The fully constructed and wrapped Gymnasium environment instance.
- Return type:
gym.Env
- Raises:
gym.error.Error – If env_name is not registered in Gym.
Key Features#
Rescales continuous action spaces to the range
[-1, 1].Optionally adds:
Noisy actions via
NoisyActionWrapperNoisy observations via
NoisyObservationWrapperReward shaping via
PositionDelayWrapper
Reproducibility through seeding for:
Gym environment
Action and observation spaces
NumPy and PyTorch RNGs
Usage Example#
env = make_env("HalfCheetah-v4", 0, config.env)
obs = env.reset()
action = env.action_space.sample()
next_obs, reward, done, info = env.step(action)