DMC Wrappers#

Descriptions#

This module provides Gymnasium-compatible wrappers for DeepMind Control Suite (DMC) environments. It allows DMC environments to be used interchangeably with Gymnasium-based reinforcement learning algorithms by providing observation_space, action_space, and standard step/reset/render APIs.

Key features:

  • Convert dm_env specifications to Gymnasium spaces (Box or Dict) via dmc_spec2gym_space.

  • Wrap DMC tasks so they can be used with Gymnasium-compatible RL code.

  • Provide standard Gym-style step and reset methods.

  • Support deterministic seeding and optional rendering.

Functions#

objectrl.utils.environment.dmc_wrappers.dmc_spec2gym_space(spec: Array | dict | OrderedDict) Space[source]#

Convert a dm_env spec (Array or BoundedArray) into a Gymnasium Space.

Parameters:

spec (Union[dm_env.specs.Array, Dict, OrderedDict]) – The dm_env spec to convert.

Returns:

A corresponding Gymnasium space.

Return type:

spaces.Space

Classes#

class objectrl.utils.environment.dmc_wrappers.DMCEnv(domain_name: str | None = None, task_name: str | None = None, env: Environment | None = None, task_kwargs: dict | None = None, environment_kwargs: dict | None = None)[source]#

Bases: Env

A Gymnasium-compatible wrapper for DeepMind Control Suite environments.

This class adapts dm_control environments to the Gymnasium API by exposing observation_space, action_space, and standard step/reset/render methods.

domain_name#

The DMC domain name (e.g., “cartpole”).

Type:

str

task_name#

The DMC task name (e.g., “swingup”).

Type:

str

action_space#

The Gymnasium action space.

Type:

spaces.Space

observation_space#

The Gymnasium observation space.

Type:

spaces.Space

__init__(domain_name: str | None = None, task_name: str | None = None, env: Environment | None = None, task_kwargs: dict | None = None, environment_kwargs: dict | None = None) None[source]#

Initialize the DMCEnv wrapper.

Parameters:
  • domain_name (Optional[str]) – Name of the control suite domain.

  • task_name (Optional[str]) – Name of the task in the domain.

  • env (Optional[dm_env.Environment]) – Pre-created dm_env environment.

  • task_kwargs (Optional[Dict]) – Keyword arguments for task creation. Must include a ‘random’ seed for determinism.

  • environment_kwargs (Optional[Dict]) – Extra arguments for environment.

step(action: ndarray) tuple[dict[str, ndarray], float, bool, bool, dict][source]#

Take a step in the environment.

Parameters:

action (np.ndarray) – Action to apply.

Returns:

A Gymnasium-style tuple:

(observation, reward, terminated, truncated, info)

Return type:

TimeStep

reset(seed: int | None = None, options: dict | None = None) tuple[dict[str, ndarray], dict][source]#

Reset the environment.

Parameters:
  • seed (Optional[int]) – Random seed for reproducibility.

  • options (Optional[Dict]) – Extra reset options.

Returns:

Initial observation and info dict.

Return type:

Tuple[Dict[str, np.ndarray], Dict]

render(mode: str = 'rgb_array', height: int = 84, width: int = 84, camera_id: int = 0) ndarray[source]#

Render the environment as an RGB array.

Parameters:
  • mode (str) – Must be “rgb_array”.

  • height (int) – Image height (default: 84).

  • width (int) – Image width (default: 84).

  • camera_id (int) – Camera ID to render from.

Returns:

Rendered image of shape (H, W, 3).

Return type:

np.ndarray