Harvest Utilities#

This module provides the Harvester class to aggregate, analyze, and visualize evaluation results across multiple reinforcement learning models and environments.

It computes metrics like Final Return, IQM (Interquartile Mean), and AULC (Area Under Learning Curve), and generates publication-ready plots and CSV/Markdown tables.

Classes#

class objectrl.utils.harvest_utils.Harvester(config: HarvestConfig)[source]#

Bases: object

Collects, processes, and visualizes evaluation results for multiple models across various environments.

This class is designed to aggregate performance metrics like Final Return, Interquartile Mean (IQM), and Area Under the Learning Curve (AULC), and generate publication-ready plots and tables.

Parameters:

config (HarvestConfig) – Configuration object containing paths, model/env names, seeds, plotting parameters, and verbosity settings.

config#

Configuration object with harvesting settings.

Type:

HarvestConfig

metrics#

List of metrics to compute and visualize.

Type:

list[str]

results#

Nested dictionary to store metric results.

Type:

dict[str, dict[str, dict[str, list[float]]]]

curves#

Nested dictionary to store learning curves for each model and environment.

Type:

dict[str, dict[str, dict[str, Any]]]

__init__(config: HarvestConfig) None[source]#

Initialize the Harvester with a configuration object.

Parameters:

config (HarvestConfig) – Configuration containing paths, model/env names, seeds, plotting parameters, and verbosity settings.

Returns:

None

initialize_data_stores() None[source]#

Prepare internal data structures to store metric results and learning curves.

Parameters:

None

Returns:

None

get_result_file_path(env: str, model: str, seed: int) PathLike | None[source]#

Get the file path to the latest evaluation results for a specific setting.

Parameters:
  • env (str) – Environment name.

  • model (str) – Model name.

  • seed (int) – Seed index.

Returns:

Full path to result .npy file, or None if not found.

Return type:

os.PathLike | None

smooth_curve(x: ndarray) ndarray[source]#

Smooth a learning curve using a moving average defined in config.

Parameters:

x (np.ndarray) – Raw curve array (e.g., rewards over time).

Returns:

Smoothed version of input array.

Return type:

np.ndarray

Raises:

ValueError – If window is invalid or larger than the array length.

collect_results() None[source]#

Load results from disk and compute summary metrics (Final, IQM, AULC). Also smooth and store all evaluation curves for each model and seed.

Parameters:

None

Returns:

None

format_model_name(model: str) str[source]#

Convert model name to uppercase for presentation.

Parameters:

model (str) – Original model identifier.

Returns:

Formatted model name (uppercase).

Return type:

str

_plot_model_metrics(env: str, model: str, ax: Any, ax_env: Any, df_env: DataFrame, df_all: DataFrame) tuple[DataFrame, DataFrame][source]#

Plot a learning curve for a single model, add metrics to dataframes.

Parameters:
  • env (str) – Name of the environment.

  • model (str) – Name of the model.

  • ax (Any) – Axes for combined figure (all environments).

  • ax_env (Any) – Axes for individual environment figure.

  • df_env (pd.DataFrame) – Environment-specific metrics table.

  • df_all (pd.DataFrame) – Aggregated table for all metrics.

Returns:

Updated environment and global results.

Return type:

tuple[pd.DataFrame, pd.DataFrame]

plot_results() None[source]#

Generate all visualizations and metric tables: - One plot per environment. - One aggregate plot for all environments. - CSV and Markdown outputs.

Parameters:

None

Returns:

None

harvest() None[source]#

Execute the full harvesting pipeline: - Collect results from files. - Compute statistics. - Generate plots and output tables.

Parameters:

None

Returns:

None

Key Features#

  • Collect evaluation results from disk for multiple seeds.

  • Compute metrics such as Final, IQM, AULC.

  • Smooth learning curves using configurable moving average.

  • Generate per-environment and global plots.

  • Output results as .csv, .md, .png, and .pdf.

Typical Workflow#

harvester = Harvester(config)
harvester.harvest()

This will:

  1. Load and process evaluation results.

  2. Compute statistical summaries.

  3. Save visualizations and tables in config.result_path.