Common Functions

In addition to the ETAx class, which completely automates the rolling horizon optimization process and the learning process for reinforcement learning algorithms, the eta_x module also provides functions which simplify the creation of optimization runs and the required environments and algorithms. These functions also provide the interfaces for reading eta_x configuration files and for the logging capabilities of eta_x.

Instantiating Environments

Environments can be instantiated with the vectorize_environment function. The function will automatically wrap the environments with normalization wrappers and monitoring wrappers if required and it can create both interaction environments and normal environments.

eta_utility.eta_x.common.vectorize_environment(env: type[BaseEnv], config_run: ConfigOptRun, env_settings: EnvSettings, callback: Callable[[BaseEnv], None], verbose: int = 2, vectorizer: type[DummyVecEnv] = <class 'stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv'>, n: int = 1, *, training: bool = False, monitor_wrapper: bool = False, norm_wrapper_obs: bool = False, norm_wrapper_reward: bool = False) → VecNormalize | VecEnv[source]

Vectorize the environment and automatically apply normalization wrappers if configured. If the environment is initialized as an interaction_env it will not have normalization wrappers and use the appropriate configuration automatically.

Parameters:

env – Environment class which will be instantiated and vectorized.
config_run – Configuration for a specific optimization run.
env_settings – Configuration settings dictionary for the environment which is being initialized.
callback – Callback to call with an environment instance.
verbose – Logging verbosity to use in the environment.
vectorizer – Vectorizer class to use for vectorizing the environments.
n – Number of vectorized environments to create.
training – Flag to identify whether the environment should be initialized for training or playing. If true, it will be initialized for training.
norm_wrapper_obs – Flag to determine whether observations from the environments should be normalized.
norm_wrapper_reward – Flag to determine whether rewards from the environments should be normalized.

Returns:

Vectorized environments, possibly also wrapped in a normalizer.

The vectorize_environment function will automatically add a callback function as a parameter during environment instantiation. The callback should be called by the environment after each episode and after each step. It will create logging output depending on the configuration.

The callback generally used by ETAx is CallbackEnvironment, which allows for logging in specified intervals. A callback can be anything that is callable and takes an environment instance as its only argument.

class eta_utility.eta_x.common.CallbackEnvironment(plot_interval: int)[source]

This callback should be called at the end of each episode. When multiprocessing is used, no global variables are available (as an own python instance is created).

Parameters:: plot_interval – How many episodes to pass between each render call.

__call__(env: BaseEnv) → None[source]

This callback should be called at the end of each episode. When multiprocessing is used, no global variables are available (as an own python instance is created).

Parameters:: env – Instance of the environment where the callback was triggered.

Instantiating Algorithms

Algorithms / models or agents can be instantiated with the initialize_model and load_model functions. The initialize_model function creates a new model from scratch, while the load_model function will load an existing model from a file created by a stable_baselines3 algorithm. Both functions will ensure that parameters passed to the algorithm and that logging output is correctly initialized.

eta_utility.eta_x.common.initialize_model(algo: type[BaseAlgorithm], policy: type[BasePolicy], envs: VecEnv | VecNormalize, algo_settings: AlgoSettings, seed: int | None = None, *, tensorboard_log: bool = False, log_path: Path | None = None) → BaseAlgorithm[source]

Initialize a new model or algorithm.

Parameters:

algo – Algorithm to initialize.
policy – The policy that should be used by the algorithm.
envs – The environment which the algorithm operates on.
algo_settings – Additional settings for the algorithm.
seed – Random seed to be used by the algorithm.
tensorboard_log – Flag to enable logging to tensorboard.
log_path – Path for tensorboard log. Online required if logging is true

Returns:

Initialized model.

eta_utility.eta_x.common.load_model(algo: type[BaseAlgorithm], envs: VecEnv | VecNormalize, algo_settings: AlgoSettings, path_model: Path, *, tensorboard_log: bool = False, log_path: Path | None = None) → BaseAlgorithm[source]

Load an existing model.

Parameters:

algo – Algorithm type of the model to be loaded.
envs – The environment which the algorithm operates on.
algo_settings – Additional settings for the algorithm.
path_model – Path to load the model from.
tensorboard_log – Flag to enable logging to tensorboard.
log_path – Path for tensorboard log. Online required if logging is true

Returns:

Initialized model.

Logging information

There are also functions for logging information about the optimization runs, such as the configuration and the network architecture.

eta_utility.eta_x.common.log_run_info(config: ConfigOpt, config_run: ConfigOptRun) → None[source]

Save run configuration to the run_info file.

Parameters:

config – Configuration for the framework.
config_run – Configuration for this optimization run.

eta_utility.eta_x.common.log_net_arch(model: BaseAlgorithm, config_run: ConfigOptRun) → None[source]

Store network architecture or policy information in a file. This requires for the model to be initialized, otherwise it will raise a ValueError.

Parameters:

model – The algorithm whose network architecture is stored.
config_run – Optimization run configuration (which contains info about the file to store info in).

Raises:

ValueError.

Other helpful functions

eta_utility.eta_x.common.episode_name_string(run_name: str, episode: int, env_id: int = 1) → str[source]

Generate a name which can be used to pre or postfix files from a specific episode and run of an environment.

Name is of the format: ThisRun_001_01 (run name _ episode number _ environment id)

Parameters:

run_name – Name of the optimization run.
episode – Number of the episode the environment is working on.
env_id – Identification of the environment.

eta_utility.eta_x.common.episode_results_path(series_results_path: Path, run_name: str, episode: int, env_id: int = 1) → pathlib.Path[source]

Generate a filepath which can be used for storing episode results of a specific environment as a csv file.

Name is of the format: ThisRun_001_01.csv (run name _ episode number _ environment id .csv)

Parameters:

series_results_path – Path for results of the series of optimization runs.
run_name – Name of the optimization run.
episode – Number of the episode the environment is working on.
env_id – Identification of the environment.

eta_utility.eta_x.common.is_env_closed(env: BaseEnv | VecEnv | VecNormalize | None) → bool[source]

Check whether an environment has been closed.

Parameters:: env – The environment to check.

eta_utility.eta_x.common.is_vectorized_env(env: BaseEnv | VecEnv | VecNormalize | None) → bool[source]

Check if an environment is vectorized.

Parameters:: env – The environment to check.