imitation.util.video_wrapper#

Wrapper to record rendered video frames from an environment.

Classes

VideoWrapper(env, directory[, single_video])

Creates videos from wrapped environment by calling render after each timestep.

class imitation.util.video_wrapper.VideoWrapper(env, directory, single_video=True)[source]#

Bases: Wrapper

Creates videos from wrapped environment by calling render after each timestep.

__init__(env, directory, single_video=True)[source]#

Builds a VideoWrapper.

Parameters
  • env (Env) – the wrapped environment.

  • directory (Path) – the output directory.

  • single_video (bool) – if True, generates a single video file, with episodes concatenated. If False, a new video file is created for each episode. Usually a single video file is what is desired. However, if one is searching for an interesting episode (perhaps by looking at the metadata), then saving to different files can be useful.

close()[source]#

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

Return type

None

directory: Path#
episode_id: int#
reset()[source]#

Resets the environment to an initial state and returns an initial observation.

Note that this function should not reset the environment’s random number generator(s); random variables in the environment’s state should be sampled independently between multiple calls to reset(). In other words, each call of reset() should yield an environment suitable for a new episode, independent of previous episodes.

Returns

the initial observation.

Return type

observation (object)

single_video: bool#
step(action)[source]#

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters

action (object) – an action provided by the agent

Returns

agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)

Return type

observation (object)

video_recorder: Optional[VideoRecorder]#