tensortrade.env.generic.environment module

class tensortrade.env.generic.environment.TradingEnv(action_scheme: tensortrade.env.generic.components.action_scheme.ActionScheme, reward_scheme: tensortrade.env.generic.components.reward_scheme.RewardScheme, observer: tensortrade.env.generic.components.observer.Observer, stopper: tensortrade.env.generic.components.stopper.Stopper, informer: tensortrade.env.generic.components.informer.Informer, renderer: tensortrade.env.generic.components.renderer.Renderer, min_periods: int = None, random_start_pct: float = 0.0, **kwargs)[source]

Bases: gym.core.Env, tensortrade.core.base.TimeIndexed

A trading environment made for use with Gym-compatible reinforcement learning algorithms.

Parameters:
  • action_scheme (ActionScheme) – A component for generating an action to perform at each step of the environment.
  • reward_scheme (RewardScheme) – A component for computing reward after each step of the environment.
  • observer (Observer) – A component for generating observations after each step of the environment.
  • informer (Informer) – A component for providing information after each step of the environment.
  • renderer (Renderer) – A component for rendering the environment.
  • kwargs (keyword arguments) – Additional keyword arguments needed to create the environment.
agent_id = None
close() → None[source]

Closes the environment.

components

The components of the environment. (Dict[str,Component], read-only)

episode_id = None
render(**kwargs) → None[source]

Renders the environment.

reset() → numpy.array[source]

Resets the environment.

Returns:obs (np.array) – The first observation of the environment.
save() → None[source]

Saves the rendered view of the environment.

step(action: Any) → Tuple[numpy.array, float, bool, dict][source]

Makes one step through the environment.

Parameters:action (Any) – An action to perform on the environment.
Returns:
  • np.array – The observation of the environment after the action being performed.
  • float – The computed reward for performing the action.
  • bool – Whether or not the episode is complete.
  • dict – The information gathered after completing the step.