tensortrade.env.generic.environment module¶
-
class
tensortrade.env.generic.environment.
TradingEnv
(action_scheme: tensortrade.env.generic.components.action_scheme.ActionScheme, reward_scheme: tensortrade.env.generic.components.reward_scheme.RewardScheme, observer: tensortrade.env.generic.components.observer.Observer, stopper: tensortrade.env.generic.components.stopper.Stopper, informer: tensortrade.env.generic.components.informer.Informer, renderer: tensortrade.env.generic.components.renderer.Renderer, **kwargs)[source]¶ Bases:
gym.core.Env
,tensortrade.core.base.TimeIndexed
A trading environment made for use with Gym-compatible reinforcement learning algorithms.
Parameters: - action_scheme (ActionScheme) – A component for generating an action to perform at each step of the environment.
- reward_scheme (RewardScheme) – A component for computing reward after each step of the environment.
- observer (Observer) – A component for generating observations after each step of the environment.
- informer (Informer) – A component for providing information after each step of the environment.
- renderer (Renderer) – A component for rendering the environment.
- kwargs (keyword arguments) – Additional keyword arguments needed to create the environment.
-
agent_id
= None¶
-
components
¶ The components of the environment. (Dict[str,Component], read-only)
-
episode_id
= None¶
-
reset
() → numpy.array[source]¶ Resets the environment.
Returns: obs (np.array) – The first observation of the environment.
-
step
(action: Any) → Tuple[numpy.array, float, bool, dict][source]¶ Makes on step through the environment.
Parameters: action (Any) – An action to perform on the environment. Returns: - np.array – The observation of the environment after the action being performed.
- float – The computed reward for performing the action.
- bool – Whether or not the episode is complete.
- dict – The information gathered after completing the step.