A trading environment is a reinforcement learning environment that follows OpenAI’s
gym.Env specification. This allows us to leverage many of the existing reinforcement learning models in our trading agent, if we’d like.
TradingEnv steps through the various interfaces from the
tensortrade library in a consistent way, and will likely not change too often as all other parts of
tensortrade changes. We’re going to go through an overview of the Trading environment below.
Trading environments are fully configurable gym environments with highly composable components:
ActionSchemeinterprets and applies the agent’s actions to the environment.
RewardSchemecomputes the reward for each time step based on the agent’s performance.
Observergenerates the next observation for the agent.
Stopperdetermines whether or not the episode is over.
Informergenerates useful monitoring information at each time step.
Rendererrenders a view of the environment and interactions.
That’s all there is to it, now it’s just a matter of composing each of these components into a complete environment.
When the reset method of a
TradingEnv is called, all of the child components will also be reset. The internal state of each action scheme, reward scheme, observer, stopper, and informer will be set back to their default values, ready for the next episode.
What if I can’t make a particular environment?¶
If none of the environments available in codebase serve your needs let us know! We would love to hear about so we can keep improving the quality of our framework as well as keeping up with the needs of the people using it.