TradingEnvironment(portfolio, action_scheme, reward_scheme, feed=None, window_size=1, use_internal=True, renderers='screenlog', **kwargs)¶
A trading environments made for use with Gym-compatible reinforcement learning algorithms.
__init__(portfolio, action_scheme, reward_scheme, feed=None, window_size=1, use_internal=True, renderers='screenlog', **kwargs)¶
- portfolio (
str]) – The Portfolio of wallets used to submit and execute orders from.
- action_scheme (
str]) – The component for transforming an action into an Order at each timestep.
- reward_scheme (
str]) – The component for determining the reward at each timestep.
- feed (optional) – The pipeline of features to pass the observations through.
- renderers (optional) – single or list of renderers for output by name or as objects. String Values: ‘screenlog’, ‘filelog’, or ‘plotly’. None for no rendering.
- price_history (optional) – OHLCV price history feed used for rendering the chart. Required if render_mode is ‘plotly’.
- kwargs (optional) – Additional arguments for tuning the environments, logging, etc.
- portfolio (
The component for transforming an action into an Order at each time step.
Utility method to clean environment before closing.
Sets the observation space and the action space of the environment. Creates the internal feed and sets initialization for different components.
A dictionary of trades made this episode, organized by order id.
Renders the environment.
Parameters: episode (
int]) – Current episode number (0-based).
Resets the state of the environments and returns an initial observation.
Returns: The episode’s initial observation.
Saves the environment.
Parameters: episode – Current episode number (0-based).
Run one timestep within the environments based on the specified action.
Parameters: action (
int) – The trade action provided by the agent for this timestep.
Returns: observation (pandas.DataFrame) – Provided by the environments’s exchange, often OHLCV or tick trade history data points. reward (float): An size corresponding to the benefit earned by the action taken this timestep. done (bool): If True, the environments is complete and should be restarted. info (dict): Any auxiliary, diagnostic, or debugging information to output.