tensortrade.env.default.rewards module

class tensortrade.env.default.rewards.PBR(price: tensortrade.feed.core.base.Stream)[source]

Bases: tensortrade.env.default.rewards.TensorTradeRewardScheme

A reward scheme for position-based returns.

  • Let \(p_t\) denote the price at time t.
  • Let \(x_t\) denote the position at time t.
  • Let \(R_t\) denote the reward at time t.

Then the reward is defined as, \(R_{t} = (p_{t} - p_{t-1}) \cdot x_{t}\).

Parameters:price (Stream) – The price stream to use for computing rewards.
get_reward(portfolio: Portfolio) → float[source]

Gets the reward associated with current step of the episode.

Parameters:portfolio (Portfolio) – The portfolio associated with the TensorTradeActionScheme.
Returns:float – The reward for the current step of the episode.
on_action(action: int) → None[source]
registered_name = 'pbr'
reset() → None[source]

Resets the position and feed of the reward scheme.

class tensortrade.env.default.rewards.RiskAdjustedReturns(return_algorithm: str = 'sharpe', risk_free_rate: float = 0.0, target_returns: float = 0.0, window_size: int = 1)[source]

Bases: tensortrade.env.default.rewards.TensorTradeRewardScheme

A reward scheme that rewards the agent for increasing its net worth, while penalizing more volatile strategies.

Parameters:
  • return_algorithm ({'sharpe', 'sortino'}, Default 'sharpe'.) – The risk-adjusted return metric to use.
  • risk_free_rate (float, Default 0.) – The risk free rate of returns to use for calculating metrics.
  • target_returns (float, Default 0) – The target returns per period for use in calculating the sortino ratio.
  • window_size (int) – The size of the look back window for computing the reward.
get_reward(portfolio: Portfolio) → float[source]

Computes the reward corresponding to the selected risk-adjusted return metric.

Parameters:portfolio (Portfolio) – The current portfolio being used by the environment.
Returns:float – The reward corresponding to the selected risk-adjusted return metric.
class tensortrade.env.default.rewards.SimpleProfit(window_size: int = 1)[source]

Bases: tensortrade.env.default.rewards.TensorTradeRewardScheme

A simple reward scheme that rewards the agent for incremental increases in net worth.

Parameters:window_size (int) – The size of the look back window for computing the reward.
window_size

The size of the look back window for computing the reward.

Type:int
get_reward(portfolio: Portfolio) → float[source]

Rewards the agent for incremental increases in net worth over a sliding window.

Parameters:portfolio (Portfolio) – The portfolio being used by the environment.
Returns:float – The cumulative percentage change in net worth over the previous window_size time steps.
class tensortrade.env.default.rewards.TensorTradeRewardScheme[source]

Bases: tensortrade.env.generic.components.reward_scheme.RewardScheme

An abstract base class for reward schemes for the default environment.

get_reward(portfolio) → float[source]

Gets the reward associated with current step of the episode.

Parameters:portfolio (Portfolio) – The portfolio associated with the TensorTradeActionScheme.
Returns:float – The reward for the current step of the episode.
reward(env: tensortrade.env.generic.environment.TradingEnv) → float[source]

Computes the reward for the current step of an episode.

Parameters:env (TradingEnv) – The trading environment
Returns:float – The computed reward.
tensortrade.env.default.rewards.get(identifier: str) → tensortrade.env.default.rewards.TensorTradeRewardScheme[source]

Gets the RewardScheme that matches with the identifier.

Parameters:identifier (str) – The identifier for the RewardScheme
Returns:TensorTradeRewardScheme – The reward scheme associated with the identifier.
Raises:KeyError: – Raised if identifier is not associated with any RewardScheme