Replies: 3 comments 2 replies
-
Same issue applies to Chapter 5. |
Beta Was this translation helpful? Give feedback.
-
The reward is intentionally set to be the profit/loss ( Why do you think this is an issue (other than not being strictly Markovian)? |
Beta Was this translation helpful? Give feedback.
-
I think there are two options to design rewards:
Your current setting |
Beta Was this translation helpful? Give feedback.
-
Hi, @praveen-palanisamy
In CryptoTradingEnv,
reward
is computed based on ALL action steps, instead of current (single) actionTensorflow-2-Reinforcement-Learning-Cookbook/Chapter04/crypto_trading_env.py
Line 92 in 31f8376
Do you think compute
reward
byself.account_value - previous_account_value
might be more appropriate?Beta Was this translation helpful? Give feedback.
All reactions