Skip to content

0.2.0

Compare
Choose a tag to compare
@fastturtle fastturtle released this 23 Oct 15:59
· 941 commits to master since this release

Highlights

  • Using stable releases for TensorFlow (>=2.3.0), Reverb, and TensorFlow Probability.
  • Added Critic Regularized Regression (code, paper)
  • Added Discrete Batch-Constrained Deep Q-learning (code, paper)
  • Added EnvironmentLoop.run_episode() for running a single episode.
  • Update EnvironmentLoop.run() to take num_steps, allowing the control of step count rather than just episode count.
  • Add more distribution types (e.g. GaussianMixture) which can be used by policies.
  • Added a environment wrapper for action repeats.
  • Improvements/tuning to datasets exposed by make_dataset.
  • Add support for nested / multidimensional rewards and discounts.

Minor changes and fixes

  • ConstantInfo logger for logging constant information.
  • Added a should_update parameter to the EnvironmentLoop.
  • Various modifications and optimizations to the make_reverb_dataset() function.
  • Improvements to typing and pytype usage.
  • Other minor bug and documentation fixes.