Skip to content

maro-0.3.2a1 🚀

Compare
Choose a tag to compare
@Jinyu-W Jinyu-W released this 30 Mar 02:01
· 5 commits to master since this release
ef2a358
  • Refine RL workflow
    • Add **kwargs to support more problem setting (e.g., Graph based ones) (#589)
      • add **kwargs to RL models' forward funcs and _shape_check()
      • add **kwargs to RL policies' get_action related funcs and _post_check()
      • add **kwargs to choose_actions of AbsEnvSampler; remain it None in current sample() and eval()
    • Add detached loss to the return value of update_critic() and update_actor() of current TrainOps; add default False early_stop to update_actor() of current TrainOps (#589)
    • Refine random seed setting logic in RL workflow (#584)
    • Refine rollout workflow (#577) to support:
      • Run a specific number of steps in rollout
      • Run a specific number of episodes during evaluation with num_eval_episodes
      • Flexible metrics management during rollout with AbsEnvSampler.metrics
    • Add AbsEnvSampler.metrics to support flexible metrics management during roullout (#577)
    • Add Callback as a general interface to support customized operations in each phase of the workflow.
      • Two instances Checkpoint and MetricsRecorder are added. (#577)
      • Add customized_callbacks to RLComponentBundle. (#589)
    • Re-organize RL job's output paths. (#577)
    • Fix several RL algorithm bugs. (#577, #589)
  • Replace the numpy data type with python common data type in whole project (#571)
  • Add RL benchmark on Mujoco as a module to tests/, compared with spinning up benchmark, performance results can be found in tests/rl/performance.md (#575, #577, #583, #584)
  • Other minor code refinements