Skip to content

Latest commit

 

History

History
31 lines (23 loc) · 930 Bytes

TODO.md

File metadata and controls

31 lines (23 loc) · 930 Bytes

Project TODOs

  1. [√] Implement and test distributed agent(agent_distributed.py)

  2. Implement and test diffusion stochastic MuZero

    • Add diffusion model components(Rectified Flow)
      • Decide jax or tensorflow, by reading mctx
    • Test Stochastic MuZero(refer to mctx.stochastic_muzero_policy)
    • Implement sampled MuZero mechanism
  3. Implement and test learning MCTS as policy improvement

  4. Environments:

    • [√] Open spiel Game Go
    • Atari 100k
    • dm_control
    • safety (review)
  5. Experiment with different search policies

    • on open spiel game Go
    • on stochastic Multi-Arm Bandits
    • implement ltr with similar processing as AlphaZero
    • find out equation equivalent for ltr

Notes

  • Each implementation should include comprehensive testing

Progress Tracking

  • Start Date: [11/6/2024]
  • Target Completion: [12/15/2024]