Releases: google-deepmind/open_spiel
OpenSpiel 1.5
This release adds new games, bug fixes, and process + support changes.
Support and Process changes
- Add Python 3.12 support
- Add Ubuntu 24.04, macos-13, and macos-14 to CI tests
- Remove Python 3.8 support
- Remove use of nox and noxfile.py
- Remove several things that we unused: HIGC, Ludii, eigen
- Remove C++ AlphaZero based on TF (LibTorch C++ implementation still available)
- Remove macos-12 and Ubuntu 20.04 from CI test
- Remove unfinished implementation of Yacht
Games
- Coalitional games from (Gemp et al., Approximating the Core via Iterative Coalition Sampling)
- Middle-Eastern Dominoes (python)
- Spades
- Team Dominoes (python)
- Twixt
Algorithms
- CFR (JAX implementation)
- Core Lagrangian from (Gemp et al., Approximating the Core via Iterative Coalition Sampling)
- Core via Linear Programming (Yan & Procaccia, If You Like Shapley, then You'll Love the Core)
- EFR from (Morrill et al. Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games)
- max^n (Luckhardt, C. Irani, K., An algorithmic solution of N-person game)
- Mean-Field PSRO (by Muller et al. Learning Equilibria in Mean-Field Games: Introducing Mean-Field PSRO)
- MF-PPO (Algumaei et al, Regularization of the policy updates for stabilizing Mean Field Games)
- PIMC search (Long et al. 2010, Understanding the Success of Perfect Information Monte Carlo Sampling in Game Tree Search)
- SGD in adidas_solvers, from (Gemp et al., Approximating Nash Equilibria in Normal-Form Games via Stochastic Optimization)
- Voting-as-Evaluation (VasE) voting methods from (Lanctot et al., Evaluating Agents using Social Choice Theory)
Examples
- Add example of applying VasE to Chatbot Arena
Improvements and other additions
- Several additions for chat game python game
- Add prisoner's dilemma to the docs
- Add a warning when loading games with known issues
- Add Colab example for how to use OpenSpiel with mean field games
- Add parser for PrefLib data files
- Add missing API methods in Julia API
Fixes
- Fix cards display in Dou Dizhu
- Fix dqn.cc build error
- Fix mean logit calculation in NeuRD loss for R-NaD
- Fix example.py to support games with simultaneous moves
- Fix trade_comm's: expose observer's trade proposal if one was made
- Fix numpy incompatibility change for PSRO's joint to marginal probability function
- Fix bridge observation tensor
- Fix chess state serialization
- Several fixes to universal poker (serialization, betting abstraction, observation tensors)
- Fix float tolerance in OOS test
- Normalize GTP responses to lower case
- Fix deprecated use of mask in LibTorch C++ DQN to PyTorch 2.0
- Fix bug in loading of PyTorch DQN checkpoints
- Convert Quoridor movement actions IDs to be relative to the state
Several other miscellaneous fixes and improvements.
Acknowledgments
Thanks to Google DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
OpenSpiel 1.4
This release adds new games, bug fixes, and process + support changes.
Support and Process changes
- Added (partial) Python 3.12 support: building & testing of wheels only (also distributed on PyPI)
- Added building and availability of Apple Silicon (
arm64
) binary wheels on PyPI - Removed support fpr Python 3.7 due to EOL June 2023
- Upgraded versions of supported extra packages (JAX, PyTorch, TF, etc.)
- Added
ml-collections
as a required dependency
Games
- Added Dots & Boxes
- Added Chat Game (python)
- Added MFG crowd avoidance game
- Added MFG periodic aversion game
- Modify Predator-Prey MFG to take in initial values
- (Not yet complete) Add partial implementation of Yacht
Algorithms
- Removed PyTorch NFSP (see #1008)
- Remove unnecessary policy reload in outcome sampling MCCFR (see #1115)
- Rewrite Stackelberg equilbrium solver using cvxpy (see #1123)
Examples
- Training TD n-tuple networks on 2048
Improvements and other additions
- Added a
build_state_from_history_string
helper function for debugging - GAMUT generator: expand the set of games provided by the wrapper
- Add exclude list in game simulation tests for games that are partially complete
- Refactored all games into individual directories
- Changed 2048 to exclude moves that don't change the board from legal actions
- Introduced number of tricks and change order of information in bridge observations (see #1118)
- Added missing functions for C++-wrapped TabularPolicy to pybind11
- Added missing functions to (CorrDevBuilder and C(C)E*Dist) to pybind11
- Added more examples to help debug game implementations
Fixes
- Backgammon: added the dice to the end of the observation vector
- Fixed uses of functions deprecated in NumPy 1.25
- Fixed float comparisons in playthroughs to default to 6 decimal places
- Fixed bug in entropy schedule in R-NaD (see #1076)
- Fixed bug in rho value (see #968)
- Fixed to actions of game definition of Liar's poker (see #1127)
- Fixed castling bug in chess (see #1125)
- Corrected include statements for efg_game (causing C++ DQN to not build)
Several other miscellaneous fixes and improvements.
Acknowledgments
Thanks to Google DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
OpenSpiel 1.3
This release adds several games and algorithms, improvements, bug fixes, and documentation updates.
Support and Process changes
- Added Python 3.11 support
- Added Roshambo bot population to wheels
- Removed Python 3.6 support
- Upgraded versions of supported extra packages (OR-Tools, abseil, Jax, TF, Pytorch, etc.)
Games
- Bach or Stravisnky matrix game
- Block Dominoes (python)
- Crazy Eights
- Dhu Dizhu
- Liar's poker (python)
- MAEDN (Mensch Ärgere Dich Nicht)
- Nine Men's morris
Game Transforms
- Add Noisy utility to leaves game transform
- Add Zero-sum game transform
Other environments
- Atari Learning Environment (ALE)
Algorithms
- Boltzmann Policy Iteration (for mean-field games)
- Correlated Q-learning
- Information State MCTS, Cowling et al. '12 (Python)
- LOLA and LOLA-DiCE (Foerster, Chen, Al-Shedivat, et al. '18) and Opponent Shaping (JAX)
- MIP Nash solver (Sandholm, Gilpin, and Conitzer '05)
- Proximal Policy Optimization (PPO); adapted from CleanRL. Supports single-agent use case, tested on ALE.
- Regret-matching (Hart & Mas-Colell '00) for normal-form games and as a PSROv2 meta-solver
- Regularized Nash Dynamics (R-NaD), Perolat & de Vylder et. al '22, Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
Bots
- Simple heuristic Gin Rummy bot
- Roshambo bot population (see python/examples/roshambo_bot_population.py)
Examples
- Opponent shaping on iterated matrix games example
- Roshambo population example
- Using Nash bargaining solution for negotiation example
Improvements and other additions
- Add
Bot::Clone()
method for cloning bots - Avoid relying on C++ exceptions for playthrough tests
- Add support Agent-vs-Task case in Nash averaging
- Add scoring variants to the game Oh Hell
- Add eligibility traces in C++ Q-learning and SARSA
- Allow creation of per-player random policies
- Support simultaneous move games in policy aggregator and exploitability
- Support UCIBot via pybind11
- Add single_tensor observer for all games
- Add used_indices for non-marginal solvers in PSROv2
- Add Flat Dirichlet random policy sampling
- Add several options to bargaining game: probabilistic ending, max turns, discounted utilities
- Add lambda returns support to JAX policy gradient
- Several improvements to Gambit EFG parser / support
- Add support for softmax policies in fictitious play
- Add temperature parameter to fixed point MFG algorithms
- Add information state tensor to battleship
- Add option to tabular BR to return maximum entropy BR
Fixes
- Fix UCIBot compilation in Windows
- Misc fixes to Nash averaging
- RNaD: fix MLP torso in final layer
- Dark hex observation (max length)
- Fix max game length in abstracted poker games
- Fix legal moves in some ACPC(poker) game cases
- Fix joint policy aggregator
- Fix non-uniform chance outcome sampling in Deep CFR (TF2 & Pytorch)
- Fix randomization bug in alpha_zero_torch
Several other miscellaneous fixes and improvements.
Known issues
There are a few known issues that will be fixed in the coming months.
- Collision with pybind11 and version in C++ LibTorch AlphaZero. See #966.
- PyTorch NFSP convergence issue. See #1008.
Acknowledgments
Thanks to Google DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
OpenSpiel 1.2
This release adds several games and algorithms, improvements, bug fixes, and documentation updates.
Support and Process changes
- Upgrade support for newer versions of dependencies
- Add dependency to pybind11_abseil
Games
- 2048
- Checkers
- Dynamic routing game
- Euchre
- Mancala
- Nim
- Phantom Go
Algorithms
- Asymmetric Q-learning
- Magnetic Mirror Descent (MMD)
- NeuRD (PyTorch)
- Policy gradients (JAX)
- Sample-based NeuRD loss (PyTorch)
- Stackelberg solver
- WoLF-PHC
Improvements and other additions
- Blackjack: add observation tensor
- C++ DQN: in-memory target net, saving + loading of model
- Core API reference
- Remove hard-coded inclusion of Hanabi and ACPC in setup.py
Fixes
- Colored Trails: fix max utility
- MCTS handling of chance nodes: properly handle them not just at the root
- Nash averaging optimization fix
- Othello: fix the max game length
- Policy aggregator, surface copy -> deep copy
- pybind11: change game references to shared pointers
Several other miscellaneous fixes and improvements.
Acknowledgments
Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
OpenSpiel 1.1.1
This release adds several algorithms and games, and several process changes.
Support, APIs, and Process changes
- Removed support for Python 3.6
- Added support for Python 3.10
- Upgrade support for newer versions of dependencies
- Rust API: add support for loading bots
- CI tests: added MacOS-11, MacOS-12, and Ubuntu 22.04. Removed CI tests for Ubuntu 18.04.
Games
- Colored Trails
- Dynamic Routing Game: added Sioux Falls network
- Mancala (Kalah)
- Multi-issue Bargaining
- Repeated game transform: add info state strings & tensors, utility sum, finite recall
- Sheriff: add info state tensor
Algorithms
- Boltzmann DQN
- Boltzmann Q-learning
- Correlated Q-learning (Greenwald & Hall)
- Deep average network for FP (mean-field games)
- Nash averaging (Balduzzi et al.)
- Nash Q-learning (Hu & Wellman)
Improvements and other additions
- Example: support mean-field games
- File wrapper: expose to Python and add WriteContents
- Nash bargaining score example
Fixes
- VR-MCCFR with nonzero baselines
- PyTorch policy gradient clipping
- Promote pawn to queen in RBC
- PyTorch and LibTorch DQN: fix for illegal moves
Many other fixes to docs and code quality.
Acknowledgments
Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
OpenSpiel 1.1.0
This release adds some major functionality: new games, new algorithm, several fixes and new features.
Support and APIs
- Windows: native build via Microsoft Visual Studio (experimental)
- Rust API
Games
- Amazons
- Morpion Solitaire
- Gridworld pathfinding (single-agent and multiagent)
- Linear-Quadratic games (mean-field game)
- Pig: Piglet variant added
- Quoridor: 3-player and 4-player support added
- Utlimate Tic-Tac-Toe
Algorithms
- AlphaZero support for games with chance nodes (Python and C++)
- ADIDAS approximate Nash equilibrium solver by Gemp et al. '21
- Boltzmann DQN
- Deep Online Mirror Descent (for mean-field games)
- Expectiminimax (C++)
Mean-field Games
- Deep Online Mirror Descent
- Best response value function (instead of only exact)
- Allow specifying learning rate in fictitious play
- Routing game experiment data
- Softmax policy
Bots
- WBridge5 external bot
- Roshambo bots: expose to Python
Fixes
- Chess SAN notation
get_all_states
: support added for games with loops- Hex and DarkHex bug fixes for even-sized boards
- MCTS sampling from the prior when 0-1 visits specified (Python and C++)
- Pig: 2D observation tensor,
ActionString
,MaxChanceNodesInHistory
- Stones n' Gems serialization fix
Miscellaneous
- Added
SpielFatalErrorWithStateInfo
debug helper - Refactored policies computed by RL into a shared
JointRLAgentPolicy
- Custom info state resampling function for IS-MCTS
- Hidden Information Games Competition tournament code: make optional dependency
- Upgrade versions of abseil and OR-Tools and versions in python extra deps
- Python dependency on scipy
- Poker chump policies
Acknowledgments
Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
OpenSpiel 1.0.2
This is a minor release: mainly for bug fixes, and also some new functionality and updates to core functionality.
New games and modifications
- Dynamic routing game: change to explicit stochastic (Python MFG), or deterministic (Python)
- New garnet game (randomized MDPs) for mean-field games (C++)
New algorithms and other functionality
- Restricted Nash Response (C++), Johanson et al. '08
- Update mean-field game algorithms to use value functions
- Enable Python best response to work for simultaneous-move games
Bug fixes
- Allow observation tensors for turn-based simultaneous move games
- Fixes to HIGC tournament code, add synchronous mode, explicit calls to bots
- Fix game type in built-in observer
- Fix information type for iterated prisoner's dilemma
- Fix to wheels CI testing: always use python3
Misc
- Add missing algorithms and games to algorithms page
- Add patch to our version of absl to compile with newer compilers (Ubuntu 21.10)
- Add python games to API test (now fully supported alongside all C++ games)
- Enable noisy_policy to work for simultaneous move games
- Added Common Loop Utils (CLU) to python extra deps
Acknowledgments
Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
OpenSpiel 1.0.1
This is a minor release: mainly for bug fixes, and also some new functionality and updates to core functionality.
New game
- Dynamic routing (python game and its mean-field limit game)
New functionality
- Allow TabularBestResponseMDP to be computed for a specific player
- Add Hidden Information Game Competition (HIGC) tournament code
- Add expected game score for simultaneous move games
Bug fixes
- Fix to blackjack to use standard policy for dealer
- Several fixes to Reconnaissance Blind Chess (see #695 #696 and #697)
- Update dependency to newer version of Hanabi
- Fix imperfect recall state string in Phantom Tic-Tac-Toe and Dark Hex
- Fix noisy policy (see 2703b20)
- Fix UndoAction for a number of games, add test for it (also remove UndoAction from some games)
Acknowledgments
Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
OpenSpiel 1.0.0
This is our first major stable release and fully-supported entry into pip/PyPI (binary distribution wheels and build from source).
New functionality since v0.3.1
Games
- Dark Chess
- Dark Hex
- Imperfect recall variants of:
- Dark Hex
- Liar's Dice
- Phantom Tic-Tac-Toe
- Kriegspiel
- Mean-field games:
- Crowd modelling (C++ and Python)
- Crowd modelling 2D (C++)
- Predator prey (Python)
- Python games:
- Iterated prisoner's dilemma
- Tic-Tac-Toe
- Reconnaissance Blind Chess
Algorithms:
- Deep CFR (JAX)
- DQN (C++, via Libtorch)
- DQN (JAX)
- Fixed Strategy Iteration CFR (FSICFR) (Neller & Hnath '11)
- Joint Policy-Space Response Oracles, JPSRO (Marris et al. '21)
- Mean-field game algorithms:
- best response / NashConv
- Fictitious Play
- Mirror Descent
- NFSP (JAX)
- Tabular best response MDP (C++): alternative best response, including proper support for perfect info games and imperfect recall
Bots:
- UCI (chess-playing) Bot
- Gin Rummy: Simple Bot
API
- golang
- Mean-field games
Examples
- DQNBR: computing an approximate best response using DQN
- FSICFR in Liar's Dice
- JSPRO usage example
- MCCFR on imperfect recall games
- Mean-field games: JAX DQN
Support and Process changes:
- Building and testing of pip binary distribution wheels via cibuildwheel (nox tests removed)
- Python dependencies: make most dependencies optional, depend only on those truly required
- pybind11: use smart_holder (and depend on smart_holder branch)
- Support g++ again (used for building bdist wheels)
- Support Python 3.9
Misc
- AlphaZero (C++ Libtorch) support for checkpointing
- Connect Four: add ResampleFromInformationState
- Gin Rummy: observer and parameterizing the game size
- Game-specific functions: chess, backgammon
- Poker: add half-pot abstraction, add total money, support subgames
- Utilities: bit permutation function
Fixes and Documentation
We added two video tutorials (by Marc & Ed) linked from the main site. We also added a link to the main page about building and using OpenSpiel as a C++ library.
- AlphaZero Libtorch: construct loss from policy logits
- AlphaZero (TF-based): document status externally (unsupported)
- Alpha-rank visualization: minor fix to deprecated matplotlib function
- Argslib: fix multiple command-line arguments
- Build: fix to enable optimizations by default
- Docker build: fix to commands and documentation improvements
- CFR tabular average policy computation more efficient
- Refactor and cleanup of Python MCCFR
- OshiZumo: fix side swap
- PolicyBot: use keys instead of legal actions
- Poker: add card set and other tests
- Python games: many improvements and better overall support
- PyTorch Deep CFR: fix MLP initialization issues and policy_net sizes parameter
- Scripts: add caching to install and build scripts
- Tabular Sarsa and Q-Learning (C++)
- Tests: refactoring, add new tests that allow disabling of legal masks check and game-specific state-checker hook
Thanks
Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
OpenSpiel 0.3.1
This addresses the problem that the new Python games were not compatible with the old version of OpenSpiel used by our pip package since it is too far behind (i.e. a fix to #503).
This version has no differences from 0.3.0. It only exists to match the version required by replacing the package hosted on PyPI. Hence, this release is identical to 0.3.0.
New Functionality (from 0.2.0)
Games
- Add Dark Hex
- Add Kuhn (new Python game)
- Tic-Tac-Toe (Python game, updated to new API)
- Liar's Dice: new bidding variant, and configurable number of faces
- Trade Comm: add info state string
- Backgammon: expos action conversion functions available to Python
Algorithms
- Deep CFR (PyTorch)
- EVA (PyTorch)
- Policy gradients (PyTorch)
- Tabular Sarsa (C++)
- Tabular Q-Learning (C++)
- Sequence-form linear programming (C++)
- Variance Reduction baselines (VR-MCCFR) in Python
Metrics
- Distance to correlated and coarse-correlated equilibrium, for extensive-form games): CEDist and CCEDist (C++)
Examples
- Poker "fold, call, pot, all-in" (FCPA) abstracted no-limit example (Python)
- Generating multiple equilibria using CFR with random initial regrets and MCCFR
Process
- Move from Travis to Github Actions for continuous integration (CI)
- Update versions: TF, Jax, Julia
Misc.
- Allow registration of observers
- Distinction between perfect recall and imperfect recall example
- Information state trees
- JupyterLab environment (two Dockerfiles)
- Allow Python games from C++, now fully compatible with main C++ API
- Expose repr (Python)
Fixes and Documentation updates
- Fix network construction in Exploitability Descent example
- Fix GameParameters template compiler causing issues on older compilers
- Fix Observation consistency in Trade Comm
- Fix joining processes in Python AlphaZero
- Fix HistoryString as identifier in HistoryTree
- Improved documentation for the new observation API
- Many other smaller fixes
Thanks
Thanks to DeepMind for continued support of development and maintenance of OpenSpiel.
Thanks to all of our contributors:
- Core Team: https://github.com/deepmind/open_spiel/blob/master/docs/authors.md
- All Contributors: https://github.com/deepmind/open_spiel/graphs/contributors
Files
The open_spiel-source-0.3.1.tar.gz bundles the necessary and some optional dependencies along with the core code (pybind11, Hanabi, ACPC, etc.) and should be able to be built directly without any additional downloads