Skip to content

[JMLR-2024] PyPop7: A Pure-Python Library for POPulation-based Black-Box Optimization (BBO), especially their *Large-Scale* versions/variants (-> evolutionary algorithms/swarm-based optimizers/pattern search/random search/...). https://jmlr.org/papers/v25/23-0386.html

License

Notifications You must be signed in to change notification settings

Evolutionary-Intelligence/pypop

Repository files navigation

PyPop7: A Pure-PYthon library of POPulation-based OPtimization in Black-Box cases (JMLR-2024)

JMLR PyPI for PyPop7 Docs arxiv Downloads Downloads visitors WeChat

PyPop7 is a Pure-PYthon library of POPulation-based OPtimization for single-objective, real-parameter, black-box problems. Its goal is to provide a unified interface and a set of elegant algorithmic implementations (e.g., evolutionary algorithms, swarm-based optimizers, and pattern search) for Black-Box Optimization (BBO), particularly population-based optimizers, in order to facilitate research repeatability, wide benchmarking of BBO, and especially their real-world applications.

More specifically, for alleviating curse of dimensionality, one focus of PyPop7 is to cover their State Of The Art for Large-Scale Optimization (LSO), though many of their small- and medium-scaled versions or variants are also included here (mainly for theoretical or benchmarking or educational purposes). For a growing list of public use cases of PyPop7, please refer to this online website for details. Although we have chosen GPL-3.0 license, anyone could use, modify, and improve this open-source library entirely freely for any (no matter open-source or closed-source) purposes.

How to Quickly Use

The following three steps are enough to utilize the black-box optimization power of this open-source pure-Python library PyPop7:

  1. Use pip to install pypop7 on the Python3-based virtual environment via venv or conda:
$ pip install pypop7

For PyPop7, the number 7 was added just because pypop has been registered by other in PyPI. The icon butterfly for PyPop7 is used to respect/allude to the book (butterflies in its cover) of Fisher ("the greatest of Darwin's successors"): The Genetical Theory of Natural Selection, which directly inspired Prof. Holland's proposal of Genetic Algorithms (GA).

  1. Define the objective/cost/loss/error/fitness function to be minimized for the optimization problem at hand (in this library, the term fitness function is used, following the established terminology tradition of evolutionary computation):
import numpy as np  # for numerical computation, also as the computing engine of pypop7

# Rosenbrock, one notorious test function from the optimization community
def rosenbrock(x):
    return 100.0 * np.sum(np.square(x[1:] - np.square(x[:-1]))) + np.sum(np.square(x[:-1] - 1.0))

# to define the fitness function to be *minimized* and also its problem settings
ndim_problem = 1000
problem = {'fitness_function': rosenbrock,
           'ndim_problem': ndim_problem,  # dimension
           'lower_boundary': -5.0 * np.ones((ndim_problem,)),  # lower search boundary
           'upper_boundary': 5.0 * np.ones((ndim_problem,))}  # upper search boundary

Without loss of generality, only the minimization process is considered here, since maximization can be easily transferred to minimization just by negating it.

  1. Run one black-box optimizer or more on the above optimization problem (rosenbrock):
# to choose e.g., LM-MA-ES owing to its low complexity and metric-learning ability for LSO:
#   -> please refer to https://pypop.readthedocs.io/en/latest/es/lmmaes.html for more details
from pypop7.optimizers.es.lmmaes import LMMAES  # Limited-Memory Matrix Adaptation ES
# to define algorithm options (which may differ in details among different optimizers)
options = {'fitness_threshold': 1e-10,  # to terminate when best-so-far fitness is <= it
           'max_runtime': 3600.0,  # to terminate when actual runtime >= 1 hours (3600 seconds)
           'seed_rng': 0,  # seed of random number generation (should be set for repeatability)
           'x': 4.0 * np.ones((ndim_problem,)),  # initial mean of search/mutation distribution
           'sigma': 3.0,  # initial global step-size of distribution (not necessarily optimal)
           'verbose': 500}
lmmaes = LMMAES(problem, options)  # to initialize the black-box optimizer under a unified API
results = lmmaes.optimize()  # to run its (often time-consuming) randomized evolution process
print(results)

Please refer to https://pypop.rtfd.io/ for online documentations of this well-designed ("self-boasted") Python library for black-box optimization (several praises from others).

"In our opinion, the main fact, which should be known to any person dealing with optimization models, is that in general, optimization problems are unsolvable. This statement, which is usually missing in standard optimization courses, is very important for understanding optimization theory and the logic of its development in the past and in the future."

A (Still Growing) Number of Black-Box Optimizers (BBO)

"Optimization algorithms are often designed for a specific type of search space, exploiting its specific structure." ... "The main lesson of the development of our field in the last few decades is that efficient optimization methods can be developed only by intelligently employing the structure of particular instances of problems."


  • lso: indicates the specific BBO version for LSO (e.g., dimension > 100, but not an absolutely deterministic number, depending upon problems and time),
  • c: indicates the competitive or de facto BBO version for small- or medium-dimensional problems (though it may also work well under some certain LSO circumstances),
  • b: indicates the baseline BBO version mainly for theoretical and/or educational interest, owing to its algorithmic simplicity (usually relative ease to mathematical analyze).

Note that this above classification based on only the dimension of objective function is just a very rough estimation for algorithm selection. In practice, perhaps the simplest way to algorithm selection is trial-and-error. Otherwise, you can try more advanced Automated Algorithm Selection techniques.


Clearly, this is an algorithm-centric rather than benchmarking-centric Python library only for black-box optimization, though proper benchmarking is also crucial for BBO, as shown below.

For new/missed BBO, we have provided a unified API to freely add them if they can well satisfy the design philosophy widely recognized in the scientific research community. Note that currently both Ant Colony Optimization (ACO) and Tabu Search (TS) are not covered in this library, since they work well mainly in discrete or combinatorial search spaces in many cases. Furthermore, both brute-force (exhaustive) search and grid search are also excluded here, since it works only for very low (typically < 10) dimensions. In the near-future version, we will consider to add others (e.g., Simultaneous Perturbation Stochastic Approximation (SPSA)) into this open-source library. Please refer to development guide for more details.

Computational Efficiency

For large-scale optimization (LSO), computational efficiency is an indispensable performance criterion of BBO/DFO/ZOO in the post-Moore era. To obtain high-performance computation as much as possible, NumPy is heavily used in this library as the base of numerical computation along with SciPy and scikit-learn. Sometimes Numba is also utilized, in order to further accelerate the wall-clock time.

Folder Structure

The first-level folder structure of this library PyPop7 is presented below:

  • .circleci: for automatic testing based on pytest.
    • config.yml: configuration file in CircleCI.
  • .github: all configuration files for GitHub.
  • docs: for online documentations.
  • pypop7: all Python source code of BBO.
  • tutorials: a set of tutorials.
  • .gitignore: for GitHub.
  • .readthedocs.yaml: for readthedocs.
  • CODE_OF_CONDUCT.md: code of conduct.
  • LICENSE: open-source license.
  • README.md: basic information of this library.
  • pyproject.toml: for PyPI (used by setup.cfg as build-system).
  • requirements.txt: only for development.
  • setup.cfg: for PyPI (used via pyproject.toml).

References

For each population-based algorithm family, we are providing several representative applications published on some (rather all) top-tier journals/conferences (such as, Nature, Science, PNAS, PRL, JACS, JACM, PIEEE, JMLR, ICML, NeurIPS, ICLR, CVPR, ICCV, RSS, just to name a few), reported in the (now still actively-updated) paper list called DistributedEvolutionaryComputation.

Sponsor

Now it is supported by National Natural Science Foundation of China under Grant No. 72401122, Guangdong Basic and Applied Basic Research Foundation under Grant No. 2024A1515012241 and 2021A1515110024. From 2021 to 2023, this open-source pure-Python library PyPop7 was supported by Shenzhen Fundamental Research Program under Grant No. JCYJ20200109141235597 (a total of 2,000,000 Yuan).

Citation

If this open-source pure-Python library PyPop7 is used in your paper or project, it is highly welcomed but NOT mandatory to cite the following arXiv preprint paper: Duan, Q., Zhou, G., Shao, C., Wang, Z., Feng, M., Huang, Y., Tan, Y., Yang, Y., Zhao, Q. and Shi, Y., 2024. PyPop7: A pure-Python library for population-based black-box optimization. arXiv preprint arXiv:2212.05652. (Now it has been submitted to JMLR, after 3 reviews from Tue, 28 Mar 2023 to Wed, 01 Nov 2023 to Fri, 05 Jul 2024, and accepted in Fri, 11 Oct 2024.)

Star History

visitors

PyPop7-Star-Data