Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keras 3: Streamlined Backend #159

Merged
merged 655 commits into from
Sep 17, 2024
Merged

Keras 3: Streamlined Backend #159

merged 655 commits into from
Sep 17, 2024

Conversation

LarsKue
Copy link
Collaborator

@LarsKue LarsKue commented Apr 15, 2024

For Users

This PR contains a complete overhaul of BayesFlow for version 2.0. The primary features introduced with this PR are listed below. However, instead of reading all of this, we really recommend just diving straight in with an example.

Multi-Backend support via Keras 3

  • Remove all tensorflow dependencies
  • Use your backend of choice: JAX, PyTorch, or TensorFlow
  • Quickly switch between backends, by setting the KERAS_BACKEND environment variable before importing BayesFlow:
import os
# "jax", "torch", or "tensorflow"
os.environ["KERAS_BACKEND"] = "torch"

import bayesflow as bf  # uses torch backend

Note that this requires:

  1. You have the respective backend installed
  2. None of the code you inject into BayesFlow is backend-specific

New to deep learning? We recommend starting out with PyTorch for debugging and moving to JAX once your code runs to gain maximum performance.

Introduce a Keras-idiomatic training pattern

Keras provides access to a lot of deep learning utilities, like Multi-GPU training, worker-process data loading, multi-batch gradient accumulation, metric logging, ... To facilitate this, we

  • Removed Trainer objects
  • Moved training strategy (online vs. offline) into the data-generating process (bayesflow.datasets)
  • Rewrite AmortizedPosterior and similar models as Keras 3 models

Introduce a user-friendly, named-parameter data flow

Many-parameter inference can get confusing, particularly when users have to deal with BayesFlow-internal keys, like prior_draws, sim_data, or most infamously, batchable_context and non_batchable_context. To make this easier, we

  • Removed Configurators entirely
  • Add an interface to easily define simulators
  • Use data adapters to ensure data handled on the user side is now always in dictionary form
    • {parameter_name: parameter_value}
    • Users choose the names of these parameters for their respective application
    • Power users may implement a custom data adapter to use any data structure they like
  • Samples drawn from the posterior distributions are now automatically split back into dictionary form

Examples

Check out the example notebooks. We will add more example notebooks as time goes on.

For Devs

There are many internal changes in this PR aimed at facilitating a fast and structured development process:

  • Modularize Tests with pytest, tox and GitHub workflows
  • Improve general structure of library with sub-packages
  • Add type hints all throughout the library
  • Add pre-commit hooks for linting and formatting
  • Improve dev environment setup by providing an environment.yaml
  • Refactor or rewrite messy parts of the library
  • Introduce a host of optimizations alongside Keras 3

@LarsKue LarsKue added refactoring Some code shall be redesigned unit tests A new set of tests needs to be added. labels Apr 15, 2024
@LarsKue LarsKue self-assigned this Apr 15, 2024
@paul-buerkner
Copy link
Collaborator

paul-buerkner commented Apr 15, 2024

@LarsKue Thank you so much! This already looks amazing!

Could you perhaps add a simple fully runnable example here for people to get started playing around with it? It is kind of there above, but I think it would make things easier to have one chunk of example code to copy and edit from there.

Everyone, please try out the new interface and tell us what you think!

@LarsKue
Copy link
Collaborator Author

LarsKue commented Apr 15, 2024

@paul-buerkner Yes, I am working on it. I hope I have one ready today.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@marvinschmitt
Copy link
Collaborator

Great work!! 👏

I'll write down some thoughts on the installation process. Those don't need any changes in the streamlined codebase but are just reminders for our future selves shortly before the release.

  • In addition to keras (which will replace the current tensorflow dependencies), the user has to install their favorite backend. How do we approach this for users who aren't proficient in Python-ecosystem stuff? A few initial thoughts:
    • Extras like pip install bayesflow[torch]. Advantage: Easy interface. Disadvantage: Restricted to pip, no mamba equivalent. I don't like that option.
    • Message during the bayesflow installation. That's annoying (if possible at all?) and I wouldn't do that either.
    • After installation, upon running bayesflow: Catch any errors that relate to missing backend packages and provide a comprehensive error message with concrete pointers on how to install the necessary backend to fix the issue. Currently my favorite option.
  • Python >=3.11 is required for typing.Self

@codecov-commenter
Copy link

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

Chase-Grajeda and others added 21 commits July 2, 2024 14:49
Cleaned lotka_volterra.py
Added inverse_kinematics.py.
Updated docs in benchmark.py.
Cleaned lotka_volterra.py.
Cleaned sir.py.
Added remaining benchmarks:
bernoulli_glm.py
bernoulli_glm_raw.py
gaussian_linear.py
gaussian_linear_uniform.py
gaussian_mixture.py
slcp.py
slcp_distractors.py
ruff formatting changes
ruff E741: fixed ambiguous variables names
Added test_sequential_simulators to test_simulators/test_simulators.py.
Removed extra whitespace in seqential_simulator.py

Co-authored-by: Lars <lars@kuehmichel.de>
stefanradev93 and others added 27 commits August 30, 2024 21:58
I did some rough manual tests, but we still need automated tests for these
…kend

# Conflicts:
#	bayesflow/amortizers.py
#	bayesflow/configuration.py
#	bayesflow/helper_networks.py
#	bayesflow/losses.py
#	bayesflow/simulation.py
#	examples/TwoMoons_Bimodal_Posterior.ipynb
#	pyproject.toml
#	requirements_dev.txt
@stefanradev93 stefanradev93 merged commit 5fc0085 into dev Sep 17, 2024
26 checks passed
@LarsKue
Copy link
Collaborator Author

LarsKue commented Sep 17, 2024

🚀

@marvinschmitt marvinschmitt deleted the streamlined-backend branch September 23, 2024 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Some code shall be redesigned unit tests A new set of tests needs to be added.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants