Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doe categorical second attempt #259

Merged
merged 125 commits into from
Aug 16, 2023
Merged
Show file tree
Hide file tree
Changes from 108 commits
Commits
Show all changes
125 commits
Select commit Hold shift + click to select a range
61d2557
setting up bofire
May 24, 2023
9bc67b3
simple relaxed categorical example in jupyter notebook
ufukguenes May 31, 2023
e0b15b3
merging main
ufukguenes May 31, 2023
63f41bb
adding exhaustive search for optimization problems with binary variab…
ufukguenes Jun 15, 2023
27c8ba5
showcasing how to use binary variables
ufukguenes Jun 15, 2023
9af4768
changes in jupyter notebooks
ufukguenes Jun 20, 2023
bbb76c8
merging main
ufukguenes Jun 20, 2023
536ef0e
adding constraint mapper
ufukguenes Jun 28, 2023
6c22b3d
fixing bug with dtype
ufukguenes Jul 3, 2023
86fd245
stated change to one-hot-encoding
ufukguenes Jul 4, 2023
668892e
allowing for multiple groups
ufukguenes Jul 5, 2023
e1a04a1
starting with branch-and-bound
ufukguenes Jul 7, 2023
588900b
rough sketch of bnb
ufukguenes Jul 7, 2023
76c0b90
asserting that design variables is actually a 1d array
ufukguenes Jul 10, 2023
067cc2d
asserting that design variables is actually a 1d array
ufukguenes Jul 10, 2023
3275b96
using the right argument for optimality criteria
ufukguenes Jul 10, 2023
7a7b7d2
finishing first bab implementation
ufukguenes Jul 10, 2023
9313aa1
changing __str__
ufukguenes Jul 10, 2023
a272191
resetting test for valid solution
ufukguenes Jul 10, 2023
ac33853
printing information of branching
ufukguenes Jul 11, 2023
7c5c952
allowing for relaxed discrete variables, and fixing bug with sequence…
ufukguenes Jul 11, 2023
831858e
fixing docstring
ufukguenes Jul 11, 2023
3d14cff
skipping non-valid-designs and branching for discrete values
ufukguenes Jul 12, 2023
09291e1
catching case for list length 1
ufukguenes Jul 12, 2023
f293597
cringe... renaming function
ufukguenes Jul 13, 2023
17499ee
removing redundant code
ufukguenes Jul 13, 2023
fe8d243
bug fix/ removing code with no effect
ufukguenes Jul 13, 2023
9c81814
renaming variable to make intention clear
ufukguenes Jul 13, 2023
cbe6279
start including new doe strategies in api
ufukguenes Jul 18, 2023
26f7741
adding categorical groups to the domain
ufukguenes Jul 18, 2023
d668b52
changing to categorical groups from domain
ufukguenes Jul 18, 2023
d03b51c
correcting docstring
ufukguenes Jul 18, 2023
d9d4da8
start writing tests for categorical and discrete variables
ufukguenes Jul 18, 2023
c3d2782
catching branches in exhaustive search which do not fulfill constraints
ufukguenes Jul 18, 2023
74961dc
simple tests for bab and exhaustive search
ufukguenes Jul 19, 2023
ea300ac
adding optional print statements for information about the optimizati…
ufukguenes Jul 19, 2023
a066fd0
adding optional print statements for information about the optimizati…
ufukguenes Jul 19, 2023
2656f4d
bug fix, allowing arbitrary objective function and model_type
ufukguenes Jul 19, 2023
bca60fe
bug fix, fixing and partially fixing experiments can now be combined
ufukguenes Jul 19, 2023
9306304
skipping branches with already fixed experiments
ufukguenes Jul 19, 2023
44ff17a
skipping branches with already fixed experiments in exhaustive search
ufukguenes Jul 19, 2023
1240e91
removing unused code from binary vars and giving warning when unsuita…
ufukguenes Jul 19, 2023
6380131
adding documentation
ufukguenes Jul 19, 2023
61a7c48
adding documentation
ufukguenes Jul 19, 2023
1bdf2bb
adding documentation
ufukguenes Jul 19, 2023
aa13771
adding documentation
ufukguenes Jul 19, 2023
a3f808f
bug fix, now also testing if solution for discrete variables are also…
ufukguenes Jul 19, 2023
b810036
bug fix, now also testing if solution for discrete variables are also…
ufukguenes Jul 19, 2023
a8398fd
adding documentation
ufukguenes Jul 19, 2023
ba3197a
Merge branch 'main' into doe_categorical
ufukguenes Jul 19, 2023
942169b
bug fix, where error occurred if fixed_experiments was None
ufukguenes Jul 19, 2023
bc5f280
adjusting tolerances for testing for valid solution
ufukguenes Jul 20, 2023
f748993
adjusting tolerances for testing for valid solution
ufukguenes Jul 24, 2023
3d853d0
raising error when to many (partially)-fixed experiments are provided
ufukguenes Jul 25, 2023
609d44c
bug fix, sorting (partially) fixed experiments and initial guess if p…
ufukguenes Jul 25, 2023
81e8f9f
adapting branch-and-bound to new fixed experiments usage
ufukguenes Jul 25, 2023
2da0a43
bug fix, allowing to fix candidates with .tell
ufukguenes Jul 26, 2023
bf5b3ae
allowing to partially fix experiments with .tell and with all strategies
ufukguenes Jul 27, 2023
9cf64be
allowing to use either equality or inequality and changing rhs
ufukguenes Jul 28, 2023
564a472
reverting, we can only do exactly 1,
ufukguenes Jul 28, 2023
c90bedb
bug fix, if partially_fixed_experiments are none, and adding time inf…
ufukguenes Jul 29, 2023
f9f1fef
adding error for using discrete var in exhaustive search, bug fix whe…
ufukguenes Jul 30, 2023
66bc550
renaming
ufukguenes Jul 30, 2023
92e3249
adding information about how many branches have been explored
ufukguenes Aug 1, 2023
0f27616
added NChooseKGroup_with_quantity (helper function) and mapping from …
ufukguenes Aug 2, 2023
4b0ea1d
bug fix, NChooseKGroup_with_quantity (helper function) and allowing t…
ufukguenes Aug 2, 2023
067290d
making some arguments optional
ufukguenes Aug 2, 2023
e540695
fixing optional arguments
ufukguenes Aug 2, 2023
0419bd8
adding documentation
ufukguenes Aug 2, 2023
2f51808
Update documentation bofire/data_models/constraints/nonlinear.py
ufukguenes Aug 3, 2023
e7b1744
Merge branch 'main' into doe_categorical
ufukguenes Aug 3, 2023
1e18faf
refactoring RelaxableBinaryInput, RelaxableDiscreteInput, they are no…
ufukguenes Aug 7, 2023
c0e075c
refactoring generate_mixture_constraint and deleting unused functions
ufukguenes Aug 7, 2023
7e82653
allowing to use the old strategy to solve nchoosek constraints
ufukguenes Aug 7, 2023
aaf4186
reversing accidental commit
ufukguenes Aug 8, 2023
8b75c06
refactoring functions
ufukguenes Aug 8, 2023
5a4d9ff
removing check for initial guess, as we can also allow non-valid init…
ufukguenes Aug 8, 2023
947a020
adding initial guess based on design of previous branch
ufukguenes Aug 8, 2023
69b9321
bug fix
ufukguenes Aug 8, 2023
d9368fb
Merge branch 'main' into doe_categorical
ufukguenes Aug 8, 2023
d85990a
deleting old example
ufukguenes Aug 8, 2023
5a18e81
merge main
ufukguenes Aug 8, 2023
bf444ce
Merge branch 'main' into doe_categorical
ufukguenes Aug 8, 2023
7ec9c9d
fixing typing
ufukguenes Aug 8, 2023
755817f
looser tolerances and pruning branches where ipopt does not satisfy c…
ufukguenes Aug 9, 2023
c80f87a
skipping fixations where ipopt does not satisfy constraints
ufukguenes Aug 9, 2023
4fe97d7
reverting commit, where I skip the is_fulfilled test
ufukguenes Aug 9, 2023
f1a019f
typing
ufukguenes Aug 9, 2023
25a90ae
adding not implemented error
ufukguenes Aug 9, 2023
c4e8d9c
typing
ufukguenes Aug 9, 2023
0ffb9ed
fixing tests
ufukguenes Aug 9, 2023
17bd61b
Merge branch 'doe_categorical' of https://github.com/experimental-des…
ufukguenes Aug 9, 2023
ba29c13
Delete .gitattributes
ufukguenes Aug 9, 2023
1ce4b8c
Delete .idea directory
ufukguenes Aug 9, 2023
bedd26c
typing
ufukguenes Aug 9, 2023
b1c3101
removing outdated tests, and fixing existing ones
ufukguenes Aug 10, 2023
b6a43ee
evaluating with d_optimality requires a 1D array
ufukguenes Aug 10, 2023
1e2f01c
adding test for categorical and discrete doe with nchoosek
ufukguenes Aug 10, 2023
c8f714e
Merge branch 'main' into doe_categorical
ufukguenes Aug 10, 2023
4323b5c
typing
ufukguenes Aug 10, 2023
d27b97b
typing
ufukguenes Aug 10, 2023
1c5e61a
Merge branch 'main' into doe_categorical
ufukguenes Aug 10, 2023
38e8918
adding random seed
ufukguenes Aug 10, 2023
f2f5f5a
adapting test
ufukguenes Aug 10, 2023
5300768
adapting (partially) fixed experiments
ufukguenes Aug 10, 2023
44666e6
typing
ufukguenes Aug 11, 2023
6041d28
ignore typing
ufukguenes Aug 14, 2023
17d0424
ignore typing
ufukguenes Aug 14, 2023
f4804c0
started beautiful fix
ufukguenes Aug 14, 2023
a89bced
reverting beautiful fix
ufukguenes Aug 14, 2023
c98e5ac
quick fix
ufukguenes Aug 14, 2023
e8430b9
Update test_doe.py
ufukguenes Aug 14, 2023
8adfeda
fix test
ufukguenes Aug 14, 2023
385ed9f
merge
ufukguenes Aug 14, 2023
6fa912c
going back to beautiful fix
ufukguenes Aug 14, 2023
40bd6ef
removing Relaxable Features
ufukguenes Aug 14, 2023
0e04efb
deleting unnecessary check
ufukguenes Aug 14, 2023
fa23eb7
adding set candidates and test
ufukguenes Aug 15, 2023
3918ade
typing
ufukguenes Aug 15, 2023
6942e6f
typing
ufukguenes Aug 15, 2023
8769201
bug fix
ufukguenes Aug 15, 2023
ef8ba0f
refactoring
ufukguenes Aug 15, 2023
b5637df
removing unnecessary line
ufukguenes Aug 15, 2023
ed04bce
merging tutorials from main
ufukguenes Aug 15, 2023
5c9366f
merging tutorials from main
ufukguenes Aug 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions bofire/data_models/constraints/nchoosek.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ def is_fulfilled(self, experiments: pd.DataFrame, tol: float = 1e-6) -> pd.Serie
Returns:
bool: True if fulfilled else False.
"""

cols = self.features
sums = (np.abs(experiments[cols]) > tol).sum(axis=1)

Expand Down
4 changes: 2 additions & 2 deletions bofire/data_models/constraints/nonlinear.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def jacobian(self, experiments: pd.DataFrame) -> pd.DataFrame:


class NonlinearEqualityConstraint(NonlinearConstraint):
"""Nonlinear inequality constraint of the form 'expression <= 0'.
"""Nonlinear equality constraint of the form 'expression == 0'.

Attributes:
expression: Mathematical expression that can be evaluated by `pandas.eval`.
Expand All @@ -91,7 +91,7 @@ def __str__(self):


class NonlinearInequalityConstraint(NonlinearConstraint):
"""Linear inequality constraint of the form 'expression == 0'.
"""Nonlinear inequality constraint of the form 'expression <= 0'.

Attributes:
expression: Mathematical expression that can be evaluated by `pandas.eval`.
Expand Down
2 changes: 1 addition & 1 deletion bofire/data_models/domain/domain.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ def validate_linear_constraints(cls, v, values):
# gather continuous inputs in dictionary
continuous_inputs_dict = {}
for f in values["inputs"]:
if type(f) is ContinuousInput:
if isinstance(f, ContinuousInput):
continuous_inputs_dict[f.key] = f

# check if non continuous input features appear in linear constraints
Expand Down
9 changes: 6 additions & 3 deletions bofire/data_models/strategies/doe.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@

from bofire.data_models.constraints.api import Constraint
from bofire.data_models.features.api import (
CategoricalInput,
DiscreteInput,
Feature,
MolecularInput,
)
Expand All @@ -22,14 +20,19 @@ class DoEStrategy(Strategy):
],
str,
]
optimization_strategy: Literal[
"default", "exhaustive", "branch-and-bound", "partially-random", "relaxed"
] = "default"

verbose: bool = False

@classmethod
def is_constraint_implemented(cls, my_type: Type[Constraint]) -> bool:
return True

@classmethod
def is_feature_implemented(cls, my_type: Type[Feature]) -> bool:
if my_type in [CategoricalInput, DiscreteInput, MolecularInput]:
if my_type in [MolecularInput]:
return False
return True

Expand Down
6 changes: 6 additions & 0 deletions bofire/data_models/strategies/samplers/polytope.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@
Feature,
)
from bofire.data_models.strategies.samplers.sampler import SamplerStrategy
from bofire.strategies.doe.utils_features import (
RelaxableBinaryInput,
RelaxableDiscreteInput,
)


class PolytopeSampler(SamplerStrategy):
Expand Down Expand Up @@ -45,4 +49,6 @@ def is_feature_implemented(cls, my_type: Type[Feature]) -> bool:
CategoricalInput,
DiscreteInput,
CategoricalDescriptorInput,
RelaxableBinaryInput,
RelaxableDiscreteInput,
]
232 changes: 232 additions & 0 deletions bofire/strategies/doe/branch_and_bound.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
from __future__ import annotations

from functools import total_ordering
from queue import PriorityQueue
from typing import List

import numpy as np
import pandas as pd

from bofire.data_models.constraints.api import ConstraintNotFulfilledError
from bofire.data_models.domain.domain import Domain
from bofire.strategies.doe.design import find_local_max_ipopt
from bofire.strategies.doe.objective import get_objective_class
from bofire.strategies.doe.utils import get_formula_from_string
from bofire.strategies.doe.utils_features import (
RelaxableBinaryInput,
RelaxableDiscreteInput,
)


@total_ordering
class NodeExperiment:
def __init__(
self,
partially_fixed_experiments: pd.DataFrame,
design_matrix: pd.DataFrame,
value: float,
categorical_groups: List[List[RelaxableBinaryInput]],
discrete_vars: List[RelaxableDiscreteInput],
):
"""

Args:
partially_fixed_experiments: dataframe containing (some) fixed variables for experiments.
design_matrix: optimal design for given the fixed and partially fixed experiments
value: value of the objective function evaluated with the design_matrix
categorical_groups: Represents the different groups of the categorical variables
discrete_vars: List of discrete variables in the optimization problem
"""
self.partially_fixed_experiments = partially_fixed_experiments
self.design_matrix = design_matrix
self.value = value
self.categorical_groups = categorical_groups
self.discrete_vars = discrete_vars

def get_next_fixed_experiments(self) -> List[pd.DataFrame]:
"""
Based on the current partially_fixed_experiment DataFrame the next branches are determined. One variable will
be fixed more than before.
Returns: List of the next possible branches where only one variable more is fixed

"""
# branching for the binary/ categorical variables
for group in self.categorical_groups:
for row_index, _exp in self.partially_fixed_experiments.iterrows():
if (
self.partially_fixed_experiments.iloc[row_index][group[0].key]
is None
):
current_keys = [elem.key for elem in group]
allowed_fixations = np.eye(len(group))
branches = [
self.partially_fixed_experiments.copy()
for i in range(len(allowed_fixations))
]
for k, elem in enumerate(branches):
elem.loc[row_index, current_keys] = allowed_fixations[k]
return branches

# branching for the discrete variables
for var in self.discrete_vars:
for row_index, _exp in self.partially_fixed_experiments.iterrows():
current_fixation = self.partially_fixed_experiments.iloc[row_index][
var.key
]
first_fixation, second_fixation = None, None
if current_fixation is None:
lower_split, upper_split = var.equal_count_split(
var.lower_bound, var.upper_bound
)
first_fixation = (var.lower_bound, lower_split)
second_fixation = (upper_split, var.upper_bound)

elif current_fixation[0] != current_fixation[1]:
lower_split, upper_split = var.equal_count_split(
current_fixation[0], current_fixation[1]
)
first_fixation = (current_fixation[0], lower_split)
second_fixation = (upper_split, current_fixation[1])

if first_fixation is not None:
first_branch = self.partially_fixed_experiments.copy()
second_branch = self.partially_fixed_experiments.copy()

first_branch.loc[row_index, var.key] = first_fixation
second_branch.loc[row_index, var.key] = second_fixation

return [first_branch, second_branch]

return []

def __eq__(self, other: NodeExperiment) -> bool:
return self.value == other.value

def __ne__(self, other: NodeExperiment) -> bool:
return self.value != other.value

def __lt__(self, other: NodeExperiment) -> bool:
return self.value < other.value

def __str__(self):
return (
"\n ================ Branch-and-Bound Node ================ \n"
+ f"objective value: {self.value} \n"
+ f"design matrix: \n{self.design_matrix.round(4)} \n"
+ f"current fixations: \n{self.partially_fixed_experiments.round(4)} \n"
)


def is_valid(
design_matrix: pd.DataFrame, domain: Domain, tolerance: float = 1e-2
) -> bool:
"""
test if a design is a valid solution. i.e. binary and discrete variables are valid
Args:
design_matrix (pd.DataFrame): the design to test
domain (Domain): the domain for which the design should be tested
tolerance: absolute tolerance between valid values and values in the design

Returns: True if the design is valid, else False

"""
categorical_vars = domain.get_features(includes=RelaxableBinaryInput)
for var in categorical_vars:
value = design_matrix.get(var.key)
if not (
np.logical_or(
np.isclose(value, 0, atol=tolerance),
np.isclose(value, 1, atol=tolerance),
).all()
):
return False

discrete_vars = domain.get_features(includes=RelaxableDiscreteInput)
for var in discrete_vars:
value = design_matrix.get(var.key)
if False in [True in np.isclose(v, var.values, atol=tolerance) for v in value]: # type: ignore
return False
return True


def bnb(
priority_queue: PriorityQueue,
verbose: bool = False,
num_explored: int = 0,
**kwargs,
) -> NodeExperiment:
"""
branch-and-bound algorithm for solving optimization problems containing binary and discrete variables
Args:
num_explored: keeping track of how many branches have been explored
priority_queue (PriorityQueue): initial nodes of the branching tree
verbose (bool): if true, print information during the optimization process
**kwargs: parameters for the actual optimization / find_local_max_ipopt

Returns: a branching Node containing the best design found

"""
if priority_queue.empty():
raise RuntimeError("Queue empty before feasible solution was found")

domain = kwargs["domain"]
n_experiments = kwargs["n_experiments"]

# get objective function
model_formula = get_formula_from_string(
model_type=kwargs["model_type"], rhs_only=True, domain=domain
)
objective_class = get_objective_class(kwargs["objective"])
objective_class = objective_class(
domain=domain, model=model_formula, n_experiments=n_experiments
)

pre_size = priority_queue.qsize()
current_branch = priority_queue.get()
# test if current solution is already valid
if is_valid(current_branch.design_matrix, domain):
return current_branch

# branch current solutions in sub-problems
next_branches = current_branch.get_next_fixed_experiments()

if verbose:
print(
f"current length of branching queue (+ new branches): {pre_size} + {len(next_branches)} currently "
f"explored branches: {num_explored}, current best value: {current_branch.value}"
)
# solve branched problems
for _i, branch in enumerate(next_branches):
initial_sample = branch.where(
~pd.isnull(branch), current_branch.design_matrix.values
)
initial_sample = initial_sample.astype("float64")
kwargs["sampling"] = initial_sample
try:
design = find_local_max_ipopt(partially_fixed_experiments=branch, **kwargs)
value = objective_class.evaluate(design.to_numpy().flatten())
new_node = NodeExperiment(
branch,
design,
value,
current_branch.categorical_groups,
current_branch.discrete_vars,
)
domain.validate_candidates(
candidates=design.apply(lambda x: np.round(x, 8)),
only_inputs=True,
tol=1e-4,
raise_validation_error=True,
)

priority_queue.put(new_node)
except ConstraintNotFulfilledError:
if verbose:
print("skipping branch because of not fulfilling constraints")

return bnb(
priority_queue,
verbose=verbose,
num_explored=num_explored + len(next_branches),
**kwargs,
)
Loading