-
Notifications
You must be signed in to change notification settings - Fork 82
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #189 from alcides/docker-compose
Updated. GeneticEngine to latest version (2024 competition)
- Loading branch information
Showing
10 changed files
with
185 additions
and
153 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,10 @@ | ||
authors: # the participants | ||
- Guilherme Espada, Leon Ingelse, Paulo Santos, Pedro Barbosa, Alcides Fonseca | ||
email: gjespada@fc.ul.pt, leoningelse@gmail.com, pacsantos@fc.ul.pt, psbarbosa@ciencias.ulisboa.pt, me@alcidesfonseca.com | ||
name: Genetic Engine | ||
- Alcides Fonseca, Guilherme Espada, Leon Ingelse, Eduardo Madeira | ||
email: me@alcidesfonseca.com, gjespada@fc.ul.pt, leoningelse@gmail.com, jmadeira@lasige.di.fc.ul.pt | ||
name: Genetic Engine | ||
description: | ||
Genetic Engine is a framework for using Genetic Programming in different contexts. Genetic Engine allows the user to define trees in terms of Classes and Inheritance, as they would in a regular programming environment. Our framework takes care of generating individuals, mutating them and crossing them over. The user also defines a fitness function that takes a tree and returns a fitness score. This often requires to write (or reuse) a tree interpreter, as it is custom in these types of approaches. | ||
GeneticEngine is an extensible framework for implementing evolutionary algorithms. Users can write their own class structure and select the algorithm they want to run. Each algorithm will generate random trees following the class specification. | ||
Currently, GeneticEngine supports several representations (tree-based, grammatical evolution, structured grammatical evolution, stack-based), several algorithms (hill-climbing, Genetic Programming, 1+1) and several operators (Tournament, Lexicase, different mutations and crossovers). | ||
A distinctive feature of this framework is that it supports dependent types that semantically constraint the tree generation. | ||
|
||
The main different between STGP and GGGP is that the restrictions on what trees are valid are done via types, while in GGGP they are expressed using a grammar. Genetic Engine extracts the grammar from the types and their relationship, allowing to use any technique from GGGP (such as Grammatical Evolution) in STGP. | ||
|
||
The advantages of using STGP are: | ||
|
||
* the user does not need to know grammars, EBNF or any other grammar syntax | ||
* there is no need for a textual representation of programs, as trees can be the only representation (à lá lisp). | ||
* there is no need for parsing a textual program to a tree, to then interpret the tree | ||
* Mutations and Recombination are automatically type-safe, where in a grammar that type-safety is implicit in the structure of the grammar (and hard to reason) | ||
url: "https://github.com/alcides/GeneticEngine/" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
authors: # the participants | ||
- Alcides Fonseca, Guilherme Espada, Leon Ingelse, Eduardo Madeira | ||
email: me@alcidesfonseca.com, gjespada@fc.ul.pt, leoningelse@gmail.com, jmadeira@lasige.di.fc.ul.pt | ||
name: Genetic Engine | ||
description: | ||
GeneticEngine is an extensible framework for implementing evolutionary algorithms. Users can write their own class structure and select the algorithm they want to run. Each algorithm will generate random trees following the class specification. | ||
Currently, GeneticEngine supports several representations (tree-based, grammatical evolution, structured grammatical evolution, stack-based), several algorithms (hill-climbing, Genetic Programming, 1+1) and several operators (Tournament, Lexicase, different mutations and crossovers). | ||
A distinctive feature of this framework is that it supports dependent types that semantically constraint the tree generation. | ||
|
||
url: "https://github.com/alcides/GeneticEngine/" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,156 +1,46 @@ | ||
from audioop import cross | ||
from typing import Union | ||
import geneticengine.off_the_shelf.regressors as gengy_regressors | ||
from sklearn.base import BaseEstimator, TransformerMixin | ||
from sklearn.model_selection import GridSearchCV | ||
# This example submission shows the submission of FEAT (cavalab.org/feat). | ||
from geml.regressors import GeneticProgrammingRegressor | ||
from geml.regressors import model | ||
from sklearn.base import RegressorMixin | ||
|
||
""" | ||
est: a sklearn-compatible regressor. | ||
if you don't have one they are fairly easy to create. | ||
see https://scikit-learn.org/stable/developers/develop.html | ||
""" | ||
est: RegressorMixin = GeneticProgrammingRegressor( | ||
max_time=3, # 8 hrs. Your algorithm should have this feature | ||
) | ||
|
||
class OptimisedGPRegressor(BaseEstimator, TransformerMixin): | ||
def __init__( | ||
self, | ||
population_size = 50, | ||
n_novelties = 10, | ||
favor_less_deep_trees = True, | ||
random_state = 123, | ||
max_time = 100, | ||
optimisation_dedicated_proportion = 0.2, | ||
slack_time = 0.05, | ||
): | ||
self.population_size = population_size | ||
self.n_novelties = n_novelties | ||
self.favor_less_deep_trees = favor_less_deep_trees | ||
self.random_state = random_state | ||
self.max_time = max_time | ||
self.optimisation_dedicated_proportion = optimisation_dedicated_proportion | ||
self.slack_time = slack_time | ||
self.model = None | ||
|
||
|
||
def fit(self, X, y): | ||
n_elites = [ 5, 10 ] | ||
max_depths = [ 10, 15 ] | ||
hill_climbings = [ True, False ] | ||
mutation_probs = [ 0.01, 0.05, 0.1, 0.2 ] | ||
crossover_probs = [ 0.8, 0.9, 0.95 ] | ||
|
||
CVS = 2 | ||
|
||
param_grid_size = len(n_elites) * len(max_depths) * len(hill_climbings) * len(mutation_probs) * len(crossover_probs) | ||
param_alloted_time = int((self.max_time * self.optimisation_dedicated_proportion) / (param_grid_size * CVS)) | ||
if param_alloted_time < 1: # For testing | ||
n_elites = [ 5 ] | ||
max_depths = [ 10 ] | ||
mutation_probs = [ 0.01 ] | ||
crossover_probs = [ 0.8 ] | ||
|
||
param_alloted_time = 1 | ||
|
||
param_grid: Union[dict, list] = { | ||
"population_size": [ self.population_size ], | ||
"n_elites": n_elites, | ||
"n_novelties": [ self.n_novelties ], | ||
"max_depth": max_depths, | ||
"favor_less_deep_trees": [ self.favor_less_deep_trees ], | ||
"seed": [ self.random_state ], | ||
"hill_climbing": hill_climbings, | ||
"probability_mutation": mutation_probs, | ||
"probability_crossover": crossover_probs, | ||
"timer_stop_criteria": [ True ], | ||
"timer_limit": [ param_alloted_time ], | ||
"metric": [ 'r2' ], | ||
} | ||
|
||
grid_search = GridSearchCV(gengy_regressors.GeneticProgrammingRegressor(),param_grid,cv=CVS) | ||
|
||
grid_search.fit(X,y) | ||
model = grid_search.best_estimator_ | ||
|
||
model_alloted_time = int(self.max_time * (1 - self.optimisation_dedicated_proportion - self.slack_time)) | ||
if "timer_limit" in model.get_params(): | ||
model.set_params(timer_limit=model_alloted_time) | ||
|
||
model.fit(X,y) | ||
self.model = model | ||
|
||
self.sympy_compatible_phenotype = model.sympy_compatible_phenotype | ||
def get_population(est) -> list[RegressorMixin]: | ||
""" | ||
Return the final population of the model. This final population should | ||
be a list with at most 100 individuals. Each of the individuals must | ||
be compatible with scikit-learn, so they should have a predict method. | ||
return model | ||
|
||
def predict(self, X): | ||
assert self.model != None | ||
y_pred = self.model.predict(X) | ||
|
||
return y_pred | ||
|
||
def score(self, X, y): | ||
self.model.score(X, y) | ||
Also, it is expected that the `model()` function can operate with them, | ||
so they should have a way of getting a simpy string representation. | ||
|
||
|
||
Returns | ||
------- | ||
A list of scikit-learn compatible estimators | ||
""" | ||
|
||
return est.get_population() | ||
|
||
est = OptimisedGPRegressor( | ||
population_size = 250, | ||
n_novelties = 10, | ||
favor_less_deep_trees = True, | ||
) | ||
|
||
def model(est, X=None): | ||
def get_best_solution(est) -> RegressorMixin: | ||
""" | ||
Return a sympy-compatible string of the final model. | ||
Parameters | ||
---------- | ||
est: sklearn regressor | ||
The fitted model. | ||
X: pd.DataFrame, default=None | ||
The training data. This argument can be dropped if desired. | ||
Return the best solution from the final model. | ||
Returns | ||
------- | ||
A sympy-compatible string of the final model. | ||
Notes | ||
----- | ||
Ensure that the variable names appearing in the model are identical to | ||
those in the training data, `X`, which is a `pd.Dataframe`. | ||
If your method names variables some other way, e.g. `[x_0 ... x_m]`, | ||
you can specify a mapping in the `model` function such as: | ||
``` | ||
def model(est, X): | ||
mapping = {'x_'+str(i):k for i,k in enumerate(X.columns)} | ||
new_model = est.model_ | ||
for k,v in mapping.items(): | ||
new_model = new_model.replace(k,v) | ||
``` | ||
If you have special operators such as protected division or protected log, | ||
you will need to handle these to assure they conform to sympy format. | ||
One option is to replace them with the unprotected versions. Post an issue | ||
if you have further questions: | ||
https://github.com/cavalab/srbench/issues/new/choose | ||
A scikit-learn compatible estimator | ||
""" | ||
|
||
# Here we replace "|" with "" to handle | ||
# protecte sqrt (expressed as sqrt(|.|)) in FEAT) | ||
model_str = est.sympy_compatible_phenotype | ||
|
||
return model_str | ||
return est.get_best_solution() | ||
|
||
def pre_train_fn(est, X, y): | ||
"""set max_time in seconds based on length of X.""" | ||
slack = 20 | ||
if len(X)<=1000: | ||
max_time = 3600 - slack | ||
else: | ||
max_time = 36000 - slack | ||
est.set_params(max_time=max_time) | ||
|
||
# pass the function to eval_kwargs | ||
eval_kwargs = dict( | ||
pre_train=pre_train_fn, | ||
test_params={'max_time': 100, | ||
} | ||
) | ||
|
||
# define eval_kwargs. | ||
eval_kwargs = {} |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# This example submission shows the submission of FEAT (cavalab.org/feat). | ||
from geml.regressors import RandomSearchRegressor | ||
from geml.regressors import model | ||
from sklearn.base import RegressorMixin | ||
|
||
""" | ||
est: a sklearn-compatible regressor. | ||
if you don't have one they are fairly easy to create. | ||
see https://scikit-learn.org/stable/developers/develop.html | ||
""" | ||
est: RegressorMixin = RandomSearchRegressor( | ||
max_time=3, # 8 hrs. Your algorithm should have this feature | ||
) | ||
|
||
|
||
def get_population(est) -> list[RegressorMixin]: | ||
""" | ||
Return the final population of the model. This final population should | ||
be a list with at most 100 individuals. Each of the individuals must | ||
be compatible with scikit-learn, so they should have a predict method. | ||
Also, it is expected that the `model()` function can operate with them, | ||
so they should have a way of getting a simpy string representation. | ||
Returns | ||
------- | ||
A list of scikit-learn compatible estimators | ||
""" | ||
|
||
return est.get_population() | ||
|
||
|
||
def get_best_solution(est) -> RegressorMixin: | ||
""" | ||
Return the best solution from the final model. | ||
Returns | ||
------- | ||
A scikit-learn compatible estimator | ||
""" | ||
|
||
return est.get_best_solution() | ||
|
||
|
||
# define eval_kwargs. | ||
eval_kwargs = {} |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# This example submission shows the submission of FEAT (cavalab.org/feat). | ||
from geml.regressors import HillClimbingRegressor | ||
from geml.regressors import model | ||
from sklearn.base import RegressorMixin | ||
|
||
""" | ||
est: a sklearn-compatible regressor. | ||
if you don't have one they are fairly easy to create. | ||
see https://scikit-learn.org/stable/developers/develop.html | ||
""" | ||
est: RegressorMixin = HillClimbingRegressor( | ||
max_time=3, # 8 hrs. Your algorithm should have this feature | ||
) | ||
|
||
|
||
def get_population(est) -> list[RegressorMixin]: | ||
""" | ||
Return the final population of the model. This final population should | ||
be a list with at most 100 individuals. Each of the individuals must | ||
be compatible with scikit-learn, so they should have a predict method. | ||
Also, it is expected that the `model()` function can operate with them, | ||
so they should have a way of getting a simpy string representation. | ||
Returns | ||
------- | ||
A list of scikit-learn compatible estimators | ||
""" | ||
|
||
return est.get_population() | ||
|
||
|
||
def get_best_solution(est) -> RegressorMixin: | ||
""" | ||
Return the best solution from the final model. | ||
Returns | ||
------- | ||
A scikit-learn compatible estimator | ||
""" | ||
|
||
return est.get_best_solution() | ||
|
||
|
||
# define eval_kwargs. | ||
eval_kwargs = {} |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# This example submission shows the submission of FEAT (cavalab.org/feat). | ||
from geml.regressors import OnePlusOneRegressor | ||
from geml.regressors import model | ||
from sklearn.base import RegressorMixin | ||
|
||
""" | ||
est: a sklearn-compatible regressor. | ||
if you don't have one they are fairly easy to create. | ||
see https://scikit-learn.org/stable/developers/develop.html | ||
""" | ||
est: RegressorMixin = OnePlusOneRegressor( | ||
max_time=3, # 8 hrs. Your algorithm should have this feature | ||
) | ||
|
||
|
||
def get_population(est) -> list[RegressorMixin]: | ||
""" | ||
Return the final population of the model. This final population should | ||
be a list with at most 100 individuals. Each of the individuals must | ||
be compatible with scikit-learn, so they should have a predict method. | ||
Also, it is expected that the `model()` function can operate with them, | ||
so they should have a way of getting a simpy string representation. | ||
Returns | ||
------- | ||
A list of scikit-learn compatible estimators | ||
""" | ||
|
||
return est.get_population() | ||
|
||
|
||
def get_best_solution(est) -> RegressorMixin: | ||
""" | ||
Return the best solution from the final model. | ||
Returns | ||
------- | ||
A scikit-learn compatible estimator | ||
""" | ||
|
||
return est.get_best_solution() | ||
|
||
|
||
# define eval_kwargs. | ||
eval_kwargs = {} |