-
-
Notifications
You must be signed in to change notification settings - Fork 0
symbolic_regression_part2
...that is great BUT my problem needs a particular evaluator / requires a unique data access technique / has a peculiar way of doing things.
No problem at all, you can customize the evaluator!
Given a
, b
and c
find a function f
such that
Probably this isn't of immediate interest, yet it's useful to illustrate a trait that may be shared by other, more complicated, problems and as a way to explain a more general problem solving technique.
const double a = ultra::random::between(-10.0, 10.0);
const double b = ultra::random::between(-10.0, 10.0);
a
and b
get two fixed, random values.
c
is somewhat different: it's a terminal. Terminal and function sets are the alphabet of the to-be-evolved-program (f
). The terminal set consists of the variables and the constants.
For our problem c
is the only terminal required (in general we also add some numbers):
class c : public ultra::terminal
{
public:
c() : ultra::terminal("c") {}
[[nodiscard]] value_t instance() const noexcept final
{
static const double val(ultra::random::between(-10.0, 10.0));
return val;
}
};
The constructor (c() : ultra::terminal("c") {}
) sets the name of the terminal (used for displaying purpose).
The instance
function returns a fixed random value.
int main()
{
using namespace ultra;
problem prob;
// SETTING UP SYMBOLS
prob.insert<c>(); // terminal
prob.insert<real::add>(); // functions
prob.insert<real::sub>();
prob.insert<real::mul>();
// ...
}
Note how the base problem
class is used instead of the derived src::problem
. src::problem
has a lot of ready-to-be-used functionalities (dataframe
s for training and validation, evaluator functions for scoring a candidate solution...) but problem
is more general and adaptable to different tasks (not only symbolic regression / classification).
Besides the terminal c
we use the functions add
, sub
, mul
as building blocks (function set).
Now only the evaluator (aka fitness function) is missing:
using candidate_solution = ultra::gp::individual;
// Given an individual (i.e. a candidate solution of the problem), returns a
// score measuring how good it is.
[[nodiscard]] double my_evaluator(const candidate_solution &x)
{
using namespace ultra;
const auto ret(run(x));
const double f(has_value(ret) ? std::get<D_DOUBLE>(ret) : 0.0);
const double model_output(b * f);
const double delta(std::fabs(a - model_output));
return -delta;
}
candidate_solution
is just an alias for gp::individual
; gp::individual
is a linear representation (Straight Line Program) used in genetic programming.
A line by line description of the evaluation process follows:
const auto ret(run(x));
Simply gets and stores the output of the candidate_solution.
ret
is a std::variant
(see value_t
for further details).
Variants allow efficient manipulation of different data types: here we're working with real numbers but Ultra also supports integers and strings.
const double f(has_value(ret) ? std::get<D_DOUBLE>(ret) : 0.0);
std::get<D_DOUBLE>(ret)
extracts the real number from the variant.
The user must check the variant for empty state (has_value(ret)
): it's required since the evolution process generates many nefarious individuals that could blow up for specific input values.
const double model_output(b * f);
const double delta(std::fabs(a - model_output));
delta
is a measure of the error based on the absolute value. Different norms may give better results (problem dependent).
return -delta;
The last instruction can be confusing: -delta
is used since Ultra uses standardized fitness (greater is better) not raw fitness. See the comments in fitness.h
.
All that remains is to put the pieces together:
int main()
{
// ...
// AD HOC EVALUATOR
search s(prob, my_evaluator);
// SEARCHING
const auto result(s.run());
std::cout << "\nCANDIDATE SOLUTION\n"
<< out::c_language << result.best_individual
<< "\n\nFITNESS\n" << *result.best_measurements.fitness << '\n';
}
The search object (s
) is instructed to use our evaluator before being launched (s.run()
).
(for your ease all the code is in the examples/symbolic_regression/symbolic_regression03.cc file)