-
-
Notifications
You must be signed in to change notification settings - Fork 0
sonar_tutorial_part3
Tweaking the search parameters is the usual first approach.
Increasing the search duration (more generations), running more iterations, and enlarging the population size are some ways to balance time and quality (this can be done via the parameters class):
int main()
{
using namespace ultra;
src::dataframe::params params;
params.output_index = src::dataframe::params::index::back;
src::problem prob("sonar.csv", params);
prob.setup_symbols();
// TWEAKING THE PARAMETERS
prob.params.population.init_subgroups = 3; // <-- 1
prob.params.population.individuals = 3000; // <-- 2
prob.params.evolution.generations = 200; // <-- 3
src::search s(prob);
s.validation_strategy<src::holdout_validation>(prob);
const auto result(s.run(5)); // <-- 4
std::cout << "\nCANDIDATE SOLUTION\n"
<< out::c_language << result.best_individual
<< "\n\nACCURACY\n" << *result.best_measurements.accuracy * 100.0
<< '%'
<< "\n\nFITNESS\n" << *result.best_measurements.fitness << '\n';
}
(the above code is in the examples/sonar03.cc file)
-
prob.params.population.init_subgroups = 3;
The search starts with
3
distinct sub-populations evolving in parallel. Ultra uses an algorithm called ALPS and can take advantage of multithreading / multiprocessing architecture. -
prob.params.population.individuals = 3000;
A larger population permits a broader exploration of the search space (large populations require more computing power / memory).
-
prob.params.evolution.generations = 200;
Allowing more time to refine a candidate solution increases the likelihood of improvement but may also lead to resource wastage in cases of premature convergence.
-
s.run(5)
We want a search consisting of five runs, each representing an independent evolution cycle. The solution of each run is compared against the validation set, and the best candidate solution is chosen as the winner.
(full example in examples/sonar04.cc)
Changing some aspects of the evolution process can be beneficial. Two interesting changes can be:
-
prob.params.evolution.brood_recombination = 3
. Brood recombination crossover is an effective technique for exploring the potential of the crossover operator in greater depth. While it may occasionally lead to premature convergence, it's worth experimenting with. -
prob.params.team.individuals = 3
. The cooperative team approach results in an improved training and generalization performance compared to the standard GP. However, caution is necessary even in this scenario: an excessively large team may lead to overfitting.
[INFO] Importing dataset...
[INFO] ...dataset imported
[INFO] Examples: 208, features: 60, classes: 2
[INFO] Setting up terminals...
[INFO] Category 0 variables: `X1` `X2` `X3` `X4` `X5` `X6` `X7` `X8` `X9` `X10` `X11` `X12` `X13` `X14` `X15` `X16` `X17` `X18` `X19` `X20` `X21` `X22` `X23` `X24` `X25` `X26` `X27` `X28` `X29` `X30` `X31` `X32` `X33` `X34` `X35` `X36` `X37` `X38` `X39` `X40` `X41` `X42` `X43` `X44` `X45` `X46` `X47` `X48` `X49` `X50` `X51` `X52` `X53` `X54` `X55` `X56` `X57` `X58` `X59` `X60`
[INFO] ...terminals ready
[INFO] Automatically setting up symbol set...
[INFO] Category 0 symbols: `FABS` `FADD` `FDIV` `FLN` `FMUL` `FMOD` `FSUB`
[INFO] ...symbol set ready
[INFO] Holdout validation settings: 70% training (144), 30% validation (64), 0% test (0)
[INFO] Number of layers set to 1
[INFO] Population size set to 736
0.592 0: -43.8549
1.063 1: -39.3506
1.520 2: -35.7575
...
03:04 386: -19.8824
03:06 390: -19.605
03:07 392: -19.3401
03:08 393: -19.0378
03:10 396: -18.913
03:13 401: -18.1754
03:17 407: -18.1338
03:18 408: -18.1247
03:20 411: -17.9289
03:23 415: -17.9113
05:11 578: -17.5204
05:17 586: -17.4367
05:19 589: -17.1784
05:22 593: -17.1744
05:22 594: -17.1669
[INFO] Evolution completed at generation: 602. Elapsed time: 05:27
Run 4 TRAINING. Fitness: -17.1669 Accuracy: 93.055556%
Run 4 VALIDATION. Fitness: -19.4064 Accuracy: 75.000000%
CANDIDATE SOLUTION
X11+fmod(X47,(X28-X15))
fmod(X42,(X19*X18))
(X36+X31)*X17
ACCURACY
89.0625%