diff --git a/CHANGELOG.md b/CHANGELOG.md
index f020ee74..65d21dbc 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,6 +1,6 @@
# Changelog
-## HEAD
+## 1.2.0 - 2022-11-18
### Features
diff --git a/documentation/learners.md b/documentation/learners.md
index 16ce0f74..9d926c95 100644
--- a/documentation/learners.md
+++ b/documentation/learners.md
@@ -1251,3 +1251,52 @@ learner hyper-parameters.
- If true, workers will print training logs.
+
+## HYPERPARAMETER_OPTIMIZER
+
+
+
+### Training configuration
+
+Following are the protobuffer definitions used in TrainingConfiguration to set
+learner hyper-parameters.
+
+- learner/abstract_learner.proto
+
+### Generic Hyper-parameters
+
+#### [maximum_model_size_in_memory_in_bytes](../yggdrasil_decision_forests/learner/abstract_learner.proto?q=symbol:maximum_model_size_in_memory_in_bytes)
+
+- **Type:** Real **Default:** -1
+
+- Limit the size of the model when stored in ram. Different algorithms can
+ enforce this limit differently. Note that when models are compiled into an
+ inference, the size of the inference engine is generally much smaller than
+ the original model.
+
+#### [maximum_training_duration_seconds](../yggdrasil_decision_forests/learner/abstract_learner.proto?q=symbol:maximum_training_duration_seconds)
+
+- **Type:** Real **Default:** -1
+
+- Maximum training duration of the model expressed in seconds. Each learning
+ algorithm is free to use this parameter at it sees fit. Enabling maximum
+ training duration makes the model training non-deterministic.
+
+#### [pure_serving_model](../yggdrasil_decision_forests/learner/abstract_learner.proto?q=symbol:pure_serving_model)
+
+- **Type:** Categorical **Default:** false **Possible values:** true, false
+
+- Clear the model from any information that is not required for model serving.
+ This includes debugging, model interpretation and other meta-data. The size
+ of the serialized model can be reduced significatively (50% model size
+ reduction is common). This parameter has no impact on the quality, serving
+ speed or RAM usage of model serving.
+
+#### [random_seed](../yggdrasil_decision_forests/learner/abstract_learner.proto?q=symbol:random_seed)
+
+- **Type:** Integer **Default:** 123456
+
+- Random seed for the training of the model. Learners are expected to be
+ deterministic by the random seed.
+
+
diff --git a/documentation/rtd/hyper_parameters.md b/documentation/rtd/hyper_parameters.md
index b1e09a78..bad63559 100644
--- a/documentation/rtd/hyper_parameters.md
+++ b/documentation/rtd/hyper_parameters.md
@@ -1269,3 +1269,52 @@ learner hyper-parameters.
- If true, workers will print training logs.
+
+## HYPERPARAMETER_OPTIMIZER
+
+
+
+### Training configuration
+
+Following are the protobuffer definitions used in TrainingConfiguration to set
+learner hyper-parameters.
+
+- learner/abstract_learner.proto
+
+### Generic Hyper-parameters
+
+#### [maximum_model_size_in_memory_in_bytes](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/abstract_learner.proto)
+
+- **Type:** Real **Default:** -1
+
+- Limit the size of the model when stored in ram. Different algorithms can
+ enforce this limit differently. Note that when models are compiled into an
+ inference, the size of the inference engine is generally much smaller than
+ the original model.
+
+#### [maximum_training_duration_seconds](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/abstract_learner.proto)
+
+- **Type:** Real **Default:** -1
+
+- Maximum training duration of the model expressed in seconds. Each learning
+ algorithm is free to use this parameter at it sees fit. Enabling maximum
+ training duration makes the model training non-deterministic.
+
+#### [pure_serving_model](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/abstract_learner.proto)
+
+- **Type:** Categorical **Default:** false **Possible values:** true, false
+
+- Clear the model from any information that is not required for model serving.
+ This includes debugging, model interpretation and other meta-data. The size
+ of the serialized model can be reduced significatively (50% model size
+ reduction is common). This parameter has no impact on the quality, serving
+ speed or RAM usage of model serving.
+
+#### [random_seed](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/abstract_learner.proto)
+
+- **Type:** Integer **Default:** 123456
+
+- Random seed for the training of the model. Learners are expected to be
+ deterministic by the random seed.
+
+