An hyperparameter tuner for Keras, specifically for tf.keras
with TensorFlow 2.0.
Status: Alpha.
Requirements:
- Python 3.6
- TensorFlow 2.0 Beta
Installation process:
git clone https://github.com/keras-team/keras-tuner.git
cd keras-tuner
pip install .
Here's how to perform hyperparameter tuning for a single-layer dense neural network using random search.
First, we define a model-building function. It takes an argument hp
from which you can
sample hyperparameters, such as hp.Int('units', min_value=32, max_value=512, step=32)
(an integer from a certain range).
This function returns a compiled model.
from tensorflow import keras
from tensorflow.keras import layers
from kerastuner.tuners import RandomSearch
def build_model(hp):
model = keras.Sequential()
model.add(layers.Dense(units=hp.Int('units',
min_value=32,
max_value=512,
step=32),
activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.compile(
optimizer=keras.optimizers.Adam(
hp.Choice('learning_rate',
values=[1e-2, 1e-3, 1e-4])),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
Next, instantiate a tuner. You should specify the model-building function,
the name of the objective to optimize (whether to minimize or maximize is automatically inferred
for built-in metrics), the total number of trials (max_trials
) to test, and the number
of models that should be built and fit for each trial (executions_per_trial
).
Available tuners are RandomSearch
and Hyperband
.
Note: the purpose of having multiple executions per trial is to reduce results variance
and therefore be able to more accurately assess the performance of a model. If you want to get
results faster, you could set executions_per_trial=1
(single round of training for each model configuration).
tuner = RandomSearch(
build_model,
objective='val_accuracy',
max_trials=5,
executions_per_trial=3,
directory='my_dir',
project_name='helloworld')
You can print a summary of the search space:
tuner.search_space_summary()
Then, start the search for the best hyperparameter configuration.
The call to search
has the same signature as model.fit()
.
tuner.search(x, y,
epochs=5,
validation_data=(val_x, val_y))
Here's what happens in search
: models are built iteratively by calling the model-building function,
which populates the hyperparameter space (search space) tracked by the hp
object. The tuner
progressively explores the space, recording metrics for each configuration.
When search is over, you can retrieve the best model(s):
models = tuner.get_best_models(num_models=2)
Or print a summary of the results:
tuner.results_summary()
You will also find detailed logs, checkpoints, etc, in the folder my_dir/helloworld
, i.e. directory/project_name
.
Below, we have a for
loop creating a tunable number of layers,
which themselves involve a tunable units
parameter.
This can be pushed to any level of parameter interdependency, including recursion.
Note that all parameter names should be unique (here, in the loop over i
,
we name the inner parameters 'units_' + str(i)
).
def build_model(hp):
model = keras.Sequential()
for i in range(hp.Int('num_layers', 2, 20)):
model.add(layers.Dense(units=hp.Int('units_' + str(i),
min_value=32,
max_value=512,
step=32),
activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.compile(
optimizer=keras.optimizers.Adam(
hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
This makes it easy to share and reuse hypermodels.
A HyperModel
subclass only needs to implement a build(self, hp)
method.
from kerastuner import HyperModel
class MyHyperModel(HyperModel):
def __init__(self, num_classes):
self.num_classes = num_classes
def build(self, hp):
model = keras.Sequential()
model.add(layers.Dense(units=hp.Int('units',
min_value=32,
max_value=512,
step=32),
activation='relu'))
model.add(layers.Dense(self.num_classes, activation='softmax'))
model.compile(
optimizer=keras.optimizers.Adam(
hp.Choice('learning_rate',
values=[1e-2, 1e-3, 1e-4])),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
hypermodel = MyHyperModel(num_classes=10)
tuner = RandomSearch(
hypermodel,
objective='val_accuracy',
max_trials=10,
directory='my_dir',
project_name='helloworld')
tuner.search(x, y,
epochs=5,
validation_data=(val_x, val_y))
These are ready-to-use hypermodels for computer vision.
They come pre-compiled with loss="categorical_crossentropy"
and metrics=["accuracy"]
.
from kerastuner.applications import HyperResNet
from kerastuner.tuners import Hyperband
hypermodel = HyperResNet(input_shape=(128, 128, 3), num_classes=10)
tuner = Hyperband(
hypermodel,
objective='val_accuracy',
max_trials=40,
directory='my_dir',
project_name='helloworld')
tuner.search(x, y,
epochs=20,
validation_data=(val_x, val_y))
If you have an existing hypermodel, and you want to search over only a few parameters
(such as the learning rate), you can do so by passing a hyperparameters
argument
to the tuner constructor, as well as tune_new_entries=False
to specify that parameters
that you didn't list in hyperparameters
should not be tuned. For these parameters, the default
value gets used.
from kerastuner import HyperParameters
hypermodel = HyperXception(input_shape=(128, 128, 3), num_classes=10)
hp = HyperParameters()
# This will override the `learning_rate` parameter with your
# own selection of choices
hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
tuner = Hyperband(
hypermodel,
hyperparameters=hp,
# `tune_new_entries=False` prevents unlisted parameters from being tuned
tune_new_entries=False,
objective='val_accuracy',
max_trials=40,
directory='my_dir',
project_name='helloworld')
tuner.search(x, y,
epochs=20,
validation_data=(val_x, val_y))
Want to know what parameter names are available? Read the code.
Whenever you register a hyperparameter inside a model-building function or the build
method of a hypermodel,
you can specify a default value:
hp.Int('units',
min_value=32,
max_value=512,
step=32,
default=128)
If you don't, hyperparameters always have a default default (for Int
, it is equal to min_value
).
What if you want to do the reverse -- tune all available parameters in a hypermodel, except one (the learning rate)?
Pass a hyperparameters
argument with a Fixed
entry (or any number of Fixed
entries), and specify tune_new_entries=True
.
hypermodel = HyperXception(input_shape=(128, 128, 3), num_classes=10)
hp = HyperParameters()
hp.Fixed('learning_rate', value=1e-4)
tuner = Hyperband(
hypermodel,
hyperparameters=hp,
tune_new_entries=True,
objective='val_accuracy',
max_trials=40,
directory='my_dir',
project_name='helloworld')
tuner.search(x, y,
epochs=20,
validation_data=(val_x, val_y))
If you have a hypermodel for which you want to change the existing optimizer, loss, or metrics, you can do so by passing these arguments to the tuner constructor:
hypermodel = HyperXception(input_shape=(128, 128, 3), num_classes=10)
tuner = Hyperband(
hypermodel,
optimizer=keras.optimizers.Adam(1e-3),
loss='mse',
metrics=[keras.metrics.Precision(name='precision'),
keras.metrics.Recall(name='recall')],
objective='val_precision',
max_trials=40,
directory='my_dir',
project_name='helloworld')
tuner.search(x, y,
epochs=20,
validation_data=(val_x, val_y))