You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The transformer blocks of this example contain 2 Conv1D layer, and therefore we have to reshape the input matrix to add the channel dimension at the end.
There is a GlobalAveragePooling1D layer after the transformer blocks:
x = layers.GlobalAveragePooling1D(data_format="channels_last")(x)
which should be correct since our channel is added at the last.
However, if running these example, the summary at the last third line will not have 64,128 Params
dense (Dense) │ (None, 128) │ 64,128 │ global_average_pool…
Instead it will just have 256 parameters and making the total params way less, the model will also have an accuracy of ~50% only
this happen no matter i am running tensorflow-2.14-gpu, or just using the CPU version tensorflow-2.18
However, if changing the data_format="channels_first" everything become fine. The number of params in the GlobalAveragePooling1D layer become 64,128. The total params also match. The training accuracy also more than 90%.
I discover that as i find a very similar model here.
The only difference is the data_format
But isn't data_format="channels_last" is the right choice ?
So whats wrong ?
The text was updated successfully, but these errors were encountered:
My rig
About this example
The transformer blocks of this example contain 2 Conv1D layer, and therefore we have to reshape the input matrix to add the channel dimension at the end.
There is a GlobalAveragePooling1D layer after the transformer blocks:
x = layers.GlobalAveragePooling1D(data_format="channels_last")(x)
which should be correct since our channel is added at the last.
However, if running these example, the summary at the last third line will not have 64,128 Params
dense (Dense) │ (None, 128) │ 64,128 │ global_average_pool…
Instead it will just have 256 parameters and making the total params way less, the model will also have an accuracy of ~50% only
this happen no matter i am running tensorflow-2.14-gpu, or just using the CPU version tensorflow-2.18
However, if changing the data_format="channels_first" everything become fine. The number of params in the GlobalAveragePooling1D layer become 64,128. The total params also match. The training accuracy also more than 90%.
I discover that as i find a very similar model here.
The only difference is the data_format
But isn't data_format="channels_last" is the right choice ?
So whats wrong ?
The text was updated successfully, but these errors were encountered: