NB_07 Food101-Final on 2019 MacPro #643
jeffsnell
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am working on implementing the Final Food-101 Model, to beat the published paper results, using an Apple MacPro 2019 - Intel 3.2 Ghz 16-core Xeon W and AMD Radeon Pro Vega II 32 GB - using Apple-Metal GPU library.
I have been able to load/run with:
python==3.10
tensorflow==2.13
tensorflow-hub==0.16.0
tensorflow-datasets==4.9.3
tensorflow-metal==1.0.0. (I have only been able to load 1.1.0 on M* based Macs)
I reached accuracy of 77.6% - I could probably get better, but I am moving on with the course for now :)
For buffer_size and batch_size I experimented with different values to see what the fit/training performance might be on the 07_efficientnetb0_feature_extract_model_mixed_precision model.
I find that I am able to run with mixed-precision enabled. I used <Policy "mixed_bfloat16"> because, although the AMD GPU is not supported for mixed precision - the INTEL Xeon is, for the CPU operations and bfloat16 is the preferred type for that processor according to what I could find online.
Rough (by eye) average over 3 epochs:
The only GPU performance measure available to me is GPU "Load" the last/bottom row generates the highest CPU & GPU load and shortest time per epoch
@mrdbourke - is there a down-side to running these larger BATCH_SIZE and BUFFER_SIZE values? I do not detect any effect on training/evaluation of models values.
As to the Final Food-101 Mode - I am not there yet...
NOTE - I Found this on Batch Size:
What is Batch Size?
Batch size is one of the most important hyperparameters in deep learning training, and it represents the number of samples used in one forward and backward pass through the network and has a direct impact on the accuracy and computational efficiency of the training process. The batch size can be understood as a trade-off between accuracy and speed. Large batch sizes can lead to faster training times but may result in lower accuracy and overfitting, while smaller batch sizes can provide better accuracy, but can be computationally expensive and time-consuming.
The batch size can also affect the convergence of the model, meaning that it can influence the optimization process and the speed at which the model learns. Small batch sizes can be more susceptible to random fluctuations in the training data, while larger batch sizes are more resistant to these fluctuations but may converge more slowly.
It is important to note that there is no one-size-fits-all answer when it comes to choosing a batch size, as the ideal size will depend on several factors, including the size of the training dataset, the complexity of the model, and the computational resources available.
Beta Was this translation helpful? Give feedback.
All reactions