One of the significant issues with Segment Anything is its weight and inference time. To predict one frame you need 75 sec on Intel i5 CPU or 6 sec on Nvidia P100 GPU.
In this experiment I trained Linknet to act like SAM and got 0.25 sec/frame on a 768x768 image instead of 75 sec/frame with the original SAM model on Intel i5 CPU. It means about 300 times the speedup.
- it works only for a particular territory
- it works only with particular zoom level
- it still not so accurate
- Grab imagery 1_get_data.ipynb
- download region from https://download.geofabrik.de
- filter landcover classes you need
- set zoom-level you want
- download imagery
- Make prediction with SAM and store results 2_predict_data_with_sam.ipynb
- Train Linknet model 3_train_linknet.ipynb
- Make a new data sample with active learning approach 4_mine_new_data_and_look_on_results.ipynb
- Tune Linknet model 5_tune_linknet.ipynb
Google Colab or Kaggle Notebooks enough to reproduce experiments.