Skip to content

jinglun-cn/Landsat-8-image-classification

Repository files navigation

Landsat-8 Classification


Prerequisites

Python >=3.5

Tensorflow >=2.0

GDAL

Rasterio

Shapely

Sites for Training

The sites of interests consist of 529 locations in 32 states in the US, which are provided in the following file:

./sites_train.csv

sites

Download and Process Data

python data_processing.py

The script downloads Landsat-8 and National Land Cover Database 2016(NLCD2016) from Amazon S3.

For Landsat-8, input Lat/Lon are converted to Path/Row to match the product ID of scenes. The level 1 scenes in 2016 with corresponding path/row and the least cloud coverage are downloaded from Amazon S3 Storage. OLI bands 1-7 and 9 are extracted.

The whole process might take a long time and requires at least 22G disk space, depending on the exact input sites.

For each input site, the script crops a 3840 m x 3840 m rectangle images, each with the given site located in the center.

The extracted data are saved as a numpy array with datatype "uint16" and shape (sample_index, x-coor, y-coor, band).

Extracted Data Example

Below is an example. First 8 images are from Landsat-8. The last image is from NLCD.

example

Train the CNN

python train.py

The script reads numpy array, generates small patches and train the CNN.

In [Sharma et al., 2017], a patch-based CNN is trained and tested using Landsat-8 data within the Florida Everglades ecosystem. As our data are significantly larger and more diverse, the oringinal model achieved only 78% in validation accuracy. Our model is able to to improve the validation accuracy to 89%.

Classify sites using pretrained model

python classify.py

The script loads the pretrained model from:

./pretrained.hdf5

The sites for classification are given in:

./sites_classify.csv

Landsat-8 data are downloaded, cropped and classified.

The class labels are stored in the output array:

./arr_cls.npy

As the model is patch-based, the boundary pixels are unclassified. One can preset the size of image to classify a larger area.

* Process LiDAR point cloud (LPC) data to estimate elevation

Input: LiDAR point cloud (LPC) data (.las file) and extracted Landsat8-NLCD data (.tif file).

For each pixel in the extracted raster file, the program estimates the average elevation by calculating the mean of a 10*10 square of LPC data. The elevation is classified into three categories(low, medium and high).

The average elevation and elevation classes are outputs. Both of them are 128*128 arrays.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

About

Landsat-8 classification with patch based CNN.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published