Machine Learning-Based Processing Proof-of-Concept Pipeline for Semi-Automatic Sentinel-2 Imagery Download, Cloudiness Filtering, Classifications, and Updates of Open Land Use/Land Cover Datasets

Investor logo

Warning

This publication doesn't include Faculty of Arts. It includes Faculty of Science. Official publication website can be found on muni.cz.
Authors

ŘEZNÍK Tomáš CHYTRÝ Jan TROJANOVÁ Kateřina

Year of publication 2021
Type Article in Periodical
Magazine / Source ISPRS International Journal of Geo-Information
MU Faculty or unit

Faculty of Science

Citation
web https://doi.org/10.3390/ijgi10020102
Doi http://dx.doi.org/10.3390/ijgi10020102
Keywords machine learning; land use; land cover; satellite imagery; Sentinel 2; image classification; cloud masking; LightGBM estimator
Description Land use and land cover are continuously changing in today's world. Both domains, therefore, have to rely on updates of external information sources from which the relevant land use/land cover (classification) is extracted. Satellite images are frequent candidates due to their temporal and spatial resolution. On the contrary, the extraction of relevant land use/land cover information is demanding in terms of knowledge base and time. The presented approach offers a proof-of-concept machine-learning pipeline that takes care of the entire complex process in the following manner. The relevant Sentinel-2 images are obtained through the pipeline. Later, cloud masking is performed, including the linear interpolation of merged-feature time frames. Subsequently, four-dimensional arrays are created with all potential training data to become a basis for estimators from the scikit-learn library; the LightGBM estimator is then used. Finally, the classified content is applied to the open land use and open land cover databases. The verification of the provided experiment was conducted against detailed cadastral data, to which Shannon's entropy was applied since the number of cadaster information classes was naturally consistent. The experiment showed a good overall accuracy (OA) of 85.9%. It yielded a classified land use/land cover map of the study area consisting of 7188 km2 in the southern part of the South Moravian Region in the Czech Republic. The developed proof-of-concept machine-learning pipeline is replicable to any other area of interest so far as the requirements for input data are met.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.