Semantic segmentation of land cover from high resolution multispectral satellite images by spectral-spatial convolutional neural network

Saralioglu E., Gungor O.

GEOCARTO INTERNATIONAL, vol.37, no.2, pp.657-677, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 37 Issue: 2
  • Publication Date: 2022
  • Doi Number: 10.1080/10106049.2020.1734871
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aqualine, Aquatic Science & Fisheries Abstracts (ASFA), Environment Index, Geobase, INSPEC
  • Page Numbers: pp.657-677
  • Keywords: classification, deep learning, Remote sensing, semantic segmentation
  • Karadeniz Technical University Affiliated: Yes


Research to improve the accuracy of very high-resolution satellite image classification algorithms is still one of the hot topics in the field of remote sensing. Successful results of deep learning methods in areas such as image classification and object detection have led to the application of these methods to remote sensing problems. Recently, Convolutional Neural Networks (CNNs) are among the most common deep learning methods used in image classification, however, the use of CNN's in satellite image classification is relatively new. Due to the high computational complexity of 3D CNNs, which aim to extract both spatial and spectral information, 2D CNNs focussing on the extraction of spatial information are often preferred. High-resolution satellite images, however, contain crucial spectral information as well as spatial information. In this study, a 3D-2D CNN model using both spectral and spatial information was applied to extract more accurate land cover information from very high-resolution satellite images. The model was applied on a Worldview-2 satellite image including agricultural product areas such as tea, hazelnut groves and land use classes such as buildings and roads. The results of the CNN based model were also compared against those of the Support Vector Machine (SVM) and Random Forest (RF) algorithms. The post-classification accuracies were obtained using 800 control points generated by a web interface created for crowdsourcing purposes. The classification accuracy was 95.6% for the 3D-2D CNN model, 89.2% for the RF and 86.4% for the SVM.