Land-cover classification with high-resolution remote sensing images using transferable deep models

doi:10.1016/j.rse.2019.111322

Remote Sensing of Environment

Volume 237, February 2020, 111322

https://doi.org/10.1016/j.rse.2019.111322 Get rights and content

Highlights

•
A method to learn transferable deep model for 5-class land-cover (LC) classification.
•
A labeled dataset consisting of 150 Gaofen-2 images for LC classification.
•
It improves LC classification performance about 20% using multi-source RS images.
•
The method shows good transferability on different sensors and geolocations.

Abstract

In recent years, large amount of high spatial-resolution remote sensing (HRRS) images are available for land-cover mapping. However, due to the complex information brought by the increased spatial resolution and the data disturbances caused by different conditions of image acquisition, it is often difficult to find an efficient method for achieving accurate land-cover classification with high-resolution and heterogeneous remote sensing images. In this paper, we propose a scheme to apply deep model obtained from labeled land-cover dataset to classify unlabeled HRRS images. The main idea is to rely on deep neural networks for presenting the contextual information contained in different types of land-covers and propose a pseudo-labeling and sample selection scheme for improving the transferability of deep models. More precisely, a deep Convolutional Neural Networks (CNNs) is first pre-trained with a well-annotated land-cover dataset, referred to as the source data. Then, given a target image with no labels, the pre-trained CNN model is utilized to classify the image in a patch-wise manner. The patches with high confidence are assigned with pseudo-labels and employed as the queries to retrieve related samples from the source data. The pseudo-labels confirmed with the retrieved results are regarded as supervised information for fine-tuning the pre-trained deep model. To obtain a pixel-wise land-cover classification with the target image, we rely on the fine-tuned CNN and develop a hybrid classification by combining patch-wise classification and hierarchical segmentation. In addition, we create a large-scale land-cover dataset containing 150 Gaofen-2 satellite images for CNN pre-training. Experiments on multi-source HRRS images, including Gaofen-2, Gaofen-1, Jilin-1, Ziyuan-3, Sentinel-2A, and Google Earth platform data, show encouraging results and demonstrate the applicability of the proposed scheme to land-cover classification with multi-source HRRS images.

Introduction

Land-cover classification with remote sensing (RS) images plays an important role in many applications such as land resource management, urban planning, precision agriculture, and environmental protection (Mathieu et al., 2007; Shi et al., 2015; Ozdarici-Ok et al., 2015; Zhang and Kovacs, 2012; Ardila et al., 2011; Fauvel et al., 2013). In recent years, high-resolution remote sensing (HRRS) images are increasingly available. Meanwhile, multi-source and multi-temporal RS images can be obtained over different geographical areas (Moser et al., 2013). Such large amount of heterogeneous HRRS images provide detailed information of the land surface, and therefore open new avenues for large-coverage and multi-temporal land-cover mapping. However, the rich details of objects emerging in HRRS images, such as the geometrical shape and structural content of objects, bring more challenges to land-cover classification (Bruzzone and Carlin, 2006). Furthermore, diverse imaging conditions usually lead to photographic distortions, variations in scale and changes of illumination in RS images, which often seriously reduces the separability among different classes (Tuia et al., 2016). Due to these influences, optimal classification models learned from certain annotated images always quickly lose their effectiveness on new images captured by different sensors or by the same sensor but from different geo-locations. Therefore, it is intractable to find an efficient and accurate land-cover classification method for HRRS images with large diversities.

To characterize the image content of different land-cover categories, many methods investigated the use of spectral and spectral-spatial features to interpret RS images (Jensen and Lulla, 1986; Gong et al., 1992; Casals-Carrasco et al., 2000; Giada et al., 2003; Tarabalka et al., 2010a, b; Zhong et al., 2014; Ma et al., 2017a). However, due to the detailed and structural information brought by the gradually increased spatial resolution, the spectral and spectral-spatial features have difficulty in describing the contextual information contained in the images (Zhao et al., 2016; Zhong et al., 2017; Hu et al., 2016; Yu et al., 2016), which are often essential in depicting land-cover categories in HRRS images. Recently, it has been reported that effective characterization of contextual information in HRRS images can largely improve the classification performance (Shao et al., 2013; Hu et al., 2017; Yang et al., 2015). Among them, deep Convolutional Neural Networks (CNNs) have been drawn much attention in the understanding of HRRS images (Hu et al., 2015a; Zhu et al., 2017), mainly because of their strong capability to depict high-level and semantic aspects of images (Krizhevsky et al., 2012; Zeiler and Fergus, 2014). Currently, various deep models have been adopted to cope with challenging issues in RS image understanding, including e.g. scene classification (Hu et al., 2015a; Xia et al., 2017c), object detection (Xia et al., 2018), image retrieval (Napoletano, 2018; Jiang et al., 2017; Xia et al., 2017b), as well as land-cover classification (Zhao and Du, 2016; Zhao et al., 2015; Zhang et al., 2018a; Maggiori et al., 2017b; Kussul et al., 2017; Volpi and Tuia, 2017).

Nevertheless, there are two main problems in applying deep model to land-cover classification with multi-source HRRS images, which are listed below.

-
The inadequate transferability of deep learning models: Due to the diverse distributions of objects and spectral shifts caused by the different acquisition conditions of images, deep models trained on a certain set of annotated RS images may not be effective when dealing with images acquired by different sensors or from different geo-locations (Othman et al., 2017). To obtain satisfactory land-cover classification on a RS image of interest, referred as the target image, new specific annotated samples closely related to it are often necessary for model fine-tuning (Maggiori et al., 2017b). Nevertheless, considering that manual annotation requires high labor intensity and is often time-consuming, it is infeasible to label sufficient samples for continuously accumulated multi-source RS images (Lu et al., 2017; Hu et al., 2015b).
-
The lack of well-annotated large-scale land-cover dataset: The identification capability of CNN models relies heavily on the quality and quantity of the training data (Chakraborty et al., 2015). Up to now, several land-cover datasets have been proposed in the community, and have advanced a lot deep-learning-based land-cover classification approaches (Gerke et al., 2014; Maggiori et al., 2017a; Mattyus et al., 2015). However, the geographic areas covered by most of existing land-cover datasets (Ma et al., 2017b; Gerke et al., 2014; Mattyus et al., 2015) do not exceed $10 k m^{2}$ and somewhat similar in geographic distributions (Mnih, 2013). The lack of variations in geographic distributions of annotated HRRS images may cause overfitting in model training and limit the generalization ability of learned models. Overall, the insufficient or unqualified training data restrict the availability of deep models for HRRS images.

In this paper, we propose a scheme to adapt deep models to land-cover classification with multi-source HRRS images, which don't have any labeling information. Considering that the textures and structures of the objects are not affected by the spectral shifts, we use contextual information extracted by CNN to automatically mine samples for deep model fine-tuning. Concretely, unlabeled samples in the target image are identified by a CNN model pre-trained on an annotated HRRS dataset, which is referred to as the source data. A subset of them with high confidence are assigned with pseudo-labels and employed to retrieve similar samples from the source data. Finally, the returned results are used to determine whether the pseudo-labels are reliable. In our classification process, a patch-wise classification is initially conducted on the image relying on the multi-scale contextual information extracted by CNN. Then, a hierarchical segmentation is used for obtaining the object boundary information, which is integrated into the patch-wise classification map for accurate results. Specifically, for pre-training CNN models, we annotate 150 Gaofen-2 satellite images to construct a land-cover classification dataset, which is named after Gaofen Image Dataset (GID).

In summary, the contributions of this paper are as follows:

-
We propose a scheme to train transferable deep models, which enables one to achieve land-cover classification by using unlabeled multi-source RS images with high spatial resolution. In addition, we develop a hybrid land-cover classification that can simultaneously extract accurate category and boundary information of HRRS images. Experiments conducted on multi-source HRRS images, including Gaofen-2, Gaofen-1, Jilin-1, Ziyuan-3, Sentinel-2A, and Google Earth platform data obtain promising results and demonstrate the effectiveness of the proposed scheme.
-
We present a large-scale land-cover classification dataset, namely GID, which is consist of 150 high-resolution Gaofen-2 images and covers areas more than $50,000$ $k m^{2}$ in China. To our knowledge, GID is the first and largest well-annotated land-cover classification dataset with high-resolution remote sensing images up to 4 m. It can provide the research community a high-quality dataset to advance land-cover classification with HRRS images, like Gaofen-2 imagery.

A preliminary version of this work was presented in (Tong et al., 2018).

The remainder of the paper is organized as follows: In Section 2, we introduce the related works. In Section 3, the introduction of our land-cover classification algorithm is presented. In Section 4, the details and properties of GID coupled with other examined images are described. We present the results of experiments and sensitivity analysis in Section 5 and Section 6, and give the discussion in Section 7. Finally, we conclude our work in Section 8.

Section snippets

Related work

Land-cover Classification: Land-cover classification with RS images aims to associating each pixel in a RS image with a pre-defined land-cover category. To this end, classification approaches using spectral information have been intensively studied. These methods can interpret RS images using the spectral features of individual pixels (Jensen and Lulla, 1986; Gong et al., 1992; Casals-Carrasco et al., 2000), but their performance is often heavily affected by intra-class spectral variations and

Methodology

To efficiently conduct land-cover classification with multi-source HRRS images, we propose a scheme to train transferable deep models, which is pre-trained on labeled land-cover dataset and can be applied to unlabeled HRRS images. Assume that there is a well-annotated large-scale dataset and a newly acquired image without labeling information. We define two domains, called source domain $D_{S}$ and target domain $D_{T}$ that are separately associated with the labeled and unlabeled images. Our aim is to

Experimental results

We test our algorithm and analyse the experimental results in this section. Two types of land-cover classification issues are examined: 1) transferring deep models to classify HRRS images captured with the same sensor and under different conditions, 2) transferring deep models to classify multi-source HRRS images. For performance comparison, several object-based land-cover classification methods are utilized. The implementation details, comparison methods, and evaluation metrics are introduced

Sensitivity analysis

In the former section, the experimental results show the promising performance of the proposed method. However, some parameters have impact on the classification results. In this section, we analyse and discuss these factors through additional experiments, including analysis on patch size, segmentation method, and thresholds of transfer learning scheme.

Discussion

Land-cover classification is closely tied to the ecological condition of the Earth's surface and have significant implications for global ecosystem health, water quality, and sustainable land management. Most studies on large-scale land-cover classification generally use the low-/medium-spatial resolution RS images, however, due to the lack of spatial information, these images are insufficient for detailed mapping for high heterogeneous areas (Hu et al., 2013). By contrast, high-spatial

Conclusion

We present a land-cover classification algorithm that can be applied to classify multi-source HRRS images. The proposed algorithm has the following attractive properties: 1) it automatically selects training samples from the target domain based on the contextual information extracted from deep model. In consequence, it does not require new manual annotation or algorithm adjustment when being applied to multi-source images. 2) it uses multi-scale contextual information for classification.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grants 61922065, 61771350, 61871299 and 41820104006, in part by the Open Research Fund of Key Laboratory of Space Utilization, Chinese Academy of Science LSU-SJLY-2017-01, the Outstanding Youth Project of Hubei Province under Contract 2017CFA037.

References (107)

J.P. Ardila et al.
Markov random field-based super-resolution mapping for identification of urban trees in vhr images
ISPRS J. Photogrammetry Remote Sens.
(2011)
U.C. Benz et al.
Multi-resolution, object-oriented fuzzy analysis of remote sensing data for gis-ready information
ISPRS J. Photogrammetry Remote Sens.
(2004)
T. Blaschke
Object based image analysis for remote sensing
ISPRS J. Photogrammetry Remote Sens.
(2010)
C. Burnett et al.
A multi-scale segmentation/object relationship modelling methodology for landscape analysis
Ecol. Model.
(2003)
D.C. Duro et al.
A comparison of pixel-based and object-based image analysis with selected machine learning algorithms for the classification of agricultural landscapes using spot-5 hrg imagery
Remote Sens. Environ.
(2012)
P. Gong et al.
A comparison of spatial feature extraction algorithms for land-use classification with spot hrv data
Remote Sens. Environ.
(1992)
B. Huang et al.
Urban land-use mapping using a deep convolutional neural network with high spatial resolution multispectral remote sensing imagery
Remote Sens. Environ.
(2018)
D.-H. Lee
Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks
L. Ma et al.
A review of supervised object-based land-cover image classification
ISPRS J. Photogrammetry Remote Sens.
(2017)
L. Ma et al.
A review of supervised object-based land-cover image classification
ISPRS J. Photogrammetry Remote Sens.
(2017)

R. Mathieu et al.

Mapping private gardens in urban areas using object-oriented techniques and very high-resolution satellite imagery

Landsc. Urban Plan.

(2007)

S.W. Myint et al.

Per-pixel vs. object-based classification of urban land cover extraction using high spatial resolution imagery

Remote Sens. Environ.

(2011)

P. Olofsson et al.

Good practices for estimating area and assessing accuracy of land change

Remote Sens. Environ.

(2014)

F. Pacifici et al.

A neural network approach using multi-scale textural metrics from very high-resolution panchromatic imagery for urban land-use classification

Remote Sens. Environ.

(2009)

Y. Shao et al.

An evaluation of time-series smoothing algorithms for land-cover classifications using modis-ndvi multi-temporal data

Remote Sens. Environ.

(2016)

Y. Sheng et al.

Representative lake water extent mapping at continental scales using multi-temporal landsat-8 imagery

Remote Sens. Environ.

(2016)

F. Vuolo et al.

How much does multi-temporal sentinel-2 data improve crop type classification?

Int. J. Appl. Earth Obs. Geoinf.

(2018)

C. Zhang et al.

A hybrid mlp-cnn classifier for very fine resolution remotely sensed image classification

ISPRS J. Photogrammetry Remote Sens.

(2018)

C. Zhang et al.

An object-based convolutional neural network (ocnn) for urban land use classification

Remote Sens. Environ.

(2018)

N. Audebert et al.

How useful is region-based classification of remote sensing images in a deep learning framework?

V. Badrinarayanan et al.

Segnet: a deep convolutional encoder-decoder architecture for image segmentation

IEEE Trans. Pattern Anal. Mach. Intell.

(2017)

J.A. Benediktsson et al.

Classification of hyperspectral data from urban areas based on extended morphological profiles

IEEE Trans. Geosci. Remote Sens.

(2005)

T. Blaschke

What's wrong with pixels? some recent developments interfacing remote sensing and gis

GeoBIT/GIS

(2001)

L. Bruzzone et al.

A multilevel context-based system for classification of very high spatial resolution images

IEEE Trans. Geosci. Remote Sens.

(2006)

L. Bruzzone et al.

A novel transductive svm for semisupervised classification of remote-sensing images

IEEE Trans. Geosci. Remote Sens.

(2006)

L. Bruzzone et al.

A novel approach to the selection of spatially invariant features for the classification of hyperspectral images with improved generalization capability

IEEE Trans. Geosci. Remote Sens.

(2009)

P. Casals-Carrasco et al.

Application of spectral mixture analysis for terrain evaluation studies

Int. J. Remote Sens.

(2000)

S. Chakraborty et al.

Active batch selection via convex relaxations with guaranteed solution bounds

IEEE Trans. Pattern Anal. Mach. Intell.

(2015)

L.C. Chen et al.

Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs

IEEE Trans. Pattern Anal. Mach. Intell.

(2018)

B. Demir et al.

Detection of land-cover transitions in multitemporal remote sensing images with active-learning-based compound classification

IEEE Trans. Geosci. Remote Sens.

(2012)

B. Demir et al.

Definition of effective training sets for supervised classification of remote sensing images by a novel cost-sensitive active learning method

IEEE Trans. Geosci. Remote Sens.

(2014)

J. Deng et al.

Imagenet: a large-scale hierarchical image database

M. Fauvel et al.

Advances in spectral-spatial classification of hyperspectral images

Proc. IEEE

(2013)

P.F. Felzenszwalb et al.

Efficient graph-based image segmentation

Int. J. Comput. Vis.

(2004)

W. Ge et al.

Borrowing treasures from the wealthy: deep transfer learning through selective joint fine-tuning

M. Gerke et al.

Isprs semantic labeling contest

S. Giada et al.

Information extraction from very high resolution satellite imagery over lukole refugee camp, Tanzania

Int. J. Remote Sens.

(2003)

L. Gómez-Chova et al.

Semisupervised image classification with laplacian support vector machines

IEEE Geosci. Remote Sens. Lett.

(2008)

R.M. Haralick et al.

Textural features for image classification

IEEE Trans. on Systems, Man, and Cybernetics

(1973)

K. He et al.

Deep residual learning for image recognition

F. Hu et al.

Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery

Remote Sens.

(2015)

F. Hu et al.

Fast binary coding for the scene classification of high-resolution remote sensing imagery

Remote Sens.

(2016)

F. Hu et al.

Deep sparse representations for land-use scene classification in remote sensing images

J. Hu et al.

A comparative study of sampling analysis in the scene classification of optical high-spatial resolution remote sensing imagery

Remote Sens.

(2015)

Q. Hu et al.

Exploring the use of google earth imagery and object-based methods in land use/cover mapping

Remote Sens.

(2013)

E. Izquierdo-Verdiguier et al.

Encoding invariances in remote sensing image classification with svm

IEEE Geosci. Remote Sens. Lett.

(2013)

J.R. Jensen et al.

Introductory digital image processing: a remote sensing perspective

Geocarto Int.

(1986)

T.-B. Jiang et al.

Retrieving aerial scene images with learned deep image-sketch features

J. Comput. Sci. Technol.

(2017)

G. Jun et al.

Spatially adaptive classification of land cover with remote sensing data

IEEE Trans. Geosci. Remote Sens.

(2011)

A. Krizhevsky et al.

Imagenet classification with deep convolutional neural networks

Cited by (581)

Classifying raw irregular time series (CRIT) for large area land cover mapping by adapting transformer model
2024, Science of Remote Sensing
For Landsat land cover classification, the time series observations are typically irregular in the number of observations in a period (e.g., a year) and acquisition dates due to cloud cover variations over large areas and acquisition plan variations over long periods. Compositing or temporal percentile calculation are usually used to transform the irregular time series to regular temporal variables so that the machine and deep learning classifiers can be applied. Recognizing that the composite and percentile calculations have information loss, this study presents a method directly Classifying the Raw Irregular Time series (CRIT) (‘raw’ means irregular good-quality surface reflectance time series without any composite or temporal percentile derivation) by adapting Transformer. CRIT uses the acquisition day of year as classification input to align time series and also takes the Landsat satellite platform (Landsat 5, 7 and 8) as input to address the inter-sensor reflectance differences.
The CRIT was demonstrated by classifying Landsat analysis ready data (ARD) surface reflectance time series acquired across one year for three years (1985, 2006 and 2018) over the Conterminous United States (CONUS) with both spatial and temporal variations in Landsat availability. 20,047 training and 4949 evaluation 30-m pixel were used where each pixel was annotated as one of seven land cover classes for each year. The CRIT was compared with classifying 16-day composite time series and temporal percentiles and compared with a 1D convolution neural network (CNN) method. Results showed that the CRIT trained with three years of samples had 1.4–1.5% higher overall accuracies with less computation time than classifying 16-day composites and 2.3–2.4% higher than classifying temporal percentiles. The CRIT advantages over 16-day composites were pronounced for developed (0.05 F1-score) and cropland (0.02 F1-score) classes and for mixed or boundary pixels. This was reasonable as the 16-day composites had only on average 7.02, 16.49 and 15.78 good quality observations for the three years, respectively, in contrast to 7.89, 27.72, and 26.60 for the raw irregular time series. The CNN was not as good as CRIT in classifying the raw irregular time series as CNN simply filling temporal positions with no observations as zeros while the CRIT used a masking mechanism to rule out their contribution. The CRIT can also take the pixel coordinates and DEM variables as input which further increased the overall accuracies by 1.1–2.6% and achieved 84.33%, 87.54% and 87.01% overall accuracies for the 1985, 2006 and 2018 classifications, respectively. The CRIT land cover maps were shown consistent with the USGS Land Change Monitoring, Assessment, and Projection (LCMAP) maps. The developed codes, training data and maps were made publicly available.
Evaluating deep learning methods applied to Landsat time series subsequences to detect and classify boreal forest disturbances events: The challenge of partial and progressive disturbances
2024, Remote Sensing of Environment
The monitoring of forest ecosystems is significantly affected by the lack of consistent historical data of low-severity (forest partially disturbed) or gradual disturbance (e.g. eastern spruce budworm epidemic). The goal of this paper is to explore the use of a subset of Landsat time series and deep learning models to identify both the type and the year of disturbances, including low-severity and gradual disturbances, in the boreal forest of eastern Canada at the pixel level. Remote sensing data such as the spectral information from Landsat time series are the best available option for large scale observations of disturbances that go back decades. Traditional modeling approaches, like LandTrendr, require substantial handcrafted pre-processing to remove noise and to extract temporal features from the image sequences before using them as input to a classical machine-learning model. Deep-learning models can autonomously discern which features are relevant within the coarse temporal and spectral information from the Landsat annual dense time series.
We evaluated the performance of TempCNN and Transformer model in detecting and classifying the type and the year of the forest disturbance using Landsat time series subsequences. Our findings resulted in the generation of four disturbance maps outlining the forest history from 1986 to 2021 within the eastern Canadian boreal forest. Our experimental outcomes demonstrate several significant benefits of employing deep learning models. Firstly, using noisy Landsat time series they achieve comparable accuracy for classifying fire and total harvesting than existing publicly available disturbance maps. Secondly, the use of shorter time series subsequence with deep learning models enables to map adequately different overlapping disturbances occurring in the complete time series. Finally, they increase the number of distinguishable disturbance classes by adding partial harvesting, gradual disturbances, and forest recovery from older events, making them useful approaches for obtaining the first remote sensing-based map for areas affected by the eastern spruce budworm.
<sup>Using ZY1-02D satellite hyperspectral remote sensing to monitor landscape diversity and its spatial scaling change in the Yellow River Estuary</sup>
2024, International Journal of Applied Earth Observation and Geoinformation
Monitoring and assessing wetland diversity is crucial for its accurate preservation. Hyperspectral satellites have been proven effective for detailed investigations of plant diversity in many places. However, it's unclear whether spectral diversity invert landscape diversity, and whether the inversion accuracy varies with spatial scale. In this study, the ZY1-02D hyperspectral remote sensing images of the Yellow River Estuary were supervised and classified by the support vector machine. Then, the landscape diversity indices (i.e., community richness, Shannon-Wiener index, Simpson index, and Pielou index) and spectral diversity indices (i.e., coefficient of variation, convex hull volume, and eight vegetation indices) were calculated. A random forest model was used to predict landscape diversity by using spectral diversity. The spatial scale relationship between spectral diversity and landscape diversity were explored lastly. Our results showed that the overall accuracy of plant community classification in the Yellow River Estuary was 91.53 %, with a Kappa coefficient of 0.90. Spectral diversity had the best inversion accuracy on the Shannon-Wiener index (14 ∼ 57 %, average = 38 %), while the intermediate on Pielou index (3 ∼ 56 %, average = 30 %) and community richness (2 ∼ 48 %, average = 30 %), but the lowest on the Simpson index (2 ∼ 43 %, average = 16 %). The inversion accuracy of landscape diversity index increased first and then stabilized with the increase of scales, reaching stability at a sampling size of 2880 m × 2880 m. Our results indicated that ZY1-02D hyperspectral data can be used to monitor spatial changes of landscape diversity in wetland systems. However, its accuracy is affected by diversity index type and spatial scaling effects. Our findings provide a new perspective for the conservation and management of large-scale wetland landscape diversity.
Design of on-chip coded high-resolution 2D imaging via 3D compressed sensing
2024, Optics and Lasers in Engineering
This manuscript introduces a computational imaging approach where a static coded aperture is integrated with the image sensor to replace the extra Fourier optical elements or dynamic modulation in the previous computational high/super-resolution tasks. A two-dimensional (2D) high-resolution image with $N \times N$ pixels is formulated as a three-dimensional (3D) low-resolution cube with $\frac{N}{k} \times \frac{N}{k} \times k^{2}$ voxels in the forward model where N is the spatial resolution and k is a factor, respectively. Thus, the 2D image reconstruction can be performed with a 3D compressed sensing model in the snapshot-compressive-imaging format which has been mathematically proved to be convergent. Our proposed method successfully performs on the scaling factors such as 4× enlargement and the PSNR gains for natural images, remote sensing images and infrared image are 1.8 dB, 3.5 dB and 1.3 dB, respectively with only 1000 paired training images.
Scale-aware deep reinforcement learning for high resolution remote sensing imagery classification
2024, ISPRS Journal of Photogrammetry and Remote Sensing
Land-use/land-cover (LULC) classification of high spatial resolution (HSR) remote sensing imagery has been successfully improved using deep learning techniques. However, the current deep learning-based classification methods necessitate the division of remote sensing imagery into smaller and fixed image patches, primarily due to computational constraints arising from the extensive size of these images. This approach limits the receptive field of the classification network and hinders the handling of different-scale LULC objects. A key problem is how to automatically select the appropriate scale of patch for different objects with a deep learning network. To address this challenge, a scale-aware classification network (SAN) based on deep reinforcement learning (DRL) is proposed. In SAN, the state of each image patch is represented by a reduced-resolution version of the high-spatial-resolution (HSR) remote sensing image, referred to as a 'thumbnail', and a positional encoding. The scale selection actions are performed by a scale control agent. A feature indexing module is also proposed to enhance the ability of the agent to distinguish the location of the current image patch. The action switches the patch scale and the viewing area of context branch of a two-branch classification network, which extracts and fuses the features of the multi-scale images. The SAN framework adjusts the network parameters to perform the appropriate scale selection action based on the mapping reward received for the selected scale. In this way, the SAN framework is able to introduce more appropriate contexts by adjusting the scale of the network input based on RL, without the need for labeled scale selection samples. The experimental results obtained using two publicly available datasets and a newly built dataset demonstrate that SAN outperforms the previous LULC deep learning methods with fixed patches, particularly for large-scale mapping applications. When compared to state-of-the-art approaches such as GLNet and WiCoNet, which combine global and local information for segmentation, as well as CascadePSP and MagNet, renowned for their progressive segmentation capabilities, SAN consistently demonstrates approximately 10% higher accuracy. The codes for this research are openly available at http://rsidea.whu.edu.cn/resource_sharing.htm.
Self-training guided disentangled adaptation for cross-domain remote sensing image semantic segmentation
2024, International Journal of Applied Earth Observation and Geoinformation
Remote sensing (RS) image semantic segmentation using deep convolutional neural networks (DCNNs) has shown great success in various applications. However, the high dependence on annotated data makes it challenging for DCNNs to adapt to different RS scenes. To address this challenge, we propose a cross-domain RS image semantic segmentation task that considers ground sampling distance, remote sensing sensor variation, and different geographical landscapes as the main factors causing domain shifts between source and target images. To mitigate the negative impact of domain shift, we propose a self-training guided disentangled adaptation network (ST-DASegNet) that consists of source and target student backbones to extract source-style and target-style features. To align cross-domain single-style features, we adopt feature-level adversarial learning. We also propose a domain disentangled module (DDM) to extract universal and distinct features from single-domain cross-style features. Finally, we fuse these features and generate predictions using source and target student decoders. Moreover, we employ an exponential moving average (EMA) based cross-domain separated self-training mechanism to ease the instability and disadvantageous effect during adversarial optimization. Our experiments on several prominent RS datasets (Potsdam, Vaihingen, and LoveDA) demonstrate that ST-DASegNet outperforms previous methods and achieves new state-of-the-art results. Visualization and analysis also confirm the interpretability of ST-DASegNet. The code is publicly available at https://github.com/cv516Buaa/ST-DASegNet.

View all citing articles on Scopus

View full text

Land-cover classification with high-resolution remote sensing images using transferable deep models

Highlights

Abstract

Introduction

Section snippets

Related work

Methodology

Experimental results

Sensitivity analysis

Discussion

Conclusion

Funding

ISPRS J. Photogrammetry Remote Sens.

ISPRS J. Photogrammetry Remote Sens.

ISPRS J. Photogrammetry Remote Sens.

Ecol. Model.

Remote Sens. Environ.

Remote Sens. Environ.

Remote Sens. Environ.

ISPRS J. Photogrammetry Remote Sens.

ISPRS J. Photogrammetry Remote Sens.

Landsc. Urban Plan.

Remote Sens. Environ.

Remote Sens. Environ.

Remote Sens. Environ.

Remote Sens. Environ.

Remote Sens. Environ.

Int. J. Appl. Earth Obs. Geoinf.

ISPRS J. Photogrammetry Remote Sens.

Remote Sens. Environ.

How useful is region-based classification of remote sensing images in a deep learning framework?

Segnet: a deep convolutional encoder-decoder architecture for image segmentation

IEEE Trans. Pattern Anal. Mach. Intell.

Classification of hyperspectral data from urban areas based on extended morphological profiles

IEEE Trans. Geosci. Remote Sens.

What's wrong with pixels? some recent developments interfacing remote sensing and gis

GeoBIT/GIS

A multilevel context-based system for classification of very high spatial resolution images

IEEE Trans. Geosci. Remote Sens.

A novel transductive svm for semisupervised classification of remote-sensing images

IEEE Trans. Geosci. Remote Sens.

A novel approach to the selection of spatially invariant features for the classification of hyperspectral images with improved generalization capability

IEEE Trans. Geosci. Remote Sens.

Application of spectral mixture analysis for terrain evaluation studies

Int. J. Remote Sens.

Active batch selection via convex relaxations with guaranteed solution bounds

IEEE Trans. Pattern Anal. Mach. Intell.

Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs

IEEE Trans. Pattern Anal. Mach. Intell.

Detection of land-cover transitions in multitemporal remote sensing images with active-learning-based compound classification

IEEE Trans. Geosci. Remote Sens.

Definition of effective training sets for supervised classification of remote sensing images by a novel cost-sensitive active learning method

IEEE Trans. Geosci. Remote Sens.

Imagenet: a large-scale hierarchical image database

Advances in spectral-spatial classification of hyperspectral images

Proc. IEEE

Efficient graph-based image segmentation

Int. J. Comput. Vis.

Borrowing treasures from the wealthy: deep transfer learning through selective joint fine-tuning

Isprs semantic labeling contest

Information extraction from very high resolution satellite imagery over lukole refugee camp, Tanzania

Int. J. Remote Sens.

Semisupervised image classification with laplacian support vector machines

IEEE Geosci. Remote Sens. Lett.

Textural features for image classification

IEEE Trans. on Systems, Man, and Cybernetics

Deep residual learning for image recognition

Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery

Remote Sens.

Fast binary coding for the scene classification of high-resolution remote sensing imagery

Remote Sens.

Deep sparse representations for land-use scene classification in remote sensing images

A comparative study of sampling analysis in the scene classification of optical high-spatial resolution remote sensing imagery

Remote Sens.

Exploring the use of google earth imagery and object-based methods in land use/cover mapping

Remote Sens.

Encoding invariances in remote sensing image classification with svm

IEEE Geosci. Remote Sens. Lett.

Introductory digital image processing: a remote sensing perspective

Geocarto Int.