Abstract

With the development of urbanization, land surface temperature (LST), as a vital variable for the urban environment, is highly demanded by urban-related studies, especially the LST with both fine temporal and spatial resolutions. Thermal sharpening methods have been developed just under this demand. Until now, there are some thermal sharpening methods proposed especially for urban surface. However, the evaluation of their accuracy still stopped at the level that only considers the statistical aspect, but no spatial information has been included. It is widely acknowledged that the spatial pattern of the thermal environment in an urban area is relatively critical for urban-related studies (e.g., urban heat island studies). Thus, this paper chose three typical methods from the limited number of thermal sharpening methods designed for the urban area and made a comparison between them, together with a newly proposed thermal sharpening method, superresolution-based thermal sharpener (SRTS). These four methods are analyzed by data from different seasons to explore the seasoning impact. Also, the accuracy for different land covers is explored as well. Furthermore, accuracy evaluation was not only taken by statistical variables which are commonly used in other studies; evaluation of the spatial pattern, which is equally important for urban-related studies, was also carried out. This time, the spatial pattern not only was analyzed qualitatively but also has been quantified by some variables for the comparison of accuracy. It is found that all methods obtained lower accuracies for data in winter than for data in other seasons. Linear water features and areas along it are difficult to be detected correctly for most methods.

1. Introduction

It is widely acknowledged that urban heat island (UHI) is becoming a detrimental phenomenon, leading to several social and environmental issues such as poor air quality, high energy demand, and even human mortality [1, 2]. Land surface temperature (LST), which can be derived from remote sensing data, is an essential variable for UHI studies and has been widely used in the literature [3]. Unfortunately, it is still reported that the current satellite sensor data are of inadequate detail for urban-related studies, which demands data with fine resolution in both spatial and temporal dimensions [4]. Sobrino et al. [5] carried out a study exploring the suitable spatial and temporal resolutions for UHI studies and finally suggested that spatial resolution finer than 50 m and a 1-2-day revisit frequency would be the desired resolutions. According to the current satellite thermal data, it is impossible to achieve this requirement because there is a trade-off between the spatial and temporal resolutions of the current remote sensing data. Moreover, this trade-off is difficult to address through the advancement of hardware due to the physical principles of remote sensing [68]. Thus, thermal sharpening techniques have been proposed.

It is found that the accuracy of thermal sharpening usually relies much on the physical meaning behind the relationship between LST and the sharpening predictors. This means that if the relationship and sharpening predictors have a strong physical meaning (e.g., the strong correlation between vegetation and LST in summer), the accuracy of a thermal sharpening method is usually acceptable. In light of this, researchers used more land cover information in thermal sharpening, especially for areas with complicated surface compositions such as urban areas [9, 10]. Studies based on regression models for downscaling have often used spectral indices (e.g., normalized difference water index (NDWI), normalized difference built-up index (NDBI)) to represent different land covers [8, 11, 12], while some others directly made use of land cover data in the process of downscaling [13, 14]. The accuracy assessment shows that methods considering more land covers gained a higher accuracy than methods based on only NDVI-LST relationship for urban areas. However, although the priority of bringing more land cover information in thermal sharpening has been demonstrated, as Sismanidis et al. [15] pointed out, the evaluation system of current thermal sharpening methods for urban areas may still need some improvements.

First, results of many studies are carried out by only one image and hence may be of limited generalizability [8, 16, 17], particularly when images acquired at different times or under different conditions are used (e.g., summer or winter) or used for different land cover mosaics. Therefore, it is better to use more than one image as the testing dataset to reduce the occurrence of bias. Furthermore, LST is a highly changeable variable which changes hourly, daily, and seasonally. Sismanidis et al. [15] pointed out that most spatial-resolution-enhanced LST actually would be used for UHI monitoring and analyzing which are the time series applications. Thus, the evaluation should at least consider the performance of a new method in different biomes, seasons, topography, and climatic conditions which will impact the relationship between LST and its predictors in time series studies. Until now, though there were some studies that reported the use of thermal sharpened LST for time series application, they mainly used the images collected between June and September [15, 18, 19]. This is because these studies usually used the methods relying on NDVI-LST relationship, and thus, they need to choose the months that have a reliable NDVI-LST relationship to guarantee the accuracy. Therefore, there is still a lack of research exploring the accuracy of thermal sharpening methods for data in different seasons.

The second issue is the evaluation of the spatial pattern of the thermal sharpened LST. Currently, most studies rely on the statistical methods for the assessment (e.g., RMSE, mean absolute error (MAE), and correlation coefficient) which focus on the accuracy of the absolute LST value of each pixel but do not consider any spatial information [9, 15, 20, 21]. However, spatial pattern information is equally critical for UHI studies [15]. How to evaluate it quantitatively and adequately is always a challengeable issue. Researchers tried to measure the spatial extent and magnitude of the UHI or use some variables (e.g., Local Moran I) to quantify the spatial information from LST images [15, 22]. Unfortunately, until now, there is no widely acknowledged method to define the similarity of the spatial pattern between the predicted and reference LSTs.

The third issue is the evaluation of accuracy for different land cover types. Although it is acknowledged that more land cover information can enhance the accuracy of thermal sharpening for urban areas, it is rare to see the analyses for the accuracy of the sharpened LST of different land covers. Based on the definition of heat capacity, it is known that heat capacity is a significant impacting factor for LST. Different materials have different heat capacities. Thus, different land covers have different LSTs. Furthermore, land covers may have different changing features in different seasons. For example, the LST of water will not decrease as many as that of an impervious surface from summer to winter, because the heat capacity of water is larger than that of the impervious surface. LST of land covers has a direct impact on the spatial pattern of LST. Thus, this difference may further lead to questions such as the following: Will the spatial pattern of LST, which influenced significantly by land covers, in different seasons change? Is the spatial pattern of LST in summer the same as it is in winter? If it changes, how does it change? There is a lack of answer to these questions because there is little spatial analysis of LST for different land covers in different seasons.

This research tried to provide a comprehensive comparison both statistically and spatially in a quantitative way, considering all the impacting factors to the accuracy of the thermal sharpening methods, such as seasons and land covers. It illustrated the advantage of a newly proposed thermal sharpening method and analyzed the accuracy of different thermal sharpening methods proposed for urban areas, including the new method, by data in different seasons, which corresponds to the aforementioned first issue. Not only the statistical assessment of the accuracy was taken; the evaluation of spatial pattern, which is equally important for urban-related studies, was also carried out. This study tried to use quantitative variables to compare the accuracy of the spatial pattern of LST and not only qualitative description for the spatial analyses of LST, which provided a potential way to deal with the second issue. Furthermore, the spatial analysis was also conducted for different land cover types, which refers to the third evaluation issue discussed above.

2. Data and Methods

2.1. Data

Great London is chosen as the study area, since it provides a complex surface composition. Testing data used in this research includes Landsat ETM+ and MODIS, while the validation data is ASTER images (Figure 1).

2.2. Experiment Design

The target of this study is to provide a comprehensive comparison for different methods in both statistical and spatial aspects and also to analyze the reliability and accuracy of the newly proposed thermal sharpening method SRTS. Thus, other three thermal sharpening methods proposed for the urban surface, which are Emissivity Modulation (EM) [14], Pixel Block Intensity Modulation (PBIM) [23, 24], and adjusted stratified stepwise regression method (Stepwise) [11], were chosen as the comparison of SRTS. Those three methods were chosen because they consider more than three land cover types which are essential impacting factors in thermal sharpening. Another reason is that their study areas are also metropolitans that are comparable with each other (i.e., Hong Kong, Athens, Shanghai, and London). All four methods were tested by data from different seasons. Moreover, LSTs for different land cover types were also compared and analyzed.

The evaluation of thermal sharpening methods was conducted in two perspectives: evaluation by statistical variables, including RMSE and correlation coefficient of result produced by each method, and evaluation of the spatial pattern of thermal sharpened LST results. In this study, fuzzy similarity, which considers the spatial characteristics in the neighborhood of each pixel, was used for evaluation of the spatial pattern of LST [25]. It calculated a similarity value for each pixel of the predicted LST images based on the reference image. In addition, visual comparison with vector boundaries of some selected objects representing different land covers was also taken for evaluation of the spatial pattern of LST. Here, the impervious surface, water, and vegetation are considered as the main land cover types of the urban surface [26]. The objects chosen for water or vegetation are relatively easy, as lakes and parks can be used which usually have a fairly clear closed boundary. The impervious surface, which is usually built as open areas or connected to other impervious surfaces (e.g., roads), is difficult to define a closed boundary. Therefore, in visual comparison, the analysis of the impervious surface was focused on the relative comparison between the LSTs of pixels in and outside the objects representing the impervious surface, for instance, whether the pixels in the impervious surface objects obtained, if in summer, the higher LST than its surroundings that are not impervious surface in the daytime. This can, though qualitatively, still support the accuracy assessment of the predicted spatial pattern of the LST image to some extent and can reflect the sensitivity of all methods to the LST variation between different land covers.

2.3. Superresolution-Based Thermal Sharpener

Currently, there are two main strategies to enhance the spatial resolution of LST. The first one is to process the coarse spatial resolution LST directly with its fine resolution impacting factors by using their experience relationship extracted by statistical algorithms, while the second strategy is to enhance the spatial resolution of the retrieving elements of LST (e.g., thermal radiance, atmospheric profiles). The newly proposed SRTS actually proposed a framework within which enhances the spatial resolution of the retrieving elements first by superresolution mapping (SRM) and superresolution reconstruction (SRR) and then derives the LST based on the resolution-enhanced elements. The framework has been shown in Figure 2.

2.3.1. Hopfield Neural Network-Based SRM

One of the retrieving elements of the LST is the land surface emissivity which can be derived by land cover map. SRM could enhance the spatial resolution of the classification and thus can enhance the resolution of the emissivity.

The HNN is a fully connected recurrent network and thus can be used to represent the image in image processing [27].

The energy function of HNN for SRM is defined as where and are the goal functions at a neuron (, , ); and are the proportional information constraint and the multiclass constraint, respectively; is the number of land cover types; and , , , and are the weight constants for each element of the energy function. The rate of change for the energy function for neuron (, ) is defined as

The two goal functions, and , in the energy function for SRM represent two forces pushing the output to 1 or 0. The aim of the increasing function is to raise the value of the neuron when the average of the eight neighboring subpixels is greater than a threshold and becomes 0 when the average is less than the threshold.

The proportion constraint is used to guarantee that the proportional information derived from soft classification is maintained while the goal functions and other constraints are satisfied, or this constraint will impact on the output of the energy function.

The multiclass constraint plays a similar role as the proportional constraint which adds another limitation that needs to be satisfied when goal functions achieve the aim. The basic idea is to ensure that each subpixel has been assigned with only one land cover type, which means that no subpixel will be unclassified or overlaid by different types.

It can be seen that the rate of change of the neuron can be obtained after the rate of change for the energy function is derived. Then, HNN for SRM could be updated at each time step, , until , where is a very small value, or the number of iterations reaches a certain amount using the Euler method.

2.3.2. Sparse Representation-Based SRR

SRR is to enhance the coarse spatial resolution image through one or a series of images by the experience relationship extracted by training algorithms. The basic idea of this model is that a vector (signal/image) can be represented by a sparse linear combination of some vectors (prototypes) contained in a dictionary matrix () which contains all the possible arrangements of the elements in a vector (signal/image) [28]: where is the signal or image that needs to be represented by the sparse representation model and is a vector containing a small number of nonzero elements, recording the coefficients of the sparse linear combination. To make the features, which are also called the prototypes in a dictionary, to be as typical as possible, the dictionary is usually trained by the dictionary training methods in patches.

This model is primarily popular among data compression techniques, as it reduces the records for images/signals to the minimum. However, researchers have found that images covering the same area but with different spatial resolutions share the same sparse vector but use different dictionaries [28, 29]. This means that for each image patch pair containing a coarse spatial resolution patch () and its corresponding fine spatial resolution patch (), they can be represented as where is the sparse vector they share.

This discovery provides the possibility for spatial resolution enhancement. If and are known by training, they can be used directly with a coarse spatial resolution image to derive the corresponding fine spatial resolution image patch-by-patch. However, this time, the dictionary training is actually for an image pair, which is different from the traditional training process just for one single image. So, there are some modifications made to train the dictionary.

Firstly, because and share the same sparse vector , the training for and cannot be taken separately as two images. Or they may obtain the different vector . The strategy used for dictionary training in this condition is the joint dictionary training [28].

The second modification is the extraction strategy for training samples. It is known that for traditional SRR, the training images actually should be the image pairs covering the same area in different spatial resolutions. However, to make the preparation of training data simple, the sparse representation modelling-based SRR uses each training image as the fine spatial resolution image and generates its corresponding coarse spatial resolution image by blurring and downsampling. Furthermore, the sparse representation SRR does not use the whole training image for training but just randomly extracted some sample patches from the training images to save the training time and reduce the training database.

Finally, the third modification is to add the feature extraction for coarse spatial resolution patches. To make the derived sparse coefficients fit the most relevant part of the coarse spatial resolution signal, feature extraction is adopted to highlight the features concerned. Generally, this process could be some kind of high-pass filter [30].

3. Results and Analyses

3.1. Statistical Analyses

To compare with other studies, which usually use statistical variables to assess their results, statistical analyses were also taken firstly. Table 1 listed the RMSE and correlation coefficient of each method to the reference LST derived by ASTER data.

Table 1 shows that the RMSEs of PBIM and Stepwise seem to be unacceptable, which are much higher than what is usually reported in previous studies. Even for EM and SRTS, RMSE around 5°C seems not a good result either (normally <5°C) [9, 11, 20]. However, previous studies used degraded data but not the real data sources for all inputs and reference data. Some studies claimed that using degraded data could avoid the small crossscale georeferencing inaccuracies caused by the usage of data from different sources [16, 31, 32]. However, it is almost unavoidable to use data from different platforms in real applications of thermal sharpened LST [15, 18]. Therefore, those accuracies reported by degraded data, although ideal, may not be practical in real applications. In contrary, accuracy assessment reported by this study, though not as good as reported in the previous studies, can be a more practical and reliable reference to those who would like to use the thermal data in real applications because all the experiments are based on real data from different sources.

The reason for the large RMSEs of PBIM and Stepwise might be explained by the correlation analyses (Figure 3). The LST ranges of PBIM or Stepwise are much wider than that of the reference LST. Given that the study area has a temperate oceanic climate, it should be impossible to have some extreme LSTs such as 160K or 430K. The reference LST also suggested that the range of LST should not be that extreme. Thus, the correlation plots indicate that PBIM and Stepwise produced some extreme points which are not correct and these points significantly increased the RMSEs.

However, the number of those extreme points is not very large, as the correlation coefficients of PBIM and Stepwise are generally similar to those of the others as shown in Table 2. Differences of the correlation coefficient between compared methods seem to be impacted mainly by seasons, because almost all the results for winter (January) obtained the lowest correlation coefficients among the three seasons. This may mainly be due to the heat capacity of surface materials which makes the characteristics of LST in winter relatively different from those in other seasons. The temperature of a material with a larger heat capacity changes slower than that of a material with a smaller heat capacity along with the change of the external temperature. Thus, LST contrast in winter would be much smaller than that in other seasons because LST of water will not change as much as others from summer to winter. This phenomenon will be shown later in Visual Comparison where the reference images of all three seasons are presented. Therefore, the reason why the correlation coefficients of PBIM and Stepwise are lower than those of the other two is because they failed to predict the small LST contrast in winter while EM and SRTS did better than them.

The only one minus value in Table 1 is produced by Stepwise for data in October. From Figure 3, it shows that the main body of its scatter points actually shows a relative clear positive trend, which means that the regression line should be from the bottom left corner to the upright corner in the feature space. However, there is a small green cluster of points (means a relatively large number of points) located at the top left corner, and this cluster impacted the trend of the regression line significantly, making it even slightly negative. The small green cluster indicates that several points with low LST in reference dataset are predicted to have a much higher LST in the predicted dataset. If analyzed with the images in Visual Comparison (Figure 4), it can be seen that almost all the water pixels are predicted to have the highest LST in the result of Stepwise for data in October which are not consistent with the reference of October, leading to a low correlation with the reference. This error is mainly because Stepwise adopted an automatic mechanism of choosing sharpening indices [11]. For the image in October 2001, although the candidate indices include NDVI, MNDWI, NDBI, and albedo, only MNDWI was chosen as the sharpening index for stepwise regression and this cannot be controlled manually. Therefore, the water surface pixels were all given a high, even higher than the impervious surface, LST value in that LST image, though the shape of most water bodies (e.g., lakes, rivers) is predicted relatively well (as shown in Figure 4). This may reflect the limitation of the autoselection mechanism for sharpening indices to some extent. Although the autoselection may avoid the human interference in the process, it causes the risk that the chosen indices might be unsuitable but cannot be controlled manually.

3.2. Evaluation of Spatial Pattern

As Keramitsoglou et al. [22] claimed, the spatial pattern information from LST images is also important for the UHI studies. Sobrino et al. [5] pointed out that to use the mean LST as the representation of an urban area or a rural surrounding area is not reasonable, as, in their experiment, it is apparent that different districts in the city have different LSTs. Thus, the description on the details of the thermal structure of the UHI effect should be considered. Unfortunately, there is a lack of in-depth evaluation of the thermal spatial patterns [15]. Thus, this research considers the spatial pattern as another essential aspect in the evaluation of the compared methods. Fuzzy similarity was employed as the assessment variables as it can not only provide a mean value as the representation of the entire image but also produce an image of which each pixel has a similarity value for the spatially corresponding pixel of the original result.

3.2.1. Evaluation by Mean Fuzzy Similarity and the Standard Error of the Mean

Figure 5 illustrated the mean fuzzy similarity of the result produced by each method in each season. It shows that most result obtained a mean similarity above 0.6 except for the result of EM in January and result of Stepwise in October. This may suggest that SRTS and PBIM are better for detecting the correct spatial pattern than EM and Stepwise.

Table 2 listed more details on the mean fuzzy similarity and the standard error (SE) of the mean for every result. It shows that the SEs of all the results are very small, which means that almost all the mean values are significantly different from each other. In addition, it is known that lower values of SE indicate more precise estimates of the population mean. In Table 2, SRTS obtained the lowest SEs of the mean among results in January and October, while EM obtained the lowest SE of the mean among results in April. In the comparison of the mean values and the SE of the mean based on data in all three seasons for each method, it also shows that SRTS obtained the lowest SE of the mean and the highest mean fuzzy similarity compared with other methods. This may suggest that SRTS has a higher accuracy in the evaluation of the spatial pattern than other three methods.

When compared for different seasons, it can be seen that the mean similarity and SE of the mean based on results of all methods in January obtained the lowest similarity and the highest SE among the compared three seasons. Thus, even though not shown apparently in Figure 5, the figures in Table 2 may still suggest that the data in January tends to be difficult for thermal sharpening methods to get a correct spatial pattern of LST compared with data in other seasons.

3.2.2. Evaluation by Fuzzy Similarity Imagery

Figures 46 showed the fuzzy similarity image of the result produced by each method for each season. Fuzzy similarity images can reflect the spatial distribution of areas with high or low similarity to the reference. In the fuzzy similarity image, the brighter the pixel is, the higher the similarity value it has, which indicates that it is more similar to the reference pixel. Otherwise, through comparing the similarity image to the reference image, the shortage of each method on predicting different land covers might be found.

Figures 6 and 7 illustrated the fuzzy similarity images of LST results not in winter. It might be found that the illustrated coverage area of the result produced by SRTS is different from other methods. For EM, PBIM, and Stepwise, their experiment data should be the common area among scenes of three platforms (MODIS, ETM+, and ASTER), while SRTS, which does not require fine spatial resolution input, just needs two sources (MODIS and ASTER). This makes the coverage area in experiment of SRTS different from, usually larger than, those of the other three methods.

Through comparison in Figures 6 and 7, which represent the performance of each method on detecting the LST spatially correct for data not in winter, it might be said that Stepwise is not recommended among the compared methods, as its accuracy is not stable due to the automechanism of predictor choice of the algorithm. Its result in Figure 7 obtained apparently more dark pixels than in Figure 6, indicating a lower spatial similarity. Although there are the advantages of this mechanism such as to reduce the human interference and to make the process more automatic, it still lacks the mechanism to guarantee that the most optimal predictors can be selected. For EM, PBIM, and SRTS, their performances in evaluation by fuzzy similarity images seem to be similar. Dark pixels tend to gather in or around the river area, indicating that the narrow linear water bodies are difficult to be detected correctly. This is mainly because the linear features tend to be in the mixed pixels in the coarse spatial resolution images. Thus, in thermal sharpening, even the fine spatial resolution information has been brought in or generated by algorithms; the accuracy of the spatial distribution of those fine resolution details is easily impacted by the original coarse resolution mixed pixels. Otherwise, vegetation in rural areas tends to be predicted wrongly as well for all compared methods.

When it comes to Figure 4 which represents the performance of each method for data in winter, there are two apparent findings. The first is that EM obtained much more dark areas than other methods, which is consistent with its low mean fuzzy similarity in the previous evaluation. The second is that almost all the water pixels in the result of Stepwise are predicted incorrectly.

Based on the reference data, the LST contrast of the entire study area in winter is much smaller than that in other seasons because of the heat capacities of different land covers. From the reference image for LST in winter (Figures 4(d) and 4(f)), it can be seen that the LST difference for the whole image is only 9K, including water, which is much smaller than those in other seasons. The incorrections that happened for vegetation and impervious surface in the result of EM are mainly because the LST for the impervious surface was predicted to be higher and LST for vegetation was predicted to be lower than their reference LSTs for winter. For the incorrections that occurred for water or areas along it (as the red boxes shown in Figures 4(c) and 4(e)), similarly, it is because of the large heat capacity of water and the original coarse resolution mixed pixels. In winter, the LST of water in winter is not significantly lower than most other materials according to the reference. However, in the sharpened LST, water still gets the lowest LST in the result of Stepwise, leading to a low similarity for the water surface.

In Figure 4, it seems like PBIM and SRTS obtained less dark areas than the other two methods. The dark pixels in the result of PBIM are evenly distributed in the study area while the dark pixels in the result of SRTS tend to be gathered mainly near the edge of the river and vegetated areas in the upper and lower parts of the image. According to the reference, there are much less dark pixels that occurred in the central urban area in the result of SRTS (as illustrated by the red polygon in the reference image). Even for the part of the river in the urban area, dark pixels in or at the edge of the river are reduced significantly compared to those of the river edge in the rural area (as shown in the red box in Figure 7(e)). This may suggest that SRTS is more suitable to be used for urban studies than others in winter as it tends to produce less unsimilar pixels for the impervious surface which is the main land cover type of the urban surface.

Through analyses of Figures 46, the following conclusions can be derived: (1)Vegetated area, water, and areas near water tend to be wrongly predicted for all compared methods as more dark pixels in the fuzzy similarity image tend to occur in these areas(2)Among all the compared methods, PBIM and SRTS tend to obtain less dark pixels than other methods, which indicates a higher accuracy for the predicted spatial pattern generated by PBIM and SRTS than those of others(3)Result of SRTS for data in winter obtained much less dark pixels for the impervious surface area in the fuzzy similarity image. This may indicate that SRTS is suitable for urban studies as the main land cover type of the urban area is the impervious surface(4)Stepwise tends to have a lower accuracy for water than other land cover types. In the experiments, results of Stepwise for LST in October and January obviously obtained the incorrect LST for the water surface (as shown in Figures 6 and 7). This may suggest that Stepwise is not suitable to be used for the area containing a large amount of water

3.3. Visual Comparison

The general spatial pattern of the entire study area processed by each method in different seasons will be compared visually, to let the readers get a straightforward view about the accuracy of the spatial pattern of each sharpened LST. It is assumed to be used as a support to the evaluation of the spatial pattern, especially to see the accuracy performance of each method for the main land cover types of urban areas, including vegetation, water, and impervious surface.

To understand the thermal response of each land cover type is also valuable for a district level study, as the LST can vary significantly between different land covers [33]. Particularly, land cover composition in urban areas is highly complicated and variable in the spatial dimension, which makes the thermal environment more complex. Anniballe et al. [19] also pointed out that the intraurban UHI spatial variability is closely related to the distribution of buildings, surface materials, and density of green areas. Therefore, some objects are chosen in this research for three land cover types.

Those highly built-up areas might be the airport, commercial areas with intensive roof-related impervious surface, and so forth.

Figure 8 illustrated the results of all compared methods for data not in winter (i.e., April and October). It can be seen that the general spatial pattern of LST produced by EM is consistent with the reference, yet blocky effect exists in both results of EM, which actually is impacted by the original coarse resolution data. One apparent evidence for the rough description of the spatial pattern is that the shape of the river and the lakes is not well described, which almost maintained the characteristics of the original coarse pixels.

For results of SRTS, blocky effect is eliminated, and the general spatial pattern of LST is consistent with the reference as well. However, it seems like the pattern has been smoothed too much, resulting in several small round hot spots in the edging area of the central urban area. This is mainly due to the SRM algorithm used by SRTS. If SRM is overdone, it is common to produce this sparsely distributed round shapes. Nevertheless, variations of LST between different land covers are still distinguishable in results of SRTS. Most impervious surface objects obtained the highest LST, while parks are allocated to the lower LST. Water bodies are generally located in the blue or yellow areas.

Even though the general pattern that the central urban areas are red and the surrounding areas are blue can be distinguished in the result of PBIM, it seems like a large amount of blue fragments exist in the central area while several red fragments appear in the rural areas as well. This might be because the regression method it used tries to bring the fine spatial resolution information extracted from predictors into the result, while the residual extracted from the original coarse resolution pixels is still used to correct the final sharpened LST. These residual data brought back the impact of coarse resolution pixels. In the results of PBIM, the shape of the river and some lakes is described fairly well. However, the relationship between LSTs of vegetation and impervious surface, where LST of vegetation should be lower than that of impervious surface, was not described well.

Results of Stepwise for data in October are apparently inconsistent with the reference, where the LST of water should not be that high. This is due to the automechanism of predictor choice. And here again confirmed that the accuracy of Stepwise is not stable when used for different applications.

Figure 9 illustrated the result of each method for data in winter (January). Due to the limited common area between ASTER and ETM+ imagery for this date, not all the objects have the reference background LSTs here. However, from the reference, it can be seen that the LST contrast of the entire area is much smaller than in other seasons.

In Figure 9, the spatial pattern of LST produced by EM and SRTS showed the consistency with the reference, where the LST contrast is generally small. In contrast, the result of PBIM obtained a fragmented LST spatial pattern and even shows a trend that the central area is cooler than the surrounding rural area. Stepwise still predicted the water pixels to have the lowest LSTs, which actually should have the similar LST to other land covers.

Through comparison in Figure 9, the priority of using the classification information, instead of a limited number of spectral indices, to provide fine spatial resolution details in thermal sharpening is highlighted. As introduced in Data and Methods, EM extracted fine spatial resolution details from emissivity data which actually are produced by classification information, and SRTS uses the SRM to sharpen the land cover information first and then bring it into LST estimation. In this section, the results produced by EM and SRTS do not have extreme points like in results of PBIM and Stepwise and are more sensitive to the changes of the spatial pattern in different seasons than those of PBIM and Stepwise.

4. Discussions

4.1. The Evaluation of Spatial Pattern

As Sismanidis et al. [15] mentioned, most current studies on thermal sharpening methods lack the evaluation of spatial patterns which is equally significant for UHI studies. Instead, they prefer to use statistical variables to do the evaluation. This might be because these variables are easy to be calculated from the absolute LST values and can be a quantitative way to describe the accuracy performance. However, they consider little spatial information of the entire LST map. Quan et al. [21] found that the conclusion derived from the evaluation based on the absolute LST values might be inconsistent with that derived from the evaluation of LST spatial distribution. In their experiment, they found the result with the most similar spatial pattern and texture to the reference image obtained the highest RMSE. Therefore, they suggested that to use which evaluation or both of them should depend on the application of the sharpened LST. If the sharpened LST is used as input to a quantified model, the accuracy of the absolute LST values should be emphasized. If the application focuses on the description of the spatial pattern of the entire thermal environment, evaluation of LST distribution and texture might be preferred. Therefore, for a comprehensive evaluation of a method, it is better to evaluate both aspects.

The lack of evaluation of the LST spatial pattern might be partly due to the difficulty of defining the spatial pattern of LST. What usually derived from thermal remote sensing data is the raster LST images which consist of pixels. On the contrary, the spatial pattern is a relatively “vectorial” concept which may need to define a boundary of an area. As Keramitsoglou et al. [22] reported, they extracted the hot spot pixels and then treat them as objects. However, in their study, the extracted objects were more like the LST classification but lost the gradual change of the entire LST pattern. Voogt and Oke [34] have already criticized that the slow development of thermal remote sensing of urban areas is due largely to the qualitative description of thermal patterns. It is common to find in literature that, for comparison of LST spatial patterns or texture, people usually present a number of results in an illustration and then use a limited number of words for the description [3537]. This revealed the lack of a widely acknowledged quantifying method for evaluation of the LST spatial pattern. Currently, three indices have been tried in the relative evaluations. The Local Moran Index (LMI) has been tried in studies of Sismanidis et al. [15] because this is a classic statistical tool for detection of the spatial cluster [38]. To evaluate the spatial pattern, CO-RMSE, which is based on the comparison between the LST cooccurrence matrix of sharpened LST and the reference, was proposed and used in studies of Quan et al. [21]. Fuzzy similarity allocated a similarity value to each pixel based on the information of its neighborhood pixels around the central pixel. The reason we adopted fuzzy similarity in this research is that it not only provides a value representing for the entire study area (e.g., mean fuzzy similarity of an image) but also provides a similarity image which can further provide spatial information on the location of error occurrence and its relationship to the land cover or other spatial factors. This type of information did help the analyses in our study, making us understand the impact of different land cover types to the accuracy performance of each method. Also, it is found in our study that the accuracies of PBIM and Stepwise in evaluation of the spatial pattern were not affected by the extreme points too much like in the statistical evaluation. This might be because those extreme values are smoothed by their neighborhoods in calculation of fuzzy similarity and thus do not show a significant reduction in the accuracy of PBIM and Stepwise in evaluation of the spatial pattern. This reflected the priority of fuzzy similarity and the necessity to do evaluation of the spatial pattern for a method as it may reveal accuracy performance from a different aspect. The evaluation and analyses of the spatial pattern in this study may provide some ideas for the further related researches.

4.2. Application of Thermal Sharpening for Urban Area

In early years of development of thermal sharpening technology, most studies emphasize on the sharpening for the large area covered mainly by vegetation [17, 37, 39]. Also, the predictors commonly used in thermal sharpening algorithms are vegetation indices. Thermal sharpening was found to be especially suitable for urban thermal environment studies because there is an urgent requirement of both fine spatial and temporal resolution data [5, 26, 40]. However, it was found that those methods proposed for large vegetated areas were not suitable for urban areas because the main impacting factor of LST in urban is not the vegetation [12, 41]. Therefore, more impacting factors, including impervious surface fractions, water indices, and albedo, were considered. Until years after 2010, more proposal studies and application reports of thermal sharpening methods were found in literature [8, 11, 15, 35, 42, 43]. However, it is found that for most of them, the scale factor (or zoom factor) of downscaling is still limited (<10) and the aiming sharpening resolutions, especially for applications of thermal sharpening data, are 1 km [15, 44], 90 m [35], or 30 m [42], which are the spatial resolutions of MODIS, ASTER, and TM/ETM+, respectively.

This might be due to the limited data sources for fine spatial resolution input. Also, it might be the strategy to guarantee the accuracy of the sharpened data, because studies usually reported that a larger scaling factor corresponds to a lower accuracy of the sharpened data [5, 8, 31]. Another possible reason might be the processing time. For applications which would like to dynamically monitor the thermal environment of several urban areas, they need data with very fine temporal resolution which usually are acquired from geostationary platforms (e.g., SEVERI with 15 min resolution). If the spatial resolution is also required to be relatively fine, there might be a burden for the processing system.

Although the applications of sharpened LST seem to be limited, that does not mean that the efforts made on expanding the diversity of the thermal sharpening methods are insignificant. On the contrary, the limited applications may reflect that the current methods are still insufficient or unsuitable for various real applications. Efforts may still be needed to ease the data preparation, optimize the algorithm to reduce the processing time and burden, and make the whole process be as automatic as possible. These requirements of practical applications are still challenging the research world, and some of the researchers have started to try to deal with the above issues. SRTS tried to simplify the data preparation by moving out the requirement of fine spatial resolution input [13]. Weng et al. [42] and Yang et al. [35] are advancing some models which try to generate TM-like and ASTER-like daily LST automatically based on a number of inputs. The above attempts are still in the beginning and have some limitations. However, they showed the efforts made on diversity of thermal sharpening development and on filling the gap between the research and the real applications.

5. Conclusions

This study compared four thermal sharpening methods proposed especially for urban areas through evaluation of two aspects. Particularly, not only statistical evaluation, which is commonly used by most thermal sharpening methods, but also evaluation of the LST spatial pattern is carried out.

In both evaluations, it is found that the accuracy performances of all methods are worse in winter than in other seasons. This is mainly because the LST contrast in winter decreased significantly compared to that in other seasons. Most thermal sharpening methods cannot detect this change very well, leading to a decreased accuracy. For comparison of different methods, Stepwise is not recommended for areas with a large amount of water, and EM and SRTS performed better than the other two methods. However, SRTS removed the requirement of fine spatial resolution input data which eased the data preparation and thus is considered to be more useful than EM. It is also found that linear water features and areas along it are commonly detected wrongly by most thermal sharpening methods. Vegetation in rural areas is also easy to be detected incorrectly.

In this study, we focused on the evaluation of the spatial pattern in accuracy assessment to make the evaluation of each method be comprehensive. Though the accuracy of the spatial pattern has been recognized as an essential factor for LST map, it is difficult to be quantified for a long time. This research may provide an idea on how to evaluate the spatial pattern for the further relative studies. Also, other assessing variables, such as LMA and CO-RMSE, are the good alternatives. It is urgent to develop a variable which can be widely accepted to quantitatively evaluate the spatial pattern and texture of an image. In addition, it is also encouraged to develop more thermal sharpening methods that could be used in real applications especially for urban areas in the future.

Data Availability

Thanks for the free data access to the MODIS images used in this study which were downloaded from https://lpdaac.usgs.gov/data_access/data_pool and the Landsat images which were downloaded from https://landsat.usgs.gov/landsat-data-access.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

Thanks are due to Dr. Anuar Muad and Yang et al. for providing the MATLAB codes of SRM and SRR. This work is supported by the “Fundamental Research Funds for the Central Universities” (Grant Nos. 310821171014 and GK201903112) in China.