A Reviw of GAN Based Super Resolution Reconstruction For Optical Remote Sensing Images
A Reviw of GAN Based Super Resolution Reconstruction For Optical Remote Sensing Images
Review
A Review of GAN-Based Super-Resolution Reconstruction for
Optical Remote Sensing Images
Xuan Wang 1 , Lijun Sun 1 , Abdellah Chehri 2, * and Yongchao Song 1
1 School of Computer and Control Engineering, Yantai University, No. 30 Qingquan Road,
Yantai 264005, China; [email protected] (X.W.); [email protected] (L.S.); [email protected] (Y.S.)
2 Department of Mathematics and Computer Science, Royal Military College of Canada,
Kingston, ON K7K 7B4, Canada
* Correspondence: [email protected]
Abstract: High-resolution images have a wide range of applications in image compression, remote
sensing, medical imaging, public safety, and other fields. The primary objective of super-resolution
reconstruction of images is to reconstruct a given low-resolution image into a corresponding high-
resolution image by a specific algorithm. With the emergence and swift advancement of generative
adversarial networks (GANs), image super-resolution reconstruction is experiencing a new era
of progress. Unfortunately, there has been a lack of comprehensive efforts to bring together the
advancements made in the field of super-resolution reconstruction using generative adversarial
networks. Hence, this paper presents a comprehensive overview of the super-resolution image
reconstruction technique that utilizes generative adversarial networks. Initially, we examine the
operational principles of generative adversarial networks, followed by an overview of the relevant
research and background information on reconstructing remote sensing images through super-
resolution techniques. Next, we discuss significant research on generative adversarial networks
in high-resolution image reconstruction. We cover various aspects, such as datasets, evaluation
criteria, and conventional models used for image reconstruction. Subsequently, the super-resolution
reconstruction models based on generative adversarial networks are categorized based on whether
the kernel blurring function is recognized and utilized during training. We provide a brief overview
of the utilization of generative adversarial network models in analyzing remote sensing imagery.
Citation: Wang, X.; Sun, L.; Chehri, In conclusion, we present a prospective analysis of forthcoming research directions pertaining to
A.; Song, Y. A Review of GAN-Based super-resolution reconstruction methods that rely on generative adversarial networks.
Super-Resolution Reconstruction for
Optical Remote Sensing Images. Keywords: generative adversarial networks; super-resolution reconstruction; remote sensing;
Remote Sens. 2023, 15, 5062. https:// low-resolution (LR) images; high-resolution (HR) images
doi.org/10.3390/rs15205062
The SR technique was initially proposed by Harris [1]. It is a crucial technology in the
domains of computer vision and digital image processing [2]. It is extensively employed in
medical imaging [3,4], remote sensing [5,6], video analysis [7], and other domains [8–11].
Currently, imaging technology for remote sensing has been utilized in numerous
industries, including but not limited to agriculture, forestry, marine, meteorology, and en-
vironmental protection [12]. Remote sensing imagery is integral in applications like land
cover analysis, crop growth identification, disaster and weather prediction, land use man-
agement, and water ecology monitoring. The demand for remote sensing imagery in
various industries is steadily growing, with HR being particularly sought after.
During the acquisition of remote sensing images, the resolution may be limited by
several factors, including shooting conditions, equipment resolution, and atmospheric
conditions [13]. These limitations have the potential to cause blurring in the resulting
images. Image SR reconstruction technology aims to obtain an HR image by reconstructing
an LR image, which can improve the recognition ability and recognition accuracy of
the image.
Public security field: With advancements in society and technology, traditional video
surveillance methods are often limited in terms of clarity and accuracy, which may not
adequately meet the needs of individuals and organizations. The utilization of artifi-
cial intelligence in video surveillance and integrated image processing technology can
significantly enhance public safety measures. Image super-resolution techniques have
wide applications in iris recognition, abnormal behavior detection, license plate recogni-
tion [14,15], etc. This can improve the accuracy of object identification and greatly improve
the safety factor.
Traditional SR reconstruction algorithms can be divided into three main categories.
The initial category is grounded on interpolation algorithms, such as bicubic interpola-
tion [16], nearest neighbor interpolation [17], adaptive image interpolation [18–20], and so
on. The second category of algorithms is reconstruction-based, including methods such
as iterative inverse projection [21,22] and convex set projection [23,24]. The third category
refers to hypersegmentation algorithms based on learning, including sparse coding tech-
niques [25–28], among others. While the traditional technique for SR reconstruction may
appear simple at first glance, it is not without its drawbacks [29].
The interpolation method exhibits a straightforward and easily comprehensible struc-
ture, making it manageable for users. However, it is important to note that this method
relies solely on the pixel information available in the low-resolution (LR) image. Each pixel
is interpolated using information from surrounding pixels, resulting in a blurred image.
The processing of the image’s edges, texture, and other areas is not optimal, resulting in
accuracy issues.
The reconstruction-based approach can sharpen the details, but its performance de-
creases rapidly as the scale factor increases. Its convergence speed is slow, and its compu-
tational cost is large. The shallow learning approach entails the acquisition of the LR-HR
Remote Sens. 2023, 15, 5062 3 of 34
image connection from an extensive range of training samples, which is then employed
to forecast the reconstructed images. While certain elements may be retrievable, there are
evident imperfections, and the process of designing is intricate.
Machine learning is an essential subfield of artificial intelligence [30]. Deep learning is
an algorithm that is widely used in the field of information technology. In the field of remote
sensing imagery, deep learning-based methods for super-resolution (SR) reconstruction
can be classified into three categories: single-image super-resolution approaches [31,32],
multi-image super-resolution techniques [33,34], and multi-(hypo)spectral remote sensing
image super-resolution methods [35].
Currently, CNN and GAN-based techniques are commonly employed for SR recon-
struction of single remote sensing pictures. The primary CNN-based approaches for SR
include SRCNN [36] (super-resolution convolutional neural network), VDSR [37] (very
deep convolutional networks for super-resolution), and EDSR [38] (enhanced deep residual
networks for super-resolution). The outcomes yielded from such approaches surpass those
of the conventional bicubic interpolation techniques, but they remain underdeveloped.
Therefore, the reconstruction effect is not particularly obvious.
The generative adversarial network (GAN) is a deep learning model that was intro-
duced by Goodfellow et al. [39] in 2014. In recent years, this approach has shown great
promise for unsupervised learning with intricate distributions. Since the proposal of GAN,
it has garnered significant attention from both academic and industrial spheres. Through
extensive research on GANs, the technology has rapidly advanced in both theoretical
understanding and model construction. There are numerous applications in the areas of
computer vision and human–computer interaction.
The main inspiration for the GAN model is derived from the idea of zero-sum games
in game theory [40,41]. In particular, GAN comprises two components, the generative
network and the discriminative network, which constantly refine their output through
iterative learning. The authors in [42] primarily conducted a comparative analysis of
various GANs. It demonstrates the implementation of the widely used GAN framework on
image samples of varying dimensions. Most initial reviews focus on utilizing deep learning
technology for reconstructing HR images from a single source. The introduction of the
GAN-based SR reconstruction model is only a part of it.
Although numerous super-resolution techniques have attained satisfactory recon-
struction outcomes, certain limitations still exist in recovering images from actual scenes.
GAN networks possess formidable learning abilities. Nevertheless, there has been limited
research dedicated to comprehensively summarizing the implementation of GAN-based
super-resolution in recent times. In this work, we refrain from providing a general overview
of SR based on deep learning, distinct from the approach in many other papers. However,
unlike most works, this article comprehensively analyzes super-resolution reconstruction
techniques for images that utilize generative adversarial networks (GANs). Furthermore,
this paper explores the core principles and processing techniques of GANs. It also provides
an overview of the SR (super-resolution) model of GANs, highlighting its reconstruction
performance, strengths, and limitations. The paper’s structural framework is depicted in
Figure 2.
The main contributions of this paper are as follows:
• We offer a thorough overview of the super-resolution process based on GANs, which
covers the working mechanism of GANs, the reconstruction process for SR, and the
GAN application in super-resolution reconstruction. This provides the detailed back-
ground knowledge for this paper.
• We present pertinent datasets of both natural and remotely sensed images, metrics for
assessing image quality, and techniques for inducing degradation in imagery.
• We present the model of GANs on super-resolution reconstruction. We categorize
them as blind super-resolution models and non-blind super-resolution models based
on whether or not the blurred kernel is assumed to be known and applied to the image.
We compare performance on natural images and remote sensing imagery.
Remote Sens. 2023, 15, 5062 4 of 34
GAN and SR
GAN Loss
Loss Function Pixel Loss
Perceputal Loss
Bicubic
Image Degradation BSR Degradation
High-order Degradation
ISRGAN
TE-SAGAN
GAN Models for Remote Sensing
NDSRGAN
Enlighten GAN...
Network Design
Evaluation Metrics
Degradation Model
The subsequent sections of this paper are as follows. In Section 2, we present a concise
overview of GANs, how they are used in the SR reconstruction process, and introduce the
loss function and image degradation process. Section 3 categorizes and briefly describes SR
reconstruction models that rely on GAN. The impact of noise on remotely sensed images
is initially discussed in Section 4. Then, some GAN-based SR models for photos from
remote sensing are presented. Finally, we provide a description of the regions where
super-resolution reconstruction of remote sensing pictures is applied. In Section 5, we
present the commonly used datasets and evaluation metrics. Section 6 compares the
performances of five SR models using two objective evaluation metrics, namely PSNR
and SSIM. Furthermore, this section also analyzes their impact on the reconstruction of
remotely sensed images. Section 7 discusses the present difficulties and future goals in
utilizing GAN for remote sensing super-resolution reconstruction. Finally, we provide a
summary of the research presented in this paper.
Remote Sens. 2023, 15, 5062 5 of 34
2. Background
2.1. GAN and SR
2.1.1. Generating Adversarial Networks
Generative adversarial networks are a trending topic in artificial intelligence research.
The basic idea behind GAN is derived from the zero-sum game of game theory [43]. GAN
mainly comprises a generator G and a discriminator D.
The model is trained using adversarial learning techniques to converge toward a Nash
equilibrium. The term “equilibrium”, also referred to as balance, describes a situation
in which the samples produced by the generator cannot be distinguished from the real
samples. The discriminator is unable to differentiate between the real and generated
samples accurately.
As shown in Figure 3, the basic principle of GAN is straightforward. Using an image
as an example, G is a generative network that takes in random noise and outputs an image,
which is denoted as G (z). The variable z stands for noise, which is arbitrary random data
with the same structure as the real data. D is a discriminative network that determines
the authenticity of the image. The input is an image x, and the output D ( x ) calculates the
likelihood that it depicts a genuine image. If the value is 1, the image is deemed authentic.
If the output is 0, the image is considered fake.
D_loss
0
Real Image x
1
noise D
G(z)
G
G_loss
The goal of the generator G is to use the produced samples to deceive the discriminator.
The objective function can be defined as follows:
The goal of the discriminator D is to identify the authenticity of the input samples,
which is defined as
max ( D ( x ) − D ( G (z))). (2)
Therefore, the objective function of GAN can be summarized as follows:
The three equations provided above serve as a concise introduction to the principles of
GAN. Equation (1) demonstrates that the objective of G is to generate an image that closely
resembles reality in order to deceive the discriminator. The more finely we interpolate
between the distribution D ( x ) and D ( G (z)), the closer the generated image will resemble
the original image. Equation (2) represents the objective of D, which is to differentiate
Remote Sens. 2023, 15, 5062 6 of 34
between the image generated by G and a real image. A higher value indicates a stronger
judgment from the discriminator. Equation (3) shows that the variables G and D are
involved in a dynamic game process. Both parties are competing against each other to
achieve superior reconstruction results.
where Ix denotes the low-resolution image, Iy denotes the high-resolution image, Deg()
denotes the degradation function, and δ denotes the relevant parameters of the degrada-
tion process.
Thus, given the current low resolution of Ix , the procedure for constructing a high-
resolution image can be described as follows:
Îy = F ( Ix , θ ), (5)
where Îy denotes the reconstructed result, F is the hypersegmentation model, and θ is the
model parameter.
The degradation of images in reality is impacted by various factors, including but not
limited to weather conditions, motion blur, and sensor noise. Researchers usually describe
Equation (4) as the following process:
Ix = ( Iy ⊗ k) ↓s +n, (6)
where k denotes the degenerate fuzzy kernel, n represents the noise, and ↓s stands for the
downsampling operation with a scaling factor, s. Iy ⊗ k denotes the convolution operation
between the HR image Iy and the degenerate fuzzy kernel k.
The conventional SR reconstruction model features a singular network structure,
which fails to consider the intricate image degradation process and myriad influencing
factors present in reality. Adapting to complex real-world scenarios can present challenges.
Applying generative adversarial networks to super-resolution reconstruction can make the
output images more natural through adversarial training.
1 2
∅ j ( Isr ) − ∅ j Iy
L perceptual = 2
0, (7)
cj hj wj
Remote Sens. 2023, 15, 5062 7 of 34
where c j , h j , and w j denote the number of channels, height, and width of the feature map,
respectively. ∅ denotes the pre-trained network. ∅ j represents the high-level features of
the j-th layer network.
m
1
L2 =
m ∑ (yi − f (xi ))2 , (9)
i =1
where xi and yi are the output and label of the model, respectively, and | xi − yi | denotes the
difference between them. When | xi − yi | is less than 1, the squared error is used; otherwise,
the linear error is used. Since the reaction to outliers is smoother, the smooth L1 loss is
more resilient than MSE.
The gradient will dynamically decrease as long as the smooth L1 loss function assumes
a small value. This addresses the convergence challenges encountered when utilizing L1
loss and mitigates the gradient explosion in certain circumstances.
downsample
X
HR LR
Figure 4. LR images employed in SR reconstruction are acquired through the degradation of the HR
images. An M × N image is resized to a smaller dimension of M N
s by s , where s is the downsam-
pling factor.
LR
Reaize
Noise
Noise
JPEG
JPEG
Blur
Blur
LR HR
Figure 8. The SRCNN [53] model comprises three important components: image feature extraction,
nonlinear mapping layer, and network reconstruction.
Remote Sens. 2023, 15, 5062 11 of 34
VDSR [37] increases the depth of the network, building on the architecture of SRCNN.
It employs deep neural networks to make predictions and applies residual learning to recon-
struct images with super-resolution. VDSR uses residual learning and an elevated learning
rate to expedite the model’s training process. It demonstrates superior reconstruction
performance compared to SRCNN.
In recent years, deep learning techniques have significantly improved image super-
resolution reconstruction. However, there is still scope for improvement in certain aspects
of the network structure. The EDSR model [38] is an adaptation of SRResNet. It eliminates
the batch normalization (BN) layer to streamline the network architecture and reduce the
consumption of storage and computational resources. The BN layer can destroy the original
contrast information of the image and ignore the absolute difference between image pixels.
This may affect the quality of the reconstructed image. Hence, the BN layer is frequently
omitted in tasks related to super-resolution.
SRGAN
Elementwise Sum
PReLU
PReLU
PReLU
Conv
Conv
Conv
Conv
Input
Conv
Pixel
BN
BN
LR SR
skip connection
Discriminator
Leaky ReLU
Leaky ReLU
Leaky ReLU
Sigmoid
SR
Dense
Dense
Conv
Input
Conv
BN
HR
Figure 9. SRGAN [54] incorporates both generator and discriminator components in its structure,
enabling it to achieve high-quality 4× image reconstruction.
Residual Block
PReLU
PReLU
B
Conv
Conv
SRGAN ESRGAN
Figure 10. The residual block structure of ESRGAN [55].
tillation and hardware-aware evolutionary searches. However, it did not yield satisfactory
visual outcomes.
To improve image resolution, especially perceptual quality, G-GANISR was proposed
in [62]. This architecture comprises a generator and a discriminator with distinct loss func-
tions. It generates new results based on quantitative and qualitative measurements. It im-
proves the performance of SISR with a gradual growth factor. For instance, Zhang et al. [63]
proposed RankSRGAN to optimize the generator on the perceptual metric.
There are certain similarities between natural imagery and remote sensing imagery,
therefore several of the aforementioned super-resolution reconstruction techniques utilized
for natural imagery can be applied to remote sensing imagery. It is possible to seek
to generate more realistic texture details by adding residual layers and random noise.
The process of knowledge distillation and hardware-aware stabilization has the potential
to create remote sensing images that are more realistic.
and wavelet-transformed to create realistic textures and extract actual image data. Similarly,
we can use multi-scale LR images in the super-resolution reconstruction of remote sensing
images to eliminate the artifact problem in the reconstructed images.
accurate data from it. The reliability of analysis and decision-making processes based on
remote sensing imagery can be affected.
(4) Reduced contrast and dynamic range: Noise can cause random fluctuations in pixel
values, resulting in reduced contrast and dynamic range. This can pose a challenge when
trying to differentiate between various features or detect subtle changes in the environment.
(5) Increased uncertainty: One challenge that arises from noise in remote sensing
data is increased uncertainty. The presence of noise can impact the reliability of any
derived products or analyses. Inaccurate measurements, misinterpretations, and potentially
erroneous conclusions can result from this.
Generator number=23
Covn LReLU
Covn LReLU
...
×2 Nearest
×2 Nearest
RRDB
RRDB
Conv
Conv
SAM
Conv
Conv
WN
WN
WN
WN
LR
SR
Generator
upsampling
upsampling
DCRDB
DCRDB
DCRDB
DCRDB
LReLU
LReLU
Conv
Conv
Conv
Conv
LR
SR
LR images are fed into the first convolutional layer to obtain the original feature map.
Then, the feature map is fed into the dense network. The discriminative network of the
model is illustrated in Figure 13. A dense multi-layer network is used to link the remaining
dense blocks. The discriminative network employs a matrix average generator to discern
real images at a local level.
Discriminator
SR Input
LReLU
LReLU
LReLU
LReLU
Conv
Yes/No
Conv
Conv
Conv
Conv
BN
BN
BN
HR
Figure 13. The discriminator architecture of NDSRGAN [84].
As remote sensing images reflect diverse features and information in different regions,
one paper proposed a novel SD-GAN [85] to learn the mapping between LR and HR. This
model employs paired discriminators to assess image quality and minimize the production
of inaccurate textures. EnlightenGAN [86] employs heuristic blocks to facilitate convergence
towards a dependable network output. The generator structure is shown in Figure 14. It
uses self-supervised hierarchical perception to address artifacts. While GAN has made
significant advancements in image SR reconstruction, the resulting images may still exhibit
artifacts and an absence of high-frequency information. TWIST-GAN [87] combines wavelet
transform, and GAN transforms to obtain high-quality remote sensing images.
Remote Sens. 2023, 15, 5062 18 of 34
Generator
skip connection
Conv
Conv
Output
23 RRDBs ×2 HR
...
Feature Map
Upsampling
Upsampling
RRDB
RRDB
RRDB
Conv
Conv
Conv
Conv
×1 LR
×4 HR
Figure 14. The generator architecture of EnlightenGAN [86].
Obtaining LR-HR image pairs in real-world scenes can be challenging, which limits
the applicability of some previously proposed methods. Wang et al. [88] presented an
unsupervised learning framework known as Enhanced Image Prior (EIPGAN). Random
noise is fed into the GAN network to enable SR reconstruction of remote sensing imagery.
Then, the reference image is used as the previous image. Finally, the noise is refreshed,
and the information is transmitted from the reference image.
Due to the inherent limitations of remote sensing technology, only a limited number
of high-resolution images are available for training deep neural networks. A GAN network
was introduced in a paper [89]. The generator acquires the SR image and subsequently
downsamples it to create the LR image. The downsampling results are subsequently
utilized to train the discriminator, thereby enhancing the spatial resolution of remote-
sensing images.
Acquiring HR remote-sensing images is a key issue in GIS. Convolutional neural
networks encounter challenges when trying to model larger scales. Jing et al. [90] sug-
gested the SWCGAN model, which combines the strengths of the Swin Transformer and
convolutional layers. The Swin Transformer layer is combined with convolutional layers to
construct a generative network capable of producing HR images.
Despite the widespread use of deep learning methods for image super-resolution,
they still have limitations when restoring high-frequency edge details in images contam-
inated with noise. A study [91] presented edge-enhanced generative adversarial net-
work architectures. EEGAN mainly consists of ultra-dense subnetworks (UDSNs) and
edge-enhanced subnetworks (EESNs). The satellite image reconstruction performance is
improved more robustly.
Recently, Zhao et al. [92] presented an SR model called the second-order adversarial at-
tention generator network (SA-GAN), which is based on real-world remote sensing imagery.
The generator network of SA-GAN utilizes a second-order channel attention mechanism
and a region-level nonlocal module to effectively leverage the a priori knowledge in LR
images. In addition, SA-GAN employs region-aware loss to mitigate the generation of
artifacts. The region perception proposed by the SA-GAN model offers new insight into
how to address the artifact problem, which is frequently caused by the reconstruction
impact of remote sensing images based on GAN model.
Table 1. Commonly used natural and remotely sensed image datasets for super-resolution recon-
struction tasks.
The NWPU-RESUSC45 contains 31,500 optical remote sensing images with a pixel
size of 256 × 256. It covers 45 scene categories: airports, basketball courts, palaces, etc.
The RSC11 remote sensing image dataset [106] contains 11 categories, including dense-
forests, grasslands, overpasses, and roads, with about 100 in each group, giving a total
of 1232.
Besides the datasets mentioned in the Table 1, Manga109 [113], OutdoorScene [114],
VOC2012 [115], and CeleA [116] can also be utilized for SR reconstruction.
Hyperspectral resolution remote sensing is a technique that involves continuously
capturing remote images of features using narrow and continuous spectral channels. Hy-
perspectral images possess a significant level of spectral resolution and encompass a
vast amount of valuable information, encompassing both radiometric and spatial aspects.
The following collection comprises multiple datasets consisting of hyperspectral remote
sensing images:
• Washington DC dataset [117]: The Washington DC data refer to an aerial hyperspectral
image acquired by the HYDICE sensor. The data size is 1208 × 307. Categories of
features include roofs, streets, graveled roads, grassy areas, etc.
• The Berlin–Urban–Gradient dataset [118] contains HyMap hyperspectral imagery at
different resolutions and simulated EnMap hyperspectral imagery. The real MyMap
data contain 111 bands. The dataset with a spatial resolution of 3.6 m has dimensions
of 6895 × 1803, and the data with a spatial resolution of 9 m is 2722 × 732.
• Airborne hyperspectral datasets [119] contain 128 bands ranging from 343 to 1018
nanometers. There are 19 categories of features, all-encompassing in both urban and
rural areas.
(2) the evaluation process demands substantial labor and resources. It cannot be automated
and inefficient. In contrast, image quality assessment is considered to be more objective.
Therefore, image quality evaluation is frequently utilized in practical applications.
Image quality evaluation metrics can reflect the reconstruction effect of the model.
In this section, we introduce some image quality evaluation methods.
m −1 n −1
1
MSE =
mn ∑ ∑ [ I (i, j) − K(i, j)]2 , (11)
i =0 j =0
MAX 2I
PSNR = 10 · log10 ( ), (12)
MSE
where I and K represent the reference and distorted images, respectively, both M × N.
MSE is the outcome of comparing the differences between each pixel of the two images.
The MAX peak signal is typically represented by a value of 255 when using 8 bits per pixel.
PSNR is a quantitative measure of image quality that considers the sensitivity to errors. It
does not consider the optical properties of the human eye, so the assessment results may
differ from human visual perception.
2µ x µy + c1
l ( x, y) = , (14)
µ2x + µ2y + c1
2σx σy + c2
c( x, y) = , (15)
σx2 + σy2 + c2
σxy + c3
s( x, y) = , (16)
σx σy + c3
where α, β, and γ are weighting parameters that represent the share of three different
features in the SSIM measure: brightness, contrast, and structure, respectively. l ( x, y)
represents the brightness comparison, c( x, y) is the difference comparison, and s( x, y) is
the texture comparison. µ x and µy represent the average of x and y, respectively. σx and
σy represent the standard deviations of x and y, respectively. σxy denotes the covariance
between x and y. c1, c2, and c3 are constants that can prevent system errors caused by
denominators equaling zero.
Remote Sens. 2023, 15, 5062 22 of 34
k
∑ ni ci
i =1
MOS = , (17)
k
∑ ni
i =1
where ci denotes each type of score and ni is the number of people scoring each type of
score. MOS is affected by various factors, including emotions, motivations, preferences, etc.
These factors can contribute to the production of truly equitable evaluation results.
In addition to the evaluation metrics mentioned above, there are many other evalua-
tion criteria [123–125], including learned perceptual image patch similarity (LPIPS) [126].
Perceptual loss is used to assess the dissimilarity between two images. The LPIPS value
decreases as the similarity between two images increases; conversely, the magnitude of
the difference increases as the LPIPS value increases. The natural image quality evaluator
(NIQE) [127] is an objective evaluation metric. It uses natural landscape elements as features
to evaluate test images and predicts their quality based on these “quality-aware” features.
6.1. Comparison and Analysis of Remote Sensing Image Models Using the Same
Degradation Method
Since different reconstruction models have different methods of image degradation,
the initial step is to standardize variables and apply BSR degradation consistently. We
demonstrate the super-resolution reconstruction results of the five models using the RSC11
remote sensing dataset in Table 2. The analysis shows that the GAN reconstruction tech-
nique yields better image metrics than the bicubic method. The SRGAN model achieved
the best performance metrics among them. The effect is shown in Figures 15–17.
In Figures 15–17, (a) is a low-resolution image degraded by BSR. And figure (b)
represents the original high-resolution image. The results from (c) to (g) represent the effect
graphs of reconstruction using bicubic, SRGAN, ESRGAN, RankSRGAN, and BSRGAN
models, respectively.
Remote Sens. 2023, 15, 5062 23 of 34
Table 2. Results of PNSR and SSIM for each model on each category of the RSC11 dataset.
As depicted in Figure 15, the BSRGAN model produces highly detailed reconstructions
图10 RSC11 high
thatbuildings HRdefinitions compared
offer superior Bicubic
to the other four models. The images reconstructed
SRGAN by the bicubic and SRGAN models produced low-quality and blurred results. They solely
ESRGAN RankGAN BSRGAN
concentrate on the scores of evaluation indicators, disregarding the realistic representation
of individuals.
(a)
Figure 16 displays an image selected from the low buildings category within the RSC11
dataset as a representative example. The bicubic super-resolution results exhibit a faint
appearance and lack intricate details in terms of the reconstruction outcomes. The recon-
struction results indicate that the performance of bicubic super-resolution is inferior. This
method produces faint images that lack detailed information. The SRGAN, ESRGAN,
and RankSRGAN algorithms offer more accurate information in the reconstructed results
compared to the bicubic algorithm. However, it is worth noting that some noise and
artifacts may be present around the edges. The BSRGAN model produces superior visual
outcomes, although it does exhibit some degree of excessive smoothing.
Remote Sens. 2023, 15, 5062 24 of 34
图11
(a)
As illustrated in Figure 17, Figure (b) shows that the reconstruction effect of the bicubic
method has obvious checkerboard artifacts. The color brightness of the reconstructed image
using BSRGAN is more similar to that of the actual HR image. The noise in the recon-
structed image is minimal, and the details within the graph are more distinct compared to
alternative algorithms.
图12
(a)
6.2. Comparison and Analysis of Remote Sensing Image Models Using the Different
Degradation Method
Table 3 shows the reconstruction metrics achieved by the five models on the AID
dataset using various degradation techniques. The reconstruction algorithm based on GAN
outperforms the traditional bicubic algorithm in 31 categories of the AID dataset.
Table 3. Results of PNSR and SSIM for each model on each category of the AID dataset.
Three images were selected from the reconstruction results of the AID test dataset to
demonstrate the effect. For a better view of the reconstruction effect, we zoomed in on the
local details of the reconstructed image. The results are shown in Figures 18–20.
Figure 18 depicts the impact of using the SR approach with × 4 integration on the AID
dataset. It can be seen by local scaling that the ESRGAN model has the best reconstruction
effect compared to other models. The reconstruction generated by the BSRGAN model
seems overly polished and lacks substantial textural nuances.
As demonstrated in Figure 19, the bicubic, SRGAN, and RankSRGAN algorithms have
limitations in effectively processing noise, and the reconstructed images are blurry with
severe artifacts. The reconstructed image displays a degree of blurriness. The ESRGAN
algorithm enhances image reconstruction by providing vibrant colors. The edge definition
is more complex and closely matches the original image.
Remote Sens. 2023, 15, 5062 26 of 34
(a)
Figure 18. The outcomes of various super-resolution techniques in × 4 reconstructions for the
Beach group of the AID dataset. (a) Original figure, (b) HR, (c) bicubic [16], (d) SRGAN [54],
(e) ESRGAN [55], (f) RankSRGAN [63], (g) BSRGAN [50].
(a)
Figure 19. The outcomes of various super-resolution techniques in × 4 reconstructions for the
bridge group of the AID dataset. (a) Original figure, (b) HR, (c) bicubic [16], (d) SRGAN [54],
(e) ESRGAN [55], (f) RankSRGAN [63], (g) BSRGAN [50].
Figure 20 features an image representative of the Square category in the AID dataset.
By zooming in locally, we can observe that the ESRGAN model produces a reconstruc-
tion effect that closely resembles the original image. The edge texture of the road and
lawn is preserved. Ringing artifacts appear in the SRGAN model reconstruction results.
The reconstructed image produced by BSRGAN displays certain limitations in terms of
spatial details.
Remote Sens. 2023, 15, 5062 27 of 34
(a)
Figure 20. The outcomes of various super-resolution techniques in × 4 reconstructions for the
Square group of the AID dataset. (a) Original figure, (b) HR, (c) bicubic [16], (d) SRGAN [54],
(e) ESRGAN [55], (f) RankSRGAN [63], (g) BSRGAN [50].
adverse social consequences and decreased public confidence. Efforts should be undertaken
to establish and enforce measures aimed at mitigating the potential misuse of GANs for the
purpose of producing and distributing fraudulent visual content.
(3) Safety and security: The implementation of expansive network infrastructures
in pivotal sectors like healthcare or transportation necessitates the meticulous evaluation
of potential safety and security hazards. The potential for malicious manipulation of
GAN-generated images by actors with nefarious intentions is a matter of concern. Such ma-
nipulation has the potential to deceive or inflict harm upon the system in question. In order
to uphold the dependability and authenticity of reconstructed images, it is imperative to
incorporate robust security measures and rigorous testing protocols.
8. Conclusions
This paper provides an overview of the super-resolution image reconstruction tech-
nique that utilizes generative adversarial networks, along with its basic principles and
relevant studies. It includes frequently used datasets for both natural and remote sensing
images, metrics for evaluating the quality of reconstructed images, operational principles
of GAN networks, and commonly used loss functions, among others. In addition, this
study presents the reconstruction impacts of several models on both natural and remotely
sensed imagery. Despite the significant advances in image super-resolution techniques,
certain challenges still need to be addressed, particularly in relation to suboptimal re-
construction outcomes. In conclusion, we will provide a concise overview of upcoming
methodological trends and approaches. These may involve the development of image qual-
ity assessment metrics that are in line with human visual perception, as well as the creation
of enhanced super-resolution reconstruction models for improved efficiency. We aim to
deepen the researchers’ comprehension of GAN techniques for image SR reconstruction,
specifically emphasizing remote sensing images. And thus, we hope to promote progress
and development.
Author Contributions: Conceptualization, X.W. and L.S.; methodology, X.W. and L.S.; software,
X.W. and L.S.; validation, X.W., L.S., A.C. and Y.S.; formal analysis, X.W. and L.S.; investigation,
X.W. and L.S.; resources, X.W.; data curation, X.W. and L.S.; writing—original draft preparation,
X.W. and L.S.; writing—review and editing, A.C.; visualization, X.W., L.S., A.C. and Y.S.; project
administration, X.W.; funding acquisition, X.W. All authors have read and agreed to the published
version of the manuscript.
Funding: This research was funded by the Natural Science Foundation of Shandong Province
(ZR2022QF037, ZR2020QF108).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The datasets are available on Github at https://github.com/SunLijun0
1/datasets, accessed on 25 October 2023.
Acknowledgments: We would like to thank the anonymous reviewers for their supportive comments,
which improved our manuscript.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Harris, J.L. Diffraction and resolving power. J. Opt. Soc. Am. 1964, 54, 931–936. [CrossRef]
2. Wang, Z.; Chen, J.; Hoi, S.C. Deep learning for image super-resolution: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020,
43, 3365–3387. [CrossRef] [PubMed]
3. Greenspan, H. Super-resolution in medical imaging. Comput. J. 2009, 52, 43–63. [CrossRef]
4. Isaac, J.S.; Kulkarni, R. Super resolution techniques for medical image processing. In Proceedings of the 2015 International
Conference on Technologies for Sustainable Development (ICTSD), Mumbai, India, 4–6 February 2015; pp. 1–6.
5. Thornton, M.W.; Atkinson, P.M.; Holland, D. Sub-pixel mapping of rural land cover objects from fine spatial resolution satellite
sensor imagery using super-resolution pixel-swapping. Int. J. Remote Sens. 2006, 27, 473–491. [CrossRef]
6. Lei, S.; Shi, Z.; Zou, Z. Super-resolution for remote sensing images via local–global combined network. IEEE Geosci. Remote Sens.
Lett. 2017, 14, 1243–1247. [CrossRef]
7. Lucas, A.; Lopez-Tapia, S.; Molina, R.; Katsaggelos, A.K. Generative adversarial networks and perceptual losses for video
super-resolution. IEEE Trans. Image Process. 2019, 28, 3312–3327. [CrossRef]
8. Fessler, J.A. Model-based image reconstruction for MRI. IEEE Signal Process. Mag. 2010, 27, 81–89. [CrossRef]
9. Zhu, D.; Qiu, D. Residual dense network for medical magnetic resonance images super-resolution. Comput. Methods Progr.
Biomed. 2021, 209, 106330. [CrossRef]
10. Zhao, X.; Zhang, Y.; Zhang, T.; Zou, X. Channel splitting network for single MR image super-resolution. IEEE Trans. Image Process.
2019, 28, 5649–5662. [CrossRef]
11. Domínguez, C.; Heras, J.; Pascual, V. IJ-OpenCV: Combining ImageJ and OpenCV for processing images in biomedicine. Comput.
Biol. Med. 2017, 84, 189–194. [CrossRef]
Remote Sens. 2023, 15, 5062 30 of 34
12. Ševo, I.; Avramović, A. Convolutional neural network based automatic object detection on aerial images. IEEE Geosci. Remote
Sens. Lett. 2016, 13, 740–744. [CrossRef]
13. Zhang, J.; Shao, M.; Yu, L.; Li, Y. Image super-resolution reconstruction based on sparse representation and deep learning. Signal
Process. Image Commun. 2020, 87, 115925. [CrossRef]
14. Gilani, S.Z.; Mian, A.; Eastwood, P. Deep, dense and accurate 3D face correspondence for generating population specific
deformable models. Pattern Recognit. 2017, 69, 238–250. [CrossRef]
15. Yang, Y.; Bi, P.; Liu, Y. License plate image super-resolution based on convolutional neural network. In Proceedings of the 2018
IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), Chongqing, China, 27–29 June 2018; pp. 723–727.
16. Keys, R. Cubic convolution interpolation for digital image processing. IEEE Trans. Acoust. Speech, Signal Process. 1981,
29, 1153–1160. [CrossRef]
17. Parker, J.A.; Kenyon, R.V.; Troxel, D.E. Comparison of interpolating methods for image resampling. IEEE Trans. Med. Imaging
1983, 2, 31–39. [CrossRef] [PubMed]
18. Mori, T.; Kameyama, K.; Ohmiya, Y.; Lee, J.; Toraichi, K. Image resolution conversion based on an edge-adaptive interpolation
kernel. In Proceedings of the 2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, Victoria,
BC, Canada, 22–24 August 2007; pp. 497–500.
19. Han, J.W.; Kim, J.H.; Sull, S.; Ko, S.J. New edge-adaptive image interpolation using anisotropic Gaussian filters. Digit. Signal
Process. 2013, 23, 110–117. [CrossRef]
20. Thévenaz, P.; Blu, T.; Unser, M. Image interpolation and resampling. In Handbook of Medical Imaging, Processing and Analysis;
Elsevier: Amsterdam, The Netherlands, 2000; Volume 1, pp. 393–420.
21. Irani, M.; Peleg, S. Improving resolution by image registration. CVGIP Graph. Model. Image Process. 1991, 53, 231–239. [CrossRef]
22. Yang, X.; Zhang, Y.; Zhou, D.; Yang, R. An improved iterative back projection algorithm based on ringing artifacts suppression.
Neurocomputing 2015, 162, 171–179. [CrossRef]
23. Tekalp, A.M.; Ozkan, M.K.; Sezan, M.I. High-resolution image reconstruction from lower-resolution image sequences and
space-varying image restoration. In Proceedings of the ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech,
and Signal Processing, San Francisco, CA, USA, 23–26 March 1992; Volume 3, pp. 169–172.
24. Patti, A.J.; Altunbasak, Y. Artifact reduction for set theoretic super resolution image reconstruction with edge adaptive constraints
and higher-order interpolants. IEEE Trans. Image Process. 2001, 10, 179–186. [CrossRef]
25. Wang, Z.; Liu, D.; Yang, J.; Han, W.; Huang, T. Deep networks for image super-resolution with sparse prior. In Proceedings of the
IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 370–378.
26. Yang, J.; Wright, J.; Huang, T.S.; Ma, Y. Image super-resolution via sparse representation. IEEE Trans. Image Process. 2010,
19, 2861–2873. [CrossRef]
27. Peleg, T.; Elad, M. A statistical prediction model based on sparse representations for single image super-resolution. IEEE Trans.
Image Process. 2014, 23, 2569–2582. [CrossRef]
28. Dong, W.; Zhang, L.; Shi, G.; Wu, X. Image deblurring and super-resolution by adaptive sparse domain selection and adaptive
regularization. IEEE Trans. Image Process. 2011, 20, 1838–1857. [CrossRef] [PubMed]
29. Baker, S.; Kanade, T. Limits on super-resolution and how to break them. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 1167–1183.
[CrossRef]
30. Arel, I.; Rose, D.C.; Karnowski, T.P. Deep machine learning-a new frontier in artificial intelligence research [research frontier].
IEEE Comput. Intell. Mag. 2010, 5, 13–18. [CrossRef]
31. Haut, J.M.; Fernandez-Beltran, R.; Paoletti, M.E.; Plaza, J.; Plaza, A.; Pla, F. A new deep generative network for unsupervised
remote sensing single-image super-resolution. IEEE Trans. Geosci. Remote Sens. 2018, 56, 6792–6810. [CrossRef]
32. Zhang, J.; Xu, T.; Li, J.; Jiang, S.; Zhang, Y. Single-Image Super Resolution of Remote Sensing Images with Real-World Degradation
Modeling. Remote Sens. 2022, 14, 2895. [CrossRef]
33. Arefin, M.R.; Michalski, V.; St-Charles, P.L.; Kalaitzis, A.; Kim, S.; Kahou, S.E.; Bengio, Y. Multi-image super-resolution for
remote sensing using deep recurrent networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 206–207.
34. Salvetti, F.; Mazzia, V.; Khaliq, A.; Chiaberge, M. Multi-image super resolution of remotely sensed images using residual attention
deep neural networks. Remote Sens. 2020, 12, 2207. [CrossRef]
35. Zhang, H.; Zhang, L.; Shen, H. A super-resolution reconstruction algorithm for hyperspectral images. Signal Process. 2012,
92, 2082–2096. [CrossRef]
36. Liebel, L.; Körner, M. Single-Image Super Resolution For Multispectral Remote Sensing Data Using Convolutional Neural
Networks. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, 41, 883–890. [CrossRef]
37. Kim, J.; Lee, J.K.; Lee, K.M. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654.
38. Lim, B.; Son, S.; Kim, H.; Nah, S.; Mu Lee, K. Enhanced deep residual networks for single image super-resolution. In Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144.
39. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial
networks. Commun. ACM 2020, 63, 139–144. [CrossRef]
40. Fudenberg, D.; Tirole, J. Game Theory; MIT Press: Cambridge, MA, USA, 1991.
Remote Sens. 2023, 15, 5062 31 of 34
41. Liang, J.; Wei, J.; Jiang, Z. Generative adversarial networks GAN overview. J. Front. Comput. Sci. Technol. 2020, 14, 1–17.
42. Tian, C.; Zhang, X.; Lin, J.C.W.; Zuo, W.; Zhang, Y.; Lin, C.W. Generative adversarial networks for image super-resolution: A
survey. arXiv 2022, arXiv:2204.13620.
43. Wang, K.; Gou, C.; Duan, Y.; Lin, Y.; Zheng, X.; Wang, F.Y. Generative adversarial networks: Introduction and outlook. IEEE/CAA
J. Autom. Sin. 2017, 4, 588–598. [CrossRef]
44. Johnson, J.; Alahi, A.; Li, F.-F. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the Computer
Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part II 14;
Springer: Berlin/Heidelberg, Germany, 2016; pp. 694–711.
45. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556.
46. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778.
47. Tai, Y.; Yang, J.; Liu, X.; Xu, C. Memnet: A persistent memory network for image restoration. In Proceedings of the IEEE
International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4539–4547.
48. Zhao, H.; Gallo, O.; Frosio, I.; Kautz, J. Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging
2016, 3, 47–57. [CrossRef]
49. Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December
2015; pp. 1440–1448.
50. Zhang, K.; Liang, J.; Van Gool, L.; Timofte, R. Designing a practical degradation model for deep blind image super-resolution.
In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021;
pp. 4791–4800.
51. Hou, H.; Andrews, H. Cubic splines for image interpolation and digital filtering. IEEE Trans. Acoust. Speech, Signal Process. 1978,
26, 508–517.
52. Wang, X.; Xie, L.; Dong, C.; Shan, Y. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In
Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021;
pp. 1905–1914.
53. Dong, C.; Loy, C.C.; He, K.; Tang, X. Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach.
Intell. 2015, 38, 295–307. [CrossRef]
54. Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al.
Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690.
55. Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Qiao, Y.; Change Loy, C. Esrgan: Enhanced super-resolution generative
adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany,
8–14 September 2018.
56. Zhang, K.; Gool, L.V.; Timofte, R. Deep unfolding network for image super-resolution. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 3217–3226.
57. Zhang, M.; Ling, Q. Supervised pixel-wise GAN for face super-resolution. IEEE Trans. Multimed. 2020, 23, 1938–1950. [CrossRef]
58. Yuan, Y.; Liu, S.; Zhang, J.; Zhang, Y.; Dong, C.; Lin, L. Unsupervised image super-resolution using cycle-in-cycle generative
adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake
City, UT, USA, 18–22 June 2018; pp. 701–710.
59. Bell-Kligler, S.; Shocher, A.; Irani, M. Blind super-resolution kernel estimation using an internal-gan. Adv. Neural Inf. Process. Syst.
2019, 32.
60. Rakotonirina, N.C.; Rasoanaivo, A. ESRGAN+: Further improving enhanced super-resolution generative adversarial network. In
Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
Barcelona, Spain, 4–8 May 2020; pp. 3637–3641.
61. Cheng, W.; Zhao, M.; Ye, Z.; Gu, S. Mfagan: A compression framework for memory-efficient on-device super-resolution gan.
arXiv 2021, arXiv:2107.12679.
62. Shamsolmoali, P.; Zareapoor, M.; Wang, R.; Jain, D.K.; Yang, J. G-GANISR: Gradual generative adversarial network for image
super resolution. Neurocomputing 2019, 366, 140–153. [CrossRef]
63. Zhang, W.; Liu, Y.; Dong, C.; Qiao, Y. Ranksrgan: Generative adversarial networks with ranker for image super-resolution. In
Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November
2019; pp. 3096–3105.
64. Chan, K.C.; Wang, X.; Xu, X.; Gu, J.; Loy, C.C. Glean: Generative latent bank for large-factor image super-resolution. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021;
pp. 14245–14254.
65. Indradi, S.D.; Arifianto, A.; Ramadhani, K.N. Face image super-resolution using inception residual network and gan framework.
In Proceedings of the 2019 7th International Conference on Information and Communication Technology (ICoICT), Kuala Lumpur,
Malaysia, 24–26 July 2019; pp. 1–6.
66. Cai, J.; Han, H.; Shan, S.; Chen, X. FCSR-GAN: Joint face completion and super-resolution via multi-task learning. IEEE Trans.
Biom. Behav. Identity Sci. 2019, 2, 109–121. [CrossRef]
Remote Sens. 2023, 15, 5062 32 of 34
67. Ko, S.; Dai, B.R. Multi-laplacian GAN with edge enhancement for face super resolution. In Proceedings of the 2020 25th
International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 3505–3512.
68. Cao, M.; Liu, Z.; Huang, X.; Shen, Z. Research for face image super-resolution reconstruction based on wavelet transform and
SRGAN. In Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference
(IAEAC), Chongqing, China, 12–14 March 2021; Volume 5, pp. 448–451.
69. Wang, Y.; Hu, Y.; Yu, J.; Zhang, J. Gan prior based null-space learning for consistent super-resolution. In Proceedings of the AAAI
Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37; pp. 2724–2732.
70. Ma, J.; Yu, J.; Liu, S.; Chen, L.; Li, X.; Feng, J.; Chen, Z.; Zeng, S.; Liu, X.; Cheng, S. PathSRGAN: Multi-supervised super-resolution
for cytopathological images using generative adversarial network. IEEE Trans. Med. Imaging 2020, 39, 2920–2930. [CrossRef]
[PubMed]
71. Liu, A.; Liu, Y.; Gu, J.; Qiao, Y.; Dong, C. Blind image super-resolution: A survey and beyond. IEEE Trans. Pattern Anal. Mach.
Intell. 2022, 45, 5461–5480. [CrossRef]
72. Ren, H.; Kheradmand, A.; El-Khamy, M.; Wang, S.; Bai, D.; Lee, J. Real-world super-resolution using generative adversarial
networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA,
USA, 14–19 June 2020; pp. 436–437.
73. Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In
Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232.
74. Bulat, A.; Yang, J.; Tzimiropoulos, G. To learn image super-resolution, use a gan to learn how to do image degradation first. In
Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 185–200.
75. Zhou, Y.; Deng, W.; Tong, T.; Gao, Q. Guided frequency separation network for real-world super-resolution. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 428–429.
76. Zhao, T.; Ren, W.; Zhang, C.; Ren, D.; Hu, Q. Unsupervised degradation learning for single image super-resolution. arXiv 2018,
arXiv:1812.04240.
77. Xu, J.; Feng, G.; Fan, B.; Yan, W.; Zhao, T.; Sun, X.; Zhu, M. Landcover classification of satellite images based on an adaptive
interval fuzzy c-means algorithm coupled with spatial information. Int. J. Remote Sens. 2020, 41, 2189–2208. [CrossRef]
78. Ma, W.; Pan, Z.; Yuan, F.; Lei, B. Super-resolution of remote sensing images via a dense residual generative adversarial network.
Remote Sens. 2019, 11, 2578. [CrossRef]
79. Wang, Z.; Li, L.; Xue, Y.; Jiang, C.; Wang, J.; Sun, K.; Ma, H. FeNet: Feature enhancement network for lightweight remote-sensing
image super-resolution. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [CrossRef]
80. Kang, X.; Li, J.; Duan, P.; Ma, F.; Li, S. Multilayer degradation representation-guided blind super-resolution for remote sensing
images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [CrossRef]
81. Wang, Y.; Bashir, S.M.A.; Khan, M.; Ullah, Q.; Wang, R.; Song, Y.; Guo, Z.; Niu, Y. Remote sensing image super-resolution and
object detection: Benchmark and state of the art. Expert Syst. Appl. 2022, 197, 116793. [CrossRef]
82. Xiong, Y.; Guo, S.; Chen, J.; Deng, X.; Sun, L.; Zheng, X.; Xu, W. Improved SRGAN for remote sensing image super-resolution
across locations and sensors. Remote Sens. 2020, 12, 1263. [CrossRef]
83. Xu, Y.; Luo, W.; Hu, A.; Xie, Z.; Xie, X.; Tao, L. TE-SAGAN: An improved generative adversarial network for remote sensing
super-resolution images. Remote Sens. 2022, 14, 2425. [CrossRef]
84. Guo, M.; Zhang, Z.; Liu, H.; Huang, Y. Ndsrgan: A novel dense generative adversarial network for real aerial imagery
super-resolution reconstruction. Remote Sens. 2022, 14, 1574. [CrossRef]
85. Ma, J.; Zhang, L.; Zhang, J. SD-GAN: Saliency-discriminated GAN for remote sensing image superresolution. IEEE Geosci. Remote
Sens. Lett. 2019, 17, 1973–1977. [CrossRef]
86. Gong, Y.; Liao, P.; Zhang, X.; Zhang, L.; Chen, G.; Zhu, K.; Tan, X.; Lv, Z. Enlighten-GAN for super resolution reconstruction in
mid-resolution remote sensing images. Remote Sens. 2021, 13, 1104. [CrossRef]
87. Dharejo, F.A.; Deeba, F.; Zhou, Y.; Das, B.; Jatoi, M.A.; Zawish, M.; Du, Y.; Wang, X. TWIST-GAN: Towards wavelet transform
and transferred GAN for spatio-temporal single image super resolution. ACM Trans. Intell. Syst. Technol. (TIST) 2021, 12, 1–20.
[CrossRef]
88. Wang, J.; Shao, Z.; Huang, X.; Lu, T.; Zhang, R.; Ma, J. Enhanced image prior for unsupervised remoting sensing super-resolution.
Neural Netw. 2021, 143, 400–412. [CrossRef]
89. Zhang, N.; Wang, Y.; Zhang, X.; Xu, D.; Wang, X. An unsupervised remote sensing single-image super-resolution method based
on generative adversarial network. IEEE Access 2020, 8, 29027–29039. [CrossRef]
90. Tu, J.; Mei, G.; Ma, Z.; Piccialli, F. SWCGAN: Generative adversarial network combining swin transformer and CNN for remote
sensing image super-resolution. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 5662–5673. [CrossRef]
91. Jiang, K.; Wang, Z.; Yi, P.; Wang, G.; Lu, T.; Jiang, J. Edge-enhanced GAN for remote sensing image superresolution. IEEE Trans.
Geosci. Remote Sens. 2019, 57, 5799–5812. [CrossRef]
92. Zhao, J.; Ma, Y.; Chen, F.; Shang, E.; Yao, W.; Zhang, S.; Yang, J. SA-GAN: A Second Order Attention Generator Adversarial
Network with Region Aware Strategy for Real Satellite Images Super Resolution Reconstruction. Remote Sens. 2023, 15, 1391.
[CrossRef]
93. Agustsson, E.; Timofte, R. Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 126–135.
Remote Sens. 2023, 15, 5062 33 of 34
94. Timofte, R.; Agustsson, E.; Van Gool, L.; Yang, M.H.; Zhang, L. Ntire 2017 challenge on single image super-resolution: Methods
and results. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA,
21–26 July 2017; pp. 114–125.
95. Martin, D.; Fowlkes, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application to evaluating
segmentation algorithms and measuring ecological statistics. In Proceedings of the Eighth IEEE International Conference on
Computer Vision. ICCV 2001, Vancouver, BC, Canada, 7–14 July 2001; Volume 2; pp. 416–423.
96. Arbelaez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal.
Mach. Intell. 2010, 33, 898–916. [CrossRef]
97. Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Li, F.-F. Imagenet: A large-scale hierarchical image database. In Proceedings of the
2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255.
98. Bevilacqua, M.; Roumy, A.; Guillemot, C.; Alberi-Morel, M.L. Low-complexity single-image super-resolution based on nonnega-
tive neighbor embedding. In Proceedings of the 23rd British Machine Vision Conference (BMVC), Surrey, UK, 3–7 September 2012.
99. Zeyde, R.; Elad, M.; Protter, M. On single image scale-up using sparse-representations. In Proceedings of the Curves and Surfaces:
7th International Conference, Avignon, France, 24–30 June 2010; pp. 711–730.
100. Huang, J.B.; Singh, A.; Ahuja, N. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 5197–5206.
101. Cai, J.; Zeng, H.; Yong, H.; Cao, Z.; Zhang, L. Toward real-world single image super-resolution: A new benchmark and a new
model. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2
November 2019; pp. 3086–3095.
102. Xia, G.S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhong, Y.; Zhang, L.; Lu, X. AID: A benchmark data set for performance evaluation of
aerial scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3965–3981. [CrossRef]
103. Dai, D.; Yang, W. Satellite image classification via two-layer sparse coding with biased image representation. IEEE Geosci. Remote
Sens. Lett. 2010, 8, 173–176. [CrossRef]
104. Cheng, G.; Han, J.; Lu, X. Remote sensing image scene classification: Benchmark and state of the art. Proc. IEEE 2017,
105, 1865–1883. [CrossRef]
105. Zhu, H.; Chen, X.; Dai, W.; Fu, K.; Ye, Q.; Jiao, J. Orientation robust object detection in aerial images using deep convolutional
neural network. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada,
27–30 September 2015; pp. 3735–3739.
106. Zhao, L.; Tang, P.; Huo, L. Feature significance-based multibag-of-visual-words model for remote sensing image scene classifica-
tion. J. Appl. Remote Sens. 2016, 10, 035004. [CrossRef]
107. Zou, Q.; Ni, L.; Zhang, T.; Wang, Q. Deep learning based feature selection for remote sensing scene classification. IEEE Geosci.
Remote Sens. Lett. 2015, 12, 2321–2325. [CrossRef]
108. Yang, Y.; Newsam, S. Bag-of-visual-words and spatial extensions for land-use classification. In Proceedings of the 18th
SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA, 2–5 November 2010;
pp. 270–279.
109. Zhu, Q.; Zhong, Y.; Zhao, B.; Xia, G.S.; Zhang, L. Bag-of-visual-words scene classifier with local and global features for high
spatial resolution remote sensing imagery. IEEE Geosci. Remote Sens. Lett. 2016, 13, 747–751. [CrossRef]
110. Yang, M.Y.; Liao, W.; Li, X.; Rosenhahn, B. Deep learning for vehicle detection in aerial images. In Proceedings of the 2018 25th
IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 3079–3083.
111. Li, K.; Wan, G.; Cheng, G.; Meng, L.; Han, J. Object detection in optical remote sensing images: A survey and a new benchmark.
ISPRS J. Photogramm. Remote Sens. 2020, 159, 296–307. [CrossRef]
112. Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A large-scale dataset for object
detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City,
UT, USA, 18–22 June 2018; pp. 3974–3983.
113. Fujimoto, A.; Ogawa, T.; Yamamoto, K.; Matsui, Y.; Yamasaki, T.; Aizawa, K. Manga109 dataset and creation of metadata. In
Proceedings of the 1st International Workshop on Comics Analysis, Processing and Understanding, Cancun, Mexico, 4 December
2016; pp. 1–5.
114. Wang, X.; Yu, K.; Dong, C.; Loy, C.C. Recovering realistic texture in image super-resolution by deep spatial feature transform. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018;
pp. 606–615.
115. Everingham, M.; Eslami, S.A.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes challenge: A
retrospective. Int. J. Comput. Vis. 2015, 111, 98–136. [CrossRef]
116. Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference
on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3730–3738.
117. Baumgardner, M.F.; Biehl, L.L.; Landgrebe, D.A. 220 band aviris hyperspectral image data set: June 12, 1992 indian pine test site
3. Purdue Univ. Res. Repos. 2015, 10, 991.
118. Okujeni, A.; van der Linden, S.; Hostert, P. Berlin-urban-gradient dataset 2009—An EnMAP preparatory flight campaign. In
EnMAP Flight Campaigns Technical Report; GFZ Data Services: Potsdam, Germany, 2016, p. 9. [CrossRef]
Remote Sens. 2023, 15, 5062 34 of 34
119. Yokoya, N.; Iwasaki, A. Airborne Hyperspectral Data over Chikusei; Technical Report SAL-2016-05-27; Space Application Laboratory,
The University of Tokyo: Tokyo, Japan, 2016; Volume 5, p. 5.
120. Wang, X.; Yi, J.; Guo, J.; Song, Y.; Lyu, J.; Xu, J.; Yan, W.; Zhao, J.; Cai, Q.; Min, H. A review of image super-resolution approaches
based on deep learning and applications in remote sensing. Remote Sens. 2022, 14, 5423. [CrossRef]
121. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE
Trans. Image Process. 2004, 13, 600–612. [CrossRef]
122. Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process.
2012, 21, 4695–4708. [CrossRef]
123. Fang, Y.; Zhang, C.; Yang, W.; Liu, J.; Guo, Z. Blind visual quality assessment for image super-resolution by convolutional neural
network. Multimed. Tools Appl. 2018, 77, 29829–29846. [CrossRef]
124. Jiang, Q.; Liu, Z.; Gu, K.; Shao, F.; Zhang, X.; Liu, H.; Lin, W. Single image super-resolution quality assessment: A real-world
dataset, subjective studies, and an objective metric. IEEE Trans. Image Process. 2022, 31, 2279–2294. [CrossRef] [PubMed]
125. Zhang, K.; Zhao, T.; Chen, W.; Niu, Y.; Hu, J. SPQE: Structure-and-Perception-Based Quality Evaluation for Image Super-
Resolution. arXiv 2022, arXiv:2205.03584.
126. Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018;
pp. 586–595.
127. Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 2012,
20, 209–212. [CrossRef]
128. Yang, D.; Li, Z.; Xia, Y.; Chen, Z. Remote sensing image super-resolution: Challenges and approaches. In Proceedings of the 2015
IEEE international conference on digital signal processing (DSP), Singapore, 21–24 July 2015; pp. 196–200.
129. Cheng, J.; Kuang, Q.; Shen, C.; Liu, J.; Tan, X.; Liu, W. ResLap: Generating high-resolution climate prediction through image
super-resolution. IEEE Access 2020, 8, 39623–39634. [CrossRef]
130. Elfadaly, A.; Attia, W.; Lasaponara, R. Monitoring the environmental risks around Medinet Habu and Ramesseum Temple at
West Luxor, Egypt, using remote sensing and GIS techniques. J. Archaeol. Method Theory 2018, 25, 587–610. [CrossRef]
131. Tatem, A.J.; Lewis, H.G.; Atkinson, P.M.; Nixon, M.S. Super-resolution target identification from remotely sensed images using a
Hopfield neural network. IEEE Trans. Geosci. Remote Sens. 2001, 39, 781–796. [CrossRef]
132. Bai, Y.; Zhang, Y.; Ding, M.; Ghanem, B. Sod-mtgan: Small object detection via multi-task generative adversarial network. In
Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 206–221.
133. Rabbi, J.; Ray, N.; Schubert, M.; Chowdhury, S.; Chao, D. Small-object detection in remote sensing images with end-to-end
edge-enhanced GAN and object detector network. Remote Sens. 2020, 12, 1432. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
Generative adversarial networks (GANs) enhance super-resolution in image processing by using adversarial training, where a generator network produces super-resolution images and a discriminator network evaluates them to make improvements. This process yields images that closely resemble the original high-resolution images, outperforming traditional methods like bicubic interpolation, which are computationally intensive and struggle with real-world degradation . GANs leverage zero-sum game principles, allowing them to capture intricate image features and improve realism beyond traditional methods .
Traditional super-resolution (SR) reconstruction methods such as interpolation are limited because they rely solely on pixel information available in low-resolution images, which results in blurred images. These methods do not optimally handle the image's edges, texture, and other areas, resulting in accuracy issues . Reconstruction-based approaches can improve detail sharpness but perform poorly as the scale factor increases, have slow convergence speeds, and are computationally expensive .
Kernel GAN has introduced significant improvements in blind super-resolution by estimating degradation characteristics internally without needing explicit prior information. It provides practical advantages by generating realistic low-resolution images necessary for effective SR reconstruction, particularly excelling in handling unparameterized degradation types .
Image degradation is a critical aspect of super-resolution reconstruction as it affects the quality of the low-resolution images used for reconstruction. It occurs due to flaws in the imaging system and is modeled to simulate realistic downsampling and noise effects. The conventional degradation model uses bicubic interpolation, but it often fails to adequately simulate real-world degradation. More advanced models introduced elements like blurring, noise, and compression to better represent image degradation .
The pooling-based decomposition (PD) method improves super-resolution reconstruction by elevating the performance of state-of-the-art techniques and significantly accelerating training convergence by 2-10 times. It also addresses color inconsistency issues between the original and reconstructed images, enhancing the fidelity of edge and texture details in remote sensing images .
VDSR, EDSR, and SRCNN have advanced super-resolution technologies by leveraging deep learning architectures to enhance image quality beyond traditional interpolation methods. VDSR uses very deep convolutional networks for SR; EDSR employs enhanced deep residual networks, and SRCNN applies convolutional neural networks for improvement. However, they fall short in terms of achieving significantly noticeable reconstruction effects and remain underdeveloped compared to the full potentials of GAN frameworks .
Multi-scale low-resolution images eliminate artifacts in super-resolution reconstruction by integrating information over different scales, which enhances the detail fidelity in the reconstructed image. This technique emphasizes combining details from multiple resolutions, allowing artifact reduction while retaining critical visual information in the image .
The implementation of GAN-based super-resolution models has significantly influenced medical and remote sensing imaging by enhancing spatial resolution and perceptual quality. For instance, models like PathSRGAN provide multi-supervised SR capabilities, improving the clarity of medical images, while Relativistic Averaged GANs facilitate SR reconstruction in remote sensing applications. These advancements reduce computational and storage costs, making high-quality imaging more accessible .
BSR degradation techniques enhance the modeling of image degradation by introducing a broad range of degradation effects, enabling more effective simulation of real-world scenarios. The use of random permutations in these models expands the degradation space, providing superior flexibility and adaptability in representing diverse degradation characteristics compared to conventional models like bicubic interpolation .
Blind super-resolution reconstruction differs from traditional methods as it does not explicitly depend on known degradation types but instead models them implicitly or explicitly to enhance unknown low-resolution images. This approach offers practical advantages by being more adaptable to real-world scenarios where degradation types may vary or be unspecified, making it suitable for unpredictable imaging conditions .