0% found this document useful (0 votes)

53 views

Compression

This document is an article review submitted by 4 students on the research paper "Compression Artifacts Reduction by a Deep Convolutional Network". The reviewed paper proposes a deep convolutional neural network called AR-CNN to efficiently reduce various artifacts caused by lossy image compression techniques like JPEG. It demonstrates that AR-CNN can suppress blocking artifacts while retaining details better than state-of-the-art methods. The paper also investigates using transfer learning to train deeper models for low-level vision tasks like compression artifact reduction.

Uploaded by

nuredin abdellah

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views

Compression

Uploaded by

nuredin abdellah

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

School of Electrical Engineering and Computing

Department of Computer Science and Engineering

Postgraduate Weekend Program First year

Advanced Image Processing (CSEg 6110)

Article Review

Title: Compression Artifacts Reduction by a Deep Convolutional

Network
Name ID Number
Nuredin Abdela PGE/28278/15
Getahun Degefa PGE/28253/15
Abiy Menberu PGE/28247/15
Wendimu Getachew PGE/28261/15

Submitted to: - Dr.Worku J.

Date: -009/02/2023
Compression Artifacts Reduction by a Deep Convolutional Network
[Chao Dong, Yubin Deng, Chen Change Loy, and Xiaoou Tang]
Introduction
Lossy compression introduces complex compression artifacts, particularly the blocking artifacts,
ringing effects and blurring. Existing algorithms either focuses on removing blocking artifacts
and produce blurred output, or restores sharpened images that are accompanied with ringing
effects. Inspired by the deep convolutional networks (DCN) on super-resolution [4], we
formulate a compact and efficient network for seamless attenuation of different compression
artifacts. We also demonstrate that a deeper model can be effectively trained with the features
learned in a shallow network. Following a similar “easy to hard” idea, we systematically
investigate several practical transfer settings and show the effectiveness of transfer learning in
low-level vision problems. Our method shows superior performance than the state-of-the-arts
both on the benchmark datasets and the real-world use case (\ie Twitter). In addition, we show
that our method can be applied as pre-processing to facilitate other low-level vision routines
when they take compressed images as input.

Short Summary of the work

Lossy compression (\eg JPEG, WebP and HEVC-MSP) is one class of data encoding methods
that uses inexact approximations for representing the encoded content. In this age of information
explosion, lossy compression is indispensable and inevitable for companies (\
eg Twitter and Facebook) to save bandwidth and storage space. However, compression in its
nature will introduce undesired complex artifacts, which will severely reduce the user experience
(\eg Figure 1(b)). All these artifacts not only decrease perceptual visual quality, but also
adversely affect various low-level image processing routines that take compressed images as
input, \eg contrast enhancement [14], super-resolution [28, 4], and edge detection [2]. However,
under such a huge demand, effective compression artifacts reduction remains an open problem.
We take JPEG compression as an example to explain compression artifacts. JPEG compression
scheme divides an image into 8×8 pixel blocks and applies block discrete cosine transformation
(DCT) on each block individually. Quantization is then applied on the DCT coefficients to save
storage space. This step will cause a complex combination of different artifacts, as depicted in
Figure 1(a). Blocking artifacts arise when each block is encoded without considering the
correlation with the adjacent blocks, resulting in discontinuities at the 8×8 borders. Ringing
effects along the edges occur due to the coarse quantization of the high-frequency components
(also known as Gibbs phenomenon [8]). Blurring happens due to the loss of high-frequency
components. To cope with the various compression artifacts, different approaches have been
proposed, some of which can only deal with certain types of artifacts. For instance, deblocking
oriented approaches [16, 19, 24] perform filtering along the block boundaries to reduce only
blocking artifacts. Liew \etal [15] and Foi \etal [5] use thresholding by wavelet transform and
Shape-Adaptive DCT transform, respectively. These approaches are good at removing blocking
and ringing artifacts, but tend to produce blurred output. Jung \etal [12] propose restoration
method based on sparse representation. They produce sharpened images but accompanied with
noisy edges and unnatural smooth regions.
To date, deep learning has shown impressive results on both high-level and low-level vision
problems. In particular, the SRCNN proposed by Dong \etal [4] shows the great potential of an
end-to-end DCN in image super-resolution. The study also points out that conventional sparse-
coding-based image restoration model can be equally seen as a deep model. However, we find
that the three-layer network is not well suited in restoring the compressed images, especially in
dealing with blocking artifacts and handling smooth regions. As various artifacts are coupled
together, features extracted by the first layer are noisy, causing undesirable noisy patterns in
reconstruction.
To eliminate the undesired artifacts, we improve the SRCNN by embedding one or more “feature
enhancement” layers after the first layer to clean the noisy features. Experiments show that the
improved model, namely “Artifacts Reduction Convolutional Neural Networks (AR-CNN)”, is
exceptionally effective in suppressing blocking artifacts while retaining edge patterns and sharp
details (see Figure 1(b)). However, we are met with training difficulties in training a deeper
DCN. “Deeper is better” is widely observed in high-level vision problems, but not in low-level
vision tasks. Specifically, “deeper is not better” has been pointed out in super-resolution [3],
where training a five-layer network becomes a bottleneck. The difficulty of training is partially
due to the sub-optimal initialization settings.
The aforementioned difficulty motivates us to investigate a better way to train a deeper model for
low-level vision problems. We find that this can be effectively solved by transferring the features
learned in a shallow network to a deeper one and fine-tuning simultaneously1. This strategy has
also been proven successful in learning a deeper CNN for image classification [22]. Following a
similar general intuitive idea, easy to hard, we discover other interesting transfer settings in this
low-level vision task: (1) We transfer the features learned in a high-quality compression model
(easier) to a low-quality one (harder), and find that it converges faster than random initialization.
(2) In the real use case, companies tend to apply different compression strategies (including re-
scaling) according to their purposes (\eg Figure 1(b)). We transfer the features learned in a
standard compression model (easier) to a real use case (harder), and find that it performs better
than learning from scratch.
The contributions of this study are three-fold: (1) we formulate a new deep convolutional
network for efficient reduction of various compression artifacts. Extensive experiments,
including that on real use cases, demonstrate the effectiveness of our method over state-of-the-art
methods [5, 11] both perceptually and quantitatively. (2) We verify that reusing the features in
shallow networks is helpful in learning a deeper model for compression artifact reduction. Under
the same intuitive idea – easy to hard, we reveal a number of interesting and practical transfer
settings. Our study is the first attempt to show the effectiveness of feature transfer in a low-level
vision problem. (3) We show the effectiveness of AR-CNN in facilitating other low-level vision
routines (\ie super-resolution and contrast enhancement), when they take JPEG images as input.
The network consists of four convolutional layers, each of which is responsible for a specific
operation. Then it optimizes the four operations (\ie, feature extraction, feature enhancement,
mapping and reconstruction) jointly in an end-to-end framework. Example feature maps shown
in each step could well illustrate the functionality of each operation. They are normalized for
better visualization.

RELATED WORK

Existing algorithms can be classified into deblocking oriented and restoration oriented methods.
The deblocking oriented methods focus on removing blocking and ringing artifacts. In the spatial
domain, different kinds of filters [16, 19, 24] have been proposed to adaptively deal with
blocking artifacts in specific regions (\eg, edge, texture, and smooth regions). In the frequency
domain, Liew \etal [15] utilize wavelet transform and derive thresholds at different wavelet
scales for denoising. The most successful deblocking oriented method is perhaps the Pointwise
Shape-Adaptive DCT (SA-DCT) [5], which is widely acknowledged as the state-of-the-art
approach [11, 14]. However, as most deblocking oriented methods, SA-DCT could not reproduce
sharp edges, and tend to overly smooth texture regions. The restoration oriented methods regard
the compression operation as distortion and propose restoration algorithms. They include
projection on convex sets based method (POCS) [30], solving an MAP problem (FoE) [23],
sparse-coding-based method [12] and the Regression Tree Fields based method (RTF) [11],
which is the new state-of-the art method. The RTF takes the results of SA-DCT [5] as bases and
produces globally consistent image reconstructions with a regression tree field model. It could
also be optimized for any differentiable loss functions (\eg SSIM), but often at the cost of other
evaluation metrics.
Super-Resolution Convolutional Neural Network (SRCNN) [4] is closely related to our work. In
the study, independent steps in the sparse-coding-based method are formulated as different
convolutional layers and optimized in a unified network. It shows the potential of deep model in
low-level vision problems like super-resolution. However, the model of compression is different
from super-resolution in that it consists of different kinds of artifacts. Designing a deep model
for compression restoration requires a deep understanding into the different artifacts. We show
that directly applying the SRCNN architecture for compression restoration will result in
undesired noisy patterns in the reconstructed image.
Transfer learning in deep neural networks becomes popular since the success of deep learning in
image classification [13]. The features learned from the ImageNet show good generalization
ability [33] and become a powerful tool for several high-level vision problems, such as Pascal
VOC image classification [18] and object detection [6, 20]. Yosinski \etal [32] have also tried to
quantify the degree to which a particular layer is general or specific. Overall, transfer learning
has been systematically investigated in high-level vision problems, but not in low-level vision
tasks. In this study, we explore several transfer settings on compression artifacts reduction and
show the effectiveness of transfer learning in low-level vision problems.

METHODOLOGY

Our proposed approach is based on the current successful low-level vision model – SRCNN [4].
To have a better understanding of our work, we first give a brief overview of SRCNN. Then we
explain the insights that lead to a deeper network and present our new model. Subsequently, we
explore three types of transfer learning strategies that help in training a deeper and better
network.
REVIEW OF SRCNN
The SRCNN aims at learning an end-to-end mapping, which takes the low-resolution
image Y (after interpolation) as input and directly outputs the high-resolution one F(Y). The
network contains three convolutional layers, each of which is responsible for a specific task.
Specifically, the first layer performs patch extraction and representation, which extracts
overlapping patches from the input image and represents each patch as a high-dimensional
vector. Then the non-linear mapping layer maps each high-dimensional vector of the first layer
to another high-dimensional vector, which is conceptually the representation of a high-resolution
patch. At last, the reconstruction layer aggregates the patch-wise representations to generate the
final output. The network can be expressed as:

Fi(Y) =max(0,Wi∗Y+Bi),i∈{1,2}; (1)

F(Y) =W3∗F2(Y)+B3. (2)
where Wi and Bi represent the filters and biases of the ith layer respectively, Fi is the output
feature maps and ’∗’ denotes the convolution operation. The Wi contains ni filters of
support ni−1×fi×fi, where fi is the spatial support of a filter, ni is the number of filters, and n0 is
the number of channels in the input image. Note that there is no pooling or full-connected layers
in SRCNN, so the final output F(Y) is of the same size as the input image. Rectified Linear Unit
(ReLU, max(0,x)) [17] is applied on the filter responses.

These three steps are analogous to the basic operations in the sparse-coding-based super-
resolution methods [29], and this close relationship lays theoretical foundation for its successful
application in super-resolution. Details can be found in the paper [4].

CONVOLUTIONAL NEURAL NETWORK FOR COMPRESSION

ARTIFACTS REDUCTION
Insights: In sparse-coding-based methods and SRCNN, the first step – feature extraction –
determines what should be emphasized and restored in the following stages. However, as various
compression artifacts are coupled together, the extracted features are usually noisy and
ambiguous for accurate mapping. In the experiments of reducing JPEG compression artifacts
(see Section 4.1.2), we find that some quantization noises coupled with high frequency details
are further enhanced, bringing unexpected noisy patterns around sharp edges. Moreover,
blocking artifacts in flat areas are misrecognized as normal edges, causing abrupt intensity
changes in smooth regions. Inspired by the feature enhancement step in super-resolution [27], we
introduce a feature enhancement layer after the feature extraction layer in SRCNN to form a new
and deeper network – AR-CNN. This layer maps the “noisy” features to a relatively “cleaner”
feature space, which is equivalent to denoising the feature maps.

Formulation: The overview of the new network AR-CNN is shown in Figure 2. The three layers
of SRCNN remain unchanged in the new model. We also use the same annotations as in
Section 3.1. To conduct feature enhancement, we extract new features from the n1 feature maps
of the first layer, and combine them to form another set of feature maps. This operation F1′ can
also be formulated as a convolutional layer:

F1′(Y)=max(0,W1′∗F1(Y)+B1′), (3)
where W1′ corresponds to n1′ filters with size n1×f1′×f1′. B1′ is an n1′-dimensional bias vector,
and the output F1′(Y) consists of n1′ feature maps. Overall, the AR-CNN consists of four layers,
namely the feature extraction, feature enhancement, mapping and reconstruction layer.

It is worth noticing that AR-CNN is not equal to a deeper SRCNN that contains more than one
non-linear mapping layers2. Rather than imposing more non-linearity in the mapping stage, AR-
CNN improves the mapping accuracy by enhancing the extracted low-level features.
Experimental results of AR-CNN, SRCNN and deeper SRCNN will be shown in Section 4.1.2

MODEL LEARNING
Given a set of ground truth images {Xi} and their corresponding compressed images {Yi}, we
use Mean Squared Error (MSE) as the loss function:

L(Θ)=1nn∑i=1||F(Yi;Θ)−Xi||2, (4)
where Θ={W1,W1′,W2,W3,B1,B1′,B2,B3}, n is the number of training samples. The loss is
minimized using stochastic gradient descent with the standard backpropagation. We adopt a
batch-mode learning method with a batch size of 128.

EASY-HARD TRANSFER
First row: The baseline 4-layer network trained with dataA-qA. Second row: The 5-layer AR-
CNN targeted at dataA-qA. Third row: The AR-CNN targeted at dataA-qB. Fourth row: The
AR-CNN targeted at Twitter data. Green boxes indicate the transferred features from the base
network, and gray boxes represent random initialization. The ellipsoidal bars between weight
vectors represent the activation functions.

Transfer learning in deep models provides an effective way of initialization. In fact, conventional
initialization strategies (\ie randomly drawn from Gaussian distributions with fixed standard
deviations [13]) are found not suitable for training a very deep model, as reported in [9]. To
address this issue, He \etal [9] derive a robust initialization method for rectifier nonlinearities,
Simonyan \etal [22] propose to use the pre-trained features on a shallow network for
initialization.

In low-level vision problems (\eg super resolution), it is observed that training a network beyond
4 layers would encounter the problem of convergence, even that a large number of training
images (\eg ImageNet) are provided [4]. We are also met with this difficulty during the training
process of AR-CNN. To this end, we systematically investigate several transfer settings in
training a low-level vision network following an intuitive idea of “easy-hard transfer”.
Specifically, we attempt to reuse the features learned in a relatively easier task to initialize a
deeper or harder network. Interestingly, the concept “easy-hard transfer” has already been
pointed out in neuro-computation study [7], where the prior training on an easy discrimination
can help learn a second harder one.
Formally, we define the base (or source) task as A and the target tasks as Bi, i∈{1,2,3}. As
shown in Figure 3, the base network baseA is a four-layer AR-CNN trained on a large
dataset dataA, of which images are compressed using a standard compression scheme with the
compression quality qA. All layers in baseA are randomly initialized from a Gaussian
distribution. We will transfer one or two layers of baseA to different target tasks (see Figure 3).
Such transfers can be described as follows.

Transfer shallow to deeper model. As indicated by [3], a five-layer network is sensitive to the
initialization parameters and learning rate. Thus we transfer the first two layers of baseA to a
five-layer network targetB1. Then we randomly initialize its remaining layers3 and train all
layers toward the same dataset dataA. This is conceptually similar to that applied in image
classification [22], but this approach has never been validated in low-level vision problems.
Transfer high to low quality. Images of low compression quality contain more complex
artifacts. Here we use the features learned from high compression quality images as a starting
point to help learn more complicated features in the DCN. Specifically, the first layer
of targetB2 are copied from baseA and trained on images that are compressed with a lower
compression quality qB.
Transfer standard to real use case. We then explore whether the features learned under a
standard compression scheme can be generalized to other real use cases, which often contain
more complex artifacts due to different levels of re-scaling and compression. We transfer the
first layer of baseA to the network targetB3, and train all layers on the new dataset.
Discussion: Why the features learned from relatively easy tasks are helpful? First, the features
from a well-trained network can provide a good starting point. Then the rest of a deeper model
can be regarded as shallow one, which is easier to converge. Second, features learned in different
tasks always have a lot in common. For instance, Figure 4(b) shows the features learned under
different JPEG compression qualities. Obviously, filters a,b,c of high quality are very similar to
filters a′,b′,c′ of low quality. This kind of features can be reused or improved during fine-tuning,
making the convergence faster and more stable. Furthermore, a deep network for a hard problem
can be seen as an insufficiently biased learner with overly large hypothesis space to search, and
therefore is prone to overfitting. These few transfer settings we investigate introduce good bias to
enable the learner to acquire a concept with greater generality. Experimental results in
Section 4.2 validate the above analysis.

EXPERIMENTS
We use the BSDS500 database [1] as our base training set. Specifically, its disjoint training set
(200 images) and test set (200 images) are all used for training, and its validation set (100
images) is used for validation. As in other compression artifacts reduction methods (\
eg RTF [11]), we apply the standard JPEG compression scheme, and use the JPEG quality
settings q=20 (mid quality) and q=10 (low quality) in MATLAB JPEG encoder. We only focus
on the restoration of the luminance channel (in YCrCb space) in this paper.
The training image pairs {Y,X} are prepared as follows – Images in the training set are
decomposed into 32×32 sub-images4 X={Xi}ni=1. Then the compressed
samples Y={Yi}ni=1 are generated from the training samples with MATLAB JPEG
encoder [11]. The sub-images are extracted from the ground truth images with a stride of 10.
Thus the 400 training images could provide 537,600 training samples. To avoid the border
effects caused by convolution, AR-CNN produces a 20×20 output given a 32×32 input Yi.
Hence, the loss (Eqn. (4)) was computed by comparing against the center 20×20 pixels of the
ground truth sub-image Xi. In the training phase, we follow [10, 4] and use a smaller learning
rate (10−5) in the last layer and a comparably larger one (10−4) in the remaining layers.

COMPARISON WITH THE STATE-OF-THE-ARTS

We use the LIVE1 dataset [21] (29 images) as test set to evaluate both the quantitative and
qualitative performance. The LIVE1 dataset contains images with diverse properties. It is widely
used in image quality assessment [25] as well as in super-resolution [28]. To have a
comprehensive qualitative evaluation, we apply the PSNR, structural similarity (SSIM) [25]5,
and PSNR-B [31] for quality assessment. We want to emphasize the use of PSNR-B. It is
designed specifically to assess blocky and deblocked images, thus is more sensitive to blocking
artifacts than the perceptual-aware SSIM index. The network settings
are f1=9, f1′=7, f2=1, f3=5, n1=64, n1′=32, n2=16 and n3=1, denoted as AR-CNN (9-7-1-5) or
simply AR-CNN. A specific network is trained for each JPEG quality. Parameters are randomly
initialized from a Gaussian distribution with a standard deviation of 0.001.
Comparison with SA-DCT

We first compare AR-CNN with SA-DCT [5], which is widely regarded as the state-of-the-art
deblocking oriented method [11, 14]. The quantization results of PSNR, SSIM and PSNR-B are
shown in Table 1. On the whole, our AR-CNN outperforms the SA-DCT on all JPEG qualities
and evaluation metrics by a large margin. Note that the gains on PSNR-B is much larger than
that on PSNR. This indicates that AR-CNN could produce images with less blocking artifacts.
To compare the visual quality, we present some restored images6 with q=10 in Figure \
textcolorred10. From Figure \textcolorred10, we could see that the result of AR-CNN could
produce much sharper edges with much less blocking and ringing artifacts compared with SA-
DCT. The visual quality has been largely improved on all aspects compared with the state-of-
the-art method. Furthermore, AR-CNN is superior to SA-DCT on the implementation speed. For
SA-DCT, it needs 3.4 seconds to process a 256×256 image. While AR-CNN only takes 0.5
second. They are all implemented using C++ on a PC with Intel I3 CPU (3.1GHz) with 16GB
RAM.
Comparison with SRCNN

As discussed in Section 3.2, SRCNN is not suitable for compression artifacts reduction. For
comparison, we train two SRCNN networks with different settings. (i) The original SRCNN (9-
1-5) with f1=9, f3=5, n1=64 and n2=32. (ii) Deeper SRCNN (9-1-1-5) with an additional non-
linear mapping layer (f2′=1, n2′=16). They all use the BSDS500 dataset for training and
validation as in Section 4. The compression quality is q=10. The AR-CNN is the same as in
Section 4.1.1.
Quantitative results tested on LIVE1 dataset are shown in Table 2. We could see that the two
SRCNN networks are inferior on all evaluation metrics. From convergence curves shown in
Figure 5, it is clear that AR-CNN achieves higher PSNR from the beginning of the learning
stage. Furthermore, from their restored images6 in Figure \textcolorred11, we find out that the
two SRCNN networks all produce images with noisy edges and unnatural smooth regions. These
results demonstrate our statements in Section 3.2. In short, the success of training a deep model
needs comprehensive understanding of the problem and careful design of the model structure.
Comparison with RTF

RTF [11] is the recent state-of-the-art restoration oriented method. Without their deblocking
code, we can only compare with the released deblocking results. Their model is trained on the
training set (200 images) of the BSDS500 dataset, but all images are down-scaled by a factor of
0.5 [11]. To have a fair comparison, we also train new AR-CNN networks on the same half-sized
200 images. Testing is performed on the test set of the BSDS500 dataset (images scaled by a
factor of 0.5), which is also consistent with [11]. We compare with two RTF variants. One is the
plain RTF, which uses the filter bank and is optimized for PSNR. The other is the RTF+SA-
DCT, which includes the SA-DCT as a base method and is optimized for MAE. The later one
achieves the highest PSNR value among all RTF variants [11].
As shown in Table 3, we obtain superior performance than the plain RTF, and even better
performance than the combination of RTF and SA-DCT, especially under the more
representative PSNR-B metric. Moreover, training on such a small dataset has largely restricted
the ability of AR-CNN. The performance of AR-CNN will further improve given more training
images.

EXPERIMENTS ON EASY-HARD TRANSFER

We show the experimental results of different “easy-hard transfer” settings, of which the details
are shown in Table 4. Take the base network as an example, the base-q10 is a four-layer AR-
CNN (9-7-1-5) trained on the BSDS500 [1] dataset (400 images) under the compression
quality q=10. Parameters are initialized by randomly drawing from a Gaussian distribution with
zero mean and standard deviation 0.001. Figures 6 - 8 show the convergence curves on the
validation set.
Transfer shallow to deeper model

In Table 4, we denote a deeper (five-layer) AR-CNN as “9-7-3-1-5”, which contains another

feature enhancement layer (f1′′=3 and n1′′=16). Results in Figure 6 show that the transferred
features from a four-layer network enable us to train a five-layer network successfully. Note that
directly training a five-layer network using conventional initialization ways is unreliable.
Specifically, we have exhaustively tried different groups of learning rates, but still have not
observed convergence. Furthermore, the “transfer deeper” converges faster and achieves better
performance than using He \etal’s method [9], which is also very effective in training a deep
model. We have also conducted comparative experiments with the structure “9-7-1-1-5” and
observed the same trend.
Transfer high to low quality

Results are shown in Figure 7. Obviously, the two networks with transferred features converge
faster than that training from scratch. For example, to reach an average PSNR of 27.77dB, the
“transfer 1 layer” takes only 1.54×108 backprops, which are roughly a half of that for “base-
q10”. Moreover, the “transfer 1 layer” also outperforms the ‘base-q10” by a slight margin
throughout the training phase. One reason for this is that only initializing the first layer provides
the network with more flexibility in adapting to a new dataset. This also indicates that a good
starting point could help train a better network with higher convergence speed.
Transfer standard to real use case – Twitter

Online Social Media like Twitter are popular platforms for message posting.

However, Twitter will compress the uploaded images on the server-side. For instance, a typical 8
mega-pixel (MP) image (3264×2448) will result in a compressed and re-scaled version with a
fixed resolution of 600×450. Such re-scaling and compression will introduce very complex
artifacts, making restoration difficult for existing deblocking algorithms (\eg SA-DCT).
However, AR-CNN can fit to the new data easily. Further, we want to show that features learned
under standard compression schemes could also facilitate training on a completely different
dataset. We use 40 photos of resolution 3264×2448 taken by mobile phones (totally 335,209
training subimages) and their Twitter-compressed version7 to train three networks with
initialization settings listed in Table 4.

From Figure 8, we observe that the “transfer q10” and “transfer q20” networks converge much
faster than the “base-Twitter” trained from scratch. Specifically, the “transfer q10”
takes 6×107 backprops to achieve 25.1dB, while the “base-Twitter” uses 10×107 backprops.
Despite of fast convergence, transferred features also lead to higher PSNR values compared with
“base-Twitter”. This observation suggests that features learned under standard compression
schemes are also transferrable to tackle real use case problems. Some restoration results6 are
shown in Figure \textcolorred12. We could see that both networks achieve satisfactory quality
improvements over the compressed version.

APPLICATION
In the real application, many image processing routines are affected when they take JPEG
images as input. Blocking artifacts could be either super-resolved or enhanced, causing
significant performance decrease. In this section, we show the potential of AR-CNN in
facilitating other low-level vision studies, \ie super-resolution and contrast enhancement. To
illustrate this, we use SRCNN [4] for super-resolution and tone-curve adjustment [14] for
contrast enhancement [2], and show example results when the input is a JPEG image, SA-DCT
deblocked image, and AR-CNN restored image. From results shown in Figure 9, we could see
that JPEG compression artifacts have greatly distorted the visual quality in super-resolution and
contrast enhancement. Nevertheless, with the help of AR-CNN, these effects have been largely
eliminated. Moreover, AR-CNN achieves much better results than SA-DCT. The differences
between them are more evident after these low-level vision processing.
CONCLUSION
Applying deep model on low-level vision problems requires deep understanding of the problem
itself. In this paper, we carefully study the compression process and propose a four-layer
convolutional network, AR-CNN, which is extremely effective in dealing with various
compression artifacts. We further systematically investigate several easy-to-hard transfer settings
that could facilitate training a deeper or better network, and verify the effectiveness of transfer
learning in low-level vision problems. As discussed in SRCNN [4], we find that larger filter sizes
also help improve the performance. We will leave them to further work.
References
[1] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image
segmentation. TPAMI,33(5):898–916, 2011.
[2] M. Bevilacqua, A. Roumy, C. Guillemot, and M.-L. A. Morel. Low-complexity single-image
super-resolution based on nonnegative neighbor embedding. In BMVC, 2012.
[3] P. Doll ´ ar and C. L. Zitnick. Structured forests for fast edge detection. In ICCV, pages
1841–1848. IEEE, 2013.
[4] C. Dong, C. C. Loy, K. He, and X. Tang. Image super-resolution using deep convolutional
networks. arXiv:1501.00092, 2014.
[5] C. Dong, C. C. Loy, K. He, and X. Tang. Learning a deep convolutional network for image
super-resolution. In ECCV, pages 184–199. 2014.
[6] A. Foi, V. Katkovnik, and K. Egiazarian. Pointwise shapeadaptive DCT for high-quality
denoising and deblocking of grayscale and color images. TIP, 16(5):1395–1411, 2007.
[7] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object
detection and semantic segmentation. In CVPR, pages 580–587. IEEE, 2014.
[8] M. A. Gluck and C. E. Myers. Hippocampal mediation of stimulus representation: A
computational theory. Hippocampus, 3(4):491–516, 1993.
[9] R. C. Gonzalez and R. E. Woods. Digital image processing,
2002.
[10] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level
performance on imagenet classification. arXiv:1502.01852, 2015.
[11] V. Jain and S. Seung. Natural image denoising with convolutional networks. In NIPS, pages
769–776, 2009.
[12] J. Jancsary, S. Nowozin, and C. Rother. Loss-specific training of non-parametric image
restoration models: A new state of the art. In ECCV, pages 112–125. 2012.
[13] C. Jung, L. Jiao, H. Qi, and T. Sun. Image deblocking via sparse representation. Signal
Processing: Image Communication, 27(6):663–677, 2012.
[14] K. I. Kim and Y. Kwon. Single-image super-resolution using sparse regression and natural
image prior. 32(6):1127–1133, 2010.
[15] A. Krizhevsky, I. Sutskever, and G. Hinton. ImageNet classification with deep
convolutional neural networks. Pages 1097–1105, 2012.
[16] Y. Li, F. Guo, R. T. Tan, and M. S. Brown. A contrast enhancement framework with jpeg
artifacts suppression. In ECCV, pages 174–188. 2014.

Coding and Cell-Loss Recovery in DCT-Based Packet Video
No ratings yet
Coding and Cell-Loss Recovery in DCT-Based Packet Video
11 pages
Zhou Breaking Through The Haze An Advanced Non-Homogeneous Dehazing Method Based CVPRW 2023 Paper
No ratings yet
Zhou Breaking Through The Haze An Advanced Non-Homogeneous Dehazing Method Based CVPRW 2023 Paper
10 pages
Big Data-Driven Fast Reducing The Visual Block Artifacts of DCT Compressed Images For Urban Surveillance Systems
No ratings yet
Big Data-Driven Fast Reducing The Visual Block Artifacts of DCT Compressed Images For Urban Surveillance Systems
11 pages
Image Coding Zero
No ratings yet
Image Coding Zero
28 pages
Deblurgan: Blind Motion Deblurring Using Conditional Adversarial Networks
No ratings yet
Deblurgan: Blind Motion Deblurring Using Conditional Adversarial Networks
10 pages
U-Net: Convolutional Networks For Biomedical Image Segmentation
No ratings yet
U-Net: Convolutional Networks For Biomedical Image Segmentation
8 pages
NIPS 2016 Dynamic Network Surgery For Efficient Dnns Paper
No ratings yet
NIPS 2016 Dynamic Network Surgery For Efficient Dnns Paper
9 pages
Ieietspc 202108 001
No ratings yet
Ieietspc 202108 001
5 pages
BILD: A General Bi-Level Framework For Dehazing and Underwater Enhancement
No ratings yet
BILD: A General Bi-Level Framework For Dehazing and Underwater Enhancement
20 pages
Comprehensive Review on Lossy and Lossless Compression Techniques
No ratings yet
Comprehensive Review on Lossy and Lossless Compression Techniques
10 pages
Kwan2006 Article ACompleteImageCompressionSchem
No ratings yet
Kwan2006 Article ACompleteImageCompressionSchem
15 pages
v1 Covered
No ratings yet
v1 Covered
15 pages
Depth-Aware Unpaired Video Dehazing
No ratings yet
Depth-Aware Unpaired Video Dehazing
16 pages
Deep Convolution Networks For Compression Artifact
No ratings yet
Deep Convolution Networks For Compression Artifact
14 pages
sankisa2018
No ratings yet
sankisa2018
5 pages
A Look at The First Place Solution of a Dermatology Classification Kaggle Competition _ Bounded Rationality
No ratings yet
A Look at The First Place Solution of a Dermatology Classification Kaggle Competition _ Bounded Rationality
13 pages
Miika Aittala Burst Image Deblurring ECCV 2018 Paper
No ratings yet
Miika Aittala Burst Image Deblurring ECCV 2018 Paper
17 pages
Wang Real-ESRGAN Training Real-World Blind Super-Resolution With Pure Synthetic Data ICCVW 2021 Paper Compressed
No ratings yet
Wang Real-ESRGAN Training Real-World Blind Super-Resolution With Pure Synthetic Data ICCVW 2021 Paper Compressed
10 pages
IJETR031104
No ratings yet
IJETR031104
3 pages
Electronics 12 00911 v2
No ratings yet
Electronics 12 00911 v2
19 pages
Imagify Reconstruction of High - Resolution Images From Degraded Images
No ratings yet
Imagify Reconstruction of High - Resolution Images From Degraded Images
5 pages
52 Denoising
No ratings yet
52 Denoising
13 pages
Kilari Veera Swamy B.Chandra Mohan Y.V.Bhaskar Reddy S.Srinivas Kumar
No ratings yet
Kilari Veera Swamy B.Chandra Mohan Y.V.Bhaskar Reddy S.Srinivas Kumar
11 pages
Blocking Effect Reduction of JPEG Compressed Image
No ratings yet
Blocking Effect Reduction of JPEG Compressed Image
4 pages
DCT and Wavelet Transform Based Image Compression
No ratings yet
DCT and Wavelet Transform Based Image Compression
5 pages
Advancements in Image Classification Using Convolutional Neural Network
No ratings yet
Advancements in Image Classification Using Convolutional Neural Network
8 pages
Ummenhofer DeMoN Depth and CVPR 2017 Paper
No ratings yet
Ummenhofer DeMoN Depth and CVPR 2017 Paper
10 pages
Adaptive Multiwavelet-Based Watermarking Through JPW Masking
No ratings yet
Adaptive Multiwavelet-Based Watermarking Through JPW Masking
8 pages
Patch Based Non Rigid 3D Reconstruction From A Single Depth Stream
No ratings yet
Patch Based Non Rigid 3D Reconstruction From A Single Depth Stream
10 pages
mulchandani2020
No ratings yet
mulchandani2020
4 pages
Error Concealment Techniques For Video Transmission Over Error-Prone Channels: A Survey
No ratings yet
Error Concealment Techniques For Video Transmission Over Error-Prone Channels: A Survey
12 pages
Improving Image Restoration by Revisiting Global Information Aggregation
No ratings yet
Improving Image Restoration by Revisiting Global Information Aggregation
25 pages
DAC'19-3D_CNN_Video_Analysis
No ratings yet
DAC'19-3D_CNN_Video_Analysis
6 pages
AdaPool_Exponential_Adaptive_Pooling_for_Information-Retaining_Downsampling
No ratings yet
AdaPool_Exponential_Adaptive_Pooling_for_Information-Retaining_Downsampling
16 pages
1-s2.0-S0141938224003184-main
No ratings yet
1-s2.0-S0141938224003184-main
10 pages
Understanding of A Convolutional Neural Network
No ratings yet
Understanding of A Convolutional Neural Network
6 pages
Constrained-CNN Losses For Weakly Supervised Segmentation
No ratings yet
Constrained-CNN Losses For Weakly Supervised Segmentation
25 pages
A Comparative Study On Image Compression in Cloud Computing
No ratings yet
A Comparative Study On Image Compression in Cloud Computing
3 pages
dvgo_v2
No ratings yet
dvgo_v2
5 pages
Haar Wavelet Based Approach For Image Compression and Quality Assessment of Compressed Image
No ratings yet
Haar Wavelet Based Approach For Image Compression and Quality Assessment of Compressed Image
8 pages
Structured Pruning of Deep Convolutional Neural Netw Orks: Sajid Anwar, Kyuyeon Hwang and Wonyong Sung
No ratings yet
Structured Pruning of Deep Convolutional Neural Netw Orks: Sajid Anwar, Kyuyeon Hwang and Wonyong Sung
11 pages
Ilidrissi-Tan2019 Article ADeepUnifiedFrameworkForSuspic
No ratings yet
Ilidrissi-Tan2019 Article ADeepUnifiedFrameworkForSuspic
6 pages
A Generative Adversarial Network With Adaptive Con
No ratings yet
A Generative Adversarial Network With Adaptive Con
12 pages
Removal of Blocking Artifacts of DCT Transform by Classied Space-Frequency Filtering
No ratings yet
Removal of Blocking Artifacts of DCT Transform by Classied Space-Frequency Filtering
5 pages
A novel image denoising algorithm combining attention mechanism and residual UNet network
No ratings yet
A novel image denoising algorithm combining attention mechanism and residual UNet network
31 pages
Universal Distortion Function For Steganography in An Arbitrary Domain
No ratings yet
Universal Distortion Function For Steganography in An Arbitrary Domain
13 pages
6.a Multiple Full
No ratings yet
6.a Multiple Full
6 pages
Sensors 23 00043 v2
No ratings yet
Sensors 23 00043 v2
15 pages
Agrwal 2016
No ratings yet
Agrwal 2016
4 pages
Aakanksha_Improving_Robustness_of_Semantic_Segmentation_to_Motion-Blur_Using_Class-Centric_Augmentation_CVPR_2023_paper
No ratings yet
Aakanksha_Improving_Robustness_of_Semantic_Segmentation_to_Motion-Blur_Using_Class-Centric_Augmentation_CVPR_2023_paper
10 pages
Video Coding Scheme
No ratings yet
Video Coding Scheme
12 pages
Th-04-06 Volume Based Modeling - Automated Construction of Complex Structural Models
No ratings yet
Th-04-06 Volume Based Modeling - Automated Construction of Complex Structural Models
5 pages
oe-31-6-10114
No ratings yet
oe-31-6-10114
22 pages
MaskPlace - Fast Chip Placement Via Reinforced Visual Representation Learning
No ratings yet
MaskPlace - Fast Chip Placement Via Reinforced Visual Representation Learning
21 pages
Multi-Focus Image Fusion For Extended Depth of Field: Wisarut Chantara and Yo-Sung Ho
No ratings yet
Multi-Focus Image Fusion For Extended Depth of Field: Wisarut Chantara and Yo-Sung Ho
4 pages
Density-preserving Deep Point Cloud Compression
No ratings yet
Density-preserving Deep Point Cloud Compression
10 pages
Simultaneous Fusion, Compression, and Encryption of Multiple Images
No ratings yet
Simultaneous Fusion, Compression, and Encryption of Multiple Images
7 pages
Scene Change Detection
No ratings yet
Scene Change Detection
31 pages
DWT Spiht
No ratings yet
DWT Spiht
8 pages
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
Zamzam Merged
No ratings yet
Zamzam Merged
22 pages
Sub Netting
No ratings yet
Sub Netting
8 pages
DAA Review Article
No ratings yet
DAA Review Article
4 pages
Network Architecture - 2022
100% (1)
Network Architecture - 2022
2 pages
Io Tbased Smart Cities
No ratings yet
Io Tbased Smart Cities
5 pages
Group Assigment One Final
No ratings yet
Group Assigment One Final
9 pages
ELECTROMAGNETIC WAVES Class Notes Jee Advanced
No ratings yet
ELECTROMAGNETIC WAVES Class Notes Jee Advanced
16 pages
ISA Transactions: Peyman Sindareh Esfahani, Jeffrey Kurt Pieper
No ratings yet
ISA Transactions: Peyman Sindareh Esfahani, Jeffrey Kurt Pieper
11 pages
Higgs Boson Physics, Part IV: Laura Reina
No ratings yet
Higgs Boson Physics, Part IV: Laura Reina
38 pages
MAcread
No ratings yet
MAcread
23 pages
2 Chain Survey
No ratings yet
2 Chain Survey
40 pages
Syllabus - MCA - I Semester - New - Revised - 2013 PDF
No ratings yet
Syllabus - MCA - I Semester - New - Revised - 2013 PDF
9 pages
HW3R
No ratings yet
HW3R
3 pages
GUI Programming With Python - Layout Management in Tkinter
No ratings yet
GUI Programming With Python - Layout Management in Tkinter
9 pages
"Octal To Binary Encoder": A Project Report On
No ratings yet
"Octal To Binary Encoder": A Project Report On
6 pages
Svit - Module 4
No ratings yet
Svit - Module 4
30 pages
Modelling and Dynamic Simulation of Processes With MATLAB'. An Application of A Natural Gas Installation in A Power Plant
No ratings yet
Modelling and Dynamic Simulation of Processes With MATLAB'. An Application of A Natural Gas Installation in A Power Plant
12 pages
Book Linear Algebra Prof Hazem
No ratings yet
Book Linear Algebra Prof Hazem
90 pages
Chimp Optimization Khishe2020
No ratings yet
Chimp Optimization Khishe2020
26 pages
Module 8
No ratings yet
Module 8
7 pages
Torres Cordova Cristian Kevin
No ratings yet
Torres Cordova Cristian Kevin
7 pages
Finite Automata
No ratings yet
Finite Automata
46 pages
Holy Garden Model School Syllabus - 4
No ratings yet
Holy Garden Model School Syllabus - 4
4 pages
Understanding The Problem First. You Have To Understand The Problem
No ratings yet
Understanding The Problem First. You Have To Understand The Problem
3 pages
09 9MA0 01 9MA0 02 A Level Pure Mathematics Practice Set 9 Mark Scheme
No ratings yet
09 9MA0 01 9MA0 02 A Level Pure Mathematics Practice Set 9 Mark Scheme
14 pages
Utf-8' '2023-24 - Fall1 - Oct18
No ratings yet
Utf-8' '2023-24 - Fall1 - Oct18
1 page
BUSINESS STATISTICS NOTES COEC 1210
No ratings yet
BUSINESS STATISTICS NOTES COEC 1210
44 pages
IGCSE Physics Syllabus Overview
No ratings yet
IGCSE Physics Syllabus Overview
13 pages
15minute Math Decimals
No ratings yet
15minute Math Decimals
21 pages
Mathematics I M 101: Narula Institute of Technology
No ratings yet
Mathematics I M 101: Narula Institute of Technology
5 pages
Solution To Test 1 (Version A) : Part I. Multiple-Choice Questions (6 2 12 Marks) Ddaefa
No ratings yet
Solution To Test 1 (Version A) : Part I. Multiple-Choice Questions (6 2 12 Marks) Ddaefa
3 pages
Maths s1 167
No ratings yet
Maths s1 167
1 page
UKMT - IMOK - Hamilton - Intermediate Mathematical Olympiad and Kangaroo 2020 - Solutions
No ratings yet
UKMT - IMOK - Hamilton - Intermediate Mathematical Olympiad and Kangaroo 2020 - Solutions
8 pages
Related Rates Problem
No ratings yet
Related Rates Problem
22 pages
AGC (Chapter 9 of W&W)
100% (1)
AGC (Chapter 9 of W&W)
94 pages
Chap 1 Physics Measurement
No ratings yet
Chap 1 Physics Measurement
17 pages