CONTRAST ENHANCEMENT FRAMEWORK FOR SUPPRESSING JPEG ARTIFACTS
GUO FANGFANG
(B.Sc., ShanDong University of China, 2012)
A THESIS SUBMITTED
FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF COMPUTER SCIENCE NATIONAL UNIVERSITY OF SINGAPORE
Trang 2@ 2014, GUO Fangfang
Declaration
Thereby declare that this thesis is my original work and it has been written by me in its entirety I have duly acknowledged all
the sources of information which have been used in the thesis This thesis has also not been submitted for any degree in
any university previously
Trang 5Acknowledgements
First, my sincere gratitude to my advisor Dr Michael S Brown for his patient guidance during my M.Sc candidature He enlightens me in the right direction I am thankful for all of his encouraging advice, shared experiences and technical support and I definitely benefited a lot To complete this thesis, my colleague in the Computer Vision Lab, Li Yu, has helped me a lot I truly appreciate his valuable suggestions, pa- tient explanations, and contributions to my research work It is always my pleasure to collaborate with these brilliant people and to work on existing research topics together
I also feel thankful for all my colleagues in the lab, with whom I have spent two years happily during my M.Sc candidature
Finally, I would like to thank my parents and my friends for their
Trang 7Contents
List of lables Ặ.Ặ Q Q Q HQ HQ HQ Hi List of Flgures Ặ Q Q HQ HQ HH HH HQ kia Introduction 1.1 Motivation .0 0.000 eee ee et es 1.2 Research Problem Statement 00000 eee eeeeees I9, › 2 —Ẩä ee Background and Preliminaries 2.1 RelatedConcepts HQ HQ HQ HH HH 2.2 Image ContrastEnhancement co 2.2.1 GlobalContrastEnhancement 2.2.2 LocalContrastEnhancement 2.3 JPEG ArtifactsReducton Ặ c Q Q ee 2.3.1 Deblockng Q HQ HH HH ko 2.3.2 Derinping ch HQ HH HH HQ kg kg "Vk n ee TH Proposed Method 3.1 FrameworkK ee
Trang 8CONTENTS 4 Evaluation 4.1 Experiment Design 4.2 Experimental Results 4.2.1 Tonecurveadjustment -
4.2.2 Dehazing and Underwater Image Enhancement
4.3 Discussion and Analysis 5 Challenges and Future Work 5.1 Challenges
5.2 FutureWork
5.2.1 Complete Previous Work .-
5.2.2 Extension to Generic Compression Scheme
Trang 9Summary
Contrast enhancement is a common tool in Computer Vision It can be applied either globally, such as tone-curve adjustment, or locally, such as dehazing The state-of-the-art algorithms show their limitations on compressed input images because the compression artifacts which are not evident in original images are enhanced at the same time This thesis is to solve this problem starting by examining the current compression artifacts reduction algorithms on contrast enhanced images The initial results show that these current algorithms need to be improved in this case The problem here is that the imperceivable compression artifacts in low-contrast images would be unintentionally boosted when we try to only enhance the images appearance The challenge in this problem is that the two tasks we want to achieve are functionally opposite On one hand, we aim to enhance the contrast of the image content
as much as possible On the other hand, within the same image, we
Trang 10and noise all together no matter in the contrast enhancement step or in the artifacts reduction step
To deal with this problem, we have proposed the image decomposition based framework to supress artifacts appearing in JPEG images that becomes prominently visible when contrast enhancement is applied to the images While the proposed framework is admittedly engineering in nature, our strategy of using structure and texture layer decomposition enables us to process them independently to each other, and to reduce the compression artifacts in parallel with contrast enhancement
Experiments show that our integrated framework can produce com- pelling results compared with generic deblocking algorithms applied sequentially with a contrast enhancement procedure
Trang 11List of Tables
4.1 Average runtime comparison 4.2 Quantitative comparison
Trang 12LIST OF TABLES
Trang 13
List of Figures 1.1 3.1 3.2 3.3 3.4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 Traditional contrast enhancement methods on the low-quality JPEG IMAGES 6 ee
Pipeline of the proposed method
Ilustration of image decomposition .-
The scene details map generatlon
The effect of deblocking on the texture layer
The comparison of existing algorithms and our method on tone- curve adjustment (case1) 2 ee ee ee The comparison of existing algorithms and our method on tone- curve adjustment (case2) .Ặ Ặ eee The results of our methods on different compression levels images More results of ourmethod
Comparison of existing dehazing methods
Trang 15Chapter 1
Introduction
1.1 Motivation
Contrast enhancement is frequently referred to as one of the most important issues in low-level computer vision The purpose of image enhancement is to improve the interpretation or perception of information contained in the image for human viewers, or to provide a better input for other automated image processing sys- tems, such as color segmentation, edge detection, image sharpening, etc Contrast enhancement can be applied either explicitly or implicitly We can boost an image’s global contrast explicitly through histogram equalization, tone-curve adjustment or gradient-based enhancement Besides, improving the visibility of images de-
graded by environment such as haze, fog, rain, and underwater is in nature an
spatially varying implicitly contrast enhancement
Trang 16CHAPTER 1 Introduction
the content in the image, it unintentionally boosts the unsightly image artifacts, especially artifacts due to JPEG compression In reality, in order to improve the speed of image stream transmission on Internet, compression schemes are usually applied to high-quality images to make the file size of images smaller Moreover, the images and videos from surveillance camera are often compressed significantly When we want to enhance the contrast of these low quality images, the results of current state-of-the-art algorithms are far from visually pleasing due to the obvious boosted artifacts, which can be seen in Figure 1.1
There are many well-known algorithms dealing with JPEG artifacts, such as the re-application of JPEG [26], Field of Experts [35], Shape-Adaptive DCT deblocking and denoising [10] and learning-based image denoising [2] The intuitive idea is to apply JPEG artifacts reduction algorithms before or after the contrast enhancement operation However, after we have tried applying the denoising methods as pre- or post- processing, they do not produce visually pleasant results When applied before the enhancement process, the algorithms may over-smooth the texture of the image When applied as a post-processing procedure, the algorithms could not effectively remove the boosted artifacts
Trang 171.2 Research Problem Statement Original image Enhanced image Tone-curve Dehaze Figure 1.1: The results of traditional contrast enhancement methods on the low- quality images
1.2 Research Problem Statement
Our research goal is to deal with the boosted JPEG artifacts during the contrast enhancement procedure Given a low-quality image, how can we produce the
enhanced result free of noticeable JPEG artifacts? On one hand, we want to enhance
the contrast of the content in the image as much as possible On the other hand, to void the block and ringing artifacts being boosted, we have to suppress the artifacts to the maximum extent Based on this objective, we state several research problems in this thesis
Trang 18in-CHAPTER 1 Introduction
dividually In our case, we can perform contrast enhancement on the image content to get the expected enhanced results, and perform the reduction of JPEG artifacts on the texture layer to suppress artifacts Recomposition of the two processed layers can achieve the aforementioned objective
Secondly, it is very important to combine the two layers properly together to get the final result The naive solution that directly adds the two processed layers together would introduce strong ringing artifacts Another problem is that the block artifacts in the homogeneous region and the image details are difficult to differentiate in the same image If we deblock too much, the block artifacts
will be removed as expected, however, the details are over smoothed at the same time However, if we deblock insufficiently, the block artifacts will not be removed
effectively To prevent such degradation problems, we need to compute a mask which separates the main objects from the homogeneous regions that should be
omitted when added back to the final result
1.3 Overview
Trang 19Chapter 2
Background and Preliminaries
2.1 Related Concepts
e Image Contrast Enhancement
Image contrast enhancement is a process involving changing the pixels’ inten- sity of the input images, so that the output image looks subjectively better [12] This procedure can be performed either explicitly or implicitly The explicit common methods used for improving contrast in digital images are histogram equalization (HE) and tone curves These kind of methods are spatially in- variant contrast enhancements Some examples of the latter are dehazing and underwater image enhancement Due to the substantial presence of particles in the atmosphere that absorb and scatter light, the degradation of the con- trast of the haze-free image mainly depends on the distance of the objects to
the camera, which tends to be smooth This kind of contrast enhancement
Trang 20CHAPTER 2 Background and Preliminaries
of their simplicity and great performance on high-quality images that are free
of obvious JPEG artifacts
e JPEG Compression Scheme
A brief overview of lossy image compression and the JPEG standard is intro- duced here The purpose of image compression is to represent images with as less data as possible in order to save storage costs or transmission time Lossy image compression is to remove the high-frequency (noise-like) details that the human eye typically does not notice The lossy compression meth- ods commonly used are Fourier-related transform coding such as discrete
cosine transform (DCT, used in JPEG, MPEG-1, MPEG-2, H.261, H.263 and
its descendants) and wavelet transform (used in JPEG 2000) The degree of compression can be adjusted, allowing a selectable trade-off between storage size and image quality The detailed discussion of the theory behind quan- tization and justification of the usage of linear transforms can be found in
[6]
The JPEG encoder partitions the image into 8x8 blocks of pixels To each block, it first applies a 2-dimensional DCT individually, followed by the quantization of the DCT coefficients element-wise by a 8x8 quantization matrix The high-frequency image content can be quantized more coarsely than the low- frequency content since there is a smaller amount of energy packed in the high-frequency bands of most natural images The human visual system is also less sensitive to quantization loss in the high-frequency bands
The quantized DCT coefficients form a matrix that is usually sparse, i.e there are many Zeros in it, especially in the high frequency bands The elements of
Trang 212.1 Related Concepts
the matrix are first ordered in a zig-zag scan An entropy coder combined with a run-length coding of the zeros will then generate an efficient representation
of the quantized coefficients (in terms of bitrate) to be transmitted or stored
The quantization matrix determines the quality and amount of compression of the image The lower the quality setting, the greater the divisor (the elements in the quantization matrix), thus increasing the chance of a zero result Conversely, if we want high-quality images, we need to set all the values in the quantization table close to 1, meaning all of the original DCT data is preserved (for more details, see [42])
e JPEG aritfacts
The nature of the JPEG compression schemes causes high-frequency informa- tion to be discarded This inevitably introduces artifacts into the compressed image At low bitrate, decoded images exhibit block aritfacts which are the discontinuities between the boundaries of the blocks and the ringing artifacts that appears around sharp edges in the original image [16] These are re-
ferred to as JPEG artifacts in computer vision area JPEG artifacts affect many
other tasks, such as image segmentation, color segmentation, edge detection, image contrast enhancement, and image sharpening, etc Therefore, reduc- tion of JPEG artifacts is a popular and challenging topic in computer vision
Trang 22CHAPTER 2 Background and Preliminaries
2.2 Image Contrast Enhancement
2.2.1 Global Contrast Enhancement e Histogram equalization (HE)
The operation of HE is performed by remapping the gray levels of the image based on the probability distribution of the input gray levels It expects an image with equally distributed pixels at every gray level This makes the image look nice and have an “equal histogram” It flattens and stretches the dynamic range of the image’s histogram, resulting in overall contrast enhancement [14] Through this adjustment, the intensities can be better distributed on the histogram The mean brightness of the output image is always at the middle or close to it regardless of the mean of the input image In addition, histogram equalization is a powerful tool in highlighting the borders and edges between different objects, but may reduce local details within the objects
However, histogram equalization is usually applied to grayscale images For color images, applying this operation on RGB channels separately may yield dramatic changes in the image’s color balance To achieve better results, we can either convert the image to other color space (e.g HSL/HSV) before applying histogram equalization to the luminance channel or apply the newly proposed histogram equalization methods in 3D space
e Explicit tone curve adjustment
Applying a function f explicitly to the original pixel intensity value, ie
I’ = f(I), is known as tone curve adjustment The function f can be de-
Trang 232.2 Image Contrast Enhancement
fined in different ways according to the content of the input image and the effect desired after enhancement The commonly used methods are contrast stretching or gamma curve mapping Its simplicity and visually pleasing results make tone curve adjustment a fundamental and conventional tool in low-level computer vision area
Although histogram equalization is essentially an implicit tone curve, it is more complicated to apply to color images and is less commonly used in reality than tone curve We therefore choose to focus on dealing with the problems of explicit tone curve operation on low-quality JPEG images in this thesis
2.2.2 Local Contrast Enhancement e Dehazing
Poor visibility has always been a major problem in computer vision For example, outdoor images are usually degraded by the weather due to the
substantial presence of particles in the atmosphere, such as haze, fog, rain,
etc., which absorb and scatter both the light from the atmosphere and the light reflected from the object The performance of high-level vision algo- rithms (e.g automatic surveillance system, intelligent vehicles and object recognition) will be inevitably affected by the poor visibility in bad weather In computer vision, the haze imaging process is modelled by a linear combi- nation of the scene’s direct attenuation and the global atmospheric light It is
as follows [8, 23, 36]:
I(x) = J(x)t(x) + A(1 — t(x)), (2.1)
Trang 24CHAPTER 2 Background and Preliminaries
where I is the observed intensity, J is the scene radiance, A is the global atmospheric light, ¢ is the medium transmission which depends on the depth of the scene object and x is the 2D spatial location of the pixels in the image
Dehazing (recovering J(x) from I(x)) is a challenging problem because the
haze is dependent on the unknown depth information and the problem is ill-posed due to inherent airlight-albedo ambiguity if the input is only a single
image Therefore, many methods exploiting two or more images or additional
information have been proposed Works in [33,32] exploit two or more images of the same scene that have different degrees of polarization (DOP) In [23, 24, 25], they exploit more constraints from the different weather conditions of the same scene These methods suffer from the requirement and the availability of the multiple input images
Recently, single image dehazing algorithms [8, 22, 13, 15, 36, 37] have made
significant progresses These methods have succeeded to adopt strong prior
or assumptions on the haze-free natural images Tan [36] observes that the
haze-free images have higher contrast compared to the input hazy image He removes the haze by applying Markov Random Field(MRF) to maximize the local contrast of each small 5 x 5 or 8 x 8 patch while regulating the smoothness between adjacent pixels Fattal [8] refines the imaging model by accounting for surface shading and the scene transmission This is followed by the estimation of airlight and scene albedo based on the assumption that surface shading and medium transmission functions are locally statistically
uncorrelated Ko Nishino [15] models the image with a factorial MRF in which
scene albedo and depth are two statistically independent latent layers They
Trang 252.2 Image Contrast Enhancement
integrate the natural images statistical priors into the MRF energy functions and estimate each layer alternatively with other layers as hidden layers using canonical expectation maximization algorithm He [13] proposes a novel prior that in the haze-free image, the lowest intensity of all the pixels over three color channels tends to be zero in most local patches, referred to as “dark channel prior” Using dark channel prior, they could estimate the airlight and transmission map and furthermore recover the scene radiance
In [22], boundary constraint on transmission function and a weighted L;-norm
based contextual regularization are modeled into an optimization problem to estimate the unknown transmission map
Underwater image enhancement
The underwater imaging process is similar to imaging in haze phenomenon, which also suffers from poor visibility caused by light that is reflected from a surface and scattered by the suspended particles in water The resultant image looks bluish due to varying degrees of light attenuation for different wavelengths [5] To remove the light scattering distortion, [31] exploits the polarization effects to compensate for visibility degradation [1] applies the
fusion technique to enhance the underwater images [4] uses image dehazing
algorithms to restore the contrast of the underwater images The simplest way to solve the color deviation in underwater images is white balancing
[3, 9, 38] In this thesis, we white balance the underwater images before
processing them
Trang 26CHAPTER 2 Background and Preliminaries
2.3 JPEG Artifacts Reduction
2.3.1 Deblocking
The JPEG compression standards are popularly used because of their excellent energy compaction capability However, in low bitrate, due to the uncorrelation between blocks in JPEG encoding scheme, annoying block artifacts come into existence and have a great impact on the image quality Therefore, deblocking is in great demand in the image processing community The deblocking methods could be classified into four types: in-loop filtering, pre-processing, overlapped block methods, and post-processing The in-loop filtering adds the deblocking filter into
the coding loop, which is adopted in H.264/ave and H.263+ Pre-processing [39]
before coding could also make the reconstructed image the same as the original image The block artifacts are due to individual and separate operation on each non-overlapping blocks In Lapped Orthogonal Transform [21], the blocks overlap slightly to avoid the discontinuity appearing at the boundary between blocks Post-processing algorithms apply some low-pass filters and other image processing algorithms on the decoded images in spatial, gradient, or DCT domain This is the most popular method because it does not need the original uncompressed images, which are usually not available in reality and the compression standards do not need to be modified Thus, we focus on the post-processing algorithms
An intuitive and most widely used principle is low pass filtering of images in the spatial domain [46] To overcome excessive smoothing due to the filter, Foi proposed a local shape-adaptive filter [10] which can remove the block edges while
preserving the content edges In [20], a blind measurement and removal of block
artifacts in DCT domain were proposed They modelled the blocking artifacts as
Trang 272.3 JPEG Artifacts Reduction
2-D step functions in the shifted block which is composed of each half from the adjacent blocks Using the estimation of block artifacts, they replaced the original 2-D step function with a new linear function The DCT transform of two adjacent blocks are then updated using the new DCT of the shifted block Another category of post-processing deblocking is to model the prior of natural image into an energy
function Work in [35] incorporated the Field of Experts [29] prior into the high
order MRF to model the natural image and solved it by the maximum a posteriori criterion Smoothness constraint [19, 11] is another popular prior used to deblock
Moreover, the Projection onto Convex Sets (POCS) based methods [46, 45, 28]
exploit the convex sets prior of the original image They project the image onto the convex sets iteratively and converge in the intersection of all the sets, resulting in the recovered image Learning-based deblocking methods [17, 2] are another way to reduce the block defects They learn a mapping from compressed images to their
uncompressed version These methods are, however, computationally demanding
2.3.2 Deringing
For blocks containing both textures and flat regions, the coarse uniform spatial dis- tribution of the quantization noise in the blocks cause ringing noise mainly around edges to become visible This is different from blocking artifacts appearing at block boundaries, as they usually appear within the blocks Pre-processing methods used for avoiding this kind of artifacts are to find the optimal DCT coefficients which adapt to the noise variances in different areas according to the image content Work in [44] introduced a noise shaping algorithm to solve this problem Post-processing
for the decoded images also attracts a lot of researchers The SA-DCT filter [10]
Trang 28CHAPTER 2 Background and Preliminaries
deals with both block and ringing artifacts The techniques in [27] use linear and nonlinear isotropic filters for reduction of ringing artifacts
2.4 Summary
In this chapter, we have reviewed some contrast enhancement methods and state- of-the-art algorithms on JPEG artifacts reduction, which are related to our research These algorithms and the reality of low-quality images inspired us to deal with the arising artifacts in contrast enhancement procedure These techniques can be adopted in our work For example, tone curve adjustment and the dehazing method [22] that works best on most images form a part of our framework The results in Chapter 4 show that our methods can produce compelling results superior to directly applying these state-of-the-art compression artifacts removal algorithms sequentially with contrast enhancement on the low-quality images
Trang 29Chapter 3
Proposed Method
In this chapter, we will show our proposed method Section 3.1 gives an overall
illustration of our framework In the following sections 3.2 , 3.3 and 3.4, various
steps of our method are introduced respectively The results of our framework and comparison with other methods are shown in the next chapter
3.1 Framework
In this section, the work flow of our proposed method is introduced As shown in
Figure 3.1, the input image is first decomposed into two components: the structure component I; and the texture component Ir by utilizing TV regularization The structure component consists of smooth signal and the main large objects in the image The small oscillating signals and the JPEG noises are mainly contained in the texture layer The contrast enhancement procedure, such as tone-curve or dehazing, is directly applied to the structure layer As for the texture layer, in order to distinguish the image fine details and JPEG artifacts, we first come out with a
Trang 30CHAPTER 3 Proposed Method Structure Is Structure I¢ Input J Result !“ Texture mask M Sea Texture I; Deblocked I
Figure 3.1: Pipeline of the proposed method
mask M which separates the main object regions and the homogeneous regions according to the DCT coefficients in each 8 x 8 block The block based mask is too coarse to use so we apply soft matting method to refine the mask in alignment with main object in the structure layer The main object regions of the texture layer will be added back to structure layer to get the final result The homogeneous regions of texture layer would be discarded Thus the block artifacts in this region and the ringing artifacts around the sharp edges will not be present in the final enhanced image In order to reduce artifacts in the main objects region, further processing on the texture layer is needed, which will be added back to final results
The last step is to reconstruct the final result by combining the enhanced struc- ture layer and the deblocked texture layer multiplied by a mask pixel-wise
Our method shows efficient reduction of block noises in the homogeneous region and mosquito noises around the sharp edges without losing the fine details of the image
Trang 313.2 Structure-Texture Decomposition
3.2 Structure-lexture Decomposition
We use the TV regularization method proposed in [30] to decompose the image into a structure layer which holds the geometrical information and a texture layer which holds the texture information of the image, i.e image details Based on the total variance regularization scheme, the structure layer I; can be obtained by iteratively minimizing the objective function defined as:
_ 7 2
min È (Is, — L)Ÿ + A|Vlj, (3.1)
1
where iis the pixel index, A is the regularization parameter to determine how strong the texture layer is constrained to the input image and the extent of smoothness of the structure layer, V is the gradient operation This equation is called ROF model for the original TV regularization proposed by Rudin, Osher and Fetami [30] This model can be solved efficiently using the half-quadratic splitting scheme proposed in [40] The texture layer can be computed by simply subtracting the structure layer from the original image, Ir, = Ij — Is,
The separation of image content into semantic parts plays a vital role in appli- cations such as compression, enhancement, restoration and more Such separation could be based on variational formulation or using independent component analy- sis and sparsity For example, [34] proposed a superior compression scheme which compresses the picture layer and text layer individually to compress both more effectively Wedel, et.al [43] performed a structure-texture decomposition of the input images to get rid of violations in the optical flow constraint due to illumina- tion changes to achieve precise and robust estimation of optical flow The proposed
Trang 32CHAPTER 3 Proposed Method
Compressed I (96.55) Compressed I, (99.18)
Figure 3.2: This shows two examples of structure-texture decomposition in un- compressed and compressed image (Q40) pairs The structure similarity index measurement (SSIM) [41] values (in X100 scale in the paper) between each pair are shown in the brackets Notice that most of the characteristic compression arti- facts exist in the texture layer, while the structure layer of the compressed image resembles that of the uncompressed image
inpainting method in [7] is based on the sparse representation based image decom- position method called MCA (morphological component analysis), designed for the separation of linearly combined texture and structure layers in a given image
In our problem, the image decomposition exploits the fact that most of the structure layer have larger gradient magnitudes, on the contrary, the texture layer captures both fine details and compression artifacts that exhibit smaller gradient magnitudes The parameter A in Eqn 3.1 is important for controlling the smooth- ness of the structure layer It needs to be adjusted according to the compression level to obtain better results The higher the compression level, the smoother the structure layer we expect In this way, the JPEG compression artifacts could be ensured to be contained almost in the texture layer That is to say that higher
Trang 333.3 Reducing Artifacts in the Texture Layer
compression requires A to be increased
The results of image decomposition are shown in Figure 3.2 ' For the same im- age, the two rows show the decomposed results of uncompressed and compressed versions respectively Convincingly, we have also quantitatively calculated the structure similarity index measurement(SSIM) [41] values between each pair of original images, structure layers, and texture layers The remarkable phenomenon is that the structure layers of uncompressed and compressed images are almost identical both quantitatively and subjectively Low SSIM score between two orig- inal images mainly results from the wide difference between the pair of texture layers It can be safely concluded that ROF model based on TV regularization could produce a structure layer that significantly filter out most of compression ar- tifacts in the image Almost all the block noises and mosquito noises are separated into the texture layer Thus the structure layer is considered to be artifacts free and suitable to be boosted directly according the desired enhancement task, resulting in the enhanced version of the structure layer, I¢
3.3 Reducing Artifacts in the Texture Layer
After seperating the structure layer and texture layer, we could operate on the two layers individually Unlike directly applying the enhancement operation to the structure layer, the texture layer needs to be further processed specially since it contains both scene details and compression artifacts To reduce the compression artifacts without losing fine details, we first come out with a mask M that separates
‘JPEG assigns a quality factor, QX, to indicate the subjective quality from 0 to 100 (from low quality to high quality)
Trang 34CHAPTER 3 Proposed Method Refined M ' t re Refined M if 5 " " | oe in a : Reise! T112 = " " k " " " — all N mm z " Initial
Figure 3.3: This shows two examples of the scene details map generation The initial results obtained by checking DCT coefficients are rough estimations A soft matting technique can help refine the map by applying it to the structure layer, and the result is well aligned with the objects in the images
the homogeneous regions from the regions where the most scene details are present
Applying soft matting method [18] to the coarse mask could make it align with the
main object in structure layer Further process to the texture layer is deblocking to reduce the potential ringing and block artifacts At the final step of our whole framework, the processed texture layer in the mask will be added back to the enhanced structure layer The homogeneous region which are not in the mask would be discarded since they do not contain the details of the images
3.3.1 Scene Detail Extraction
The task now is to compute the mask M that separates the homogeneous region and scene detail regions To create the mask, we apply the discrete cosine transform (DCT) to each 8 x8 patch in the texture layer We exploit a straightforward fact that the greater the high-frequency coefficients are, the more likely the block contains
Trang 353.3 Reducing Artifacts in the Texture Layer
image details Therefore we can use the the DCT high-frequency coefficients to serve as the likelihood of the scene details Denoting the 8 x 8 DCT of one block as matrix B, then the likelihood of this block to be part of the image scene details can be expressed as
t=(Ö_ B2„)— BỊ, — BỊ, — BỆ,, (3.2)
u,v
where u,v denotes the 2D index in the DCT of the block We take the sum of
squares of all DCT coefficients except B11, Bi, and By; , which are the low-frequency
coefficients, and apply a threshold to the likelihood to make a binary indication of each block The threshold we use is empirically set to 0.1 This initial block-wise
estimation of texture region, denoted as M, is a coarse estimation, as shown in the
second column of Figure 3.3
This initial block-wise estimation of scene details region is too coarse for prac- tical use and needs to be refined Therefore we tried the soft matting method to align the mask to the structure layer by minimizing the following function on the refined scene details map M we want:
min(m — rh)'(m — rh) + am'L,m, (3.3) where m and m are the vector form of matrixs M and M, respectively L, is Levin's [18] matting Laplacian matrix generated from the guided structure layer I, The first term in Eqn 3.3 forces the the agreement with the initial estimation M, while the second term forces the output to be aligned with the structure layer L, The regularization parameter a is fixed as 10°in our implementation We could see the
Trang 36CHAPTER 3 Proposed Method
TextureI, _ Result (89.46) _ Deblocked result (90.37) Figure 3.4: This shows the effect of blocking artifact reduction The left side shows the textural layers and its corresponding final composition results without the blocking artifacts reduction step The right side shows the same pair but with the effect of blocking artifact reduction As can be seen both in texture and final results that the block is less noticeable when we apply the block artifact reduction The similarity against ground truth using SSIM for with and without deblocking are also shown in the bracket
refined mask in last column in Figure 3.3 The refined mask are aligned to the scene in the structure layer as expected
Since the edges in mask are sharp enough, when we multiply the mask with texture layer pixel-wise, the ringing artifacts around the edge can be removed effectively
3.3.2 Block Artifacts Reduction
The mask we get in previous step can only help to remove the block and ringing artifacts in homogeneous regions We still need to reduce the existential artifacts in scene details regions As we have introduced in Chapter 2, there are a bunch of state- of-the-art deblocking algorithms However, they are computationally demanding To improve the efficiency of our algorithm, we tried to reduce block artifacts in spatial domain instead of the gradient domain or DCT domain We define an
Trang 373.4 Layer Recomposition
objective function which tries to constrain the output being similar to the input while reducing the gradient of the block edges between 8 x 8 blocks The objective function is defined as follows:
min 2 ( — Ir}? + § Lom (3.4)
where I and I are texture layer and deblocked texture layer respectively, n is the set of all the locations at the 8 x 8 block borders, and the term f is the weight controlling the smoothness which we set as 0.5 in our implementation The first term tries to force the output to be similar to the input, and the second term smooths the blocking artifacts appearing at the 8 x 8 block borders This is effective in reducing the block artifacts in the texture layer
The results of deblocking on the texture layer are shown in Figure 3.4 To show the effect of deblocking clearly, we have zoomed in some representative patches Visually the block artifats are much more inconspicuous in our results Further- more, we have calculated the SSIM between the results with/without deblocking and the uncompressed image (ground truth) separately The results with deblock- ing achieve a higher quantitative SSIM score as shown in the brackets in the figure
3.4 Layer Recomposition
Having processed the structure layer and texture layer separately, we need to com- bine the two layers reasonably to get the final enhanced result The naive solution that simply add up these two layers will introduce strong halos artifacts, mostly in the locations containing more edges This is because the structure layer has been
Trang 38CHAPTER 3 Proposed Method
enhanced by tone-curve or dehazing according the task, however, the texture layer has not been enhanced Another reason is that most contrast functions f are not linear, thus f (Is +Ir) # f(s) + f(I7) We cannot simply apply the same enhancement process and then sum them up To prevent such degradation problems, we need to apply some operation on the texture layer before adding it back to If
Denoting the enhanced texture layer as IF, our goal is to find a reasonable IF to make f(I) come out from f(I) = Ig + Ij, visually pleasant and free of noise That is to say we need to approximate the enhancement of the original image by finding a scale multiplication factor K, which should ensure the following condition as much as possible:
f(D = f(s) + KI;, (3.5)
where I is the original input image and I = Is + Ir In our case, I, = Mo I4, and o is the element-wise multiplication operator I? is the masked texture layer with artifacts reduced, which is generated by section 3.3 In this way, we have
I =KoMold, (3.6)
To obtain It, for different enhancement function f, we should derive the appropriate K For the application of image tone-curve adjustment, the tone-curve function f is applied to the intensity values of the input image, I As we all know, a Taylor series is a representation of a function as an infinite sum of terms that are calculated from the values of the function’s derivatives at a single point The approximation of function can be written in this way: f(t+ At) ~ f(t) + f’(At We can safely apply
Trang 393.5 Summary this into our tone-curve adjustment function: ƒŒs, + lr,) = ƒữs,) + ƒ Œs)lr, (3.7) Hence, from the last equation, we have the scale factor for the tone adjustment K; = f'(Is,)
In the dehazing or underwater application, the enhancement should consider the physical model of scattering media According to the model exploited in [13], the output of enhancement is obtained through the following equation:
+A, (3.8)
where I; is the intensity at pixel index i in the input image, A is the atmospheric light, t; is the medium transmission of index i, and usually assumed identical across three color channels Therefore, the scale factor, K;, should be approximately equal
1
to = -, since A is a constant and I? is in i +k form Following [13], f is obtained from
dark channel prior and A is obtained from the patch with the brightest intensity in dark channel
Having recovered both the structure and texture layers, the final result can be achieved by simply summing up the two layers: I’ = I + Ip
3.5 Summary
In this chapter, we have demonstrated how to enhance the image contrast while suppressing compression artifacts which would become prominently visible if we did not do anything when enhancing the image contrast Combining the state-
Trang 40CHAPTER 3 Proposed Method
of-the-art contrast enhancement approaches (tone-curve adjustment or dehazing) together with the image decomposition algorithm, our method can produce the en- hanced results void of block and ringing artifacts that are commonly present in the results of current compression artifacts removal algorithms applied as pre- or post- processing of contrast enhancement Decomposing the image into structure and texture layers enables us to integrate the two opposite tasks, i.e contrast enhance- ment and suppressing the compression artifacts, in one framework and achieve them in parallel In next chapter, we will evaluate and show the effectiveness of our method in terms of different applications and sufficient data