Skip to main content

ORIGINAL RESEARCH article

Front. Neurorobot., 27 June 2022

Adaptive Bilateral Texture Filter for Image Smoothing

\nHuiqin Xu&#x;Huiqin Xu1Zhongrong Zhang&#x;Zhongrong Zhang1Yin Gao,Yin Gao2,3Haizhong LiuHaizhong Liu1Feng XieFeng Xie4Jun Li,
Jun Li2,3*
  • 1School of Mathematics and Physics, Lanzhou Jiaotong University, Lanzhou, China
  • 2Fujian Science & Technology Innovation Laboratory for Optoelectronic Information of China, Fuzhou, China
  • 3Quanzhou Institute of Equipment Manufacturing, Chinese Academy of Sciences (CAS), Quanzhou, China
  • 4Institute of Automation and Communication Magdeburg, Magdeburg, Germany

The biggest challenge of texture filtering is to smooth the strong gradient textures while maintaining the weak structures, which is difficult to achieve with current methods. Based on this, we propose a scale-adaptive texture filtering algorithm in this paper. First, the four-directional detection with gradient information is proposed for structure measurement. Second, the spatial kernel scale for each pixel is obtained based on the structure information; the larger spatial kernel is for pixels in textural regions to enhance the smoothness, while the smaller spatial kernel is for pixels on structures to maintain the edges. Finally, we adopt the Fourier approximation of range kernel, which reduces computational complexity without compromising the filtering visual quality. By subjective and objective analysis, our method outperforms the previous methods in eliminating the textures while preserving main structures and also has advantages in structure similarity and visual perception quality.

Introduction

Natural images usually have complicated textures, which makes it difficult to understand the main information of the image without texture removal. Structure-preserving texture smoothing is an important issue in computer vision and digital image processing for image cognition. It attempts to eliminate the meaningless textures while preserving dominant structure as well as possible, which has a wide range of applications, such as tone mapping (Jia and Zhang, 2019), detail enhancement (Fei et al., 2017), image abstraction (Winnemöller et al., 2006), and so on. For structure-preserving texture filtering, the first is to detect pixels near structure edges and then preserve structures while eliminating textures. Therefore, texture filtering plays an essential role in many image preprocessing applications.

The early methods utilized were in contrast to pixel intensity for texture measurement (Tomasi and Manduchi, 1998; Farbman et al., 2008; Xu et al., 2011). Such methods can remove fine details but perform poorly when directly eliminating high-contrast and complicated textures in the image. Subsequently, some more comprehensive texture measurement methods have been proposed, such as local extrema (Subr et al., 2009), region covariance (Karacan et al., 2013), and relative total variation (Xu et al., 2012), these methods can smooth out the textures but also cause blurring of the small structure edges. Further, many scholars have improved texture measurement methods to generate the guidance image in the joint bilateral filter. For instance, the guidance image is calculated through patch shift for each pixel (Cho et al., 2014). Similarly, joint bilateral filtering was also employed in Jeon et al. (2016), Song et al. (2018) and Xu and Wang (2019), where they use an adaptive kernel scale to generate a smoothed image as guidance. These methods perform better because they proposed to use small window sizes near structures and large window sizes in the textures region.

On the other hand, some methods were introduced by adaptively adjusting the spatial kernel scale or range kernel scale of the bilateral filter. For instance, the size of the range kernel is changed at each pixel (Gavaskar and Chaudhury, 2018), where the polynomials are adopted to approximate histograms for accelerations of adaptive bilateral filtering. In addition, the width of the spatial kernel is adapted by relying on local gradient information (Ghosh et al., 2019), which can obtain structure-preserving smoothing results. This paper modifies the structure measure used in Ghosh et al. (2019) to implement the superior performance of texture removal.

In recent years, deep learning algorithms have been introduced to the area of edge-preserving texture filtering. Earlier work included deep edge-aware filtering proposed by Xu et al. (2015), which constructs a unified neural network architecture in the gradient domain. Chen et al. (2017) and Lu et al. (2018) both trained fully supervised Convolutional Neural Networks for texture smoothing. Since the above methods require several image pairs that are not readily available to train the model, the semi-supervised method (Gao et al., 2020) and unsupervised method (Zhu et al., 2017) are proposed to avoid the collection of annotated training examples.

In this paper, we present a scale-adaptive texture smoothing algorithm based on the traditional bilateral filtering framework, which smooths multi-scale textures by adjusting the scale of the spatial kernel at each pixel. First, we employ gradient information along with the four-directional structure detection to identify the structures from coarse textures. Second, the spatial kernel size for each pixel is estimated depending on the structure measure. Finally, we use the Fourier approximation of the Gaussian range kernel to accelerate the bilateral filtering for texture removal, where the computational complexity does not change with the spatial kernel size. The experimental results show that our method can effectively achieve the outstanding capability of structure-preserving smoothing results. The main contributions of this paper are as follows:

• We propose a four-directional structure detection based on gradient information, which uses the gradient information in the pixel neighborhood to more accurately extract structures from images containing complicated textures.

• We propose a mapping rule to determine the spatial kernel scale of each pixel, which can adaptively adjust scale size via structure information. The pixels in the vicinity of the structure edges adopt smaller spatial kernel scales and the pixels in textural regions adopt larger spatial kernel scales.

• The approximation algorithm of adaptive bilateral filtering is presented for texture removal. This strategy claims that the complexity of texture filtering does not lie on the scale of the spatial kernel.

In the following section of this paper, the related work is described in Section Related Work, our proposed method is discussed in detail in Section Our Method, experimental analysis is discussed in Section Experiments and Results, the applications of our algorithm are presented in Section Applications, and a conclusion is introduced in Section Conclusion.

Related Work

The research of texture filtering has received a lot of attention in the past several decades.

Traditional texture filtering algorithms include local weighted averaging and global optimization. Bilateral filtering (BF) (Tomasi and Manduchi, 1998), guided filter (He et al., 2013), and anisotropic diffusion (Perona and Malik, 1990) are all typical local weighted averaging methods. As one of the classic non-linear filters, BF combines the spatial kernel and the range kernel for noise removal. Algorithms based on global optimization mainly include the total variation (TV) model (Rudin et al., 1992), weighted least squares (WLS) (Farbman et al., 2008), and L0 gradient minimization (Xu et al., 2011); these methods optimize the global framework that relies on gradient information, which can overcome some limitations of local filters such as halo artifacts and gradient reversals, but these optimization-based methods need to solve a complex linear model which is time-consuming and cannot remove high-contrast noise well. Subsequently, some edge-preserving models have been proposed to optimize the global framework, for example, Huang et al. (2018) took advantage of global optimization together with local filtering to enhance the smoothness. To improve smoothing quality and processing speed based on WLS, Liu et al. (2017) proposed semi-global weighted least squares which solve a sequence of subsystems iteratively, and Liu W., et al. (2020) achieved high speed through Fourier transform and inverse transform. However, these traditional texture filters cannot effectively distinguish prominent structures from complex details and completely smooth out the textures in images with complex backgrounds.

Some new models have been proposed for extracting the salient structure from the input images, which make use of texture characteristics instead of gradient information to identify regular or random textures. For example, Subr et al. (2009) decomposed the structures and textures through local extrema, and they defined textures as the oscillations between local minima and maxima and calculated the average of the extremal envelopes to smooth out the textures. Karacan et al. (2013) proposed a patch-based region covariance that uses first-order and second-order feature statistics to extract structures from different types of textures; however, structures that have similar statistical properties to textures may be incorrectly smoothed out, which tend to overly blur the structures of the images. Lee et al. (2017) proposed an interval gradient operator for structure-preserving image smoothing. On the other hand, Xu et al. (2012) observed that the inherent variation in a window that includes structures is generally greater than that in a window containing textures, so they propose a relative total variation (RTV) to capture the structures and textures characteristics of the images. Subsequently, Zhao et al. (2019) proposed an activity-driven LAD-RTV for texture removal. However, these methods may incorrectly regard small structures as texture because of the overlapping area between adjacent windows. The complex textures cannot be completely smoothed out when the smoothing window size is too small, while an excessively smoothed image may be produced, and when the size is too large, it is difficult to find a suitable window size to achieve the balance between preserving main structures and removing unimportant textures.

To address the limitations of not smoothing out complicated textures and structure edges blurring, filtering methods based on structure-aware have been proposed to achieve high-smoothing quality. That is, the smoothing scale for a pixel is adaptively varied from pixel to pixel. These methods obtain smoothing results through joint bilateral filtering, in which the guidance image calculated by adaptive kernel scale is a particularly important process. Jeon et al. (2016) propose a scale-adaptive texture filtering based on patch-based statistics, and the optimal smoothing scale of each pixel is estimated according to the directional relative total variation (dRTV) measurement. Song and Xiao (2017) used patches of two scales to represent pixels by calculating the directional anisotropic structure measurement (DASM) on each pixel; the smaller patches are adopted for pixels at the structures and the larger patches are adopted for pixels in texture regions. Subsequently, Song et al. (2019) utilized directional anisotropic structure measure (DASM) to replace dRTV in Jeon et al. (2016), then evaluate the exact smoothing scale relying on four-direction statistics of DASM value. With regard to texture measurement windows, Xu and Wang (2018) adopted long and narrow small windows for texture measurement because the structure edges are not always parallel to the axes. Furthermore, Liu Y., et al. (2020) proposed texture filtering based on the local histogram operator, which uses the difference in color distribution to distinguish structures from textures, and then they determined the width of the range kernel. The above methods perform well in preserving structure while smoothing out textures. However, the multiple iterations of joint bilateral filtering may cause blurred structure and color cast.

Recently, deep learning has made significant progress in the field of image texture smoothing (Chen et al., 2017; Kim et al., 2018; Gao et al., 2020). Kim et al. (2018) designed a new framework for structure-texture decomposition, and they replaced the total variation prior with a network and plug deep variational priors into an iterative smoothing process. Gao et al. (2020) presented a semi-supervised algorithm relying on Generative Adversarial Networks (GANs) for structure-preserving smoothing, which designs different loss functions for both labeled and unlabeled datasets. However, in neural network training, their target outputs are usually generated by the existing smoothing methods.

Our Method

Classic bilateral filtering makes use of spatial kernel together with range kernel, which not only notices the distance between pixels but also pays attention to the similarity of the intensity of pixels. Based on this, we propose a scale adaptive bilateral filtering that allows the scale of the spatial kernel to adjust at each pixel. Figure 1 shows the entire process of texture filtering of our method.

FIGURE 1
www.frontiersin.org

Figure 1. The process of texture filtering for the input image. (A) Input image; (B) Gradient map; (C) Structure map; (D) Scale map; (E) Smoothing result. Reproduced with permission from Hyunjoon Lee, Junho Jeon, Junho Kim and Seungyong Lee, available at https://sci-hub.wf/10.1111/cgf.12875.

Structure-Preserving Bilateral Filtering

Considering the general form of bilateral filtering Tomasi and Manduchi, 1998, for the input image f, the output result is obtained by scale adaptive bilateral filtering, written as:

u(p)=qΩpw(q-p)φ(f(q)-f(p))f(q)qΩpw(q-p)φ(f(q)-f(p)),    (1)

where u (p) is the output value at the pixel p, and w (l) and φ (t) represent the spatial kernel and the range kernel, respectively. We use a box function for the spatial kernel in this paper, that is, the window Ωp of the spatial kernel centered at the pixel p. We assume that Wp represents the scale of the spatial kernel at the pixel p, then Ωp can be expressed as |Ωp|=(2Wp+1)2 and q is the pixel that belongs to Ωp. The Gaussian range kernel φ (t) in Tomasi and Manduchi (1998) is defined as:

φ(t)=exp (-t22σr2),    (2)

where t is the intensity difference between the pixels p and q. The parameter σr is the standard deviation of the Gaussian kernel, which determines the width of the range kernel, that is, the smoothing parameter, and σr is fixed at each pixel. A small σr gives rise to superior structure-preservation and inferior texture smoothing, and on the contrary, a large σr gives rise to better texture smoothing but the undesired blurring of the structure edges. Hence, it is significant to find an appropriate parameter σr for achieving better structure-preserving texture filtering results.

Structure Measurement

In our proposal, we apply the large spatial kernel sizes in the homogeneous regions for texture elimination and the small sizes near structures for edge-preserving. And the kernel size at each pixel is adaptively optimized by structure measurement. So we calculate the structure information as follows.

First, we blur the input image f using a Gaussian filter to get image fσ. The gradient of the image fσ is calculated by:

Gp=(xfσ)p2+(yfσ)p2,    (3)

where Gp is the gradient value at the pixel p, and ∂xfσ and ∂yfσ are the partial derivatives of fσ in x and y directions.

The gradient map G is calculated by Equation (3), as shown in Figure 1B. It is clear that the textures with strong gradients are also preserved when only gradient information is considered, that is, the structures and textures with similar gradients cannot be completely distinguished. Therefore, in our proposal, we further conduct four-directional structure detection relying on gradient information. For each pixel, the detection neighborhood is a (2m + 1) × (2m + 1) neighborhood centered at it. To determine a more accurate structural inspection value for each pixel, we detect the (m + 1) × (m + 1) sub-neighborhoods located in four directions. Figure 2 shows the sub-neighborhoods in four directions for structure detection. To be more specific, taking into account the distance from the pixel to the exampled pixel, we computer the weighted average of the gradient values in each sub-neighborhood to obtain Ap(j), j = {NW, NE, SW, SE}, which is used to evaluate the appearance of structure edges.

FIGURE 2
www.frontiersin.org

Figure 2. Sub-neighborhoods of four direction for structure detection.

In the four detecting neighborhoods of each pixel, a strong structure edge corresponds to a large Gp value while a weak structure edge corresponds to a small Gp value. For this reason, we adopt the Gaussian function as the weight to calculate the Ap(j) in the four directions, and the maximum value of Ap(j) is selected as the result of structure measurement for each pixel.

{Ap(j)=bϵΨp(i)gm(p,b) Gbgm (p,b)=12π(m-1)2exp(-p-b22(m-1)2)Sp=maxj={NW,NE,SW,SE} {Ap(j)}    (4)

where Ψp(j) represents jth sub-neighborhood, b is the pixel that belongs to Ψp(j), gm (p, b) is the Gaussian function of the distance between the pixel p and b, and max {•} represents the maximum value of the elements in the bracket. Ap(j) is the comprehensive manifestation of the structure edges in the jth sub-neighborhood for the pixel p, whose maximum value Sp denotes more likelihood of the edges occurring. Therefore, the larger value of Sp implies less smoothing and the smaller value of Sp implies more smoothing around the pixel p.

Adaptive Spatial Kernel Scale Estimation

From the analysis of structure measurement, a large value of Sp suggests that the pixel is in the vicinity of the structure edges, where the scale of the spatial kernel should be adjusted as small as possible. Conversely, the spatial kernel scale should be adjusted as large as possible in textural regions. To estimate the scale Wp in terms of Sp, we establish an inverse mapping function from Sp to Wp, so that the function satisfies the above conditions. The mapping can be expressed as:

Wp=max{η(1λ)Sp2,δ},    (5)

where Sp2 is the square of Sp. λ is the denominator of the base of an exponential function, whose value must be greater than 1 to ensure (1/λ)Sp2 ranges in [0, 1]. η is the upper limit of the scale of filtering windows, so η(1/λ)Sp2 denotes the estimated value of the spatial kernel scale. The introduction of δ is to keep the size of windows from approaching 0 so that it prevents the filtering result from over-sharpening or aliasing (δ = 1 by default). Therefore, Wp ranges in [ δ, η ].

Fourier Approximation

The brute force computation of Equation (1) requires O(Wp2) operations for each pixel, which is time-consuming in practical applications, especially in textural regions, where the scale Wp is usually large. For the computational limitation of traditional bilateral filtering, various acceleration algorithms have been proposed to approximate the bilateral filter (Chaudhury, 2011, 2015; Chaudhury et al., 2011), whose computational complexity is decreased to O(1), that is, the complexity no longer depends on the scale Wp. However, some of these algorithms cannot guarantee that the error of the approximate value of the discrete points is within the tolerance range, and the poor approximated estimation may result in color distortion in the filtering image.

In this paper, we adopt the Fourier expansion of the range kernel in Ghosh and Chaudhury (2016) to approximate the scale adaptive bilateral filter. Specifically, Equation (2) can be approximated in another manner:

φ^ (t)=n=-NNcnexp (τnvt),    (6)

where τ2 = −1, v = π/T, φ^(t) is an approximate estimate of φ (t), N denotes the order of Fourier expansion, cn is the corresponding coefficient, t is the pixel intensity differences in Ωp, and the range of t is {−T, ⋯ , 0, ⋯T}, where T can be calculated by:

T=maxpϵfmaxqϵΩp|f (q)-f (p)|.    (7)

For all t ∈ [−T, T], the following constraint must be satisfied:

|φ (t)-φ^ (t)|ε,    (8)

where ε is the tolerance of the approximation for the Gaussian range kernel (ε = 0.01 by default).

For the given range kernel φ (t) and tolerance ε, the specific solution of the approximation order N, and the corresponding coefficients cn is provided in Ghosh and Chaudhury (2016). By using Equation (6) to approximate Equation (2), we can reformulate Equation (1) as:

û (p)=E (p)H (p),    (9)

where û (p) is an approximation of u (p), E (p) and H (p) represent the approximate value of numerator and denominator of Equation (1), respectively, which can be expressed as:

E (p)=qΩpw(q-p) φ^ (f (q)-f (p))f (q),    (10)
H (p)=qΩpw(q-p) φ^ (f (q)-f (p)).    (11)

We can further express Equations (10) and (11) as:

E (p)=n=-NNcnexp(-τnvf (p)) en (p),    (12)
H (p)=n=-NNcnexp(-τnvf (p)) hn (p),    (13)

where en (p) and hn (p) are expressed as follows:

en (p)=qΩpw (q-p) f (q)exp(τnvf (q)),    (14)
hn (p)=qΩpw (q-p)exp(τnvf (q)).    (15)

Since a box function is employed for the spatial kernel, in conclusion, the adaptive bilateral filtering can be decomposed into a series of box filtering. Therefore, Equations (14) and (15) can be simplified as follows:

en (p)=qΩpf (q)exp(τnvf (q)),    (16)
hn (p)=qΩpexp(τnvf (q)).    (17)

It can be added point-by-point in the neighborhood of the pixel p to obtain en (p) and hn (p), whose computation is expensive. Hence, in our proposal, we compute Equations (16) and (17) by the recursive algorithm in Crow (1984). We assume that the integrated element of pixel p in en (p) is r (q):

r (q)=f (q) exp (τnvf (q)).    (18)

First, we compute the integral image R (p) at the pixel p:

R (p)=R (x,y)=k1=1xk2=1yr (k1,k2),    (19)

where (x, y) is the coordinate of pixel p and (k1, k2) is the coordinate of the pixel in the integral region.

By using recursive theory, the integral image R (x + 1, y + 1) at the pixel (x + 1, y + 1) can be expressed as:

R (x+1,y+1)=r (x+1,y+1)+R (x+1,y)+R (x,y+1)-R (x,y).    (20)

For any scale Wp, en (p) can be computed as follows:

en (p)=R (x+Wp,y+Wp)-R (x-Wp-1,y+Wp)-R (x+Wp,y-Wp-1)+R (x-Wp-1,y-Wp-1).    (21)

Similarly, hn (p) can be obtained.

In conclusion, we can calculate the Equations (10) and (11) according to en (p) and hn (p), and instead of directly computing scale adaptive bilateral filtering, we replace each convolution with pointwise operation through Fourier expansion of the range kernel, as shown in Equations (12) and (13). Furthermore, we can compute en (p) and hn (p) at O(1) complexity with a recursive algorithm, that is, en (p) and hn (p) require a fixed number of operations for any scale Wp.

To be specific, since Equation (21) requires three additions, this means that it takes three additions to compute both Equations (16) and (17). In summary, we can compute Equations (16) and (17) using addition operations, then we can compute Equations (12) and (13) in terms of Equations (16) and (17) by pointwise operations. It is quite clear that the computation of Equation (9) is based on Equations (12) and (13), which proves that the scale adaptive bilateral filter can be computed at O(1) complexity.

We compute the approximation of the output value in this paper, particularly, we consider the error to be:

u-û=max{|u  (p)-û (p)| : pf},    (22)

which provides the largest difference between the exact and approximate scale-adaptive bilateral filtering pixelwise.

www.frontiersin.org

Necessary symbols.

According to the Equation (6), the error comes from the approximation of the range kernel, meanwhile, for all t ∈ [−T, T], |φ(t)-φ^(t)|ε. From the conclusion of Ghosh and Chaudhury (2016), we can ensure that Equation (22) is within some tolerance:

u-û2Tεw(0)-ϵ.    (23)

Since there are complex and irregular textures in many natural images, generally, single iterative filtering cannot completely smooth out the textures. Considering this limitation, we adopt the multiple iteration operation of adaptive bilateral filtering in this paper. Algorithm 1 summarizes the overall process of our method.

ALGORITHM 1
www.frontiersin.org

Algorithm 1. Structure-preserving bilateral texture filtering.

Experiments and Results

Parameters Setting

Our method is implemented using MATLAB. In our algorithm, the relevant parameters are σr, m, λ, η, and I. σr determines the scale of the Gaussian range kernel, as we adopt the suggested setting by Ghosh et al. (2019): σr ranges in [20, 40]. While evaluating structure measurement, generally, we manually set the radius of detection neighborhood m = 4 to practice a majority of cases. λ is used to normalize the values of structure measurement to the interval [0, 1], so we fix λ = 10 throughout.

The relatively important parameters are the upper limit of the spatial kernel scale η and iteration number I. The value of η depends on the roughness of the textures and the sharpness of the structures and by the parameter recommendation of Ghosh et al. (2019), we set η ranges in [8, 16]. In most situations, setting I = 2 can achieve the desired filtering visual effect. Figure 3 shows the filtering results in different combinations of η and I.

FIGURE 3
www.frontiersin.org

Figure 3. Filtering results with various parameter combinations. (A) Input image; (B) η=10,I=2; (C) η=10,I=3; (D) η=15,I=1; (E) η=15,I=2; (F) η=15,I=3. Reproduced with permission from Li Xu, Qiong Yan, Yang Xia, Jiaya Jia, available at http://www.cse.cuhk.edu.hk/%7eleojia/projects/texturesep/.

Visual Comparison

For subjective evaluation, we compute our algorithm with the state-of-the-art texture smoothing techniques, including relative total variation (RTV) (Xu et al., 2012), structure gradient and texture decorrelation regularization (SGTD) (Liu et al., 2013), rolling guidance filter (RGF) (Zhang et al., 2014), bilateral texture filtering (BTF) (Cho et al., 2014), scale-aware structure-preserving texture filtering (SATF) (Jeon et al., 2016), and relativity-of-Gaussian (ROG) (Cai et al., 2017). Generally, we use the suggested parameters to obtain optimal filtering results for previous methods. In Figures 46, we display the visual effect comparison for three images containing various textures and structures. The reason why we choose these three images is that they contain different types of textures and different shaped structures, which can illustrate the superiority of our algorithm from many aspects.

FIGURE 4
www.frontiersin.org

Figure 4. Visual effect comparison of texture filtering results. (A) Input image; (B) RTV (λ=0.015,σ=6); (C) SGTD (mu=0.31); (D) RGF (σs=5,σr=1); (E) BTF (k=9); (F) SATF (sr=0.1,se=0.05); (G) ROG (σ1=1,σ2=3); (H) ours (σr=35,η=10,I=2). Reproduced with permission from Chengfang Song, Chunxia Xiao, Ling Lei, and Haigang Sui, available at https://sci-hub.wf/10.1111/cgf.13005.

Figure 4 shows the filtered results of different methods on the mosaic art “Pompeii fish mosaic,” where the image contains coarse textures and highlighted small-scale structure edges. It is observed that all methods can eliminate fine details in homogenous regions; however, the approaches of SGTD and ROG perform better in removing high-contrast textures effectively. Moreover, for the preservation of small structures highlighted in the image, the methods of SGTD, RGF, BTF, and ROG can hardly preserve the fine structures of fish's eyes which are overly smoothed since the size of the windows is oversize. In the enlarged box, we can clear that the methods of RTV, SGTD, and ROG may result in excessive sharpness near structure edges, which appears as an unwanted jaggy artifact.

Compared with these existing advanced methods, our algorithm works better in eliminating coarse textures while preserving main structures as much as possible in Figure 4H. Particularly, our method can completely preserve the structure of fish's eyes.

Figure 5 shows the smoothing effect on a face image. Especially, we focus on the region highlighted with the red box, whose meaningful structures and textures on the left and right sides of the nose bridge are very similar in appearance. Since the previous methods apply texture filtering with the fix-scale kernel to remove textures, the visual effect is not always well. The results of RTV, BTF, and ROG exist unwished artifacts in the bridge of the nose. On the side, the methods of RGF and SATF perform poorly when removing high-contrast textures.

FIGURE 5
www.frontiersin.org

Figure 5. Texture filtering results comparison. (A) Input image; (B) RTV (λ=0.015,σ=8); (C) SGTD (mu=0.31); (D) RGF (σs=5,σr=0.1); (E) BTF (k=9); (F) SATF (sr=0.1,se=0.05); (G) ROG (σ1=1,σ2=4); (H) ours (σr=35,η=10,I=2). Reproduced with permission from Sanjay Ghosh, Ruturaj G. Gavaskar, Debasisha Panda and Kunal N. Chaudhury, available at https://sci-hub.wf/10.1109/TCSVT.2019.2916589.

Our algorithm handles the pixel around structures with a small scale and the pixel in the textural region with a large scale. In Figure 5H, we obtain a better filtering result than the state-of-the-art methods and our method can remove coarse textures without creating artifacts.

Figure 6 shows small structures comparison of different filtering results on the mosaic art “fish.” All the existing methods blur the fine structures and cause artifacts near edges. Relatively serious are the results of RTV, SGTD, RGF, and ROG, and the whiskers and teeth of fish even became sticky. Meanwhile, in methods of BTF and SATF, the teeth of fish are barely preserved.

FIGURE 6
www.frontiersin.org

Figure 6. Small structures comparison of different filtering results. (A) Input image; (B) RTV (λ=0.015,σ=6); (C) SGTD (mu=0.31); (D) RGF (σs=4,σr=0.05); (E) BTF (k=9); (F) SATF (sr=0.1,se=0.05); (G) ROG (σ1=1,σ2=3); (H) ours (σr=25,η=10,I=2). Reproduced with permission from Hyunjoon Lee, Junho Jeon, Junho Kim and Seungyong Lee, available at https://sci-hub.wf/10.1111/cgf.12875.

In contrast, our method achieves the superior property of preserving multi-scale structures, as shown in Figure 6H. The edges and details can maintain the original structure as much as possible.

Figure 7 shows the comparison of denoising effects on a gray image. Intuitively, it can be seen that the effects of RGF, BTF, and SATF methods are not ideal when removing gray image noise, and cannot completely smooth the noise in the background. The methods of RTV and ROG cause edge sharpening in smoothing results.

FIGURE 7
www.frontiersin.org

Figure 7. Comparison of gray image denoising. (A) Input image; (B) RTV (λ=0.015,σ=6); (C) SGTD (mu=0.31); (D) RGF (σs=4,σr=0.05); (E) BTF (k=9); (F) SATF (sr=0.1,se=0.05); (G) ROG (σ1=1,σ2=3); (H) ours (σr=25,η=10,I=2). The picture can be found in the MATLAB public dataset, available at https://matlab.mathworks.com/.

In comparison, our proposed algorithm can remove the noise of gray images and retain the edge features of people in the image, as shown in Figure 7H.

Quantitative Evaluation

The widely used image objective quantitative evaluation methods include Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). PSNR is an image quality evaluation based on error sensitivity. SSIM comprehensively measures image similarity from the aspects of brightness, contrast, and structure. In our evaluation, we also take the Feature Similarity index (FSIM) (Zhang et al., 2011) and Blind Image Spatial Quality Evaluator (BRISQUE) (Chen et al., 2018) as the evaluation indexes. We selected four ground truth images in Dong et al. (2015), Abiko and Ikehara (2019), and Shen et al. (2015), and then added salt and pepper noise along with periodic noise to these four images as the texture images, as shown in Figure 8, using ground truth images as the reference to calculate PSNR, SSIM, and FSIM. In contrast, the BRISQUE is obtained only by the filtered result.

FIGURE 8
www.frontiersin.org

Figure 8. Images used for quantitative evaluation. (A–D) Ground truth images; (E–H) Images with noise. Panel (A) is reproduced with permission from Ryo Abiko, Masaaki Ikehara, available at https://www.jstage.jst.go.jp/article/transinf/E102.D/10/E102.D_2018EDP7437/_pdf. Panel (B) is reproduced with permission from Xiaoyong Shen, Chao Zhou, Li Xu and Jiaya Jia, available at http://www.cse.cuhk.edu.hk/leojia/projects/mutualstructure/. Panels (C,D) are reproduced with permission from Chao Dong, Chen Change Loy, Kaiming He and Xiaoou Tang, available at http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html.

Table 1 shows the statistics of the mean values of the objective evaluation indexes of four images in Figure 8. First, on the metric of PSNR, our method performs best among these seven methods, which suggests that our results have less image distortion. The methods of RTV and SATF get higher PSNR results that are only inferior to ours. Concerning SSIM results, our method also achieves the highest result. In contrast, we only obtain the third-highest FSIM value, which is inferior to BTF and SATF. In general, the similarity between the results filtered by our approach and ground truth images is relatively good. Finally, we take a look at BRISQUE results, whose smaller score implies better perceptual quality. It just so happens that our method has the smallest BRISQUE value.

TABLE 1
www.frontiersin.org

Table 1. Comparison of the mean values of the objective evaluation indexes.

Timing Data

To verify that the complexity of our algorithm does not depend on the size of the spatial kernel, we set different parameters for the upper limit of the spatial kernel scale, that is, we change the width of scale to smooth the 400 × 324 image and recode the timings required for a single iteration.

Table 2 shows the statistics of timings for a single iteration of the image in Figure 9A, Figures 9B–D show the different filtering results for different η. It can be seen that the timings required for the filtering process are not much different when the values of η are different, which illustrates that the complexity of the adaptive bilateral filtering does not depend on the size of scales. This result verifies our algorithm that the complexity is decreased to O(1) by our approximation of bilateral filtering.

TABLE 2
www.frontiersin.org

Table 2. Timing statistics for a single iteration of a color image.

FIGURE 9
www.frontiersin.org

Figure 9. Filtered results using our method. (A) Input image; (B)r=25,η=8,I=2); (C)r=25,η=12,I=2); (D)r=25,η=16,I=2). Reproduced with permission from Li Xu, Qiong Yan, Yang Xia, Jiaya Jia, available at http://www.cse.cuhk.edu.hk/%7eleojia/projects/texturesep/.

Applications

Detail Enhancement

Our approach can be applied to image detail enhancement (Fei et al., 2017). It aims to highlight image details and improve the visual effects of the images. Figure 10 displays the application of our method in detail enhancement. We first subtract the filtered image from the input image to generate the textures, which are magnified three times and superimposed on the input image, so that we can achieve the purpose of detail enhancement.

FIGURE 10
www.frontiersin.org

Figure 10. Detail enhancement. (A,D) Input images; (B,E) Filtered results using our method; (C,F) The detail enhancement results. Reproduced with permission from Wei Liu, Pingping Zhang, Xiaogang Chen, Chunhua Shen and Xiaolin Huang, available at https://arxiv.53yu.com/pdf/1812.07122.

Edge Detection

The existence of high-contrast textures will keep some irrelevant information and produce false edges in edge detection. Due to the severe influence of textures, we execute our method for texture removal before edge detection. As shown in Figure 11, compared to the edge detection of the original image, the edge map of the filtered image extracted by the canny detection (Canny, 1986) operator is clearer.

FIGURE 11
www.frontiersin.org

Figure 11. Edge detection. (A) Input image; (B) Edge detection of input image; (C) Filtered result using our method; (D) Edge detection of filtered result. Reproduced with permission from Li Xu, Qiong Yan, Yang Xia, Jiaya Jia, available at http://www.cse.cuhk.edu.hk/%7eleojia/projects/texturesep/.

Image Abstraction and Pencil Sketching

The texture smoothing method proposed in this paper can also be applied to image abstraction and pencil sketching. Following (Winnemöller et al., 2006), our method is employed in replacing the bilateral filter to generate abstraction results. Furthermore, we obtain pencil sketching results based on image abstraction. The results are shown in Figure 12.

FIGURE 12
www.frontiersin.org

Figure 12. Image abstraction and Pencil sketching. (A,D) Input images; (B,E) Image abstraction results; (C,F) Pencil sketching results. Panel (A) is reproduced with permission from JiaXianYao, available at https://github.com/JiaXianYao/Bilateral-Texture-Filtering. Panel (D) is reproduced with permission from By Sylvain Paris, Samuel W. Hasinoff and Jan Kautz, available at https://cacm.acm.org/magazines/2015/3/183587-local-laplacian-filters/abstract.

Conclusion

To preserve multi-scale structures while filtering various textures, we propose an adaptive bilateral texture filter for image smoothing, whose spatial kernel scale is adjusted adaptively. To distinguish prominent structures from textures, we combine gradient information and four-direction structure inspection to generate the structure map of the image. Then, the optimal spatial kernel scale corresponding to each pixel is estimated via structure measurement, which satisfied large smoothing window sizes in texture regions and small smoothing window sizes around structures. In addition, the Fourier expansion of the range kernel is used to reduce the computational complexity. Through the subjective and objective evaluation of the experimental results, we conclude that our method performs better than existing methods in texture removal and structure preservation.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Author Contributions

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

Funding

This work was supported by the National Natural Science Foundation of China (No. 62001452), the Fujian Science and Technology Innovation Laboratory for Optoelectronic Information of China (No. 2021ZZ116), and the Science and Technology Program of Quanzhou (Nos. 2020C071 and 2020C049R).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Abiko, R., and Ikehara, M. (2019). Fast edge preserving 2D smoothing filter using indicator function. IEICE Trans. Inf. Syst. 102, 2025–2032. doi: 10.1587/transinf.2018EDP7437

CrossRef Full Text | Google Scholar

Cai, B., Xing, X., and Xu, X. (2017). “Edge/structure preserving smoothing via relativity-of-Gaussian,” in 2017 IEEE International Conference on Image Processing (ICIP). IEEE p. 250–254. doi: 10.1109/ICIP.2017.8296281

PubMed Abstract | CrossRef Full Text | Google Scholar

Canny, J. (1986). A computational approach to edge detection. IEEE Trans. Patt. Analy. Mach. intel. 8, 679–698. doi: 10.1109/TPAMI.1986.4767851

CrossRef Full Text | Google Scholar

Chaudhury, K. N. (2011). Constant-time filtering using shiftable kernels. IEEE Signal Process. Lett. 18, 651–654. doi: 10.1109/LSP.2011.2167967

CrossRef Full Text | Google Scholar

Chaudhury, K. N. (2015). “Fast and accurate bilateral filtering using Gauss-polynomial decomposition,” in 2015 IEEE International Conference on Image Processing (ICIP) (Montreal, QC: IEEE), p. 2005–2009. doi: 10.1109/ICIP.2015.7351152

CrossRef Full Text | Google Scholar

Chaudhury, K. N., Sage, D., and Unser, M. (2011). Fast $ O (1) $ bilateral filtering using trigonometric range kernels. IEEE Trans. Image Proces. 20, 3376–3382. doi: 10.1109/TIP.2011.2159234

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, Q., Xu, J., and Koltun, V. (2017). “Fast image processing with fully-convolutional networks,” in Proceedings of the IEEE International Conference on Computer Vision (Venice: IEEE), p. 2497–2506. doi: 10.1109/ICCV.2017.273

CrossRef Full Text | Google Scholar

Chen, X., Zhang, Q., Lin, M., Yang, G., and He, C. (2018). No-Reference color image quality assessment: from entropy to perceptual quality. arXiv preprint arXiv:1812.10695, 1–12. doi: 10.1186/s13640-019-0479-7

CrossRef Full Text | Google Scholar

Cho, H., Lee, H., Kang, H., and Lee, S. (2014). Bilateral texture filtering. ACM Trans. Graph. 33, 1–8. doi: 10.1145/2601097.2601188

CrossRef Full Text | Google Scholar

Crow, F. C. (1984). “Summed-area tables for texture mapping,” in Proceedings of the 11th Annual Conference on Computer Graphics and Interactive Techniques (New York, NY: Association for Computing Machinery), p. 207–212. doi: 10.1145/964965.808600

CrossRef Full Text | Google Scholar

Dong, C., Loy, C. C., He, K., and Tang, X. (2015). Image super-resolution using deep convolutional networks. IEEE Trans. Patt. Analy. Mach. Intel. 38, 295–307. doi: 10.1109/TPAMI.2015.2439281

PubMed Abstract | CrossRef Full Text | Google Scholar

Farbman, Z., Fattal, R., Lischinski, D., and Szeliski, R. (2008). Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM Trans. Graph. 27, 1–10. doi: 10.1145/1360612.1360666

CrossRef Full Text | Google Scholar

Fei, K., Zhe, W., Chen, W., Wu, X., and Li, Z. (2017). Intelligent detail enhancement for exposure fusion. IEEE Trans. Multim. 20, 484–95. doi: 10.1109/TMM.2017.2743988

CrossRef Full Text | Google Scholar

Gao, X., Wu, X., Xu, P., Guo, S., Liao, M., and Wang, W. (2020). Semi-supervised texture filtering with shallow to deep understanding. IEEE Trans. Image Proces. 29, 7537–7548. doi: 10.1109/TIP.2020.3004043

CrossRef Full Text | Google Scholar

Gavaskar, R. G., and Chaudhury, K. N. (2018). Fast adaptive bilateral filtering. IEEE Trans. Image Proces. 28, 779–790. doi: 10.1109/TIP.2018.2871597

PubMed Abstract | CrossRef Full Text | Google Scholar

Ghosh, S., and Chaudhury, K. N. (2016). On fast bilateral filtering using Fourier kernels. IEEE Signal Process. Lett. 23, 570–573. doi: 10.1109/LSP.2016.2539982

CrossRef Full Text | Google Scholar

Ghosh, S., Gavaskar, R. G., Panda, D., and Chaudhury, K. N. (2019). Fast scale-adaptive bilateral texture smoothing. IEEE Trans. Circ. Syst. Video Technol. 30, 2015–26. doi: 10.1109/TCSVT.2019.2916589

CrossRef Full Text | Google Scholar

He, K., Sun, J., and Tang, X. (2013). Guided image filtering. IEEE Trans. Patt. Analy. Mach. intel. 35, 1397–1409. doi: 10.1109/TPAMI.2012.213

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, W., Bi, W., Gao, G., Zhang, Y. P., and Zhu, Z. (2018). Image smoothing via a scale-aware filter and L 0 norm. IET Image Process. 12, 1521–1528. doi: 10.1049/iet-ipr.2017.0719

CrossRef Full Text

Jeon, J., Lee, H., Kang, H., and Lee, S. (2016). “Scale-aware structure-preserving texture filtering,” in Computer Graphics Forum (Hoboken, NJ: Wiley Online Library), p. 77–86. doi: 10.1111/cgf.13005

CrossRef Full Text | Google Scholar

Jia, Y., and Zhang, W. (2019). Efficient and adaptive tone mapping algorithm based on guided image filter. Int. J. Patt. Recogn. Artif. Intel. 34, 2054012. doi: 10.1142/S0218001420540129

CrossRef Full Text | Google Scholar

Karacan, L., Erdem, E., and Erdem, A. (2013). Structure-preserving image smoothing via region covariances. ACM Trans. Graph. 32, 1–11. doi: 10.1145/2508363.2508403

CrossRef Full Text | Google Scholar

Kim, Y., Ham, B., Do, M. N., and Sohn, K. (2018). Structure-texture image decomposition using deep variational priors. IEEE Trans. Image Proces. 28, 2692–2704. doi: 10.1109/TIP.2018.2889531

PubMed Abstract | CrossRef Full Text | Google Scholar

Lee, H., Jeon, J., Kim, J., and Lee, S. (2017). “Structure-texture decomposition of images with interval gradient,” in Computer Graphics Forum (Hoboken, NJ: Wiley Online Library), p. 262–274. doi: 10.1111/cgf.12875

CrossRef Full Text | Google Scholar

Liu, Q., Liu, J., Dong, P., and Liang, D. (2013). “SGTD: Structure gradient and texture decorrelating regularization for image decomposition,” in Proceedings of the IEEE International Conference on Computer Vision (Sydney, NSW: IEEE), p. 1081–1088. doi: 10.1109/ICCV.2013.138

CrossRef Full Text | Google Scholar

Liu, W., Chen, X., Shen, C., Liu, Z., and Yang, J. (2017). “Semi-global weighted least squares in image filtering,” in Proceedings of the IEEE International Conference on Computer Vision (Cham: IEEE), p. 5861–5869. doi: 10.1109/ICCV.2017.624

CrossRef Full Text | Google Scholar

Liu, W., Zhang, P., Huang, X., Yang, J., Shen, C., and Reid, I. (2020). Real-time image smoothing via iterative least squares. ACM Trans. Graph. 39, 1–24. doi: 10.1145/3388887

CrossRef Full Text | Google Scholar

Liu, Y., Liu, G., Liu, H., and Liu, C. (2020). Structure-aware texture filtering based on local histogram operator. IEEE Access. 8, 43838–43849. doi: 10.1109/ACCESS.2020.2977408

CrossRef Full Text | Google Scholar

Lu, K., You, S., and Barnes, N. (2018). “Deep texture and structure aware filtering network for image smoothing,” in Proceedings of the European Conference on Computer Vision (ECCV). p. 217–233. doi: 10.1007/978-3-030-01225-0_14

CrossRef Full Text | Google Scholar

Perona, P., and Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. IEEE Trans. Patt. Analy. Mach. Intel. 12, 629–639. doi: 10.1109/34.56205

CrossRef Full Text | Google Scholar

Rudin, L. I., Osher, S., and Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Hysica D. 60, 259–268. doi: 10.1016/0167-2789(92)90242-F

CrossRef Full Text | Google Scholar

Shen, X., Zhou, C., Xu, L., and Jia, J. (2015). “Mutual-structure for joint filtering,” in Proceedings of the IEEE International Conference on Computer Vision (Santiago: IEEE), p. 3406–3414. doi: 10.1109/ICCV.2015.389

CrossRef Full Text | Google Scholar

Song, C., and Xiao, C. (2017). “Structure-preserving bilateral texture filtering,” in 2017 International Conference on Virtual Reality and Visualization (ICVRV) (Zhengzhou: IEEE), p. 191–196. doi: 10.1109/ICVRV.2017.00046

CrossRef Full Text | Google Scholar

Song, C., Xiao, C., Lei, L., and Sui, H. (2019). “Scale-adaptive structure-preserving texture filtering,” in Computer Graphics Forum (Hoboken, NJ: Wiley Online Library), p. 149–158. doi: 10.1111/cgf.13824

CrossRef Full Text | Google Scholar

Song, C., Xiao, C., Li, X., Li, J., and Sui, H. (2018). Structure-preserving texture filtering for adaptive image smoothing. J. Visual Langu. Comput. 45, 17–23. doi: 10.1016/j.jvlc.2018.02.002

CrossRef Full Text | Google Scholar

Subr, K., Soler, C., and Durand, F. (2009). Edge-preserving multiscale image decomposition based on local extrema. ACM Trans. Graph. 28, 1–9. doi: 10.1145/1618452.1618493

CrossRef Full Text | Google Scholar

Tomasi, C., and Manduchi, R. (1998). “Bilateral filtering for gray and color images,” in Sixth international conference on computer vision (IEEE Cat. No. 98CH36271) (Bombay: IEEE), p. 839–846.

Google Scholar

Winnemöller, H., Olsen, S. C., and Gooch, B. (2006). Real-time video abstraction. ACM Trans. Graph. 25, 1221–1226. doi: 10.1145/1141911.1142018

CrossRef Full Text | Google Scholar

Xu, L., Lu, C., Xu, Y., and Jia, J. (2011). “Image smoothing via L 0 gradient minimization,” in Proceedings of the 2011 SIGGRAPH Asia Conference. p. 1–12. doi: 10.1145/2070781.2024208

CrossRef Full Text | Google Scholar

Xu, L., Ren, J., Yan, Q., Liao, R., and Jia, J. (2015). “Deep edge-aware filters,” in International Conference on Machine Learning (Lille: PMLR), p. 1669–1678.

Google Scholar

Xu, L., Yan, Q., Xia, Y., and Jia, J. (2012). Structure extraction from texture via relative total variation. ACM Trans. Graph. 31, 1–10. doi: 10.1145/2366145.2366158

PubMed Abstract | CrossRef Full Text | Google Scholar

Xu, P., and Wang, W. (2018). Improved bilateral texture filtering with edge-aware measurement. IEEE Trans. Image Proces. 27, 3621–3630. doi: 10.1109/TIP.2018.2820427

PubMed Abstract | CrossRef Full Text | Google Scholar

Xu, P., and Wang, W. (2019). Structure-aware window optimization for texture filtering. IEEE Trans. Image Proces. 28, 4354–4363. doi: 10.1109/TIP.2019.2904847

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, L., Zhang, L., Mou, X., and Zhang, D. (2011). FSIM: a feature similarity index for image quality assessment. IEEE Trans. Image Proces. 20, 2378–2386. doi: 10.1109/TIP.2011.2109730

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Q., Shen, X., Xu, L., and Jia, J. (2014). “Rolling guidance filter,” in European Conference on Computer Vision (Cham: Springer), p. 815–830. doi: 10.1007/978-3-319-10578-9_53

CrossRef Full Text | Google Scholar

Zhao, L., Bai, H., Liang, J., Wang, A., Zeng, B., and Zhao, Y. (2019). Local activity-driven structural-preserving filtering for noise removal and image smoothing. Signal Processing 157, 62–72. doi: 10.1016/j.sigpro.2018.11.012

CrossRef Full Text | Google Scholar

Zhu, J. Y., Park, T., Isola, P., and Efros, A. A. (2017). “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE International Conference on Computer Vision (Venice: IEEE), p. 2223–2232. doi: 10.1109/ICCV.2017.244

CrossRef Full Text | Google Scholar

Keywords: image smoothing, bilateral filter, structure measurement, adaptive spatial kernel, Fourier approximation

Citation: Xu H, Zhang Z, Gao Y, Liu H, Xie F and Li J (2022) Adaptive Bilateral Texture Filter for Image Smoothing. Front. Neurorobot. 16:729924. doi: 10.3389/fnbot.2022.729924

Received: 24 June 2021; Accepted: 09 May 2022;
Published: 27 June 2022.

Edited by:

Florian Röhrbein, Technische Universität Chemnitz, Germany

Reviewed by:

Ye Yuan, Chongqing Institute of Green and Intelligent Technology (CAS), China
Yan Wang, Chongqing Normal University, China

Copyright © 2022 Xu, Zhang, Gao, Liu, Xie and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jun Li, anVubGkmI3gwMDA0MDtmamlyc20uYWMuY24=

This author contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.