Low-rank tensor recovery via non-convex regularization, structured factorization and spatio-temporal characteristics

https://doi.org/10.1016/j.patcog.2023.109343Get rights and content

Highlights

  • For the LRTC problem, we define a novel non-convex tensor pseudo-norm to replace the weighted sum of the tensor nuclear norm (WSTNN) as a tighter rank approximation.

  • For the TRPCA problem, we first introduce the noise analysis and decompose the visual frequency sequence into three terms, which are low rank static background, sparse foreground and dynamic background.

  • Then, we introduce spatio-temporal matrix to make better use of the inherent spatio-temporal characteristics of low-rank static background and sparse foreground.

  • Finally, we introduce an incoherent term to constrain the sparse foreground and the dynamic background to improve the separability.

Abstract

Recently, the convex low-rank 3rd-order tensor recovery has attracted considerable attention. However, there are some limitations to the convex relaxation approach, which may yield biased estimators. To overcome this disadvantage, we develop a novel non-convex tensor pseudo-norm to replace the weighted sum of the tensor nuclear norm as a tighter rank approximation. Then in tensor robust principle component analysis, we introduce the noise analysis to separate the spare foreground from the dynamic background more accurately. Furthermore, by introducing a spatio-temporal matrix, we can make better use of the inherent spatio-temporal characteristics of the low-rank static background and sparse foreground. Finally, we introduce an incoherent term to constrain the sparse foreground and the dynamic background to improve the separability. Some preliminary numerical examples of color image, video, and face image data sets are presented to illustrate the efficiency of our proposed methods.

Introduction

Recently, a great deal of mathematical efforts has been devoted to tensors, which is a high-order extension of the matrix as an important data format for multi-dimensional data applications, such as traffic data imputation [1], [2], multi-class learning [3], hyperspectral image denoising [4], color image and gray video recovery [5], [6], [7], magnetic resonance imaging (MRI) data recovery [8], [9], submodule clustering [10], anomaly detection [11], high dimensional signal processing [12] and multilinear subspace learning [13]. Due to the damage to the collection equipment, the interference of noise, and the difficulty of data collection, the collected data are often incomplete or grossly corrupted. Thus, we are concerned in this paper with the 3rd-order tensor recovery problem, drawing upon recent advances in low-rank tensor completion (LRTC) and tensor robust principal component analysis (TRPCA).

The LRTC problem is to find a low-rank tensor from observed incomplete data. Accordingly, its mathematical model is written asminXrank(X),s.t.PΩ(X)=PΩ(M),where rank(·) is a tensor rank and Ω is an index set locating the observed data. PΩ is a projection operator that keeps the entries of X in Ω and sets all others to zero.

Different tensor ranks lead to different LRTC models of (1.1) with different methods. With an eye towards application, many researchers have studied the CANDECOMP/PARAFAC (CP) rank and Tucker rank, which corresponds to CP decomposition [14], block term decomposition [15] and Tucker decomposition [16], respectively. The computation of CP rank is an NP-hard problem [17], but the Tucker rank can be obtained directly by unfolding the tensor in matrices to calculate the matrix ranks. Therefore, the LRTC problem is mostly based on Tucker rank. For example, Li et al. [18] have developed tensor nuclear norm-based methods to simultaneously recover both low Tucker rank and sparse tensors from various degraded observations. Li et al. [19] apply the alternating direction method of multipliers (ADMM) based on exact and inexact iteratively reweighted algorithms to solve non-convex p-norm relaxation model for low Tucker rank tensor recovery problem. However, the model based Tucker rank needs direct unfolding first, which will destroy the original internal structure of the 3D-array data and lose some important information [20], [21]. Recently, based on the tensor-tensor product (t-product) and tensor singular value decomposition (t-SVD) [22], Kilmer et al. [20] proposed the tensor multi-rank and tubal rank definitions. Subsequently, Zhang et al. [23] defined the tensor nuclear norm (TNN) based on t-SVD and tensor tubal rank to solve the LRTC problem, which maintains the tensor structure more effectively than the direct unfolding.

Recently, some non-convex surrogate functions based on tubal rank also have been used to approximate the rank function. For example, Jiang et al. [24] defined the partial sum of TNN (PSTNN). Cai et al. [25] developed a new t-Gamma tensor quasi-norm. Wang et al. [26] presented the weighted Schatten function. Chen et al. [27] and Yang et al. [28] proposed the weighted Schatten-p function. However, they only considered the tubal rank of mode-3 and ignored the tubal ranks of mode-1 and mode-2. Zheng et al. [29] generalize the TNN as a weighted sum of the tensor nuclear norm (WSTNN) in a balanced way, which considers all modes of tensor together. But there is a gap between the rank function and the nuclear norm, especially when the singular value is large, see Fig. 1. So the adoption of WSTNN usually leads to the approximation of the corresponding tensor N-tubal rank being insufficient.

Another typical tensor recovery problem is the TRPCA problem, which aims to recover the low-rank component and sparse component from observations. More specifically, TRPCA based on t-SVD [30] aims to recover the low tubal rank component X and to remove the hidden E resulted from the noisy observations O=X+E via the following optimizationminX,EXTNN+λE1,s.t.O=X+E,where λ is a balancing parameter, and the sparsity of E is characterized by the tensor 1 norm. The TRPCA model [31], [32], [33] achieves the detection of moving targets by decomposing the video tensor O into a low-rank background tensor X and a sparse moving foreground target tensor E. However, in most cases, a video sequence is always captured with a complex background in which the foreground objects may blend into the background [34], such as wind-blowing leaves, waves, swaying vegetation, fountains, changes in light, ripples on the water, flags flying in the wind and so on. Because the background is not completely static (that is, the background also contains dynamic components), the performance of the foreground detection method will be affected by the dynamic pixel components in the background. It is easy to misjudge the dynamic background as the foreground moving target, resulting in an incomplete and empty edge of the foreground moving object detection [35].

As has been said, we are concerned in this paper with some novel models for LRTC and TRPCA problems. For the LRTC problem, we extend the WSTNN and define a new tensor rp pseudo-norm, which better approximates the rank of a 3rd-order tensor, see Fig. 1. For the TRPCA problem, traditional TRPCA is very prone to voids in the process of background/foreground separation of complex scene videos and easy to misjudge the dynamic background as a moving target, which makes the separation effect not ideal. In order to address this problem, we introduce noise analysis and decompose the visual frequency sequence into three terms, low-rank static background, sparse foreground, and dynamic background. In order to make better use of their own characteristics in the low-rank static background and sparse foreground (the pixels of adjacent two frontal slices of low-rank static background are basically the same, and the pixels of adjacent two horizontal slices and lateral slices of the sparse foreground are very close), we introduce temporal and spatial matrix. At the same time, in order to more accurately separate the sparse foreground from the dynamic background and prevent the moving objects from appearing in both sparse foreground and dynamic background, we introduce an incoherent term to constrain sparse foreground and dynamic background so as to improve the separability. Below is a summary of our main contributions:

  • (1)

    For the LRTC problem, we define a novel non-convex tensor pseudo-norm to replace the WSTNN as a tighter rank approximation. Compared with WSTNN, the new tensor pseudo-norm approximates the tensor N-tubal rank better than WSTNN.

  • (2)

    For the TRPCA problem, we first introduce the noise analysis and decompose the visual frequency sequence into three terms, which are low-rank static background, sparse foreground, and dynamic background. This is beneficial to better extract foreground objects in complex scenes with a dynamic background. Then, we introduce the spatio-temporal matrix to make better use of the inherent spatio-temporal characteristics of low-rank static background and sparse foreground. Finally, we introduce an incoherent term to constrain the sparse foreground and the dynamic background to improve the separability. It will help us more accurately separate the sparse foreground from the dynamic background and prevent the moving objects from appearing in both the sparse foreground and the dynamic background.

The present paper is built up as follows. Section 2 reviews notations, basic concepts and introduces the tensor rp pseudo-norm. In Section 3, based on the new tensor rp pseudo-norm, a new model of LRTC is introduced and an alternating minimization method is proposed, where any accumulating point of the generated sequence is a Karush-Kuhn-Tucker (KKT) point. In Section 4, we improve the TRPCA model via rp pseudo-norm and other novel methods. Moreover, some numerical experiments on colorful image recovery, gray video recovery, face image shadow removal, and background modeling are reported in Sections 5 and 6, which illustrate the validity of our proposed models. We have also discussed the convergence behavior of our algorithms. Finally, the paper ends with concluding remarks in Section 7.

Section snippets

Preliminaries

In this section, we first summarize some notations and propose a new 3rd-order tensor pseudo-norm.

Enhanced LRTC model via rp pseudo-norm

In this section, we establish a new tensor completion model based on tensor rp pseudo-norm.

Non-convex model of TRPCA

Another typical tensor recovery problem is the TRPCA, which aims to recover the low-rank component X and sparse component E from observations O=X+ERn1×n2×n3. Adopting the rp pseudo-norm to characterize the low-rank part, our TRPCA model is formulated asminX,Eu=13l=1nu1nuX¯u(l)rp+λE1,s.t.O=X+E.In order to better extract the foreground objects in complex scenes with dynamic backgrounds, we introduce the noise analysis and decompose the visual frequency sequence O into three terms, i.e., O=

Experimental results for LRTC

In this section, we report some numerical examples to demonstrate the validity of our LPRN-based tensor completion method. We employ the peak signal-to-noise ratio (PSNR), the structural similarity index (SSIM) [43] and the feature similarity index (FSIM) [44] to evaluate the performance of each algorithm. We conduct extensive experiments to evaluate our method, and then, compare it with some existing methods, including TNN [23], WSTNN [29], PSTNN [24], ADMM-ilR [19], IR-t-TNN [26] and t-Sw,p

Experimental results for TRPCA

In this section, we report some numerical examples to show the validity of our LPRN-based TRPCA method. We conduct extensive experiments to evaluate our method, and then, compare it with some existing methods, including TRPCA [30], LSD [31], DECOLOR [32], IBTSVT [33], ETRPCA [45] and t-Sw,p [28].

Experimental Data Settings: Two data sets are used for TRPCA. The details of these data sets are described as follows:

Yale B face database It contains 16,128 images of 28 human subjects under 9 poses

Conclusion

For the LRTC problem, we extended the WSTNN to a new tensor rp pseudo-norm, which better approximates the rank of a 3rd-order tensor. Based on the rp pseudo-norm, we introduced new non-convex tensor recovery models and proposed an alternating minimization method to solve the corresponding optimization problem. In addition, we introduce the noise analysis and decompose the visual frequency sequence into three terms, low-rank static background, sparse foreground, and dynamic background in

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Quan Yu received the master’s degree in mathematics from Tianjin University in 2022. He is currently pursuing the Ph.D. degree with the School of Mathematics, Hunan University. His research interests include image processing, low rank matrix optimization and low rank tensor optimization.

References (50)

  • B. Garcia-Garcia et al.

    Background subtraction in real applications: challenges, current models and future directions

    Comput. Sci. Rev.

    (2020)
  • B. Ran et al.

    Traffic speed data imputation method based on tensor completion

    Comput. Intell. Neurosci.

    (2015)
  • H. Tan et al.

    Short-term traffic prediction based on dynamic tensor completion

    IEEE Trans. Intell. Transp. Syst.

    (2016)
  • G. Obozinski et al.

    Joint covariate selection and joint subspace selection for multiple classification problems

    Stat. Comput.

    (2009)
  • C. Peng et al.

    Hyperspectral image denoising using nonconvex local low-rank and sparse separation with spatial–spectral total variation regularization

    IEEE Trans. Geosci. Remote Sens.

    (2022)
  • Q. Yu et al.

    Low tucker rank tensor completion using a symmetric block coordinate descent method

    Numer. Linear Algebra Appl.

    (2022)
  • M. Yang et al.

    3-D array image data completion by tensor decomposition and nonconvex regularization approach

    IEEE Trans. Signal Process.

    (2022)
  • Q. Yu et al.

    T-product factorization based method for matrix and tensor completion problems

    Comput. Optim. Appl.

    (2022)
  • M. Li et al.

    The nonconvex tensor robust principal component analysis approximation model via the weighted p-norm regularization

    J. Sci. Comput.

    (2021)
  • J. Ying et al.

    Hankel matrix nuclear norm regularized tensor completion for n-dimensional exponential signals

    IEEE Trans. Signal Process.

    (2017)
  • N.D. Sidiropoulos et al.

    Tensor decomposition for signal processing and machine learning

    IEEE Trans. Signal Process.

    (2017)
  • M. Yang et al.

    On identifiability of higher order block term tensor decompositions of rank lrrank-1

    Linear Multilinear Algebra

    (2020)
  • L.R. Tucker

    Some mathematical notes on three-mode factor analysis

    Psychometrika

    (1966)
  • J.M. Landsberg

    Tensors: geometry and applications, number v. 128

    Graduate Studies in Mathematics

    (2012)
  • M.E. Kilmer et al.

    Third-order tensors as operators on matrices: atheoretical and computational framework with applications in imaging

    SIAM J. Matrix Anal. Appl.

    (2013)
  • Cited by (5)

    Quan Yu received the master’s degree in mathematics from Tianjin University in 2022. He is currently pursuing the Ph.D. degree with the School of Mathematics, Hunan University. His research interests include image processing, low rank matrix optimization and low rank tensor optimization.

    Ming Yang received his B.S. in Math from Jilin University, Changchun, China in 2007 and his Ph.D. in Math from Texas A&M University-College Station, USA, in 2012. Currently, he is an assistant professor in the mathematics department of the University of Evansville. His research interests are machine learning, image processing and tensor decomposition. He has published several research papers in top-tier journals, including SIAM Journal on Imaging Sciences, IEEE Signal Processing Letters, IEEE Transactions on Knowledge and Data Engineering, IEEE Transactions on Image Processing, Journal of Dynamics and Differential Equations, Linear and Multilinear Algebra.

    View full text