Citation: | Hengmin Zhang, Jian Yang, Wenli Du, et al., “Enhanced acceleration for generalized nonconvex low-rank matrix learning,” Chinese Journal of Electronics, vol. 34, no. 1, pp. 98–113, 2025. DOI: 10.23919/cje.2023.00.340 |
Matrix minimization techniques that employ the nuclear norm have gained recognition for their applicability in tasks like image inpainting, clustering, classification, and reconstruction. However, they come with inherent biases and computational burdens, especially when used to relax the rank function, making them less effective and efficient in real-world scenarios. To address these challenges, our research focuses on generalized nonconvex rank regularization problems in robust matrix completion, low-rank representation, and robust matrix regression. We introduce innovative approaches for effective and efficient low-rank matrix learning, grounded in generalized nonconvex rank relaxations inspired by various substitutes for the ℓ0-norm relaxed functions. These relaxations allow us to more accurately capture low-rank structures. Our optimization strategy employs a nonconvex and multi-variable alternating direction method of multipliers, backed by rigorous theoretical analysis for complexity and convergence. This algorithm iteratively updates blocks of variables, ensuring efficient convergence. Additionally, we incorporate the randomized singular value decomposition technique and/or other acceleration strategies to enhance the computational efficiency of our approach, particularly for large-scale constrained minimization problems. In conclusion, our experimental results across a variety of image vision-related application tasks unequivocally demonstrate the superiority of our proposed methodologies in terms of both efficacy and efficiency when compared to most other related learning methods.
In recent years, the recovery of low-rank matrices has found widespread applications in pattern recognition, machine learning, and computer vision [1]–[6]. This field encompasses various well-established methods, including robust principal component analysis (RPCA) [7], matrix completion (MC) [8], robust matrix completion (RMC) [9], low-rank representation (LRR) [10], and robust matrix regression (RMR) [11]. Additionally, several variants have been explored in recent research [12]–[16]. It is noteworthy that both RMC and LRR can be regarded as extensions of MC and RPCA. These fundamental concepts were originally introduced through rank minimization problems, as follows:
RMC:minZλrank(Z)+‖PΩ(D)−PΩ(Z)‖qℓq | (1) |
LRR:minZλrank(Z)+‖D−DZ‖ℓ | (2) |
RMR:minxλrank(D−A(x))+‖x‖qℓq | (3) |
where λ>0 represents the regularization parameter. In both Problems (1) and (2), the symbol D represents the data matrix. We define PΩ(Zi,j) as Zi,j if (i,j)∈Ω, and as 0 otherwise. The rank of Z corresponds to the number of non-zero singular values in the matrix. We employ norms such as ‖⋅‖qℓq (e.g., with q>0) and ‖⋅‖ℓ (e.g., ℓ=ℓ1,ℓ2, or ℓ21-norm) to describe various residual styles, including Laplace, Gaussian, and column-elements, respectively. In Problem (3), D represents the testing sample matrix. The expression A(x)=x1A1+x2A2+⋯+xnAn denotes a linear mapping from Rn to Rp×q for each Ai belonging to Rp×q, respectively. We use A∗(⋅) to represent its adjoint operation, and ‖x‖qℓq signifies the coefficients representation norm, where ℓ1-norm and ℓ2-norm correspond to setting q as 1 and 2, respectively. The convex norms mentioned above have been thoroughly examined in previous studies [14], [17]–[19].
In general, the problems presented in (1)–(3) are non-deterministic polynomial (NP)-hard and challenging to solve due to the nonconvex and nonsmooth nature of the rank function. Notably, this formulation extends several existing low-rank matrix recovery problems by replacing fλ(X;p) with various norms, including the nuclear norm [8], Schatten-p norm (p>0) [9], weighted Schatten-p norm (p>0) [20], and weighted/truncated nuclear norm [21], [22]. Additionally, we demonstrate that convex and nonconvex ℓ0-norm relaxations (such as ℓp-norm (p>0) [9], exponential-type penalty (ETP) [23], Laplace [24], and Geman [25]), listed in Table 1, can be extended to relax the rank function by acting on the singular values of matrix Z [20], [26], [27]. These properties naturally motivate us to develop feasible solutions to efficiently obtain low-rank solutions in a proper way. Moreover, our primary focus in this work centers on the investigation of problem formulations, optimization algorithms, and performance evaluations related to effective and efficient nonconvex low-rank matrix learning.
To optimize the previously mentioned minimization problems, it is necessary to introduce auxiliary variables to transform them into constrained problems rather than solving them directly. Here, on the one hand, building upon the principles established in Problems (1) and (2), we introduce the term fλ(X;p) as a substitute for λrank(X), while g(D−A(Z)) quantifies the loss function for a general nonconvex low-rank matrix learning problem, denoted as follows:
minfλ(X;p)+g(E)s.t.X=Z,E+A(Z)=D | (4) |
where both X and E are auxiliary variables. On the other hand, adopting a complementary approach, we replace the rank function with generalized nonconvex relaxations and employ ‖x‖qℓq with q=1,2 in Problem (3). This leads to the formulation of the following ℓq-norm regularized problems:
minfλ(E;p)+‖z‖qℓqs.t.x=z,E+A(x)=D | (5) |
where the fλ(E;p) is also derived from Table 1. Especially, Problem (5) encompasses a broader scope than the models studied in [28] and [29], as the latter exclusively focus on specific instances involving Schatten-p norms with p=1,2/3, and 1/2, respectively. In some real-world scenarios marked by significant occlusions and intense illuminations, nonconvex formulations consistently outperform their convex counterparts. This superiority is attributed to the extended matrix variate power exponential distribution [29], [30], which offers a more comprehensive depiction of pixel dependencies. Moreover, it features a heavier-tailed region that captures spatial structural information within residual images. Empirical results consistently validate the superior performance of nonconvex methods in recovering low-rank matrices from incomplete or corrupted data matrices.
It is crucial to highlight that the problems addressed in this work were previously explored in the conference papers [31], [32]. However, this work distinguishes itself by expanding upon these problems and developing a unified nonconvex low-rank matrix learning framework with more in-depth analysis and explanations. To visually illustrate the comparisons and differences, we present plotted curves of four nonconvex functions in Figure 1 and various demonstrations related to different types of noise and two low-rank matrix structures in Figure 2. These visualizations provide additional insights into the regularization term and the residual function for various real-world image applications. Meanwhile, it is noteworthy that we have successfully incorporated preliminary information into the objective terms, encompassing the loss function and coefficient term. This integration proves to be advantageous for enhancing the overall performance. Addressing constrained problems presents a substantial challenge in developing optimization algorithms that are both effective and efficient, particularly when dealing with multiple variables. This challenge is notably prominent in the study of widely utilized alternating direction method of multipliers (ADMM) algorithms [33]–[35].
In comparison with [31] and [32], we have introduced an advanced acceleration technique using randomized singular value decomposition (RSVD) to alleviate the computational complexity associated with low-rank matrix computations. This represents a significant departure from our earlier conference papers, which relied on traditional ADMM algorithms with either full or economical singular value decomposition (SVD) [6], [22], [36]. Moreover, the problems under investigation involve two constrained equations, and the concept of low-rankness holds specific significance within the context of the measurements incorporated into the objective function. Additionally, we have provided comprehensive theoretical analyses from two perspectives, including computational complexity and convergence guarantees. Finally, we also conducted extensive experimental comparisons with several closely related works to validate the superiority of our proposed methodologies across various application tasks in terms of computational efficiency and efficacy. In summary, compared to previously published conference works, we have achieved notable improvements in model explanations, efficient algorithm designs, comparison methods, and application tasks, respectively.
Building upon these insights and the existing body of researches, which aims to reduce computational complexity and enlarge real-world image applications, we can distill the primary contributions of this work as follows:
Model formulation We have established a unified nonconvex low-rank matrix relaxation framework by extending RMC, LRR, and RMR. This approach yields nearly unbiased estimators compared to nuclear norm-based regularization problems, thereby elevating the overall solution quality. Our rank relaxation functions are derived from nonconvex relaxations of the ℓ0-norm, which we apply to the singular values of the target low-rank matrix. This renders the proposed problem formulation exceptionally robust, versatile, and applicable to a wide range of image processing tasks.
Algorithm design Our algorithm designs involve the development of a nonconvex ADMM method with two dual variables, eliminating the need for linearization strategies. Each subproblem can be solved to obtain a closed-form solution. To further enhance acceleration, we incorporate RSVD in the low-rank matrix updates and also utilize orthogonal factorization techniques and parallel strategies. Additionally, we provide theoretical guarantees, demonstrating that the variable sequence generated by nonconvex ADMM is bounded. Moreover, during local convergence analysis, we show that a subsequence converges to a stationary point. The limiting point satisfies the Karush-Kuhn-Tucker (KKT) conditions under more lenient assumptions.
Experimental validation In our experimental results, we demonstrate the superior recovery performance of our proposed nonconvex methods when compared to convex counterparts and other relevant techniques in the domains of matrix completion, subspace clustering, image classification, and image reconstruction using various real-world datasets. Our evaluations cover a diverse range of low-level and high-level vision tasks within the realms of computer vision and pattern recognition. These experimental tasks serve to reinforce the effectiveness and efficiency of our methods through both quantitative and qualitative comparisons.
The following sections are organized as follows: In Section II, we introduce the generalized nonconvex low-rank matrix learning problems and explore their connections with the related work from various perspectives. Section III delves into the development of the multiple variables ADMM optimization algorithm with some acceleration strategies to enhance the computational efficiency for three distinct low-rank matrix optimization problems. We provide a comprehensive theoretical analysis, covering aspects such as computational complexity and convergence properties. Section IV presents a wide array of experiments designed to validate the performance and computational advantages of our proposed method. Finally, in Section V, we conclude our work, providing a summary and outlining potential future directions.
In the upcoming sections, we will delve into the concrete analysis of the constrained Problems (4) and (5). Our initial focus will be on unveiling the significance of each component within the objective function and investigating their connections to existing low-rank matrix learning problems. Building upon these foundational insights, we will embark on a comprehensive analysis of the generalized nonconvex low-rank matrix learning methods and their various extensions, as represented in Figure 3. This diagram provides a comprehensive overview of the research framework of this work from multiple aspects including problem formulations, algorithm design and analysis, and application tasks.
It can be observed that the objective functions in both Problems (4) and (5) are composed of a regularization term and a residual function, each serving a distinct purpose. To illustrate, fλ(X;p) quantifies the regularization term for the low-rank matrix in Problem (4), whereas fλ(E;p) characterizes the residual function for the low-rank matrix in Problem (5). This formulation’s versatility enables us to tailor the low-rank constraints to various generalized learning problems within the realm of low-level image vision tasks. Furthermore, our research highlights the significance of ‖⋅‖ℓ and ‖⋅‖qℓq norms and their connection to the formulation of both convex and nonconvex norms. This analysis underscores the diverse roles played by different norms in defining residual and coefficient styles, as mentioned earlier.
To the best of our knowledge, the adoption of the nonconvex ℓ0-norm function to relax the rank function is widely recognized as a more effective approach, as supported by prior research. Emphasizing the versatility and broad applicability of Problems (4) and (5), we conduct comparisons and establish connections with various existing works, centering our discussion on three key perspectives, outlined as follows:
• Differences in problem formulations: Existing rank relaxation techniques [9], [26], [37], [38] involve calculating the sum of all singular values associated with specific functions to quantify the second-order sparsity of the low-rank matrix. This further served as establishing generalized nonconvex low-rank matrix learning problems. Notably, our work addresses these minimization problems within a unified framework, eliminating the need for individual case-by-case formulations and providing a versatile solution with proper meanings.
• Optimization using first-order algorithms: To achieve the desired solution, one can employ ADMM [13], [35], [39], [40] for handling constrained minimization problems. Alternatively, accelerated proximal gradient algorithm (APG) [41]–[44] and iteratively reweighed algorithm (IRA) [14], [45] are tailored for unconstrained minimization problems. Importantly, to maintain closed-form solutions without relying on linearization techniques, unlike many existing approaches [26], [46]–[49], we also favor the use of ADMM algorithms to effectively tackle the constrained Problems (4) and (5) while incorporating the acceleration strategy for improving the computational efficiency.
• Applications in image low-level and high-level vision: Depending on the specific application cases [2], [3], Problem (4) can find applications in tasks such as image inpainting (as in matrix completion methods) and image clustering (as in subspace segmentation methods). Additionally, Problem (5) is applicable in image classification and reconstruction (as in matrix regression methods). These various low-level applications are underpinned by the construction of meanings for the objective terms.
In this section, we introduce an efficient algorithm that integrates faster low-rank matrix computation to enhance the efficiency of ADMM. Our primary objective is to provide an analysis of the iteration procedures, computational complexity, and convergence properties. Therefore, we will outline the step-by-step solution for Problems (4) and (5) below.
To enhance computational efficiency and reduce the complexity of the optimization algorithm, we introduce a faster proximal operator for low-rank matrices. This approach involves calculating a smaller matrix for the proximal operator, leading to a significant reduction in its size as in [50]–[52]. By offering a closed-form solution for computing the proximal operator related to low-rank matrices, the following proposition greatly improves the algorithm’s computational efficiency and reduces complexity at each iteration. The primary challenge in solving for low-rank matrices is computing the proximal operators, which often involve convex and nonconvex rank relaxations and typically require the computation of the full SVD in most cases.
To enhance the computational efficiency, we propose an acceleration technique known as randomized SVD. This technique reduces computational complexity and enables faster updates, as described below.
Proposition 1 Suppose that rank(X)=r<d≤min(m,n), Q∈Rm×d is an orthogonal matrix satisfying QTQ=I, and Y∈Rd×n such that X=QY. Then, the proximal operator Proxhλ(QTZ) is the solution to
Proxhλ(QTP)=argminYhλ(Y)+12‖Y−QTP‖2F | (6) |
where we have hλ(Y)=∑di=1hλ(σi(Y)), ‖X−P‖2F=‖Y−QTP‖2F, and the matrices X and Y have the same singular values. Therefore, by finding the optimal Y∗=Proxhλ(QTP), we have the desired matrix X∗=QY∗.
Based on von Neumann’s theorem [53], which presents a series of transformations, we initially perform the SVD of a matrix, denoted as UΣVT without loss of generality. Here, U and V represent the left and right singular vector matrices, and Σ is the diagonal singular value matrix with its i-th element denoted as σi for 1≤i≤l=min(m,n). By leveraging Proposition 1, we achieve a significant reduction in computational complexity, especially when d≤min(m,n). This reduction implies that fewer singular values need to be calculated, resulting in faster execution time. This underscores the importance of the proximal operator extension from (6) in obtaining the optimal solution and approximating the matrix through the accelerated generalized singular value thresholding (GSVT) [2], [47] for solving the subsequent low-rank matrix-related subproblems, which are not considered in the existing works [31], [32].
Following the principles of multiple ADMM [33]–[35] for solving constrained optimization problems, we employ an iterative procedure to efficiently address Problems (4) and (5), respectively. This iterative process entails the sequential update of the relevant variables. Our choice is rooted in the constrained nature of the problems in question, where each subproblem can be efficiently computed with a closed-form solution, rendering the ADMM algorithms a fitting choice.
To facilitate the computation of a closed-form solution for each subproblem, one can transform Problem (4) into the following unconstrained problem. Subsequently, we adopt the augmented Lagrange function in the following form:
Lμ(Z,E,X,Λ1,Λ2)=fλ(X)+g(E)+μ2[‖X−Z+Λ1μ‖2F+‖E+A(Z)−D+Λ2μ‖2F]−12μ(‖Λ1‖2F+‖Λ2‖2F) | (7) |
here, μ>0 is the penalty parameter, and both Λ1 and Λ2 represent the dual variables. It is important to highlight that (7) functions as the augmented Lagrange function for (4). The terms involving ‖⋅‖2F represent the differentiable term introduced in [10], [21], and [54]. Note that this iteration process aims to achieve the low-rank matrix solution derived from the RMC and LRR problems.
Building upon the previously introduced augmented Lagrange function and considering the variables at the k-th iteration (Zk,Ek,Xk,Λ1,k,Λ2,k,μk), the iterative process entails sequential updates of the primal variables, dual variables, and the penalty parameter. Consequently, to eliminate the need for linearization techniques, the variables at the (k+1)-th iteration can be updated as follows:
Zk+1=argminZμk2[‖Z−ˆXkΛ1,k‖2F+‖A(Z)−ˆEkΛ2,k‖2F]=(I+A∗A)−1(ˆXkΛ1,k+ˆEkΛ2,k) | (8) |
Ek+1=argminEg(E)+μk2‖E−ˆZk+1Λ2,k‖2F | (9) |
Xk+1=argminXfλ(X;p)+μk2‖X−ˆZk+1Λ1,k‖2F | (10) |
where ˆXkΛ1,k=Xk+Λ1,kμk, ˆEkΛ1,k=D−Ek−Λ2,kμk, ˆZk+1Λ2,k=D−A(Zk+1)−Λ2,kμk, and ˆZk+1Λ1,k=Zk+1−Λ1,kμk. Meanwhile, we also need to update the dual variables and the penalty parameter, as follows:
Λ1,k+1=Λ1,k+μk(Xk+1−Zk+1) | (11) |
Λ2,k+1=Λ2,k+μk(Ek+1+A(Zk+1)−D) | (12) |
μk+1=min(ρμk,μmax),ρ>1 | (13) |
where the number of iterations is significantly influenced by the choice of ρ values. Specifically, a larger value of ρ generally results in faster convergence, whereas a smaller value of ρ tends to yield a more accurate solution.
In the updated process, solving subproblem (8) involves addressing a quadratic problem, while subproblem (10) requires computing proximal operators using a fixed-point iteration algorithm for various formulations of the function fλ(X), as detailed in Proposition 1 for the RMC method, focused on enhancing computational efficiency. We consider a “dictionary” that linearly spans the data space, which helps reduce computational complexity, as employed in the LRR method. Both of the aforementioned acceleration techniques are designed to minimize computational complexity. As demonstrated in [10], the lowest-rank representation can recover the underlying row space, thereby revealing the true data segmentation. Moreover, the optimal solver for subproblem (9) can be adapted based on different choices of g(E), such as ℓ1, ℓ2, or ℓ21 norms for various noise measurements. The initial values for the variables are set as (Z0,E0,X0,Λ1,0,Λ2,0,μ0). To assess convergence, we have established stopping criteria based on the following formula:
‖Ek+1+A(Zk+1)−D‖F‖D‖F<ϵ | (14) |
where ϵ=0.00001. In summary, the detailed iteration procedure of the nonconvex ADMM with two dual variables for solving (4) using the augmented Lagrange function (7) is presented in our Algorithm 1.
Algorithm 1 Optimization algorithm for Problem (4) |
Input: data matrix D, projecting matrix A. |
Parameters: (p,λ,ρ,μ), ε, and kmax. |
Output: X∗. |
1: Set k=0 and μ0; |
2: while not converged do |
3: Update Z via (8); |
4: Update E via (9); |
5: Update X via (10); |
6: Update Λ1 via (11); |
7: Update Λ2 via (12); |
8: Update μ via (13); |
9: end while |
To derive the vector of representation coefficients, we employ the foundational iteration procedures of ADMM to optimize Problem (4). These steps entail the sequential updates of variables and the penalty parameter. Throughout the optimization process, the availability of closed-form solvers is particularly crucial, especially when dealing with ℓq-norms in Problem (5), where we set q to 1, as extensively discussed in [11], [29], and [55]. Particularly, for the case of q=2, we omit the introduction of the auxiliary variable z and eliminate the use of the constrained equation x=z due to the differentiable property of ‖x‖2ℓ2. You can find further details in [11] and [32]. In this work, our focus is specifically on the optimization iterations for the case of q=1, which are elaborated upon in the following two primary steps:
Step 1 By introducing the dual variables Λ1 and Λ2, we formulate the augmented Lagrangian function for Problem (5) only with q=1, represented as
Lμ(E,z,x,Λ1,Λ2)=fλ(E;p)+‖z‖1ℓ1+μ2‖E+A(x)−D+Λ1μ‖2F−12μ‖Λ1‖2F+μ2‖x−z+Λ2μ‖2−12μ‖Λ2‖2 | (15) |
where this function formulation follows the same processing procedure as the previously described function (15). Then, we proceed to present the variables updated below.
Step 2 By alternately updating each variable in the objective function (15), we also take into account the variables at the k-th iteration and then present the following iteration sequences for the (k+1)-th iteration:
i) In the above-given function, while keeping ˆxΛ1,k=D−A(xk)−Λ1,kμk fixed, we can update E through the following minimization subproblem:
Ek+1=argminEfΛ(E;p)+μ2‖E−ˆxΛ1,k‖2F | (16) |
and the closed-form solution can be achieved by using the generalized low-rank matrix computations [2], [47].
ii) Fixing ˆEΛ1,k=D−Ek+1−Λ1,kμk, we subsequently update (z,x) from (15) by
zk+1=argminz‖z‖1ℓ1+μ2‖z−(xk+Λ2,kμ)‖2=sign(xk+Λ2,kμk)⋅max(xk+Λ2,k−1μk,0) | (17) |
xk+1=argminx‖A(x)−ˆEΛ1,k‖2+‖x−zk+1+Λ2,kμk‖2=(I+A∗A)−1(A∗(ˆEΛ1,k)+zk+1−Λ2,kμk) | (18) |
iii) Using the updated variables Ek+1, zk+1, and xk+1 obtained from the previous steps, we proceed to update (Λ1,Λ2) and μ according to the following equations:
Λ1,k+1=Λ1,k+μk(Ek+1+A(xk+1)−D) | (19) |
Λ2,k+1=Λ2,k+μk(xk+1−zk+1) | (20) |
μk+1=min(ρμk,μmax),ρ>1 | (21) |
where (19) and (20) are employed for the dual variables, while (21) is utilized for updating the penalty parameter in this process.
In updating the involved variables, analytical solutions for subproblems (16)–(18) can be readily obtained by computing the derivatives of (18) with respect to x and setting them to zero. Furthermore, the subproblem (17), which is based on the ℓ1-norm, can be solved exactly using the well-known soft-thresholding operator [56]. The subproblem (16), which relies on fλ(E;p), also has a closed-form solution [2], [47]. In summary, the iteration procedures for solving Problem (5) using the nonconvex ADMM are outlined in Algorithm 2. The iterations continue until either reaching the maximum iteration limit kmax or satisfying the predefined stopping criterion, which is set to ε=0.001, as specified in (14).
Algorithm 2 Optimization algorithm for Problem (5) |
Input: testing sample matrix D, training sample matrices A1,A2,…,An, and its column matrix H∈Rm×n with m=p×q. |
Parameters: (p,λ,ρ,μ), ε, and kmax. |
Output: (E∗,x∗). |
1: Set k=0 and μ0; |
2: while not converged do |
3: Update E via (16); |
4: Update z via (17); |
5: Update x via (18); |
6: Update Λ1 via (19); |
7: Update Λ2 via (20); |
8: Update μ via (21); |
9: end while |
In this subsection, we begin by examining the computational complexity of Algorithms 1 and 2. The computational complexity is influenced by the SVD computations and matrix multiplications carried out during each iteration, as well as the total number of iterations. We then proceed to establish convergence guarantees, relying on the updated rules and the properties of the objective function. The detailed analysis is presented below from two perspectives.
1) Computational complexity of Algorithms 1 and 2: The factorization technique efficiently computes an orthogonal matrix Q∈Rm×d, enabling a small SVD of QTZ∈Rd×n to be performed in O(nd2) time. This method reduces the time complexity of obtaining Z∗ to O(mnd) due to the multiplication of two matrices of sizes m×n and n×d. In contrast, the complexity for computing the matrix inverse is O(n3) for a matrix size of n×n. The overall complexity of Algorithms 1 and 2 is determined by the complexity of each iteration multiplied by the total number of iterations.
2) Convergence guarantees of Algorithm 1: The convergence analysis encompasses two technical results, which are presented respectively as follows:
Lemma 1 Let {Θk=(Zk,Ek,Xk,Λ1,k,Λ2,k)} be the sequence generated by nonconvex ADMM, and assume that ∑ki=01μi<+∞ holds. Then {Θk} is a bounded variable sequence.
Theorem 1 Let {Θk=(Zk,Ek,Xk,Λ1,k,Λ2,k)} be the sequence generated by nonconvex ADMM, assume limj→+∞μkj(Xkj+1−Xkj)=0, limj→+∞μkj(Ekj−Ekj+1)=0, then any accumulation point Θ∗=(Z∗,E∗,X∗,Λ1,∗,Λ2,∗), it satisfies
{0=A∗(Λ2,∗)−Λ1,∗0∈∂g(E∗)+Λ2,∗0∈∂fλ(X∗)+Λ1,∗ | (22) |
as well as both X∗=Z∗ and D=E∗+A(Z∗).
Hence, the above results accord to the desired conclusions as having provided in [31]. In other words, the limiting point (Z∗,E∗,X∗,Λ1,∗,Λ2,∗) satisfies the KKT conditions of (7) and (Z∗,E∗,X∗) is a stationary point of Problem (4), respectively. This provides the local convergence property of Algorithm 1.
3) Convergence guarantees of Algorithm 2: The convergence analysis yields similar theoretical results to those presented in Algorithm 1, which are given as follows:
Lemma 2 Suppose that ∑if(σi(D−A(x)))+λ‖x‖qℓq→+∞ iff ‖x‖→+∞ and we assume that μk−1(xk−1−xk)→0 as k→+∞, then the dual variable sequences {Λ1,k} and {Λ2,k} generated by Algorithm 2 are both bounded.
Theorem 2 Let {Tk=(Ek,zk,xk,Λ1,k,Λ2,k)} be the sequence generated by Algorithm 2 for the case p=1, which has at least one accumulation point under the same assumptions with Lemma 2 and ∑+∞i=0μi+μi+12μ2i<+∞. Then, for any accumulation point T∗=(E∗,z∗,x∗,Λ1,∗,Λ2,∗), and the (E∗,z∗,x∗) is a stationary point of Problem (5).
Through the proof procedures provided in Theorems 1 and 2 above, we further establish the boundedness of primal variable sequences, as demonstrated in Lemmas 1 and 2. In essence, we confirm that any accumulation point attained from Algorithms 1 and 2 qualifies as a stationary point. It is worth noting that the local convergence proofs for these results exhibit similarities under comparable conditions and slight revisions. In addition, the elaborate proofs for these analyses are available in the previously published conference papers [31], [32]. For the sake of brevity and space conservation, we have omitted them in this work.
In this section, our main objective is to showcase the superior learning efficacy and computational efficiency of nonconvex rank regularization methods in addressing RMC, LRR, and RMR problems, as respectively formulated in Problems (4) and (5). We aim to provide theoretical insights into their convergence properties and present visual results from various viewpoints, covering a range of vision tasks, including image inpainting, clustering, classification, and reconstruction. This section offers a considerably more comprehensive overview and validation than the previously published conference papers [31], [32], which focused solely on the proposed methods for the experimental comparisons to validate the evaluation efficacy and computational efficiency. Please notice that although the proposed methods have not yet achieved the highest level of computational efficiency, we have successfully validated the effectiveness of the enhanced acceleration technologies employed in our approaches.
In the subsequent experiments, our primary focus is on benchmarking our proposed methods against several existing algorithms, utilizing notations commonly found in pertinent references. The comparison works are sourced from the corresponding journal publications and conference proceedings. These experiments are conducted using databases containing raw images and are executed on Matlab 2021b, running on a personal computer with 8.0 GB of RAM and an Intel(R) Core(TM) i7-
• In the first experiment, we employ matrix completion methods to robustly recover a natural image with missing entries, as shown in Figure 4. The initial image has dimensions of 600×600×3, as shown in Figure 4(a). Subsequently, we truncate this image to create an incomplete version, treating it as a strictly low-rank matrix with a rank of 80, as depicted in Figure 4(b). The introduction of missing entries through text generates an image with incomplete sections, demonstrated in Figure 4(c). Considering that the image comprises three channels (red, green, and blue), we independently complete the missing pixel parts for each channel and then merge the results to achieve optimal inpainting performance. Our evaluation criteria include metrics such as peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) [2], [52]. Additionally, we assess computational efficiency in terms of CPU computation time (TIME) and the number of iterations (ITER). It is noteworthy that higher PSNR and SSIM values, coupled with lower TIME and ITER values, indicate the superiority of the evaluated methods to some degree.
• In the second experiment, we apply clustering methods for subspace segmentation to the ExtYaleB
• In the third experiment, we employ regression techniques for image classification using two well-established face databases: AR
• In the fourth experiment, we mainly conduct numerical assessments to validate the reconstruction capabilities of the proposed methods in both clustering and classification tasks. Our analysis begins by presenting the reconstructed and residual images from the illuminated ExtYaleB dataset. Furthermore, we meticulously curate two sets of face images, one from the AR dataset and another from the ExtYaleB dataset. One set comprises clean face images, while the other set features face images with three different occlusions. We apply the provided methods, leveraging four nonconvex functions for both clustering and classification tasks, respectively. Our particular emphasis is on comparing the reconstruction images and the residual images from these two vision tasks. In particular, the reconstruction images are achieved using the coefficient matrix and vector, while the residual images are obtained from the error matrix in the iteration process.
This experiment aims to assess the performance of Algorithm 1 in optimizing the RMC problem for image inpainting. Subsequently, we evaluated the efficiency and effectiveness of several existing completion methods, resulting in the following observations from two key perspectives:
• Table 2 presents the inpainting results, evaluating our proposed methods across four different nonconvex functions. It includes a comprehensive comparison with various methods, encompassing both convex and nonconvex approaches, some of which incorporate acceleration techniques. The table features diverse methods and their respective outcomes, comparing values of PSNR, SSIM, TIME, and ITER. It is worth noting that for the ℓp-norm and ETP, we chose p=0.5, whereas for the Laplace and Geman functions, we utilized p=0.001. The findings reveal that convex methods such as APG and augmented Lagrange multipliers (ALM) yield lower PSNR and SSIM values. Please note that ALM originates from the work presented in the arXiv preprint arXiv:1009.5055. We here substituted this reference with [46] to ensure the availability of a DOI number. In contrast, nonconvex methods, including iteratively reweighted nuclear norm (IRNN), generalized proximal gradient (GPG), partial singular value thresholding (PSVT), weighted nuclear norm minimization (WNNM), accelerated proximal alternating linearized minimization (AccPALM), and LpTNN (the model combining Lp-norm and the truncated nuclear norm), demonstrate superior performance, with notable effectiveness observed for both WNNM and AccPALM. Moreover, the utilization of acceleration techniques results in more efficient computations, as evident by ALM, IRNN, GPG, and AccPALM, which exhibit reduced iteration counts and/or lower computational time. It is obvious that these results underscore the advantages of our proposed methods, optimized by Algorithm 1, in terms of both inpainting efficacy and computational efficiency compared to most of the comparison methods.
Method | PSNR | SSIM | TIME (s) | ITER |
APG [42] | 50.697 | 78.2 | (354, 365, 338) | |
ALM [46] | 50.864 | 24.0 | (96, 92, 100) | |
IRNN [26] | 51.905 | 21.6 | (68, 68, 68) | |
GPG [47] | 52.064 | 21.7 | (70, 69, 69) | |
PSVT [59] | 50.921 | 46.9 | (160, 163, 165) | |
WNNM [21] | 55.872 | 132.9 | (457, 593, 592) | |
AccPALM [52] | 54.234 | 12.6 | (101, 103, 105) | |
L2/3TNN [60] | 50.895 | 374.1 | (90, 91, 93) | |
L1/2TNN [60] | 50.895 | 350.9 | (88, 89, 91) | |
ADMMℓp | 52.844 | 20.0 | (177, 177, 179) | |
ADMMETP | 52.144 | 20.1 | (180, 180, 186) | |
ADMMLaplace | 51.910 | 19.7 | (178, 178, 181) | |
ADMMGeman | 52.227 | 18.5 | (177, 178, 176) |
• To evaluate our proposed methods under four nonconvex functions, we initiated the assessment by plotting convergence curves of the stopping condition for the ℓp-norm and ETP. These plots illustrate the relationship between ITER and TIME, as shown in Figures 7(a) and (b), respectively. Furthermore, we presented the influence of different parameter choices for the p-value in the Laplace function and the initial ˉμ-value, which is used for the initial penalty parameter (i.e., ˉμ‖D‖2) under the case of the Geman function on evaluation performance, as demonstrated in Figures 7(c) and (d). To sum up, these analyses not only validate the convergence properties of our approaches but also provide valuable insights into the multifaceted effects of various parameters.
This subsection is dedicated to conducting a comprehensive comparison of subspace segmentation methods, which includes assessing both convex and nonconvex rank relaxation methods, as well as low-rank matrix factorization-based approaches. Our evaluation focuses on the unsupervised clustering task using the ExtYaleB dataset, considering scenarios with 5 and 10 subjects. Following this, we provide quantitative and qualitative results, along with highlighting significant observations, described as follows:
• Table 3 presents a comprehensive comparison of various clustering methods and exhibits advantages across four nonconvex functions. In the specific scenarios of our evaluations, it is apparent that LRR, least squares regression (LSR1, LSR2), and fast universal low rank representation (FULRR) exhibit relatively lower ACC and NMI values, along with less time consumption. In contrast, iteratively reweighted least squares (IRLS), smooth representation (SMR), the FULRR, Schatten-1/2 norm factorization based on subspace clustering (S1/2NFSC), Schatten-2/3 norm factorization based on subspace clustering (S2/3NFSC), AccPALM, and weighted schatten-p norm based on LRR (WsLRR) demonstrate improved ACC and NMI values compared to convex methods. Notably, LSR and SMR stand out for their higher efficiency in terms of computational costs. Furthermore, S1/2NFSC, S2/3NFSC, FULRR, WsLRR, and AccPALM leverage matrix factorization acceleration techniques and iteration reduction strategies, enabling them to efficiently achieve low-rank matrices with results that either outperform or match other methods. This naturally aligns with expectations since these nonconvex methods have been intentionally designed to strike a balance between accuracy and efficiency by using Algorithm 1 to optimize Problem (4).
Method | 5 subjects | 10 subjects | ||||
ACC (%) | NMI (%) | TIME (s) | ACC (%) | NMI (%) | TIME (s) | |
LRR [10] | 79.375 | 81.636 | 7.7 | 75.313 | 79.285 | 29.3 |
IRLS [61] | 89.688 | 81.656 | 17.5 | 72.969 | 66.017 | 93.4 |
LSR1 [62] | 94.063 | 87.481 | 60.000 | 53.131 | ||
LSR2 [62] | 95.625 | 92.173 | 58.125 | 51.218 | ||
SMR [63] | 94.688 | 86.634 | 69.375 | 67.096 | ||
FULRR [64] | 90.313 | 79.606 | 12.1 | 84.063 | 77.463 | 43.4 |
WsLRR [65] | 96.875 | 91.554 | 8.2 | 76.719 | 77.289 | 33.8 |
S1/2NFSC [57] | 86.563 | 74.753 | 3.1 | 75.000 | 75.207 | 10.0 |
S2/3NFSC [57] | 88.750 | 77.842 | 3.1 | 73.750 | 75.018 | 10.4 |
AccPALM [52] | 89.063 | 76.065 | 70.469 | 62.672 | 3.5 | |
ADMMℓp | 97.500 | 92.900 | 2.1 | 84.063 | 79.192 | 9.8 |
ADMMETP | 97.813 | 93.888 | 1.8 | 85.469 | 80.672 | 9.0 |
ADMMLaplace | 96.563 | 91.300 | 2.4 | 84.844 | 80.065 | 10.4 |
ADMMGeman | 97.188 | 93.059 | 2.3 | 84.688 | 79.967 | 10.3 |
• Figure 8 offers visual comparisons for our proposed methods employing four different nonconvex functions with 5 subjects in Figures 8(a) and (b) and 10 subjects in Figures 8(c) and (d), respectively. In Figure 8(a), we present the block-diagonal structures of the coefficient matrix with normalization, shedding light on their relationship with the number of subjects for the ℓp-norm. Moving to Figure 8(b), we display the statistical distribution of the coefficient matrix, offering insights into the representation of clustering samples related to the ETP function. Meanwhile, Figure 8(c) allows us to analyze the impact of parameter choices through experiments on the Laplace function. Finally, we provide an analysis of the influences of randomly initialized variables on clustering accuracy in Figure 8(d) based on the Geman norm through
In this subsection, we evaluate the performance of various classification methods using two experimental datasets: the AR dataset and the ExtYaleB dataset with three different types of occlusions. Moreover, we utilize the coefficient vectors obtained from (15). Following the previous studies [10], [45], the training samples remain uncorrupted, while the testing samples intentionally incorporate corruption. Then, we provide a comprehensive analysis encompassing both numerical and visual results, as detailed below.
• Table 4 presents a comparative analysis of classification methods based on their experimental results, considering ACC and TIME. We note that the integration of prior information into regression methods, including nuclear norm based matrix regression (NMR), faster NMR (FNMR), generalized iterated shrinkage algorithm (GISA), and our proposed methods, results in an enhanced classification performance at the expense of increased computation time compared to other methods, except for adaptive locality preserving regression (ALPR). This is attributed to the improved evaluation performance achieved by incorporating proper measurements for the noises. In contrast, collaborative representation classifier (CRC), GISA, and unified linear regression (ULR) demonstrated lower computational costs as they omit SVD computations in their iterative processes. This suggests that avoiding SVD computations can contribute to improved computational efficiency. Especially, we employ the four nonconvex functions, addressing each sample in parallel based on Algorithm 2. The results provided underscore the superiority of these proposed classification methods under various occlusions in both AR and ExtYaleB datasets. This superiority is due to the proper measurements for the residual error and the coefficient vector, simultaneously.
Method | AR_sunglass | AR_scarf | ExtYaleB_baboon | |||
ACC (%) | TIME (s) | ACC (%) | TIME (s) | ACC (%) | TIME (s) | |
CRC [66] | 53.167 | 3.6 | 67.500 | 3.5 | 55.483 | 14.9 |
NMR [28] | 73.000 | 340.5 | 76.333 | 366.2 | 96.491 | 829.5 |
FNMR [28] | 73.000 | 216.5 | 76.167 | 232.1 | 96.491 | 489.2 |
ALPR [67] | 41.667 | 207.5 | 51.167 | 145.3 | 44.518 | 387.4 |
ULR* [14] | 50.833 | 37.1 | 66.667 | 36.3 | 52.851 | 77.1 |
ULR** [14] | 50.833 | 11.5 | 67.883 | 11.6 | 52.851 | 22.5 |
GISA2/3 [68] | 50.000 | 6.2 | 65.667 | 6.0 | 63.816 | 20.1 |
GISA1/2 [68] | 50.500 | 6.1 | 66.000 | 6.1 | 63.816 | 20.9 |
ADMMℓp | 83.000 | 98.1 | 79.667 | 126.6 | 99.561 | 234.7 |
ADMMETP | 82.833 | 55.9 | 78.500 | 54.7 | 99.561 | 131.5 |
ADMMLaplace | 85.500 | 122.7 | 82.167 | 123.8 | 99.781 | 282.3 |
ADMMGeman | 85.833 | 120.3 | 82.333 | 115.6 | 99.781 | 266.9 |
• Figure 9 offers visual comparisons of our proposed methods using four different nonconvex functions for classification. In Figures 9(a) and (b), we illustrate the statistical distribution of the coefficient vector with normalization, using both the ℓp-norm and the ETP function for the AR datasets. These visuals clearly reveal that the coefficient values for the same subjects have larger magnitudes, while those for different subjects tend to be lower. In contrast, Figures 9(c) and (d) display the reconstruction errors for the ExtYaleB dataset, with a focus on the Laplace and Geman functions. In these scenarios, the reconstruction errors for the same subjects as the testing samples exhibit lower values, whereas errors for different subjects demonstrate higher values. These visual results offer valuable insights into the efficiency of our proposed methods.
This subsection focuses on evaluating the visual comparisons of our proposed clustering and classification methods in terms of their reconstruction capabilities. We utilize two experimental datasets with illumination corruptions and various occlusions. Subsequently, we present the reconstructed and residual images generated under four distinct nonconvex functions. Meanwhile, the visual results will offer valuable insights and observations, as follows:
• For the clustering task, our focus is on presenting visual examples in Figures 10(a)–(d) from the top to the bottom row. These images showcase a sequence of reconstructed images alongside corresponding residual images, obtained from the ExtYaleB dataset. Upon a detailed examination of these samples, it becomes apparent that our methods excel at mitigating illumination effects in the clustered images compared to Figure 5. Importantly, this reinforces the observations made during the validation of their reconstruction ability.
• For the classification task, we present visual results using face images from the AR and ExtYaleB dataset in Figures 11(a)–(d) from left to right, demonstrating the outcomes of the proposed methods employing four distinct nonconvex functions for processing the raw images, as shown in Figure 6. Additionally, we evaluate the reconstruction capabilities of these methods by analyzing the differences and similarities in reducing occlusions to some extent. It is noteworthy that among the relaxation functions used in our proposed methods, those based on the involved nonconvex functions consistently yield superior reconstruction performance in the regression process and analysis.
This work focuses on the development of efficient and effective nonconvex multiple ADMM techniques, enhanced with randomized SVD acceleration, factorization, and/or parallel computing approaches. These techniques are designed to address a class of generalized nonconvex rank regularization problems under equality constraints, effectively extending the scope of problems like RMC, LRR, and RMR. By deriving update rules for the involved subproblems, which benefit from closed-form solutions and milder assumptions. Morover, we established the boundedness of one, and in some cases, two dual variables. Additionally, we delved into the computational complexity and provable convergence properties of the variable sequence generated by our approach. Our numerical experiments, conducted on real-world data, consistently confirm the theoretical properties, demonstrating the efficiency of our methodology. Extensive experiments in image inpainting, clustering, face classification, and reconstruction clearly showcase the superiority of our methods over most related ones in terms of both performance and efficiency.
As part of the future work, it will be pivotal to tackle the challenges associated with accelerating generalized proximal operators and conduct comprehensive global convergence analyses. This holds particular significance for advancing nonconvex first-order algorithms, especially in multi-variable low-rank matrix learning scenarios [54]. Additionally, we aim to broaden the applicability of the proposed algorithms by addressing nonconvex tensor recovery problems [16], [69]–[71]. Furthermore, we also plan to explore the integration of specific auxiliary information [15], [32], [67], [72], which could contribute significantly to enhancing the evaluation performance of the proposed methods.
This work was supported by the Ministry of Education, Republic of Singapore, through its Start-Up Grant and Academic Research Fund Tier 1 (Grant No. RG61/22), and in part by the National Natural Science Foundation of China (Grant No. 61906067), and the China Postdoctoral Science Foundation (Grant Nos. 2019M651415 and 2020T13019).
[1] |
N. He, R. L. Wang, J. Lyu, et al., “Low-rank combined adaptive sparsifying transform for blind compressed sensing image recovery,” Chinese Journal of Electronics, vol. 29, no. 4, pp. 678–685, 2020. DOI: 10.1049/cje.2020.05.014
|
[2] |
Z. X. Hu, F. P. Nie, R. Wang, et al., “Low rank regularization: A review,” Neural Networks, vol. 136, pp. 218–232, 2021. DOI: 10.1016/j.neunet.2020.09.021
|
[3] |
K. W. Tang, J. Zhang, C. S. Zhang, et al., “Unsupervised, supervised and semi-supervised dimensionality reduction by low-rank regression analysis,” Chinese Journal of Electronics, vol. 30, no. 4, pp. 603–610, 2021. DOI: 10.1049/cje.2021.05.002
|
[4] |
Y. Q. Cai, X. W. Lu, and N. Jiang, “A survey on quantum image processing,” Chinese Journal of Electronics, vol. 27, no. 4, pp. 718–727, 2018. DOI: 10.1049/cje.2018.02.012
|
[5] |
Z. Zhang, Y. Xu, J. Yang, et al., “A survey of sparse representation: Algorithms and applications,” IEEE Access, vol. 3, pp. 490–530, 2015. DOI: 10.1109/ACCESS.2015.2430359
|
[6] |
Y. Chen, J. Yang, L. Luo, et al., “Adaptive noise dictionary construction via IRRPCA for face recognition,” Pattern Recognition, vol. 59, pp. 26–41, 2016. DOI: 10.1016/j.patcog.2016.02.005
|
[7] |
E. J. Candès, X. D. Li, Y. Ma, et al., “Robust principal component analysis?,” Journal of ACM, vol. 58, no. 3, article no. 11, 2011. DOI: 10.1145/1970392.1970395
|
[8] |
E. J. Candes and T. Tao, “The power of convex relaxation: Near-optimal matrix completion,” IEEE Transactions on Information Theory, vol. 56, no. 5, pp. 2053–2080, 2010. DOI: 10.1109/TIT.2010.2044061
|
[9] |
F. P. Nie, H. Wang, H. Huang, et al., “Joint Schatten-p norm and l p-norm robust matrix completion for missing value recovery,” Knowledge and Information Systems, vol. 42, no. 3, pp. 525–544, 2015. DOI: 10.1007/s10115-013-0713-z
|
[10] |
G. C. Liu, Z. C. Lin, S. C. Yan, et al., “Robust recovery of subspace structures by low-rank representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 171–184, 2013. DOI: 10.1109/TPAMI.2012.88
|
[11] |
J. C. Xie, J. Yang, J. J. Qian, et al., “Robust nuclear norm-based matrix regression with applications to robust face recognition,” IEEE Transactions on Image Processing, vol. 26, no. 5, pp. 2286–2295, 2017. DOI: 10.1109/TIP.2017.2662213
|
[12] |
H. M. Zhang, S. Y. Li, J. Qiu, et al., “Efficient and effective nonconvex low-rank subspace clustering via SVT-free operators,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 12, pp. 7515–7529, 2023. DOI: 10.1109/TCSVT.2023.3275299
|
[13] |
L. Yang, T. K. Pong, and X. J. Chen, “Alternating direction method of multipliers for a class of nonconvex and nonsmooth problems with applications to background/foreground extraction,” SIAM Journal on Imaging Sciences, vol. 10, no. 1, pp. 74–110, 2017. DOI: 10.1137/15M1027528
|
[14] |
H. M. Zhang, F. Qian, B. Zhang, et al., “Incorporating linear regression problems into an adaptive framework with feasible optimizations,” IEEE Transactions on Multimedia, vol. 25, pp. 4041–4051, 2023. DOI: 10.1109/TMM.2022.3171088
|
[15] |
J. M. Liu, Y. J. Chen, J. S. Zhang, et al., “Enhancing low-rank subspace clustering by manifold regularization,” IEEE Transactions on Image Processing, vol. 23, no. 9, pp. 4022–4030, 2014. DOI: 10.1109/TIP.2014.2343458
|
[16] |
Y. Y. Chen, S. Q. Wang, C. Peng, et al., “Generalized nonconvex low-rank tensor approximation for multi-view subspace clustering,” IEEE Transactions on Image Processing, vol. 30, pp. 4022–4035, 2021. DOI: 10.1109/TIP.2021.3068646
|
[17] |
J. J. Qian, W. K. Wong, H. M. Zhang, et al., “Joint optimal transport with convex regularization for robust image classification,” IEEE Transactions on Cybernetics, vol. 52, no. 3, pp. 1553–1564, 2022. DOI: 10.1109/TCYB.2020.2991219
|
[18] |
J. J. Qian, S. M. Zhu, W. K. Wong, et al., “Dual robust regression for pattern classification,” Information Sciences, vol. 546, pp. 1014–1029, 2021. DOI: 10.1016/j.ins.2020.09.062
|
[19] |
C. Gong, H. M. Zhang, J. Yang, et al., “Learning with inadequate and incorrect supervision,” in Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA, pp. 889–894, 2017.
|
[20] |
H. M. Zhang, C. Gong, J. J. Qian, et al., “Efficient recovery of low-rank matrix via double nonconvex nonsmooth rank minimization,” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 2916–2925, 2019. DOI: 10.1109/TNNLS.2019.2900572
|
[21] |
S. H. Gu, Q. Xie, D. Y. Meng, et al., “Weighted nuclear norm minimization and its applications to low level vision,” International Journal of Computer Vision, vol. 121, no. 2, pp. 183–208, 2017. DOI: 10.1007/s11263-016-0930-5
|
[22] |
Y. Hu, D. B. Zhang, J. P. Ye, et al., “Fast and accurate matrix completion via truncated nuclear norm regularization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 9, pp. 2117–2130, 2013. DOI: 10.1109/TPAMI.2012.271
|
[23] |
C. X. Gao, N. Y. Wang, Q. Yu, et al., “A feasible nonconvex relaxation approach to feature selection,” in Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, pp. 356–361, 2011.
|
[24] |
J. Trzasko and A. Manduca, “Highly undersampled magnetic resonance image reconstruction via homotopic l0-minimization,” IEEE Transactions on Medical Imaging, vol. 28, no. 1, pp. 106–121, 2009. DOI: 10.1109/TMI.2008.927346
|
[25] |
D. Geman and C. D. Yang, “Nonlinear image recovery with half-quadratic regularization,” IEEE Transactions on Image Processing, vol. 4, no. 7, pp. 932–946, 1995. DOI: 10.1109/83.392335
|
[26] |
C. Y. Lu, J. H. Tang, S. C. Yan, et al., “Nonconvex nonsmooth low rank minimization via iteratively reweighted nuclear norm,” IEEE Transactions on Image Processing, vol. 25, no. 2, pp. 829–839, 2016. DOI: 10.1109/TIP.2015.2511584
|
[27] |
H. M. Zhang, J. J. Qian, J. B. Gao, et al., “Scalable proximal Jacobian iteration method with global convergence analysis for nonconvex unconstrained composite optimizations,” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 9, pp. 2825–2839, 2019. DOI: 10.1109/TNNLS.2018.2885699
|
[28] |
J. Yang, L. Luo, J. J. Qian, et al., “Nuclear norm based matrix regression with applications to face recognition with occlusion and illumination changes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 1, pp. 156–171, 2017. DOI: 10.1109/TPAMI.2016.2535218
|
[29] |
L. Luo, J. Yang, J. J. Qian, et al., “Robust image regression based on the extended matrix variate power exponential distribution of dependent noise,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 9, pp. 2168–2182, 2017. DOI: 10.1109/TNNLS.2016.2573644
|
[30] |
H. M. Zhang, J. Yang, J. J. Qian, et al., “Efficient image classification via structured low-rank matrix factorization regression,” IEEE Transactions on Information Forensics and Security, vol. 19, pp. 1496–1509, 2024. DOI: 10.1109/TIFS.2023.3337717
|
[31] |
H. M. Zhang, W. Luo, W. L. Du, et al., “Robust recovery of low rank matrix by nonconvex rank regularization,” in Proceedings of the 11th International Conference on Image and Graphics, Haikou, China, pp. 106–119, 2021.
|
[32] |
H. M. Zhang, W. L. Du, Z. M. Li, et al., “Nonconvex rank relaxations based matrix regression for face reconstruction and recognition,” in Proceedings of the 2020 Chinese Automation Congress (CAC), Shanghai, China, pp. 2335–2340, 2020.
|
[33] |
H. M. Zhang, J. B. Gao, J. J. Qian, et al., “Linear regression problem relaxations solved by nonconvex ADMM with convergence analysis,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 2, pp. 828–838, 2024. DOI: 10.1109/TCSVT.2023.3291821
|
[34] |
S. Boyd, N. Parikh, E. Chu, et al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2011.
|
[35] |
H. M. Zhang, F. Qian, P. Shi, et al., “Generalized nonconvex nonsmooth low-rank matrix recovery framework with feasible algorithm designs and convergence analysis,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 9, pp. 5342–5353, 2023. DOI: 10.1109/TNNLS.2022.3183970
|
[36] |
H. M. Zhang, J. Yang, J. C. Xie, et al., “Weighted sparse coding regularized nonconvex matrix regression for robust face recognition,” Information Sciences, vol. 394–395, pp. 1–17, 2017. DOI: 10.1016/j.ins.2017.02.020
|
[37] |
X. Peng, C. Y. Lu, Z. Yi, et al., “Connections between nuclear-norm and frobenius-norm-based representations,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 1, pp. 218–224, 2018. DOI: 10.1109/TNNLS.2016.2608834
|
[38] |
H. M. Zhang, J. Yang, J. J. Qian, et al., “Nonconvex relaxation based matrix regression for face recognition with structural noise and mixed noise,” Neurocomputing, vol. 269, pp. 188–198, 2017. DOI: 10.1016/j.neucom.2016.12.095
|
[39] |
S. Magnusson, P. C. Weeraddana, M. G. Rabbat, et al., “On the convergence of alternating direction Lagrangian methods for nonconvex structured optimization problems,” IEEE Transactions on Control of Network Systems, vol. 3, no. 3, pp. 296–309, 2016. DOI: 10.1109/TCNS.2015.2476198
|
[40] |
D. Y. Yin, G. Q. Wang, B. Xu, et al., “Image deblurring with mixed regularization via the alternating direction method of multipliers,” Journal of Electronic Imaging, vol. 24, no. 4, pp. 043020–043020, 2015. DOI: 10.1117/1.JEI.24.4.043020
|
[41] |
Q. M. Yao, J. T. Kwok, T. F. Wang, et al., “Large-scale low-rank matrix learning with nonconvex regularizers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 11, pp. 2628–2643, 2019. DOI: 10.1109/TPAMI.2018.2858249
|
[42] |
K. C. Toh and S. Yun, “An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems,” Pacific Journal of Optimization, vol. 6, no. 3, pp. 615–640, 2010.
|
[43] |
H. M. Zhang, J. J. Qian, B. Zhang, et al., “Low-rank matrix recovery via modified Schatten-p norm minimization with convergence guarantees,” IEEE Transactions on Image Processing, vol. 29, pp. 3132–3142, 2020. DOI: 10.1109/TIP.2019.2957925
|
[44] |
H. M. Zhang, F. Qian, F. H. Shang, et al., “Global convergence guarantees of (A)GIST for a family of nonconvex sparse learning problems,” IEEE Transactions on Cybernetics, vol. 52, no. 5, pp. 3276–3288, 2022. DOI: 10.1109/TCYB.2020.3010960
|
[45] |
C. Y. Lu, Z. C. Lin, and S. C. Yan, “Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization,” IEEE Transactions on Image Processing, vol. 24, no. 2, pp. 646–654, 2015. DOI: 10.1109/TIP.2014.2380155
|
[46] |
Z. C. Lin, R. S. Liu, and Z. X. Su, “Linearized alternating direction method with adaptive penalty for low-rank representation,” in Proceedings of the 24th International Conference on Neural Information Processing Systems, Granada, Spain, pp. 612–620, 2011.
|
[47] |
C. Y. Lu, C. B. Zhu, C. Y. Xu, et al., “Generalized singular value thresholding,” in Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, pp. 1805–1811, 2015.
|
[48] |
H. Attouch, J. Bolte, and B. F. Svaiter, “Convergence of descent methods for semi-algebraic and tame problems: Proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods,” Mathematical Programming, vol. 137, no. 1–2, pp. 91–129, 2013. DOI: 10.1007/s10107-011-0484-9
|
[49] |
J. Bolte, S. Sabach, and M. Teboulle, “Proximal alternating linearized minimization for nonconvex and nonsmooth problems,” Mathematical Programming, vol. 146, no. 1–2, pp. 459–494, 2014. DOI: 10.1007/s10107-013-0701-9
|
[50] |
H. M. Zhang, W. L. Du, X. Q. Liu, et al., “Factored trace lasso based linear regression methods: Optimizations and applications,” in Proceedings of the 5th International Conference on Cognitive Systems and Signal Processing, Zhuhai, China, pp. 121–130, 2021.
|
[51] |
F. H. Shang, Y. Y. Liu, H. H. Tong, et al., “Robust bilinear factorization with missing and grossly corrupted observations,” Information Sciences, vol. 307, pp. 53–72, 2015. DOI: 10.1016/j.ins.2015.02.026
|
[52] |
H. M. Zhang, B. H. Wen, Z. Y. Zha, et al., “Accelerated PALM for nonconvex low-rank matrix recovery with theoretical analysis,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 4, pp. 2304–2317, 2024.
|
[53] |
H. Nikaidô, “On von Neumann’s minimax theorem,” Pacific Journal of Mathematics, vol. 4, no. 1, pp. 65–72, 1954. DOI: 10.2140/pjm.1954.4.65
|
[54] |
C. H. Chen, B. S. He, Y. Y. Ye, et al., “The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent,” Mathematical Programming, vol. 155, no. 1–2, pp. 57–79, 2016. DOI: 10.1007/s10107-014-0826-5
|
[55] |
J. H. Chen and J. Yang, “Robust subspace segmentation via low-rank representation,” IEEE Transactions on Cybernetics, vol. 44, no. 8, pp. 1432–1445, 2014. DOI: 10.1109/TCYB.2013.2286106
|
[56] |
A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009. DOI: 10.1137/080716542
|
[57] |
H. M. Zhang, J. Yang, F. H. Shang, et al., “LRR for subspace segmentation via tractable Schatten-p norm minimization and factorization,” IEEE Transactions on Cybernetics, vol. 49, no. 5, pp. 1722–1734, 2019. DOI: 10.1109/TCYB.2018.2811764
|
[58] |
J. Chen, H. Mao, Z. Wang, et al., “Low-rank representation with adaptive dictionary learning for subspace clustering,” Knowledge-Based Systems, vol. 223, artilce no. 107053, 2021. DOI: 10.1016/j.knosys.2021.107053
|
[59] |
T. H. Oh, Y. W. Tai, J. C. Bazin, et al., “Partial sum minimization of singular values in robust PCA: Algorithm and applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 4, pp. 744–758, 2016. DOI: 10.1109/TPAMI.2015.2465956
|
[60] |
H. Liang, L. Kang, and J. J. Huang, “A robust low-rank matrix completion based on truncated nuclear norm and Lp-norm,” The Journal of Supercomputing, vol. 78, no. 11, pp. 12950–12972, 2022. DOI: 10.1007/s11227-022-04385-8
|
[61] |
M. J. Lai, Y. Y. Xu, and W. T. Yin, “Improved iteratively reweighted least squares for unconstrained smoothed l q-minimization,” SIAM Journal on Numerical Analysis, vol. 51, no. 2, pp. 927–957, 2013. DOI: 10.1137/110840364
|
[62] |
C. Y. Lu, H. Min, Z. Q. Zhao, et al., “Robust and efficient subspace segmentation via least squares regression,” in Proceedings of the 12th European Conference on Computer Vision, Florence, Italy, pp. 347–360, 2012.
|
[63] |
H. Hu, Z. C. Lin, J. J. Feng, et al., “Smooth representation clustering,” in Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, pp. 3834–3841, 2014.
|
[64] |
Q. Q. Shen, Y. S. Liang, S. Y. Yi, et al., “Fast universal low rank representation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 3, pp. 1262–1272, 2022. DOI: 10.1109/TCSVT.2021.3078327
|
[65] |
Q. Q. Shen, Y. Y. Chen, Y. S. Liang, et al., “Weighted Schatten p-norm minimization with logarithmic constraint for subspace clustering,” Signal Processing, vol. 198, artilce no. 108568, 2022. DOI: 10.1016/j.sigpro.2022.108568
|
[66] |
L. Zhang, M. Yang, and X. C. Feng, “Sparse representation or collaborative representation: Which helps face recognition?,” in Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, pp. 471–478, 2011.
|
[67] |
J. Wen, Z. F. Zhong, Z. Zhang, et al., “Adaptive locality preserving regression,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 1, pp. 75–88, 2020. DOI: 10.1109/TCSVT.2018.2889727
|
[68] |
W. M. Zuo, D. Y. Meng, L. Zhang, et al., “A generalized iterated shrinkage algorithm for non-convex sparse coding,” in Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Australia, pp. 217–224, 2013.
|
[69] |
X. L. Sun, Y. Hai, X. J. Zhang, et al., “Adaptive tensor rank approximation for multi-view subspace clustering,” Chinese Journal of Electronics, vol. 32, no. 4, pp. 840–853, 2023. DOI: 10.23919/cje.2022.00.180
|
[70] |
Y. P. Liu, J. N. Liu, and C. Zhu, “Low-rank tensor train coefficient array estimation for tensor-on-tensor regression,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 12, pp. 5402–5411, 2020. DOI: 10.1109/TNNLS.2020.2967022
|
[71] |
Y. P. Liu, Z. Long, H. Y. Huang, et al., “Low CP rank and tucker rank tensor completion for estimating missing components in image data,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 4, pp. 944–954, 2020. DOI: 10.1109/TCSVT.2019.2901311
|
[72] |
H. M. Zhang, J. Y. Zhao, B. Zhang, et al., “Unified framework for faster clustering via joint Schatten p-norm factorization with optimal mean,” IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 3, pp. 3012–3026, 2024.
|
Method | PSNR | SSIM | TIME (s) | ITER |
APG [42] | 50.697 | 78.2 | (354, 365, 338) | |
ALM [46] | 50.864 | 24.0 | (96, 92, 100) | |
IRNN [26] | 51.905 | 21.6 | (68, 68, 68) | |
GPG [47] | 52.064 | 21.7 | (70, 69, 69) | |
PSVT [59] | 50.921 | 46.9 | (160, 163, 165) | |
WNNM [21] | 55.872 | 132.9 | (457, 593, 592) | |
AccPALM [52] | 54.234 | 12.6 | (101, 103, 105) | |
L2/3TNN [60] | 50.895 | 374.1 | (90, 91, 93) | |
L1/2TNN [60] | 50.895 | 350.9 | (88, 89, 91) | |
ADMMℓp | 52.844 | 20.0 | (177, 177, 179) | |
ADMMETP | 52.144 | 20.1 | (180, 180, 186) | |
ADMMLaplace | 51.910 | 19.7 | (178, 178, 181) | |
ADMMGeman | 52.227 | 18.5 | (177, 178, 176) |
Method | 5 subjects | 10 subjects | ||||
ACC (%) | NMI (%) | TIME (s) | ACC (%) | NMI (%) | TIME (s) | |
LRR [10] | 79.375 | 81.636 | 7.7 | 75.313 | 79.285 | 29.3 |
IRLS [61] | 89.688 | 81.656 | 17.5 | 72.969 | 66.017 | 93.4 |
LSR1 [62] | 94.063 | 87.481 | 60.000 | 53.131 | ||
LSR2 [62] | 95.625 | 92.173 | 58.125 | 51.218 | ||
SMR [63] | 94.688 | 86.634 | 69.375 | 67.096 | ||
FULRR [64] | 90.313 | 79.606 | 12.1 | 84.063 | 77.463 | 43.4 |
WsLRR [65] | 96.875 | 91.554 | 8.2 | 76.719 | 77.289 | 33.8 |
S1/2NFSC [57] | 86.563 | 74.753 | 3.1 | 75.000 | 75.207 | 10.0 |
S2/3NFSC [57] | 88.750 | 77.842 | 3.1 | 73.750 | 75.018 | 10.4 |
AccPALM [52] | 89.063 | 76.065 | 70.469 | 62.672 | 3.5 | |
ADMMℓp | 97.500 | 92.900 | 2.1 | 84.063 | 79.192 | 9.8 |
ADMMETP | 97.813 | 93.888 | 1.8 | 85.469 | 80.672 | 9.0 |
ADMMLaplace | 96.563 | 91.300 | 2.4 | 84.844 | 80.065 | 10.4 |
ADMMGeman | 97.188 | 93.059 | 2.3 | 84.688 | 79.967 | 10.3 |
Method | AR_sunglass | AR_scarf | ExtYaleB_baboon | |||
ACC (%) | TIME (s) | ACC (%) | TIME (s) | ACC (%) | TIME (s) | |
CRC [66] | 53.167 | 3.6 | 67.500 | 3.5 | 55.483 | 14.9 |
NMR [28] | 73.000 | 340.5 | 76.333 | 366.2 | 96.491 | 829.5 |
FNMR [28] | 73.000 | 216.5 | 76.167 | 232.1 | 96.491 | 489.2 |
ALPR [67] | 41.667 | 207.5 | 51.167 | 145.3 | 44.518 | 387.4 |
ULR* [14] | 50.833 | 37.1 | 66.667 | 36.3 | 52.851 | 77.1 |
ULR** [14] | 50.833 | 11.5 | 67.883 | 11.6 | 52.851 | 22.5 |
GISA2/3 [68] | 50.000 | 6.2 | 65.667 | 6.0 | 63.816 | 20.1 |
GISA1/2 [68] | 50.500 | 6.1 | 66.000 | 6.1 | 63.816 | 20.9 |
ADMMℓp | 83.000 | 98.1 | 79.667 | 126.6 | 99.561 | 234.7 |
ADMMETP | 82.833 | 55.9 | 78.500 | 54.7 | 99.561 | 131.5 |
ADMMLaplace | 85.500 | 122.7 | 82.167 | 123.8 | 99.781 | 282.3 |
ADMMGeman | 85.833 | 120.3 | 82.333 | 115.6 | 99.781 | 266.9 |