- Split View
-
Views
-
Cite
Cite
Anna Auría, Rafael Carrillo, Jean-Philippe Thiran, Yves Wiaux, Tensor optimization for optical-interferometric imaging, Monthly Notices of the Royal Astronomical Society, Volume 437, Issue 3, 21 January 2014, Pages 2083–2091, https://doi.org/10.1093/mnras/stt1994
- Share Icon Share
Abstract
Image recovery in optical interferometry is an ill-posed non-linear inverse problem arising from incomplete power spectrum and bispectrum measurements. We reformulate this non-linear problem as a linear problem for the supersymmetric rank-1 order-3 tensor formed by the tensor product of the vector representing the image under scrutiny with itself. On one hand, we propose a linear convex approach for tensor recovery with built-in supersymmetry, and regularizing the inverse problem through a nuclear norm relaxation of a low-rank constraint. On the other hand, we also study a non-linear non-convex approach with a built-in rank-1 constraint but where supersymmetry is relaxed, formulating the problem for the tensor product of three vectors. In this second approach, only linear convex minimization subproblems are, however, solved, alternately and iteratively for the three vectors. We provide a comparative analysis of these two novel approaches through numerical simulations on small-size images.
INTRODUCTION
Interferometry is a unique tool to image the sky at otherwise inaccessible resolutions. Ideally, an interferometer measures complex visibilities identifying the Fourier coefficients of the intensity image |${\boldsymbol x}$| of interest. In this context, the visibility associated with a given telescope pair at one instant of observation gives the Fourier transform of the image of interest at a spatial frequency identified by the baseline components in the image plane. At radio wavelengths, these visibilities are indeed accessible, thereby setting a linear inverse problem in the perspective of image recovery. The standard clean algorithm operates by local iterative removal of the convolution kernel associated with the partial Fourier coverage (Thompson, Moran & Swenson 2001). Convex optimization methods regularizing the inverse problem through sparsity constraints have recently been proposed in the framework of the recent theory of compressive sampling (Wiaux et al. 2009a,b; Wenger et al. 2010; Wiaux, Puy & Vandergheynst 2010; Li, Cornwell & de Hoog 2011; McEwen & Wiaux 2011; Carrillo, McEwen & Wiaux 2012).
At optical wavelengths though, atmospheric turbulence induces a random phase delay that implies a systematic cancellation of the visibility values. Power spectrum information can, however, be retrieved, together with partial phase information through phase closure or bispectrum measurements (Baldwin & Haniff 2002; Thorsteinsson, Buscher & Young 2004; Thiébaut & Giovanelli 2010). These considerations apply both to aperture masking interferometry on a single telescope (Baldwin et al. 1986; Haniff et al. 1987; Tuthill et al. 2000) and to optical interferometer arrays such as the Very Large Telescope Interferometer.1 Providing detailed images of complex astrophysical phenomena is an important challenge for optical interferometry today (Baldwin & Haniff 2002). In the perspective of image recovery, prior constraints are also essential to regularize this non-linear ill-posed inverse problem.
The state-of-the-art MiRA method (Thiébaut 2008) takes a maximum a posteriori approach where the image is the solution of an optimization problem with an objective function |$f({\boldsymbol x})=f_{\rm data}({\boldsymbol x})+\ell f_{\rm prior}({\boldsymbol x})$|, for some arbitrary parameter ℓ to be tuned, and with additional positivity and total flux constraints. Sparsity priors have in particular been promoted (Thiébaut 2008; Renard, Thiébaut & Malbet 2011). The data non-linearity induces non-convexity of the objective function. The adopted strategy is to perform only local optimization, in the context of which the solution depends not only on the data and on the priors, but also strongly on the initial image and on the path followed by the local optimization method. The wisard alternative (Meimon, Mugnier & Besnerais 2005) takes a two-step alternate minimization (AM) self-calibration approach. First, the missing Fourier phases are recovered on the basis of a current estimate and phase closure information enabling to build pseudo-complex visibilities. Secondly, the image is recovered from the pseudo-complex visibilities as in radio interferometry. While the second step is convex and leads to a unique image independently of the initialization, the first step is not. The overall procedure remains non-convex and the final solution depends on the initial guess. In summary, state-of-the-art methods are non-convex due to the intrinsic data non-linearity (Thiébaut & Giovanelli 2010), and therefore known to suffer from a strong sensitivity to initialization.
The approaches proposed here stem from a different perspective. We first formulate a linear version of the problem for the real and positive supersymmetric rank-1 order-3 tensor |$\mathcal {X}={\boldsymbol x}\circ {\boldsymbol x}\circ {\boldsymbol x}$| formed by the tensor product of the size-N vector |${\boldsymbol x}$| representing the image under scrutiny with itself. This allows us to pose a linear convex problem for recovery of a size-N3 tensor |$\mathcal {X}$| with built-in supersymmetry, and regularizing the inverse problem through a nuclear norm relaxation of a low-rank constraint, also enforcing reality and positivity constraints. We also study a non-linear non-convex approach with a built-in rank-1 constraint but where supersymmetry is relaxed, formulating the problem for the tensor product |${{\boldsymbol u}_1}\circ {{\boldsymbol u}_2}\circ {{\boldsymbol u}_3}$| of three size-N vectors. In contrast with the state of the art though, only linear convex minimization subproblems are solved, alternately and iteratively for the vectors, also enforcing reality and positivity.2 While the former approach is much heavier than the latter in terms of memory requirements and computation complexity due to the drastically increased dimensionality of the unknown, the underlying convexity ensures essential properties of convergence to a global minimum of the objective function and independence of initialization, justifying a comparative analysis. For numerical experiments, we consider a generic discrete measurement setting where measurements identify with triple products of discrete Fourier coefficients of |${\boldsymbol x}$|. These triple products are selected randomly according to a variable-density scheme sampling more densely low spatial frequencies, and are affected by simple additive Gaussian noise.
In Section 2, we review convex optimization and proximal splitting methods. In Section 3, we introduce our generic discrete data model and describe our new linear tensor formulation of the optical-interferometric imaging problem. In Sections 4 and 5, the new AM and nuclear minimization (NM) approaches are discussed. Our simulation setting for comparison of these two methods and corresponding results are presented in Section 6. Finally, we conclude in Section 7.
CONVEX OPTIMIZATION AND PROXIMAL SPLITTING METHODS
DATA MODEL AND TENSOR FORMULATION
For the sake of simplicity, we adopt a discrete setting where the intensity image of interest is represented by the real and positive vector |${\boldsymbol x}\in \mathbb {R}_+^N$| with components xi. Its 2D discrete Fourier transform is denoted by |$\boldsymbol {\hat{x}}\in \mathbb {C}^N$| with components |$\hat{x}_i$|. By abuse of notation, we denote |$\hat{x}_{-i}$| the component of |$\boldsymbol {\hat{x}}$| at the opposite spatial frequency to that associated with |$\hat{x}_{i}$|. Signal reality implies |$\hat{x}_{-i}=\hat{x}_{i}^*$|, where * stands for complex conjugation.
In what follows, we show how to bring the linearity of the measurement scheme by lifting the image model from a vector to a tensor formulation. We start by reviewing some tensor definitions and notations. First, the order (or number of dimensions, ways or modes) of a tensor |$\mathcal {X}\in \mathbb {C}^{N_1 \times \cdots \times N_d}$| with components |$\mathcal {X}_{i_1,\ldots , i_d}$| is the number d of the indices characterizing its components. For the sake of simplicity, we will present the formulation only for tensors of order 3. A three-way tensor |$\mathcal {X} \in \mathbb {C}^{N_1 \times N_2 \times N_3}$| is rank-1 if it can be written as the outer product of three vectors, i.e. |$\mathcal {X}=\boldsymbol {a} \circ \boldsymbol {b} \circ \boldsymbol {c}$|, or component-wise |$\mathcal {X}_{ijk}=a_ib_jc_k$|. Secondly, the rank of a tensor, rank|$(\mathcal {X})$|, is defined as the smallest number of rank-1 tensors that generate |$\mathcal {X}$| as their sum. In other words, if |$\mathcal {X}$| can be expressed as |$\mathcal {X}=\sum _{r=1}^R \boldsymbol {a_r} \circ \boldsymbol {b_r} \circ \boldsymbol {c_r}$|, then rank|$(\mathcal {X}) \le R$|. The notion of rank when applied to a tensor is analogous to the matrix rank though most of the common properties of the latter do not hold when dealing with objects of a dimension higher than 2. One of the main differences is that there is no algorithm to compute the rank of a given tensor. In fact, the problem is Non-deterministic Polynomial-time hard (NP-hard) (Hastad 1990). The well-known method to find a rank-k approximation of a matrix through the largest k values of its singular value decomposition (Stewart 1992) does not apply or have an equivalent for the case of high-dimension tensors. Thirdly, matricization is the process of transforming a tensor into a matrix. The mode-n matricization of a tensor |$\mathcal {X}$| is denoted by |$\boldsymbol{X}_{(n)}$| and results from unfolding all its modes but the mode n into the rows of a matrix. The n-rank of a tensor follows as |$n{{-rank}}(\mathcal {X})=({\rm rank}(\boldsymbol{X}_{(1)}),{\rm rank}(\boldsymbol{X}_{(2)}),{\rm rank}(\boldsymbol{X}_{(3)}))$|. In contrast to the rank function, it is easier to handle, since the problem is reduced to calculations with matrices which are already well-known objects with nice properties. The reader can refer to the review from Kolda & Bader (2009) for a more detailed explanation on different notions of tensor rank and their associated decomposition methods. Finally, a tensor is called cubical if every mode has the same size, i.e. |$\mathcal {X} \in \mathbb {R}^{N \times N \times N}$|. A cubical tensor |$\mathcal {X}$| is called supersymmetric if its entries are invariant under permutation of their indices: |$\mathcal {X}_{ijk}=\mathcal {X}_{ikj}=\mathcal {X}_{jik}=\mathcal {X}_{jki}=\mathcal {X}_{kij}=\mathcal {X}_{kji}$|.
We note, however, that the rank-1 and supersymmetry properties are not explicitly built in in the tensor formulation (5), which thereby presents a drastically increased dimensionality, N3, of the unknown |$\mathcal {X}$| compared to the original |${\boldsymbol x}$| of size N in (4). In the following sections, we discuss our two different regularization schemes for tensor recovery. We first study a non-convex AM approach where the rank-1 constraint is built in, and subsequently move to a convex NM scheme with built-in supersymmetry.
RANK-1 ALTERNATE MINIMIZATION
Algorithm formulation
Algorithm 1 AM algorithm
Initialize k = 1, |${{\boldsymbol u}_1}^{(0)},{{\boldsymbol u}_2}^{(0)},{{\boldsymbol u}_3}^{(0)}\in \mathbb {R}^N$|.
while not converged do
|${{\boldsymbol u}_1}^{(k)}=\arg \min _{{{ u}_1}} \Vert {\boldsymbol{T}}_{({{ u}_2}^{(k-1)}{{ u}_3}^{(k-1)})}{{\boldsymbol u}_1} - \boldsymbol {y}\Vert _2^2$|.
|${{\boldsymbol u}_2}^{(k)}=\arg \min _{{{ u}_2}} \Vert {\boldsymbol{T}}_{({{ u}_1}^{(k)}{{ u}_3}^{(k-1)})}{{\boldsymbol u}_2} - \boldsymbol {y}\Vert _2^2$|.
|${{\boldsymbol u}_3}^{(k)}=\arg \min _{{{ u}_3}} \Vert {\boldsymbol{T}}_{({{ u}_1}^{(k)}{{ u}_2}^{(k)})}{{\boldsymbol u}_3} - \boldsymbol {y}\Vert _2^2$|.
k ← k + 1
end while
|${\boldsymbol x}_{\rm AM}=\frac{1}{3}({{\boldsymbol u}_1}^{(k)}+{{\boldsymbol u}_2}^{(k)}+{{\boldsymbol u}_3}^{(k)})$|
return|${\boldsymbol x}_{\rm AM}$|
Optimization details
The memory requirement to solve this minimization problem is dominated by the storage of the three vectors, which is of size |$\mathcal {O}(N)$|. In terms of computation time, the algorithm is dominated at each iteration by the application of the operator |$\mathcal {T}$| which computes three 2D fast Fourier transforms of size N, with an asymptotic complexity of order |$\mathcal {O}(N\log N)$|. This approach is computationally efficient. In contrast with the state-of-the-art approaches such as MiRA and wisard, it brings convexity to the subproblems. However, the global problem remains non-convex and the solution may still depend on the initialization. One can easily identify convergence to a local minimum through large residual values of the objective function. With the aim to mitigate the dependence on initialization, and as suggested by Haldar & Hernando (2009), we propose to run the algorithm nri times with random initializations, choosing a posteriori the solution with a minimum objective function value.
SUPERSYMMETRIC NUCLEAR MINIMIZATION
Algorithm formulation
Low-rankness, reality and positivity will be imposed as regularization priors in the convex minimization problem to be defined. As pointed out, the rank of a tensor is difficult to handle since the problem of finding rank(|$\mathcal {X}$|) is NP-hard. Computing the rank of different matricizations of the tensor is an easier task. The unfoldings of a rank-1 tensor are actually rank-1 matrices, so that a low-n-rank constraint can be used as a proxy for low rankness. The rank of a matrix is, however, a non-convex function. The nuclear norm, defined as the ℓ1 norm of its singular values, is a well-known convex relaxation of the rank function that was recently promoted in matrix recovery theory (Candès & Recht 2009). Building on those results, Gandy, Recht & Yamada (2011) tackle the low-n-rank tensor recovery problem through the minimization of the sum of the nuclear norms of the mode-n matricizations |$\boldsymbol{X}_{(n)}$| for all n. In the supersymmetric case, the mode-n matricizations are all identical and denoted by |$\boldsymbol{X}_{(n)}=\mathcal {U}(\mathcal {X})\in \mathbb {C}^{N\times N^2}$|, where |$\mathcal {U}$| stands for the unfolding operator. We propose here to exploit the symmetry of the tensor under scrutiny, together with the signal normalization, to promote a computationally more efficient low-rank prior. Relying on these properties, we note that summation over one index of a tensor of the form |${\boldsymbol x} \circ {\boldsymbol x} \circ {\boldsymbol x}$| with ∑ixi = 1 leads to the order-2 tensor |${\boldsymbol x} \circ {\boldsymbol x}$|, which is real, positive, symmetric, as well as rank-1 and positive-semidefinite. We define |$\mathcal {C}$| as the operator performing the summation over one dimension. Once more supersymmetry ensures that the resulting matrix is independent of the choice of the dimension along which components are summed up: |$\mathcal {C}(\mathcal {X})\in \mathbb {C}^{N\times N}$| with |$[\mathcal {C}(\mathcal {X})]_{ij}= \sum _k \mathcal {X}_{ijk}$|. A low-rank constraint on |$\mathcal {C}(\mathcal {X})$| will be promoted, through a nuclear norm minimization, as a convex proxy for the low rankness of |$\mathcal {X}$|. Positive-semidefiniteness of |$\mathcal {C}(\mathcal {X})$|, i.e. the positivity of the eigenvalues, which are then identical to the singular values, may also be explicitly added as a convex prior, denoted by |$\mathcal {C} (\mathcal {X}) \succeq 0$|, together with the convex reality and positivity constraints of |$\mathcal {X}$|: |$\mathcal {X} \in \mathbb {R}_+^{N\times N \times N}$|. This summation approach is a priori computationally significantly more efficient, given the reduced matrix size of |$\mathcal {C}(\mathcal {X})$| compared to that of the unfolded matrix |$\mathcal {U}(\mathcal {X})$|.
The final NM algorithm is shown in Algorithm 2. To solve the complex optimization problem in (11) we use the Douglas–Rachford splitting algorithm, which is tailored to solve problems of the form in (2) with K = 2. The problem in (11) can be reformulated as in (2) by setting |$f_1(\mathcal {X})=\Vert \mathcal {C}(\mathcal {X})\Vert _*+{i}_{S}(\mathcal {X})$| and |$f_2({\boldsymbol x})={i}_{C_{\epsilon }}(\mathcal {X})$|, where |$C_{\epsilon }=\lbrace \mathcal {X} \in \mathbb {C}^{N\times N \times N} : \Vert \boldsymbol {y}_{\rm s}-\mathcal {T}_{\rm s}(\mathcal {X})\Vert _2 \le \epsilon \rbrace$|. The main recursion of the Douglas–Rachford algorithm is detailed in steps 3 and 4 of Algorithm 2, where ν > 0 and τk ∈ (0, 2) are convergence parameters. The sequence |$\lbrace \mathcal {X}^{(k)} \rbrace$| generated by the recursion in Algorithm 2 converges to a solution of the problem (11) (Combettes & Pesquet 2011). The algorithm is stopped when the relative variation between successive solutions, |$\left\Vert \mathcal {X}^{(k)}-\mathcal {X}^{(k-1)} \right\Vert /\left\Vert \mathcal {X}^{(k-1)} \right\Vert$|, is smaller than some bound ξ ∈ (0, 1), or after the maximum number of iterations allowed, Tmax, is reached. In our implementation, we use the values τk = 1, ∀t, ξ = 10−3 and ν = 10−1. In the following subsection, we detail the computation of the proximity operators for f1 and f2.
Algorithm 2 NM algorithm
Initialize k = 1, |$\mathcal {X}^{(1)}\in \mathbb {R}^{N \times N \times N}$|, τk ∈ (0, 2) and ν > 0.
while not converged do
|$\mathcal {Z}^{(k)}=\mathrm{prox}_{\nu f_2}\left( \mathcal {X}^{(k)}\right)$|.
|$\mathcal {X}^{(k+1)}=\mathcal {X}^{(k)}+\tau _k\left(\mathrm{prox}_{\nu f_1}\left(2\mathcal {Z}^{(k)}-\mathcal {X}^{(k)}\right)-\mathcal {Z}^{(k)}\right)$|.
k ← k + 1
endwhile
|${\boldsymbol x}_{\rm NM}= [\mathcal {EP}_1](\mathcal {X}^{(k)})$|.
return|${\boldsymbol x}_{\rm NM}$|
Optimization details
All the operations done in the computation of the proximal operators of f1 and f2 preserve tensor symmetry, provided that the symmetrized version |$\mathcal {T}_{\rm s}$| of the measurement operator and a symmetrized data vector are used. These two are sufficient conditions to impose supersymmetry at each iteration of Algorithm 2, and consequently for the final tensor solution.
The memory requirement to solve this NM problem is dominated by the storage of the tensor, which is of size |$\mathcal {O}(N^3)$|. In terms of computation time, the algorithm is dominated at each iteration by the application of the operator |$\mathcal {T}_{\rm s}$| which computes N2 2D fast Fourier transforms of size N along each of the three dimensions, with an asymptotic complexity of |$\mathcal {O}(N^3\log N)$|. These orders of magnitude obviously stand in stark contrast with those for the AM approach.
While the NM approach is much heavier than the AM approach in terms of memory requirements and computation complexity due to the drastically increased dimensionality of the unknown, the underlying convexity at the tensor level ensures essential properties of convergence to a global minimum of the objective function and independence of initialization, justifying a comparative analysis.
SIMULATIONS AND RESULTS
In this section, we evaluate the performance of the NM and AM algorithms through numerical simulations. Our optimization code5 was implemented in matlab and run on a standard 2.4 GHz Intel Xeon processor. Given the expected large memory requirements and long reconstructions time for the NM formulation, we consider small-size images with N = 162 = 256 for which the image vector occupies of the order of 4 KB in double precision, while the size-N3 tensor variable already takes of the order of 100 MB. The memory requirement for the simple tensor variable would already rise to the order of 8 GB for a 322 = 1024 image size.
For what the measurement setting is concerned, we assume random variable-density sampling in the 6D Fourier space, where low spatial frequencies are more likely to be sampled than high frequencies. In practice, the sampling pattern is obtained by sampling frequels independently along each of the three tensor dimensions from a bidimensional random Gaussian profile in the corresponding fourier plane, associating the originally continuous random points with the nearest discrete frequency. The sampling is carried out progressively, noting that if a product is sampled twice the result is discarded and repeating this procedure until M samples are obtained. Again this consideration only arises from the discrete setting adopted. Fig. 1 presents a typical sampling pattern.
In all experiments we define the input signal-to-noise ratio as |${\rm ISNR}= - 10 \log (\sigma _n^2/e_y^2),$| where |$e_y^2=(1/M)\sum _a^M | y_a|^2$|. The signal-to-noise ratio of a reconstruction |$\boldsymbol {\bar{x}}$| is defined as |${\rm SNR}= - 10 \log (\Vert \boldsymbol {\bar{x}}-{\boldsymbol x} \Vert ^2/\Vert {\boldsymbol x} \Vert ^2)$|. With this definition, the higher the SNR, the closer the recovered signal |$\boldsymbol {\bar{x}}$| is from the original |${\boldsymbol x}$|.
As a preliminary experiment, we provide a comparison of the performance of the NM approach defined in (11) with the equivalent minimization problem where the summation operator |$\mathcal {C}$| is replaced by the unfolding operator |$\mathcal {U}$| in the nuclear norm and where the positive-semidefiniteness constraint is discarded as it does not apply for non-square matrices. Both algorithms were tested on images constructed from 32 random spikes, with ISNR = 30 dB. The positive spike values are taken uniformly at random and normalized to get unit flux, while positions are drawn at random from a Gaussian profile centred on the image. The graphs in Fig. 2 represent the SNR and timing curves as a function of undersampling in the range [0.25, 1]. A total of 10 simulations per point are performed, varying the signal, as well as the sampling and noise realizations. Both approaches provide similar reconstruction qualities, with a smaller variability of the component summation approach, which is also slightly superior at low undersamplings. The component summation approach, running in the order of 103 s, is, as expected, significantly faster than the unfolding approach, running on average more than 10 times more slowly in the range [0.5, 1]. We therefore discard further consideration of the latter.
Having validated our NM approach in comparison with alternative state-of-the-art low-tensor-rank approaches, we compare its performance with that of the AM scheme. First, we evaluate the reconstruction quality on images constructed from 32 and 64 randomly located spikes. The AM approach is also evaluated for varying reinitialization numbers: nri ∈ {1, 5, 10}. The graphs in Fig. 3 represent the SNR curves as a function of undersampling in the range [0.25, 1]. A total of 50 and 10 simulations per point are performed for AM and NM, respectively, varying the signal, as well as the sampling and noise realizations. The results show a clear superiority of AM relative to NM in terms of average reconstruction quality. Both approaches exhibit non-negligible variability. The dependency of the non-convex AM approach on initialization is clearly illustrated by the nri = 1 and 5 curves, confirming the importance of the multiple reinitializations. We also observe a saturation between nri = 5 and 10. As expected from asymptotic complexity considerations, AM runs significantly faster than NM, with reconstructions in the order of 102 s for nri = 5, approximately 10 times faster than NM.
Secondly, simulations are performed in an identical setting on realistic images representing low-resolution versions of the Eta Carinae star system, of a simulated rapidly rotating star, and of the M51 galaxy.6 The multiple simulations per point are performed by varying the sampling and noise realizations. The graphs in Figs 4, 5 and 6 present the SNR curves as a function of undersampling in the range [0.25, 1] (AM only reported for nri = 5), confirming the previous results on random images. Reconstructed images are also reported, providing visual confirmation of the superiority of AM relative to NM over the full undersampling range. In both approaches, the visual quality difference between the reconstructions with, respectively, best and median SNR values illustrates the variability of the reconstruction quality. The NM approach suffers from a significantly larger visual degradation of the median SNR value at M = 0.25N than AM. This degradation appears at larger sampling ratios for M51.
Let us highlight that, while only five reinitializations are necessary in the AM approach in low dimension to reach saturation, additional experimental tests on random signals of size N = 642 show that nri = 20 or larger is necessary for a meaningful reconstruction, thereby emphasizing the convergence problem due to non-convexity in higher dimension. Also, the computation time scales linearly with nri and can rapidly blow up in this context.
CONCLUSION
We have proposed a novel linear formulation of the optical-interferometric imaging problem in terms of the supersymmetric rank-1 order-3 tensor formed by the tensor product of the vector representing the image sought with itself. In this context, we proposed a linear convex approach for tensor recovery with built-in supersymmetry, and regularizing the inverse problem through nuclear norm minimization. We have also studied a non-linear, non-convex, AM approach where supersymmetry is relaxed while the rank-1 constraint is built in. While the former approach is associated with drastically increased dimensionality of the unknown, the underlying convexity ensures essential properties of convergence to a global minimum of the objective function and independence of initialization, justifying its analysis. Simulation results in low dimension show that the AM scheme provides significantly superior imaging quality than the NM approach, in addition to be much lighter in its memory requirements and computation complexity. Another set of results in higher dimension, however, suggest that the number of necessary reinitializations for the non-convex AM scheme rapidly increases with N. This state of things clearly calls for further consideration of a purely convex approach.
Future work should address sparsity constraints along the lines of the recent evolutions brought about in radio interferometry (Wiaux et al. 2009a,b; Carrillo et al. 2012) and in optical interferometry (Thiébaut & Giovanelli 2010; Renard et al. 2011). Our approaches should also be studied in a more realistic setting with exact power spectrum and bispectrum measurements in the continuous domain and for different noise statistics, and explicitly compared to existing MiRA and wisard implementations. Software and hardware optimization will also be key to handle large-size images, e.g. using graphics processing units (Baron & Kloppenborg 2010).
The authors thank G. Puy for insightful discussions on optimization algorithms. AA and REC are supported by the Swiss National Science Foundation (SNSF) under grants 205321_138311 and 200021_140861. YW is supported by the Center for Biomedical Imaging (CIBM) of the Geneva and Lausanne Universities, EPFL and the Leenaards and Louis–Jeantet foundations.
We also attempted an alternative non-convex approach consisting of solving the non-linear problem directly for |${\boldsymbol x}$|, using the non-convex projected gradient method proposed by Attouch, Bolte & Svaiter (2013). First, simulations did not produce any meaningful reconstruction and this approach was discarded.
Note that Attouch et al. (2010) prove that this AM approach converges to a critical point of the objective function (7), provided that terms of the form |$\gamma \Vert {u_p}- {\bar{u}_p}\Vert _2^2$| controlling the distance between the current unknown up with respect to its value at the previous iteration |$ {\bar{u}_p}$| are added to the objective function in (8), for any γ > 0. Simulations in the context of the setting described in Section 6 show that the algorithm converges to the same solution for γ ≠ 0 and γ = 0. Other simulations also show that starting the minimization of the three variables with the same random initial point leads to very similar solutions for the three vectors, or for their mean, both in terms of signal-to-noise ratio and in terms of visual quality.
Note that Kofidis & Regalia (2002) provide a proof of convergence of their algorithm for even-order tensors only. Simulations in the context of the setting described in Section 6 show that this procedure systematically converges for our order-3 tensors, and provides significantly better results than a heuristic procedure based on extracting the first eigenvector of |$\mathcal {C}(\mathcal {X}_{\rm NM})$| or performing a sum over two dimensions |$\sum _{jk}[\mathcal {X}_{\rm NM}]_{ijk}$|.
Code and test data are available at https://github.com/basp-group/co-oi
Images from Renard et al. (2011) downloaded from the JMMC service at apps.jmmc.fr/oidata/shared/srenard/