DETECTION AND CHARACTERIZATION OF EXOPLANETS AND DISKS USING PROJECTIONS ON KARHUNEN–LOÈVE EIGENIMAGES

Rémi Soummer; Laurent Pueyo; James Larkin

doi:10.1088/2041-8205/755/2/L28

1. INTRODUCTION

Imaging faint exoplanets and circumstellar disks is a difficult task limited by the small angular separation and high contrast between the faint astrophysical signal and the residual starlight (Oppenheimer & Hinkley 2009). The first direct images of planets or other interesting localized structures orbiting nearby stars were obtained relatively recently (Marois et al. 2008a, 2010b; Kalas et al. 2008; Lagrange et al. 2009, 2010; Kalas 2011; Janson et al. 2012; Kraus & Ireland 2012). Direct imaging results will significantly expand in the near future with a new generation of ground-based instruments being implemented (Macintosh et al. 2008; Beuzit et al. 2008; Hinkley et al. 2011; Hodapp et al. 2008). Even using advanced starlight suppression systems (e.g., coronagraphs) such as the one designed for the Gemini Planet Imager (Sivaramakrishnan et al. 2001; Soummer 2005; Soummer et al. 2011b), image post-processing remains necessary to achieve the full high-contrast performance (Marois et al. 2008b). Indeed, most recent direct imaging results were enabled by key advances in observational procedures, e.g., angular differential imaging (Marois et al. 2006), simultaneous imaging of contiguous wavelengths (Marois et al. 2000; Sparks & Ford 2002; Pueyo et al. 2012), in conjunction with sophisticated images analysis routines based on the Locally Optimized Combination of Images (LOCI) algorithm (Lafrenière et al. 2007). Although LOCI is extremely powerful at detecting point sources, the very aggressive point-spread function (PSF) subtraction introduces biases on the photometry and astrometry, which have to be calibrated carefully (Marois et al. 2010a). Biases can be reduced significantly using statistical exploration of the algorithm parameter space (Soummer et al. 2011a), a computationally expensive solution, or by modifying the underlying cost function by masking some pixels (Marois et al. 2010a) or using damping penalty terms (Pueyo et al. 2012). LOCI tends to subtract extended structures such as debris disks, and several studies have focused on mitigation strategies, taking advantage of the geometry in edge-on disks or performing an optimization over large zones (Fitzgerald et al. 2007; Thalmann et al. 2010; Buenzli et al. 2010).

In this Letter, we introduce a new reduction algorithm based on projections on a truncated Karhunen–Loève transformation of the reference PSF library. We first introduce the general formalism associated with PSF subtraction and describe the algorithm. We then use Hubble Space Telescope (HST) NICMOS coronagraphic data to illustrate its performances and highlight its advantages for the characterization of point sources and the discovery of faint extended structures around nearby stars.

2. THE KL IMAGE PROJECTION (KLIP) ALGORITHM

2.1. Optimal PSF Subtraction

An astronomical PSF can be seen as an N-dimensional stochastic process indexed over a scalar quantity ψ = ψ(t, λ, θ, ϕ, T_exp, m_Star), which is representative of the state of the instrument–telescope system (for instance, time, wavelength, field rotation angle, wavefront error, exposure time, and star magnitude) at the time of the exposure. With N pixels in the image, the PSF intensity can be written as

$\begin{equation} I_{\psi }(n)=I(p[n],q[n],\psi), \end{equation} \tag{ 1 }$

where we have rearranged the detector coordinates in one dimension using the pixel mapping functions p and q, and the pixel index n (n ∈ [1, n]).

We assume an observation sequence that yields a target image

$\begin{equation} T(n) = I_{\psi _0}(n) + \epsilon \, A(n), \end{equation} \tag{ 2 }$

which might ( epsilon = 1) or might not ( epsilon = 0) contain faint astronomical signal A(n), and a set of reference images

$\begin{equation} \lbrace R_k(n) = I_{\psi _k}(n) \rbrace _{k = 1...K}. \end{equation} \tag{ 3 }$

ψ₀(n) and ψ_k(n) represent the realizations of the state of the telescope-instrument system, respectively, for the target image and the set of references.

Using prior information about the observation strategy, these references have been selected so that they do not contain any astronomical signal in the search region $\mathcal {S}$ (dimension $N_{\mathcal {S}}$ pixels) of the PSF. Moreover, the observations have been carried out at neighboring states of the telescope/instrument system: |ψ_i − ψ_j| < δ for all i, j ∈ [0, K], where δ is a small quantity that can be adjusted in practice using the correlation between images as a proxy.

The goal of PSF subtraction is to reconstruct $I_{\psi _0}(n)$ using the set of reference images and to subtract it from the target image since $\epsilon A(n) = T - I_{\psi _0}(n)$ . Unfortunately, I_ψ(n) is a continuous random process of ψ and thus an infinite amount of references would be needed to reconstruct $I_{\psi _0}(n)$ exactly. In practice, we can only seek to determine the best estimate $\hat{I}_{\psi _0}(n)$ of $I_{\psi _0}(n)$ using the limited information contained in the K references.

Assuming that each reference is unique, the ensemble $\lbrace I_{\psi _k} \rbrace _{k = 1...K}$ spans $\mathcal {I}_{K}$ , a K-dimensional sub-space of $\mathbb {R}^{N_{\mathcal {S}}}$ , which has an infinity of orthonormal bases {Z_k}_{k = 1...K}. This problem of estimating $\hat{I}_{\psi _0}(n)$ can be formulated as follows.

What is the basis set {Z_k(n)}_{k = 1...K} most likely to minimize the distance between a random realization of I_ψ(n) (ψ = ψ₀ in our case) and $\mathcal {I}_{K}$ ? More formally,

$\begin{equation} \min _{\lbrace Z_k\rbrace } \left\lbrace E_{\psi }\left[ \sum _{n = 1}^{N_{\mathcal {S}}} \left(I_{\psi } - \sum _{k = 1}^{K} <I_{\psi },Z_k>_{\mathcal {S}} Z_k (n)\right)^2 \right] \right\rbrace, \end{equation} \tag{ 4 }$

where E_ψ[ · ] stands for the expected value over the telescope realizations and $\langle \cdot,\cdot \rangle _{\mathcal {S}}$ denotes the inner product over $\mathcal {S}$ . This is a well-known problem in signal processing and, assuming that all the images are of zero mean over $\mathcal {S}$ , the optimal basis set {Z^KL_k}_{k = 1...K} is given by the Karhunen–Loève transform of $\lbrace I_{\psi _k} \rbrace _{k = 1...K}$ (Karhunen 1947; Loève 1948). Moreover, if one chooses to only describe $I_{\psi _0}$ over a K_klip < K dimensional sub-space, then the optimal basis set is given by the truncated Karhunen–Loève transform of the references, $\lbrace Z^{{\rm KL}}_k\rbrace _{k = 1...K_{{\rm klip}}}$

2.2. The Algorithm

The KL image projection (KLIP) algorithm consists of the following steps.

1.
Partition the target T(n) and reference R_k(n) images in an ensemble of search areas $\mathcal {S}$ , and subtract their average values so that they have zero mean over each $\mathcal {S}$ .
2.
Compute the Karhunen–Loève transform of the set of reference PSFs {R_k(n)}_{k = 1...K}:
$\begin{equation} Z^{{\rm KL}}_k(n) = \frac{1}{\sqrt{\Lambda _k}}\sum _{p=1}^{K} c_k(\psi _p) R_p(n), \end{equation} \tag{ 5 }$
where the vectors C_k = [c_k(ψ₁)...c_k(ψ_K)] are the eigenvectors of the K-dimensional covariance matrix E_RR of the R_k(n) references over $\mathcal {S}$ , and {Λ}_{k = 1...K} are its eigenvalues.
$\begin{eqnarray} E_{RR}[i,j] &=& \sum _{n = 1}^{N_{\mathcal {S}}} R_i(n)R_j(n) \end{eqnarray} \tag{ 6 }$

$\begin{eqnarray} E_{RR}\cdot C_k &=& \Lambda _k C_k, \; \Lambda _1>\Lambda _2>....>\Lambda _N. \end{eqnarray} \tag{ 7 }$
3.
Choose a number of modes K_klip to keep in the estimate $\hat{I}_{\psi _0}(n)$ .
4.
Compute the best estimate $\hat{I}_{\psi _0}(n)$ of the actual PSF $I_{\psi _0}(n)$ from the projection of the science target T(n) on the {Z^KL_k(n)}_{k = 1...K_klip} Karhunen–Loève basis:
$\begin{equation} \hat{I}_{\psi _0}(n) =\sum _{k = 1}^{K_{{\rm klip}}} <T,Z^{{\rm KL}}_k>_{\mathcal {S}} Z^{{\rm KL}}_k(n). \end{equation} \tag{ 8 }$
5.
Calculate the final image $F(n) = T(n) - \hat{I}_{\psi _0}(n)$ ,
$\begin{eqnarray} F(n) &=& \left(I_{\psi _0}(n) - \sum _{k = 1}^{K_{{\rm klip}}} <I_{\psi _0},Z^{{\rm KL}}_k>_{\mathcal {S}} Z^{{\rm KL}}_k (n)\right) \nonumber \\ &&+ \epsilon \left(A(n) - \sum _{k = 1}^{K_{{\rm klip}}} <A,Z^{{\rm KL}}_k>_{\mathcal {S}} Z^{{\rm KL}}_k (n)\right). \qquad \end{eqnarray} \tag{ 9 }$

2.3. Throughput and Signal-to-noise Ratio

Equation (9) provides fundamental information pertaining to the optimal choice of K_klip for a given search zone $\mathcal {S}$ . The variance in the search zone $\mathcal {S}$ after the PSF subtraction is given by

$\begin{eqnarray} \sigma ^2_\mathcal {S} &=& ||I_{\psi _0}||^2 \left({1 - \sum _{k = 1}^{K_{{\rm klip}}}\left(\frac{ <I_{\psi _0},Z^{{\rm KL}}_k>_{\mathcal {S}}}{||I_{\psi _0}||} \right)^2}\right) \end{eqnarray} \tag{ 10 }$

$\begin{eqnarray} \hspace{-3.7pc}& \simeq & ||I_{\psi _0}||^2 \left({1 - \frac{\sum _{k = 1}^{K_{{\rm klip}}} \Lambda _k}{\sum _{k = 1}^{K} \Lambda _k}}\right), \end{eqnarray} \tag{ 11 }$

which is a decreasing function of K_klip. In Figure 1, we show the impact of K_klip on the residual variance integrated over entire KLIP-reduced images (i.e., $\mathcal {S}$ is the full image), using HST NICMOS coronagraphic data preselected to have no astrophysical signal. Figure 1 shows that the empirical residual variance after PSF subtraction follows the theoretical estimate in Equation (11).

**Figure 1.** Normalized total residual variance over the entire image after PSF subtraction. The black curve corresponds to the theoretical expression $\sigma ^2_S/||I_{\psi _0}||^2$ based on the corollary of the Karhunen–Loève theorem (see Equation (11)). The color curves correspond to the empirical total residual variance after subtraction for several target stars from the *HST* NICMOS archive without any astrophysical signal around them. In the case of a target with an astrophysical signal, the variance saturates to a minimum level and does not follow the theoretical expression. The theoretical expression from Equation (11) can therefore be used to estimate semianalytically the residual noise after subtraction. If KLIP is applied in small regions (e.g., using zones as with LOCI) this expression can be used to estimate the noise locally around a putative source in the target image.
Download figure:
Standard image High-resolution image

Similarly, Equation (9) provides a simple expression for the signal:

$\begin{equation} S = \sum _{ A(n) \ne 0} \left(A(n) - \sum _{k = 1}^{K_{{\rm klip}}} <A,Z^{{\rm KL}}_k>_{\mathcal {S}} Z^{{\rm KL}}_k(n) \right). \end{equation} \tag{ 12 }$

For the dominant modes (i.e., k ≪ K) $< A,Z^{{\rm KL}}_k>_{\mathcal {S}}\,\simeq 0$ since the morphology of the astronomical signal cannot be reproduced by a PSF realization. Indeed the PSF of a faint planet is localized over a few pixels and thus almost orthogonal to the main modes of the telescope's PSF realizations. Naturally, this argument depends on the spatial morphology of the signal: in Figure 2 we show that the algorithm throughput decreases when K_klip increases for several types of faint astrophysical object. Given priors on the shape of the astrophysical signal (i.e., planet position and flux, or a disk model), it is possible to estimate directly the signal-to-noise ratio (S/N) as a function K_klip when $\mathcal {S}$ is small enough so that Equation (11) represents the local noise in the vicinity of the source. It is therefore possible to choose K_klip a priori in order to optimize the S/N for a given putative source in the image.

**Figure 2.** KLIP throughput based on the expression in Equation (9) from the projection of the astrophysical signal A onto the KL basis $\langle A,Z^{{\rm KL}}_k\rangle_{\mathcal {S}}$ . Since a planet's signal is mostly orthogonal to the main KL modes the throughput is high, as shown for a planet located at ∼1.5 arcsec from the star (top curve). For more extended sources the throughput is reduced, as illustrated for a simple geometrical model corresponding to the HD 181327 disk (middle curve). The bottom curve corresponds to a hypothetical ring-like disk about three times wider than HD 181327. Note that here $\mathcal {S}$ corresponds to an entire *HST* NICMOS image, the throughput will be smaller when using smaller partitions of the image. However because it is calculated analytically for a source model independent from its flux, it can be calibrated easily.
Download figure:
Standard image High-resolution image

**Figure 2.** KLIP throughput based on the expression in Equation (9) from the projection of the astrophysical signal A onto the KL basis $\langle A,Z^{{\rm KL}}_k\rangle_{\mathcal {S}}$ . Since a planet's signal is mostly orthogonal to the main KL modes the throughput is high, as shown for a planet located at ∼1.5 arcsec from the star (top curve). For more extended sources the throughput is reduced, as illustrated for a simple geometrical model corresponding to the HD 181327 disk (middle curve). The bottom curve corresponds to a hypothetical ring-like disk about three times wider than HD 181327. Note that here $\mathcal {S}$ corresponds to an entire *HST* NICMOS image, the throughput will be smaller when using smaller partitions of the image. However because it is calculated analytically for a source model independent from its flux, it can be calibrated easily.
Download figure:
Standard image High-resolution image

3. DISCUSSION

3.1. Forward Modeling with KLIP

In practice the optimal search area $\mathcal {S}$ might be too large to use Equations (11) and (12) to pre-select K_klip. Instead K_klip can be tuned by inspecting the S/N of putative sources in reduced images. However, the second term of Equation (9) carries all the information necessary to characterize faint astrophysical signal once it has been detected. Indeed, the impact of the algorithm on the signal only depends on the projection of A(n) on the $\lbrace Z^{{\rm KL}}_k\rbrace _{k=1..K_{{\rm klip}}}$ and is not a function of $I_{\psi _0}$ . As a consequence KLIP enables forward modeling of detected planets and disks by fitting directly an astrophysical model A_ξ(n) to the final reduced images F(n), where ξ represents the model parameters (e.g., astrometry and photometry for a planet, or geometry, grain distribution for a disk). ξ can be obtained by minimizing the following cost function:

$\begin{equation} \min _{\lbrace \xi \rbrace } \left\lbrace \sum _{n = 1}^{N_{\mathcal {S}}} \left(F(n)-A_{\xi }(n) + \sum _{k = 1}^{K_{{\rm klip}}} <A_{\xi },Z^{{\rm KL}}_k>_{\mathcal {S}} Z^{{\rm KL}}_k(n)\right)^2\right\rbrace.\vspace*{3pt} \end{equation} \tag{ 13 }$

A similar approach was proposed by Marois et al. (2010a) using LOCI, but it presents degeneracies because the LOCI coefficients are not guaranteed to be independent of $I_{\psi _0}$ or A. Such degeneracies do not arise in the case of KLIP because the effect of the algorithm on the astrophysical source is completely determined from Equation (13) and because the basis {Z^KL_k(n)}_{k = 1...K_klip} is independent of the science target. Note that Equation (13) assumes a noiseless signal. Optimal observers (Kasdin & Braems 2006; Caucci et al. 2007, 2009) can mitigate the impact of noise in the final estimate of ξ, and their implementation is greatly facilitated by the fact that Equation (13) only consists of propagating A_ξ(n) through a linear filter. Discussion regarding the practical implementation and performances of such observers are however beyond the scope of this study.

The left panel of Figure 3 shows an archival HST NICMOS image of HR 8799 reduced with KLIP using annular search zones. While the three planets b, c, and d are visible they are not detected unambiguously: the result reported by Soummer et al. (2011a) used a combination of thousands of LOCI reductions, and took advantage of two HST roll angles to minimize the false alarm probability. One could envision a similar parameter search with KLIP. The computational cost of such an exercise would however be reduced since KLIP does not require the introduction of "optimization zones" and "subtraction zones" (Lafrenière et al. 2007), which reduces the dimensionality of the algorithm's parameter exploration. More importantly, the astrometry reported in Soummer et al. (2011a) was the result of a lengthy characterization of the biases associated with LOCI, involving thousands of reductions with synthetic companions. On the right panel of Figure 3, we illustrated how one can determine the astrometric location and the photometry of HR8799bcd with forward modeling in a single KLIP-reduced image. In this example, we performed a simple χ² minimization of the position and flux of each planets, but this framework enables easy implementation of more appropriate modeling using Markov Chains Monte Carlo. KLIP thus provides astrophysical estimates which do not contain any bias arising from self-subtraction in the reduction algorithm. The uncertainties on the astrophysical parameters are solely driven by the local noise in the reduced image. This can be addressed by optimizing the KLIP parameters to increase local S/N, using several exposures and taking advantage of observers that are optimal with respect to the residual noise.

3.2. Similarities Between LOCI and KLIP

Principal component analysis has been extensively used in astronomy to reduce the dimensionality of large data sets, e.g., Cowan et al. (2009) and Connolly et al. (1995). Principal components can also be used directly with LOCI or any of its variants to reduce the dimensionality of the reference set, using KL eigenimages instead of reference PSFs.

The cost function associated with KLIP (Equation (4)) is very similar to the cost function associated with LOCI (Lafrenière et al. 2007). However, LOCI minimizes the distance between the science target and the reference PSFs, whereas KLIP minimizes, in the statistical sense, the distance between any PSFs I_ψ in the close vicinity of references library and the K_klip-dimensional orthonormal basis-set {Z^KL_k(n)}_{k = 1...K_klip}. It is in principle possible to include the target to the reference library before decomposition into principal components, but in this case astrophysical sources can be completely subtracted and forward modeling cannot be carried out. While such solutions might yield slightly deeper contrasts, they might yield astrophysical estimates with biases that are difficult to calibrate. In this sense LOCI, with well-chosen parameters, might be optimal for detection while KLIP is more robust for characterization.

Because the set of {R_k}_{k = 1...N} is not an orthonormal basis, LOCI requires the inversion of the covariance matrix E_RR, which is typically ill conditioned. This leads to noise amplification if the inversion is not properly regularized. Several approaches to regularization exist, for example, eigenvalue truncation (Marois et al. 2010a) or Tikhonov regularization (Pueyo et al. 2012; Soummer et al. 2011a). With KLIP the regularization is implicit with the truncation of the KL basis. It is also more easily controlled because the KL set remains optimal for any value of K_klip. In addition, simple regularization schemes do not reduce the dimensionality of the covariance matrix, whereas KLIP reduces the number of references and therefore improves speed. Once the KL transform has been generated for a given search area geometry, the remaining free parameter K_klip can be explored with minimal computation burden (projections). Depending on the implementation of the algorithm and the type of data, KLIP can lead to considerable speed increases, especially when considering KLIP does not require us to process synthetic sources for characterization of detected signal.

As far as detection is concerned the most striking difference between LOCI and KLIP resides in disk imaging. Using HST Cycle 10 NICMOS archival data of the disk around HD 181327, we performed a global-LOCI and global-KLIP reduction over the entire image, using a carefully pre-selected reference library of 232 PSFs. The disk edges imaged with KLIP, Figure 4, are significantly better constrained when compared with state of the art classical PSF subtraction (Schneider et al. 1999). For comparison, we show in Figure 4 a LOCI reduction of the same object without any regularization: the self-subtraction degrades the rich astrophysical information contained in the ring's edge. It is important to note that with appropriate regularization it is possible to obtain with LOCI a very similar result to KLIP. However, with LOCI it is not easy to model the astrophysical signal since there is no guarantee that the algorithm configuration optimal for detection will not bias the astrophysical signal.

**Figure 4.** Left: image of HD 181327 using KLIP from a carefully selected reference library of 232 PSFs, using 35 KLIP coefficients. For comparison, the right panel shows the image obtained with LOCI without regularization using the same data set. We were able to obtain a very similar image using appropriate regularization with LOCI. However, there is no guarantee that the structure detected in a reduced LOCI image is not an artifact of the regularization. KLIP circumvents this degeneracy and enables forward modeling.
Download figure:
Standard image High-resolution image

4. CONCLUSION

In this Letter, we have introduced a new algorithm for high-contrast image post-processing. KLIP is a cousin of existing methods (e.g., LOCI) in the sense that it builds, using a quadratic cost function, a synthetic reference PSF as a linear combination of references from a library. KLIP involves a truncated Karhunen–Loève transform of the reference library in order to produce an optimal set of orthogonal references. The science target is then projected onto a truncated set of these eigenimages to generate the synthetic reference. We used HST NICMOS data to show that KLIP provides detectability performances similar to LOCI. In addition to reducing the dimensionality of the data with the KL truncation, KLIP provides a linear framework that enables forward modeling of astrophysical sources by fitting directly an astrophysical model to the PSF-subtracted data, without introducing degeneracies. For example, the algorithm throughput for a planet candidate can be calculated semianalytically and independently of the planet signal without having to inject synthetic companions in a reference PSF. The properties of the KL transform enable a theoretical expression of the S/N in subtracted images, provided that KLIP is applied in sufficiently small zones to be representative of the local noise around the planet candidate. The algorithm was illustrated using HST data, but it is also applicable to ground-based integral-field spectrographs data, and preliminary tests show that a local application is preferable in this case because of the lesser degree of correlation between PSFs.

The authors also thank J. B. Hagan, M. Perrin, S. Lonsdale, T. Comeau, and M. Russ for work and help on pipeline development; G. Schneider and D. Hines for discussions on disk imaging with NICMOS, and for providing the LAPL reference PSF library. Support for Program No. HST-AR-12652.01-A was provided by NASA through a grant from the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Incorporated, under NASA contract NAS5-26555. This work was also performed in part under contract with the Jet Propulsion Laboratory (JPL) funded by NASA through the Sagan Fellowship Program. JPL is managed for NASA by the California Institute of Technology. The authors also thank the participants of the Exoplanet Imaging Workshop for interesting discussions with support from the National Academies Keck Futures Initiative.

DETECTION AND CHARACTERIZATION OF EXOPLANETS AND DISKS USING PROJECTIONS ON KARHUNEN–LOÈVE EIGENIMAGES

Article metrics

Permissions

Author e-mails

Author affiliations

Dates

ABSTRACT

1. INTRODUCTION