Keywords

1 Introduction

As a chronic disease, central serous chorioretinopathy (CSCR) is one of the leading causes of vision threat among middle-aged male individuals, in which serous retinal detachments such as neurosensory retinal detachment (NRD) are generally presented and treated as a prominent characteristic of CSCR. Spectral-domain optical coherence tomography (SD-OCT) imaging technology can generate 3D cubes and provide more detailed characteristics of disease phenotypes, which has become an important imaging modality for the diagnosis and treatment of CSCR [1]. Because retinal detachment is the separation of the neurosensory retina (NSR) from the underlying retinal pigment epithelium (RPE), state-of-the-art algorithms principally segment NRD regions based on layer segmentation results. For example, Wang et al. [2] utilized fuzzy level-set method to identify the boundaries. Wu et al. [3] presented an OCT Enface fundus-driven subretinal fluid segmentation method. Semi-supervised segmentation methods have also been utilized. Zheng et al. [4] used computerized segmentation combining with minimal expert interaction. Wang et al. [5] proposed a slice-wise label propagation algorithm. Supervised machine learning methods, including random forest [6] and deep learning model [7], have been introduced by treating the fluid segmentation as a binary classification problem. However, most state-of-the-art algorithms rely on the accuracy of retinal layer segmentations.

Because the subretinal fluid in SD-OCT images are typically darker than the surroundings, they can be treated as blob-like structures, which can be efficiently dealt with blob detection algorithms based on scale-space theory. One of the most widely used blob detection algorithm is Laplacian of Gaussian (LoG). The generalized Laplacian of the Gaussian (gLoG) [8] is developed to detect asymmetric structures. To further explore the convexity and elliptic shape of the blobs, Hessian-based LoG (HLoG) [9] and Hessian-based Difference of Gaussians (HDoG) [10] are proposed. Most algorithms mentioned above need to detect the candidate blobs by finding the local optimal within a large range of multi-scale Gaussian scale-space, which is, however, time consuming and sensitive to noise.

In this paper, an unsupervised blob segmentation algorithm is proposed to segment subretinal fluid in SD-OCT images. First, a Hessian-based aggregate generalized Laplacian of Gaussian (HAgLoG) algorithm is proposed by aggregating the log-scale-normalized convolution responses of each individual gLoG filter. Second, two regional features, i.e. blobness and flatness, are extracted based on the aggregate response map. Third, together with the intensity values, the features are fed into the variational Bayesian Gaussian mixture model (VBGMM) [11] to obtain the blob candidates which are voted into the superpixels to get the initial subretinal fluid regions. Finally, an active contours driven by local likelihood image fitting energy (LLIF) [12] with narrowband implementation is utilized to obtain integrated segmentations. Without retinal layer segmentation, the proposed algorithm can obtain higher segmentation accuracy compared to the state-of-the-art methods that rely on the retinal layer segmentation results.

2 Methodology

Figure 1 illustrates the pipeline of our framework. In the preprocessing phase, the B-scan as shown in Fig. 1(a) is firstly smoothed with bilateral filter as shown in Fig. 1(b). Then a mean value based threshold is used to get the region of interest (ROI) as shown in Fig. 1(c). Based on the denoised result, the proposed HAgLoG is carried out with the following three phases: (1) aggregate response construction, (2) Hessian analysis, and (3) post-pruning with VBGMM and LLIF.

Fig. 1.
figure 1

The pipeline of the proposed automatic subretinal fluid segmentation method.

2.1 Aggregate Response Construction

A blob is a region in an image that is darker (or brighter) than its surroundings with consistent intensity inside. Because the NRD in SD-OCT images are typically darker than the surroundings, in this paper, we only consider the blob with brighter surroundings. We modify the general Gaussian function by setting the kernel angle as 0. Based on the general Gaussian function in Ref. [8], let \(G(x,y,\sigma _x,\sigma _y)\) be the 2-D modified general Gaussian function defined as:

(1)

where \(\textit{A}\) is a normalization factor, and \(\sigma _x\) and \(\sigma _y\) are the scales on the x-axis and y-axis directions, respectively. The coordinates of the kernel center are set as zeros without losing generality. Based on Ref. [8], for the 2-D image f, the modified gLoG-filter (as shown in Fig. 1(d)) image \(\bigtriangledown ^2I(x,y,\sigma _x,\sigma _y)\) can be represented as

$$\begin{aligned} \bigtriangledown ^2I(x,y,\sigma _x,\sigma _y) = \left[ \bigtriangledown ^2G(x,y,\sigma _x,\sigma _y)\right] *f(x,y) \end{aligned}$$
(2)

where \(\bigtriangledown ^2\) denotes the Laplacian operator, \(*\) is the convolution operation. To further improve the estimation accuracy, the convolution response is log-scale normalized [8]:

$$\begin{aligned} l = (1+\log (\sigma _x)^\alpha )(1+\log (\sigma _y)^\alpha )\bigtriangledown ^2I(x,y,\sigma _x,\sigma _y) \end{aligned}$$
(3)

Finally, as shown in Fig. 1(e), the aggregated response map L is obtained by summing up all intermediate response maps:

$$\begin{aligned} L(x,y) = \sum _{\sigma _x,\sigma _y}\left[ (1+\log (\sigma _x)^\alpha )(1+\log (\sigma _y)^\alpha )\bigtriangledown ^2I(x,y,\sigma _x,\sigma _y)\right] \end{aligned}$$
(4)

2.2 Hessian Analysis

For a pixel (xy) in the aggregated response L, the Hessian matrix is

$$\begin{aligned} H\left( L(x,y)\right) = \left[ \frac{\partial ^2L(x,y)}{\partial x^2}\quad \frac{\partial ^2L(x,y)}{\partial x\partial y};\quad \frac{\partial ^2L(x,y)}{\partial x\partial y} \quad \frac{\partial ^2L(x,y)}{\partial y^2}\right] \end{aligned}$$
(5)

Therefore, we can identify the dark blobs using the following proposition.

Proposition. In an aggregated response 2D image, every pixel of a dark blob has a positive definite Hessian matrix.

Proof. The Ref. [13] provided detailed relationships between the eigenvalues of Hessian matrix and geometric shape. Specifically, if a pixel belongs to a dark blob-like structure, all the eigenvalues of the corresponding Hessian matrix are positive, that is, the Hessian matrix of the pixel is positive definite.

However, the above proposition is a necessary but not sufficient property to determine the dark blobs. To further reduce the false detection, similar with Ref. [9], we give the following definition.

Definition. A blob candidate in aggregated response image is a 8-connected component of set \(U=\{(x,y)\Vert (x,y)\in L(x,y),T(x,y)=1\}\), where T(xy) is the binary indicator: if the pixel has a positive definite Hessian matrix, then \(T(x,y)=1\); otherwise, \(T(x,y)=0\).

Instead of determining the positive matrix with eigenvalues of H(L(x,y)), we utilize the Sylvester’s criterion, which is a necessary and sufficient criterion, to determine whether a Hermitian matrix is positive-definite. In our Hessian matrix, the leading principal minors are and . Therefore, the Hessian matrix is positive definite if and only if \(D_1>0\) and \(D_2>0\). As a result, based on Proposition and Definition, we can obtain the dark blob candidates with the two leading principal minors of Hessian matrix as shown in Fig. 1(f).

Based on the eigenvalues \(\lambda _1\) and \(\lambda _2\), Refs. [9, 13] defined two geometric features for blob detection: the likelihood of blobness \(R_B\) and the second-order structureness (flatness) \(S_B\) as:

(6)
$$\begin{aligned} S_B = \sqrt{\lambda _1^2+\lambda _2^2} = \sqrt{trace(H(L))^2-2det(H(L))} \end{aligned}$$
(7)

where \(0<R_B\le 1\), when \(\lambda _1=\lambda _2\), \(R_B=1\). \(0<S_B\le 1\). The higher \(S_B\) is the more salient the blob is against the local background [10]. Both features are shown in Fig. 1(g) and (h), respectively.

2.3 Post-pruning

The input feature space is constructed by stacking the blob features \(R_B\) and \(S_B\) with average local intensity \(A_B\), and fed into VBGMM [11] to remove false positive detections and cluster the blob candidates into blob regions and non-blob regions as shown in Fig. 1(j). The detected blob regions are then voted into superpixels (as shown in Fig. 1(k)) obtained by simple linear iterative clustering (SLIC) [14] to get the initial subretinal fluid regions shown in Fig. 1(l). Generally, the subretinal fluid can be observed within several continuous B-scans. Therefore, the segmentation results of three adjacent slices are utilized to make sure the continuity of the subretinal fluid and remove the false detections. Finally, an active contours driven by LLIF [12] with narrowband implementation in Fig. 1(m) is utilized to obtain integrated segmentations in Fig. 1(n).

3 Results

In this paper, 23 longitudinal SD-OCT cube scans with only NRD from 12 eyes of 12 patients acquired with a Cirrus OCT device were used to evaluate the proposed algorithm. All the scans covered a \(6\times 6\times 2\,\mathrm{mm}^3\) area centered on the fovea with volume dimension \(512\times 128\times 1024\). This study was approved by the Institutional Review Board (IRB) of the First Affiliated Hospital of Nanjing Medical University with informed consent. Two independent experts manually drew the outlines of NRD regions based on B-scan images, which were used to generate the segmentation ground truths.

Fig. 2.
figure 2

Comparison of segmentation results overlaid on original B-scan images for five example cases selected from five patients. For each row, the images shown from left to right are the segmenta-tion results obtained by LPHC, SS-KNN, RF, FLSCV, CMF, EFD and the proposed algorithm. The last column shows the manually segmentation results of Expert 1.

Fig. 3.
figure 3

Comparison of segmentation results overlaid on three 3D example cases selected from three patients. For each row, the images shown from left to right are the segmentation surfaces obtained by RF, FLSCV, CMF, EFD and the proposed algorithm. The last column shows the manually segmentation surfaces of Expert 2.

Three criteria were utilized to evaluate the performances: true positive volume fraction (TPVF), positive predicative value (PPV) [3] and dice similarity coefficient (DSC). A linear analysis and Bland-Altman approach was applied for statistical correlation and reproducibility analyses.

Six state-of-the-art methods were compared, including a semi-supervised segmentation algorithm using label propagation and higher-order constraint (LPHC) [5], a stratified sampling k-nearest neighbor classifier based algorithm (SS-KNN) [15], a random forest classifier based method (RF) [6], a fuzzy level set with crosssectional voting (FLSCV) [2], a continuous max flow optimization-based method (CMF) [16] and an Enface fundus-driven method (EFD) [3].

In Fig. 2, five example B-scans selected from five patients were used to show the performance of the proposed model. LPHC can hardly obtain satisfactory segmentations without good initializations. SS-KNN may produce obvious false detections, and cannot guarantee the smoothness of the contour. FLSCV suffers from under segmentation, while RF suffers from false positive segmentations. Both CMF and EFD are sensitive to low contrast. By contrast, the proposed algorithm effectively avoid the under-segmentation and is more robust to low contrast to produce smooth and accurate segmentation results which are highly consistent with the ground truth.

To further highlight the superior performances of the proposed algorithm, Fig. 3 shows the 3-D segmentation surfaces of three example subjects selected from three different patients. Both RF and FLSCV suffer from insufficient segmentations with smaller segmentation volumes. Moreover, FLSCV may produce more misclassifications. Because of the sensitiveness to the low contrast, the surfaces obtained by CMF and EFD contains obvious protrusions. By contrast, the proposed algorithm apparently outperforms other approaches and generates the segmentation results most similar to the ground truths.

Table 1 summarizes the average quantitative results between the segmentations and manual gold standards. Overall, our model is capable of producing a higher segmentation accuracy than the other comparison methods. It should be noted both CMF and EFD rely on the layer segmentation results. Comparatively, without utilizing any layer information, our algorithm can still obtain better segmentations.

Table 1. The summarizations of the quantitative results (mean ± standard deviation) between the segmentations and manual gold standards (two individual experts’ segmentations).

Finally, as shown in Fig. 4, a statistical correlation analysis is carried out by utilizing a linear regression analysis and Bland-Altman approach to compare the segmentation results of the proposed method with the manually segmentations from two experts. From Fig. 4(a) and (c), we can find that the proposed algorithm can produce high correlation with two experts (both \(r^2=0.99\)). The Bland-Altman plots in Fig. 4(b) and (d) also indicate stable agreement between the automated and manual segmentations.

Fig. 4.
figure 4

Statistical correlation analysis. (a) Linear regression analysis between the proposed method and Expert 1. (b) Bland-Altman plot for the proposed method and Expert 1. (c) Linear regression analysis between the proposed method and Expert 2. (d) Bland-Altman plot for the proposed method and Expert 2.

4 Conclusion

In this paper, we propose an automatic and accurate NRD segmentation method in SD-OCT images. Our proposed model moves past the limitations that retinal layer segmentation present, is unsupervised without utilizing any training data, and automatic without using any manual interactions. Consequently, without retinal layer segmentation, the proposed algorithm can produce accurate segmentation results which are highly consistent with the ground truths. Our methods may provide reliable NRD segmentations for SD-OCT images and be useful for clinical diagnosis. Our future work mainly focus on the extension of 3D segmentation.