Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Underwater dynamic polarization imaging without dependence on the background region

Open Access Open Access

Abstract

Active-polarization imaging holds significant promise for achieving clear underwater vision. However, only static targets were considered in previous studies, and a background region was required for image restoration. To address these issues, this study proposes an underwater dynamic polarization imaging method based on image pyramid decomposition and reconstruction. During the decomposition process, the polarized image is downsampled to generate an image pyramid. Subsequently, the spatial distribution of the polarization characteristics of the backscattered light is reconstructed by upsampling, which recovered the clear scene. The proposed method avoids dependence on the background region and is suitable for moving targets with varying polarization properties. The experimental results demonstrate effective elimination of backscattered light while sufficiently preserving the target details. In particular, for dynamic targets, processing times that fulfill practical requirements and yield superior recovery effects are simultaneously obtained.

© 2024 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Underwater active polarization imaging has great potential for various practical applications, such as marine environmental monitoring and underwater engineering operations [1]. However, the absorption and scattering of light by water and suspended particles considerably limit the effectiveness of underwater imaging. Absorption rapidly diminishes the target light, while scattering generates veiling light that obscures the target [2]. The challenges posed by absorption can be addressed by increasing the intensity of the incident light; however, eliminating the scattered light is difficult.

Various methods have been proposed to address this fundamental challenge, including range-gated imaging [3], synchronous scanning imaging [4], indirect time-of-flight imaging [5], and ghost imaging [6]. Owing to the partially polarized properties of backscattered light [7], polarization imaging effectively mitigates backscatter and achieves clear underwater vision. Coupled with the advantages of a compact system and high cost efficiency, this has elevated it to become a research focus [810]. Among the various polarization-based imaging methods, the degree of polarization (DOP) of backscattered light (Pscat) is significant because the accuracy of the estimation of this parameter directly affects the imaging quality. Schechner et al. obtained Pscat by selecting a background region and calculating the average pixel values in that area [11]. Because passive imaging does not allow all-weather imaging and may have low brightness in the deep sea, Treibitz introduced active polarization illumination based on Schechner’s research [12]. Subsequently, valuable research has been conducted to emulate this move [1319]. However, each of these methods assumes that Pscat is constant across the scene, rendering them inappropriate when the backscatter exhibits spatially variable polarization [20]. To overcome this limitation, Hu et al. employed an extrapolation method to infer the global distribution of Pscatin a selected background region [21]. Wei et al. utilized a polynomial fitting method to establish an underwater polarimetric imaging model based on the Stokes vector [22]. Although the fitting approach accurately estimated the spatial variations in the backscattered light, selecting a suitable background region was difficult in strongly scattering environments. Therefore, Wei et al. analyzed the distribution of backscattered light and presented a solution using low-rank sparse matrix decomposition [23].

The aforementioned methods require a background region in the image, which implies that they fail when the target occupies the entire field of view. To address this issue, Wang et al optimized the physically feasible region to complete automatic underwater image recovery without background region [24]. However, the DOP of the direct transmission light was assumed to be low or zero. Zhao et al. presented a new method that could simultaneously obtain the DOP of the target light and backscattered light [25]. However, these two parameters were treated as constants. The Stokes decomposition scheme designed by Li et al. provided a novel means of addressing the problem that Pscatis single-valued [26]. However, incident light with two orthogonal polarization orientations was required. Moreover, most studies in this field primarily focused on static imaging, and only a few methods were available for imaging moving targets [27,28]. Nevertheless, strong scattering can lead to severe aliasing of the target light by backscattered light, which may compromise the effectiveness of these methods.

In this study, to enable dynamic imaging without background region, an image pyramid decomposition and reconstruction method is proposed to recover degraded underwater images that are completely occupied by the moving target. Initially, a polarized image is chosen as the bottom image. Subsequently, an image pyramid is created by downsampling the bottom image. The target information is effectively eliminated by applying downsampling and identifying the black pixels. This allows the reconstruction of the spatial distribution of backscattered light through upsampling, which enables the recovery of the clear scene. Experiments with moving targets having different polarization properties are conducted to verify the effectiveness of the proposed method. Qualitative and quantitative results demonstrate exceptional and robust descattering performance. Furthermore, the rapid processing speeds of our proposed method indicate its significant potential for practical applications.

2. Methodology based on image pyramid decomposition and reconstruction

In the traditional active polarization imaging model [12], the clear scene can be described as:

$$T = \frac{1}{{{P_{scat}} - {P_{obj}}}}[{I_ \bot }(1 + {P_{scat}}) - {I_\parallel }(\textrm{1} - {P_{scat}})].$$
where T represents the target light, that is, the clear scene; ${I_ \bot }$ and ${I_\parallel }$ are images of polarization states that are orthogonal and identical to the incident polarized light, respectively. Pscat and Pobj denote the DOP of the backscattered light and target light, respectively.

According to Eq. (1), the clear scene can be recovered from degraded images if the four parameters (${I_\parallel }$, ${I_ \bot }$, Pscat, and Pobj) are obtained. The Stokes vector is selected as the input image for the entire process to provide a complete description of the polarization characteristics of light.

$${S_0} = {I_0} + {I_{90}} = {T_p} + {T_n} + {B_p} + {B_n},$$
$${S_1} = {I_0} - {I_{90}} = {T_p}\cos 2\alpha + {B_p}\cos 2\beta ,$$
$${S_2} = {I_{45}} - {I_{135}} = {T_p}\sin 2\alpha + {B_p}\sin 2\beta .$$
where I0, I45, I90, and I135 are images corresponding to polarization directions of 0$^{\circ}$, 45$^{\circ}$, 90$^{\circ}$, and 135$^{\circ}$, respectively; B is the backscattered light; and the subscripts p and n represent the completely polarized and non-polarized parts of light, respectively. $\alpha $ and $\beta $ denote the angle of polarization (AOP) of the target light and backscattered light, respectively. Because circularly polarized light accounts for a small percentage of the overall treatment process, only the first three components are provided, whereas S3 is neglected [22].

According to Eqs. (2)–(3), ${I_0}$ and ${I_{90}}$ can be obtained via the Stokes vector:

$${I_0} = \frac{{{S_0} + {S_1}}}{2} = {T_p}{\cos ^2}\alpha + {B_p}{\cos ^2}\beta + \frac{1}{2}{T_n} + \frac{1}{2}{B_n},$$
$${I_{90}} = \frac{{{S_0} - {S_1}}}{2} = {T_p}{\sin ^2}\alpha + {B_p}{\sin ^2}\beta + \frac{1}{2}{T_n} + \frac{1}{2}{B_n}.$$

The subscripts 0 and 90 indicate the same and orthogonal states of polarization as that of the incident light, respectively. Thus, I0 and I90 represent the desired ${I_\parallel }$ and ${I_ \bot }$, respectively. Pscatplays a crucial role in eliminating backscattered light. Hence, estimating the global distribution of Pscat requires more attention.

From Eqs. (2)–(6), the target and background are always superimposed in the available images, which hinders estimation. The differences in the properties of the target and backscattered lights can be utilized to address this issue. In [28], the change in the texture of the target signal is more significant. Therefore, the target light can be assumed as more discrete, whereas the backscattered light exhibits continuity across the entire image. Image downsampling results in the loss of target information during the formation of the image pyramid, which implies that the hindrance of the target is gradually removed. Subsequently, the spatial variation in the backscattered light can be reconstructed by upsampling.

Considering the advantages of polarization difference and filtering, S0 is directly excluded. According to [7], I0 cannot exert polarization filtering effects. Thus, only S1, S2, and I90 can serve as the bottom images of the pyramid. The decomposition process can be further simplified in many instances. For targets with strong depolarization abilities, the target information is eliminated during image subtraction, allowing only I90 to be downsampled. For targets with weak depolarization abilities, scanty target information is available in S2 and I90 [29], making S1 a suitable option.

According to [29], for highly polarized targets, the polarized portion of the target light is concentrated along ${0^ \circ }$, that is, $\alpha $ is ${0^ \circ }$. In this case, some components in Eqs. (4) and (6) are 0:

$${T_p}\sin 2\alpha = {T_p}\sin {0^ \circ } = 0,$$
$${T_{90}} = {T_p}{\sin ^2}\alpha = {T_p}{\sin ^2}{0^ \circ } = 0.$$

The unpolarized part of target light is ignored because of its low intensity. Hence, the processing flow of targets with weak depolarization abilities are:

$${S_1} = {T_p}\cos 2\alpha + {B_p}\cos 2\beta ,$$
$${S_2} = {B_p}\sin 2\beta ,$$
$${I_{90}} = {B_p}{\sin ^2}\beta + \frac{1}{2}{B_n},$$
$$S_1^{j - i - 1} = {[S_1^{j - i}]_{ \downarrow 2}}\textrm{ }i = 0,1,\ldots j - 2.$$
where $S_1^{j - i}$ and $S_1^{j - i - 1}$ are images on different layers of the pyramid; S1 acts directly as the bottom image $S_1^j$; and ${[{} ]_{ \downarrow 2}}$ represents the downsampling of rows and columns with a sampling interval of two. It is important to emphasize that target information cannot be completely removed solely by downsampling. In other words, the top image retains the main texture features, which may introduce errors into the reconstruction. Inspired by the black pixels mentioned in [30], a portion of the pixels in the top image $S_1^1$ was searched to enhance the estimation accuracy. Certain targets contain rich patterns with different reflectivities. In the positions occupied by the patterns with the lowest reflectance, the target light is almost completely absorbed, which allows the corresponding pixels to record only the backscattered light. Although these black pixels may be few, the entire intensity distribution can be obtained by polynomial fitting because the backscattered light is continuous across the field of view. Owing to the relatively constant polarization properties of backscattered light in the depth direction [22], a search by column was used to find the black pixels. For simplicity, the black pixel is denoted by BP. The process of searching for BP can be described as follows:
$$\textrm{Arange} = \max (S_1^1(c)) - \min (S_1^1(c)),$$
$$Aupbound = \max (S_1^1(c)) - Arange \cdot m,$$
$$BP = \{{p|p \le Aupbound, {p \in S_1^1(c)} \}} .$$

Here, $S_1^1(c)$ denotes a particular column of $S_1^1$; max() and min() represent finding the maximum and minimum values, respectively; p represents the pixel; and m is a scale factor that controls the magnitude of Aupbound. Possible errors caused by directly using the pixel with the minimum value as BP are avoided by setting m and Aupbound.

Next, the spatial distribution of backscattered light is estimated:

$$S_1^1(B) = \sum\limits_{{o_1},{o_2}}^o {{p_{{o_1}{o_2}}}{x^{{o_1}}}{y^{{o_2}}}} ,$$
$$S_1^{j - i + 1}(B) = {[S_1^{j - i}(B)]_{ \uparrow 2}}\textrm{ }i = 1,\ldots j - 1.$$
where $S_1^1(B)$ represents the global variation of backscattered light on the top layer; (x,y) are the coordinates of BP; ${p_{{o_1}{o_2}}}$ and o denote the coefficient and order of the polynomial function, respectively. The optimal values are determined using the least-squares method. ${[{} ]_{ \uparrow 2}}$ represents the upsampling of rows and columns, which is used to obtain new pixels by calculating the average values of the neighboring pixels. The reconstructed image $S_1^j(B)$ is assumed to contain only the information of the backscattered light:
$$S_1^j(B) = {B_p}\cos 2\beta .$$

In conjunction with Eqs. (10) and (11), the spatial distribution of backscattered light can be obtained with high accuracy:

$$\beta = \frac{1}{2}\arctan (\frac{{{S_2}}}{{S_1^j(B)}}),$$
$${B_p} = \sqrt {{{[S_1^j(B)]}^2} + {{({S_2})}^2}} ,$$
$${B_n} = 2({I_{90}} - {B_p}{\sin ^2}\beta ).$$

For targets with strong depolarization abilities, the diffuse reflection occurring on the target surface results in comparable intensities of the reflected light along each polarization direction. Therefore, the target-related information in S1 and S2 is eliminated in image subtraction. Based on this, appropriate changes in the processing flow are required:

$$\beta = \frac{1}{2}\arctan (\frac{{{S_2}}}{{{S_1}}}),$$
$${B_p} = \sqrt {{{({S_1})}^2} + {{({S_2})}^2}} ,$$
$${B_n} = 2[I_{90}^j(B) - {B_p}{\sin ^2}\beta ].$$
where $I_{90}^j(B)$ represents the reconstructed bottom image.

The global distribution of Pscatcan be obtained by dividing Bp and Bn, that is:

$${P_{scat}} = \frac{{{B_p}}}{{{B_p} + {B_n}}}.$$

Therefore, only Pobj remains to be determined. This parameter is assigned a constant value because it merely contributes a scale factor to the brightness of the recovery result [10].

Finally, the four parameters obtained (I0, I90, Pscat,and Pobj) are substituted into Eq. (1) to recover the clear scene. The flowchart of the proposed method is shown in Fig. 1. Here, a target with weak depolarization abilities is taken as an example.

 figure: Fig. 1.

Fig. 1. Flowchart of the proposed method.

Download Full Size | PDF

As shown in Fig. 1, the proposed method is free from the restrictions of the background region. Meanwhile, the spatial distribution of the backscattered light is reconstructed, which enables it to address non-uniform variations. The entire process can be automated, making it applicable to dynamic imaging. These advantages suggest that practical applications of proposed method could be highly successful.

3. Experiments and results

3.1 Experimental setup

To validate the effectiveness of the proposed method, experiments were conducted using different targets. The experimental setup in Fig. 2 include an LED combined with a polarization state generator (PSG) that emitted a polarized light beam at a central wavelength of 550 nm. A polarimetric camera (FLIR BFS-U3-51S5P-C) is positioned alongside these devices to acquire the Stokes vector. The exposure time was set to 0.005 s to enable proper brightness.

 figure: Fig. 2.

Fig. 2. Schematic of the underwater polarization imaging experiments.

Download Full Size | PDF

To simulate a turbid underwater environment, skimmed milk and water were added to the water tank, whose size was 180 cm${\times} $60 cm${\times} $60 cm. The water level was 35 cm, and 160 and 200 ml of milk were added to obtain low and high turbidity levels, respectively. Following the formulas presented in [31], the scattering coefficients of the two turbidities were 0.059/cm and 0.074/cm. Black light-absorbing paper has been pasted to the inside of the tank to absorb reflected light from the surface. All the targets used in the experiments were 80 cm from the front wall of the water tank, which corresponds to 4.74 and 5.92 scattering mean lengths in the two different turbidities, separately.

3.2 Recovery results of different methods in static imaging

First, static imaging experiments were conducted to validate the effectiveness of the proposed method. In the first set of experiments, a model fish was used as a target. Light was primarily diffusely reflected from the surface of the model; therefore, its DOP was omitted. Because imaging speed was not required, a polarization state analyzer (PSA) with higher extinction ratios was used to acquire images, resulting in better descattering. For comparison, the recovery results of CLAHE, Treibitz’s method, and Zhao’s method are shown in Fig. 3.

 figure: Fig. 3.

Fig. 3. Imaging results of the model fish in low and high turbidity environments, which correspond to the results of intensity imaging, CLAHE [32], Treibitz’s method [12], Zhao’s method [25], and our method. For a better comparison, one region is cropped out and placed in the bottom.

Download Full Size | PDF

In Fig. 3, the veiling effect of the backscattered light fogs the intensity image as a whole over, which obscures the target. When turbidity increases, detailed information about the fish is impossible to obtain. CLAHE primarily stretches the gray levels of the image, while backscattered light is not eliminated. Consequently, the fish remains obscured by the dense haze, resulting in an indistinct visual appearance. Treibitz’s method partly suppresses backscattered light when the turbidity of the medium is low. The elimination is particularly evident in the head of the fish. However, at high turbidity levels, the haze that should have been eliminated reappears throughout the entire image. This is attributed to Pscat being the average value of one randomly selected background region, which makes it unsuitable for removing all haze that exists in the image. Zhao’s method searches for the optimal value of Pscat by using a genetic algorithm. Backscattered light is effectively eliminated, particularly under the condition of low-concentration scattering media, because the background area is free from restrictions. However, target information is lost. In the enlarged view, parts of the fins are erroneously eliminated as the background, and the support rod blends completely into the background, thereby diminishing the performance of the method. Benefiting from the accurate reconstruction of backscattered light, our approach is the closest to the expected goal. In the last column of Fig. 3, most of the background areas in the images are blackened because a considerable amount of haze is eliminated. The backscattered light remains significantly suppressed with intensified scattering. More importantly, the target information remains intact. As shown in the enlarged view, all textural features of both the fish and support rod are unbroken. These results demonstrate that our method can effectively eliminate backscattered light and retain the target information.

The other target consisted of two parts: an aluminum sheet and a sticker with the XJTU logo attached to it. The incident light retains its original polarization characteristics when reflected from the logo surface. Hence, the polarization effects of this target must be considered. In addition, to verify the performance of the proposed method without the background region, the original images were cropped to remove the background region. Upon cropping, the images were directly processed using CLAHE, Zhao’s method, and our methods. In Treibitz’s method, the background region remains until display.

The recovery results for the logo are shown in Fig. 4. At low concentrations of skimmed milk, the overall performances of the intensity image and CLAHE results remain largely unchanged. The images are covered by a layer of haze, and the visual effects are poor. In Treibitz’s method, the haze thickness increases with the turbidity. The letters are almost unidentifiable in the enlarged views. Compared to the results for the model fish, the effect of descattering is significantly reduced because of the random selection of the background region. Accordingly, some uncertainty was expected in the performance of this method. Zhao et al. used a genetic algorithm to estimate the optimal values of Pscat and Pobj. Therefore, it was independent of the background region and exhibited an excellent capability of uncovering haze. In contrast, our method excelled at removing backscattered light. This is evident in both the overall image and enlarged views, which provide a clearer visual experience. As the turbidity of the medium increased, our approach consistently exhibited outstanding descattering performance. Conversely, in the other methods, the fine patterns of the logo became indistinguishable. In the enlarged views of the three intermediate methods, only slightly larger Chinese characters were vaguely discernible. However, in our results, even tiny letters were distinct, providing compelling evidence of the superior performance and robustness of the proposed approach.

 figure: Fig. 4.

Fig. 4. Recovery results of the XJTU logo under low and high turbidity underwater environments. The background regions in the images are removed before display (Treibitz’s method) or processing (CLAHE, Zhao’s method, and our method).

Download Full Size | PDF

3.3 Results of dynamic imaging

To assess the dynamic imaging capabilities of our method, two targets were moved, and videos of the movement were captured using the polarimetric camera for subsequent processing. The video sequences documenting the movements of these targets comprised multiple frames, each with a resolution of 1024${\times} $1224. Each frame of the degraded image was recovered using our method on a computer equipped with an AMD R5-3600 CPU. To enhance the processing speed, the first frame was selected for image pyramid decomposition and reconstruction, and the global distribution of Pscat was obtained. For each subsequent frame, recovery was performed directly using the estimated Pscat. Consequently, the average processing times for the fish and logo were approximately 0.039 s and 0.036 s, respectively, which is consistent with the requirements of practical real-time imaging. If the acquisition time of the camera is added, the upper limit of 0.04 s may be slightly exceeded. Fortunately, this issue can be solved by increasing the intensity of the incident light. Hence, the proposed method can perform dynamic imaging. A qualitative comparison of the key frames is shown in in Fig. 5.

 figure: Fig. 5.

Fig. 5. Key frames of the original and recovery videos. The fish and the logo of XJTU are in a state of horizontal motion and rotation, respectively. For the former, the 1st, 60th, 110th, 130th and 150th frame of the video are selected, while for the latter, the numbers are 1, 35, 95, 145 and 175. Full videos can be found in Visualization 1 and Visualization 2.

Download Full Size | PDF

To reduce the bias stemming from subjective evaluation, two parameters, contrast (C) and Enhancement Measure Evaluation (EME), were used to quantitatively compare the imaging quality across different methods. According to [11], contrast (C) is defined as follows:

$$C = \frac{\sigma }{{\bar{I}}} = \frac{{\sqrt {\frac{1}{{M \times N}}\sum\limits_{i = 1}^M {\sum\limits_{j = 1}^N {{{[{I(i,j) - \bar{I}} ]}^2}} } } }}{{\frac{1}{{M \times N}}\sum\limits_{i = 1}^M {\sum\limits_{j = 1}^N {I(i,j)} } }}.$$
where $\sigma $ is the standard deviation of the grayscale values of the pixels in the image; $\bar{I}$ is the average gray value of the image; M and N represent the numbers of rows and columns, respectively; and $I(i,j)$ denotes the gray value of the pixel located in the ith row and jth column. The formula for the EME is presented in [33]:
$$\textrm{EME} = \left|{\frac{1}{{{k_1}{k_2}}}\sum\limits_{{l_1} = 1}^{{k_1}} {\sum\limits_{{l_2} = 1}^{{k_2}} {20\log \frac{{I_{\max ;{l_1},{l_2}}^{}}}{{I_{\min ;{l_1},{l_2}}^{} + q}}} } } \right|.$$
where ${k_1}$ and ${k_2}$ indicate that the image is divided into ${k_1} \times {k_2}$ blocks; ${l_1}$ and ${l_2}$ are the serial numbers of the corresponding horizontal and vertical image blocks, respectively; $I_{\max ;{l_1},{l_2}}^{}$ and $I_{\min ;{l_1},{l_2}}^{}$ denote the maximum and minimum gray values of all the pixels in the image block. q is set to 0.0001 to avoid denominators of 0.

As exhibited in Fig. 5, each frame of the intensity videos is enveloped in a dense layer of haze, leading to extremely poor visualization of the target. Under these circumstances, the proposed method is unaffected by the motion of the target and consistently delivers satisfactory results. In the second row of Fig. 5, along with the elimination of backscattered light, most of the background area in the image is restored to its intended black color. The fish becomes clearly recognizable and the visual effects significantly improve. In the fourth row, the haze in the image is partly suppressed, allowing the details of the target, such as the Chinese characters and letters, to become more discernible. In terms of the quantitative evaluation, the relative positions of the curves in Fig. 6 illustrate that our method considerably increases the values of the evaluation parameters, indicating superior imaging quality.

 figure: Fig. 6.

Fig. 6. Quantitative comparison of the original and recovery videos, which are the parameter curves of imaging results for the (a) fish and (b) XJTU logo.

Download Full Size | PDF

An intriguing phenomenon is related to the correlation between the recovery performance and state of motion of the targets. In particular, the contrast between the two rectangles is striking. While the background area is selected in both cases, the red area is filled with black pixels, indicating that the backscattered light is completely eliminated. However, an apparent residual haze is observed in the blue region. This outcome can be viewed as a tradeoff in the pursuit of higher processing speeds. The decomposition and reconstruction of only the first frame results in an estimated global distribution that better fit the properties of the pixels in the red rectangle because most of the black pixels are located in the right half of the image. As mentioned earlier, intensity variations are observed in the backscattered light in the horizontal direction. Therefore, recovering the left half of the image using estimates from the right half produces an error, which manifests in the image as residual haze. Thus, as the fish moves to the right, the background area occupies a larger proportion of the left half, causing more pallid pixels to be generated by the haze. Simultaneously, this results in a decreasing trend in the parameter curves. With respect to the XJTU logo, the black pixels originates mainly from texture details of the target, precluding the emergence of large black background regions in the image. Because the position of the logo undergoes minimal changes in the horizontal direction, progressive deepening of the residual haze in the background area does not occur as the number of frames increased. Thus, no gradual decline in the values of the evaluation parameters is observed. The slight fluctuations in the curve may be attributed to the vibrations during rotation, which shifted the target away from the focal position.

Fortunately, residual haze in the background area does not significantly affect imaging, as greater emphasis is placed on the visual effects of the target of interest in practical applications. From this perspective, although the estimation result of the first frame may slightly compromise the fin information, its effect on the integrity of the overall silhouette is negligible. The characteristic XJTU patterns are easily distinguishable. This indicates that the proposed approach restores the visual effects of the targets to an acceptable level. In contrast, most previous polarization-based imaging methods are not available at all under this circumstance. The requirement of a considerable amount of human-machine interactions results in time-consuming operations, rendering these approaches impractical for continuous operation.

The above analysis shows that the proposed method is effective for dynamic imaging, marking a significant step toward practical applications.

4. Discussion

4.1 Selection of the bottom image

While forming the image pyramid, S1 or I90 was selected as the bottom image, whereas S0 and I0 were knocked out. The images of the XJTU logo were analyzed to assess the rationale behind this choice.

Because S0 lacks polarization information, only the intensity distribution of the backscattered light can be obtained using pyramid decomposition and reconstruction. The recovery result was the difference between the two images. When I0 was utilized as the bottom image, the processing flow was similar to that of our approach. The key difference was that the decomposition and reconstruction of I0 were conducted to obtain the global distribution of Pscat. The intensity images, recovery images based on S0 and I0, and our results are shown in Fig. 7. For quantitative comparison, the values of the two evaluation parameters were directly incorporated into the images. The left and right sides show the numerical values for C and EME, respectively.

 figure: Fig. 7.

Fig. 7. Comparison of the recovery results based on different bottom images. (a) Intensity images. (b) Recovery results based on S0; the bottom image of the image pyramid is S0. (c) Recovery results based on I0; the bottom image is I0. (d) Recovery results obtained by our method; the bottom image is S1.

Download Full Size | PDF

Figure 7 shows that the recovery results based on S0 and I0 exhibit a remarkable descattering performance owing to reduced scattering. Although their imaging quality is slightly inferior to that of our results, the visual effects and evaluation parameters of the images are acceptable. As shown in the first rows of Fig. 7 (b) and (c), the thin haze is eliminated, and the patterns of the logo become clearer. Even the smallest patterns, such as the founding year “1896” can be discerned clearly. However, this remarkable performance is notably diminished under high-turbidity conditions. In particular, the result based on S0 retains a pervasive layer of haze, making it difficult to discriminate larger patterns such as Chinese characters. When the bottom image is switched to I0, the visual effects slightly improve. However, compared with our results, the recovery capability is low.

The choice of the bottom image in the image pyramid significantly affects the results. In a low-turbidity environment, where the image contrast is not severely reduced, identification of the black pixels in the top image is relatively easy, which facilitates the precise reconstruction of the spatial distribution of backscattered light. However, the imaging quality rapidly deteriorates with increasing turbidity. Moreover, S0 records all the information, including stray light and noise, thereby presenting a considerable obstacle to the search for black pixels. Furthermore, the polarization properties of light cannot be leveraged, which significantly reduces the descattering effectiveness. I0 benefits from the polarization filtering effect owing to the removal of a portion of the noisy light. Nevertheless, as previously discussed, the significant amount of residual backscattered light in the image severely limits the effectiveness of this process and affects imaging quality. In comparison, S1 serves as the bottom image in the proposed method. Owing to polarization differential imaging, it accentuates the target profile while suppressing backscattered light and other stray light. Consequently, S1 has a distinct advantage in identifying black pixels and contributes to excellent recovery outcomes.

4.2 Effect of the number of decomposition layers

In the proposed approach, the bottom image is downsampled several times to remove the target information. Here, the effect of the number of downsamples, that is, the number of decomposition layers, on recovery is discussed. Figure 8. shows the results for the different decomposition layers obtained using the deteriorated images of the model fish as the input.

 figure: Fig. 8.

Fig. 8. Results for the different decomposition layers. Because the size of the image varies with the number of decomposition layers, only C is used as an evaluation parameter. The values of C are placed on both axes for a more pronounced curve trend.

Download Full Size | PDF

The trend of the two curves shows that there is an optimal value for the decomposition layers that supports the best recovery, regardless of the turbidity of the medium. For the images of the model fish, the top image of the four-layer pyramid removes most of the target information. On either side of the optimum, the curves exhibit significantly distinct trends. This phenomenon is attributed to the two components of the image acting on the two segments of the curve. In the first half, because the target information is discrete, each decomposition removes some target light. Therefore, the image quality improved rapidly as the number of decomposition layers increased. After reaching the peak, the removal effect of the target light is saturated, and it is the turn of backscattered light to work. Although background information is continuous, the spatial variation in this component is uneven. In such cases, upsampling by averaging adjacent rows and columns to generate new pixels performs well in spaces where pixels are evenly distributed. Persistent downsampling may exceed this space such that the resulting pixel values differ significantly from their true values. Fortunately, the spatial extent of the uniform backscattered light is sufficient for downsampling; hence, no apparent change in the image quality was observed in the second half of the curves. However, prolonged downsampling reduces the reconstruction accuracy of the backscattered light slightly, which adversely affects the overall image quality. This accounts for the gradual descent of the curves beyond the peak values.

Thus, there exists an optimal value for the decomposition layers that maximizes the image quality. This indicates the importance of judiciously selecting the number of decomposition layers when applying the pyramid decomposition to images.

5. Conclusion

In this paper, a method based on image pyramid decomposition and reconstruction is proposed to achieve dynamic and clear underwater imaging. Owing to the accurate estimation of the polarization characteristics of backscattered light, the clear scene can be recovered well. The experimental results indicate the superiority of the proposed method for moving targets with strong or weak depolarization abilities; both the processing speed and recovery effects are satisfactory. Because the DOP of the backscattered light is no longer assumed to be constant, the recovery performance is not susceptible to external factors and remains at a consistently high level. Furthermore, the effects of the two factors influencing the recovery are discussed. The proposed approach avoids the limitation of the background region, thereby broadening the scope of application of underwater image recovery methods, particularly if the target occupies the entire field of view in dynamic imaging.

Currently, the image contains only targets with single polarization characteristics. Our future studies will examine the recovery of targets with complex polarization characteristics.

Funding

National Natural Science Foundation of China (61890961, 62127813).

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant No. 61890961, 62127813), to which the authors are most grateful.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J. S. Jaffe, “Underwater Optical Imaging: The Past, the Present, and the Prospects,” IEEE J. Oceanic Eng. 40(3), 683–700 (2015). [CrossRef]  

2. K. O. Amer, M. Elbouz, A. Alfalou, et al., “Enhancing underwater optical imaging by using a low-pass polarization filter,” Opt. Express 27(2), 621–643 (2019). [CrossRef]  

3. M. Wang, X. Wang, Y. Zhang, et al., “Range-intensity-profile prior dehazing method for underwater range-gated imaging,” Opt. Express 29(5), 7630–7640 (2021). [CrossRef]  

4. J. S. Jaffe, “Performance bounds on synchronous laser line scan systems,” Opt. Express 13(3), 738–748 (2005). [CrossRef]  

5. D. S. Jeon, A. Meuleman, S. H. Baek, et al., “Polarimetric iToF: Measuring High-Fidelity Depth Through Scattering Media,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2023), pp. 12353–12362.

6. D. Huyan, N. Lagrosas, and T. Shiina, “Optical Properties Analysis of Scattering Media Based on GI-OCT Imaging,” Photonics 10(2), 146 (2023). [CrossRef]  

7. H. Li, J. Zhu, J. Deng, et al., “Influence mechanism of the particle size on underwater active polarization imaging of reflective targets,” Opt. Express 31(5), 7212–7225 (2023). [CrossRef]  

8. T. Liu, Z. Guan, X. Li, et al., “Polarimetric underwater image recovery for color image with crosstalk compensation,” Opt. Lasers Eng. 124, 105833 (2020). [CrossRef]  

9. J. Wang, M. Wan, G. Gu, et al., “Periodic integration-based polarization differential imaging for underwater image restoration,” Opt. Lasers Eng. 149, 106785 (2022). [CrossRef]  

10. Y. Li, R. Ruan, Z. Mi, et al., “An underwater image restoration based on global polarization effects of underwater scene,” Opt. Lasers Eng. 165, 107550 (2023). [CrossRef]  

11. Y. Y. Schechner and N. Karpel, “Recovery of underwater visibility and structure by polarization analysis,” IEEE J. Oceanic Eng. 30(3), 570–587 (2005). [CrossRef]  

12. T. Treibitz and Y. Y. Schechner, “Active Polarization Descattering,” IEEE Trans. Pattern Anal. Mach. Intell. 31(3), 385–399 (2009). [CrossRef]  

13. M. Dubreuil, P. Delrot, I. Leonard, et al., “Exploring underwater target detection by imaging polarimetry and correlation techniques,” Appl. Opt. 52(5), 997–1005 (2013). [CrossRef]  

14. F. Liu, P. Han, Y. Wei, et al., “Deeply seeing through highly turbid water by active polarization imaging,” Opt. Lett. 43(20), 4903–4906 (2018). [CrossRef]  

15. P. Han, F. Liu, Y. Wei, et al., “Optical correlation assists to enhance underwater polarization imaging performance,” Opt. Lasers Eng. 134, 106256 (2020). [CrossRef]  

16. H. Zhang, M. Ren, H. Wang, et al., “Fast processing of underwater polarization imaging based on optical correlation,” Appl. Opt. 60(15), 4462–4468 (2021). [CrossRef]  

17. Q. Song, X. Liu, H. Huang, et al., “Polarization Reconstruction Algorithm of Target Based on the Analysis of Noise in Complex Underwater Environment,” Front. Phys. 10, 813634 (2022). [CrossRef]  

18. Y. Zhang, Q. Cheng, Y. Zhang, et al., “Image-restoration algorithm based on an underwater polarization imaging visualization model,” J. Opt. Soc. Am. A 39(5), 855–865 (2022). [CrossRef]  

19. H. Zhang, J. Gong, M. Ren, et al., “Active Polarization Imaging for Cross-Linear Image Histogram Equalization and Noise Suppression in Highly Turbid Water,” Photonics 10(2), 145 (2023). [CrossRef]  

20. H. Li, J. Zhu, J. Deng, et al., “Underwater active polarization descattering based on a single polarized image,” Opt. Express 31(13), 21988–22000 (2023). [CrossRef]  

21. H. Hu, L. Zhao, X. Li, et al., “Underwater Image Recovery Under the Nonuniform Optical Field Based on Polarimetric Imaging,” IEEE Photonics J. 10(1), 1–9 (2018). [CrossRef]  

22. Y. Wei, P. Han, F. Liu, et al., “Enhancement of underwater vision by fully exploiting the polarization information from the Stokes vector,” Opt. Express 29(14), 22275–22287 (2021). [CrossRef]  

23. Y. Wei, P. Han, F. Liu, et al., “Estimation and removal of backscattered light with nonuniform polarization information in underwater environments,” Opt. Express 30(22), 40208–40220 (2022). [CrossRef]  

24. H. Wang, H. Hu, J. Jiang, et al., “Automatic underwater polarization imaging without background region or any prior,” Opt. Express 29(20), 31283–31295 (2021). [CrossRef]  

25. Y. Zhao, W. He, H. Ren, et al., “Polarization descattering imaging through turbid water without prior knowledge,” Opt. Lasers Eng. 148, 106777 (2022). [CrossRef]  

26. X. Li, J. Xu, L. Zhang, et al., “Underwater image restoration via Stokes decomposition,” Opt. Lett. 47(11), 2854–2857 (2022). [CrossRef]  

27. T. Yu, X. Wang, S. Xi, et al., “Underwater polarization imaging for visibility enhancement of moving targets in turbid environments,” Opt. Express 31(1), 459–468 (2023). [CrossRef]  

28. L. Liu, X. Li, J. Yang, et al., “Fast image visibility enhancement based on active polarization and color constancy for operation in turbid water,” Opt. Express 31(6), 10159–10175 (2023). [CrossRef]  

29. H. Li, J. Zhu, J. Deng, et al., “Visibility enhancement of underwater images based on polarization common-mode rejection of a highly polarized target signal,” Opt. Express 30(24), 43973–43986 (2022). [CrossRef]  

30. J. Zhou, T. Yang, W. Ren, et al., “Underwater image restoration via depth map and illumination estimation based on a single image,” Opt. Express 29(19), 29864–29886 (2021). [CrossRef]  

31. Y. Piederriere, F. Boulvert, J. Cariou, et al., “Backscattered speckle size as a function of polarization: influence of particle-size and -concentration,” Opt. Express 13(13), 5030–5039 (2005). [CrossRef]  

32. A. M. Reza, “Realization of the Contrast Limited Adaptive Histogram Equalization (CLAHE) for real-time image enhancement,” VLSI Signal Process. 38(1), 35–44 (2004). [CrossRef]  

33. S. S. Agaian, K. Panetta, and A. M. Grigoryan, “Transform-based image enhancement algorithms with performance measure,” IEEE Trans. on Image Process. 10(3), 367–382 (2001). [CrossRef]  

Supplementary Material (2)

NameDescription
Visualization 1       This video shows comparison between the intensity imaging results and our results. The target is a fish with strong depolarization abilities, in horizontal motion.
Visualization 2       This video shows comparison between the intensity imaging results and our results. The target is the logo of XJTU with weak depolarization abilities, in rotation.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1. Flowchart of the proposed method.
Fig. 2.
Fig. 2. Schematic of the underwater polarization imaging experiments.
Fig. 3.
Fig. 3. Imaging results of the model fish in low and high turbidity environments, which correspond to the results of intensity imaging, CLAHE [32], Treibitz’s method [12], Zhao’s method [25], and our method. For a better comparison, one region is cropped out and placed in the bottom.
Fig. 4.
Fig. 4. Recovery results of the XJTU logo under low and high turbidity underwater environments. The background regions in the images are removed before display (Treibitz’s method) or processing (CLAHE, Zhao’s method, and our method).
Fig. 5.
Fig. 5. Key frames of the original and recovery videos. The fish and the logo of XJTU are in a state of horizontal motion and rotation, respectively. For the former, the 1st, 60th, 110th, 130th and 150th frame of the video are selected, while for the latter, the numbers are 1, 35, 95, 145 and 175. Full videos can be found in Visualization 1 and Visualization 2.
Fig. 6.
Fig. 6. Quantitative comparison of the original and recovery videos, which are the parameter curves of imaging results for the (a) fish and (b) XJTU logo.
Fig. 7.
Fig. 7. Comparison of the recovery results based on different bottom images. (a) Intensity images. (b) Recovery results based on S0; the bottom image of the image pyramid is S0. (c) Recovery results based on I0; the bottom image is I0. (d) Recovery results obtained by our method; the bottom image is S1.
Fig. 8.
Fig. 8. Results for the different decomposition layers. Because the size of the image varies with the number of decomposition layers, only C is used as an evaluation parameter. The values of C are placed on both axes for a more pronounced curve trend.

Equations (27)

Equations on this page are rendered with MathJax. Learn more.

T = 1 P s c a t P o b j [ I ( 1 + P s c a t ) I ( 1 P s c a t ) ] .
S 0 = I 0 + I 90 = T p + T n + B p + B n ,
S 1 = I 0 I 90 = T p cos 2 α + B p cos 2 β ,
S 2 = I 45 I 135 = T p sin 2 α + B p sin 2 β .
I 0 = S 0 + S 1 2 = T p cos 2 α + B p cos 2 β + 1 2 T n + 1 2 B n ,
I 90 = S 0 S 1 2 = T p sin 2 α + B p sin 2 β + 1 2 T n + 1 2 B n .
T p sin 2 α = T p sin 0 = 0 ,
T 90 = T p sin 2 α = T p sin 2 0 = 0.
S 1 = T p cos 2 α + B p cos 2 β ,
S 2 = B p sin 2 β ,
I 90 = B p sin 2 β + 1 2 B n ,
S 1 j i 1 = [ S 1 j i ] 2   i = 0 , 1 , j 2.
Arange = max ( S 1 1 ( c ) ) min ( S 1 1 ( c ) ) ,
A u p b o u n d = max ( S 1 1 ( c ) ) A r a n g e m ,
B P = { p | p A u p b o u n d , p S 1 1 ( c ) } .
S 1 1 ( B ) = o 1 , o 2 o p o 1 o 2 x o 1 y o 2 ,
S 1 j i + 1 ( B ) = [ S 1 j i ( B ) ] 2   i = 1 , j 1.
S 1 j ( B ) = B p cos 2 β .
β = 1 2 arctan ( S 2 S 1 j ( B ) ) ,
B p = [ S 1 j ( B ) ] 2 + ( S 2 ) 2 ,
B n = 2 ( I 90 B p sin 2 β ) .
β = 1 2 arctan ( S 2 S 1 ) ,
B p = ( S 1 ) 2 + ( S 2 ) 2 ,
B n = 2 [ I 90 j ( B ) B p sin 2 β ] .
P s c a t = B p B p + B n .
C = σ I ¯ = 1 M × N i = 1 M j = 1 N [ I ( i , j ) I ¯ ] 2 1 M × N i = 1 M j = 1 N I ( i , j ) .
EME = | 1 k 1 k 2 l 1 = 1 k 1 l 2 = 1 k 2 20 log I max ; l 1 , l 2 I min ; l 1 , l 2 + q | .
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.