1 Introduction

Today is a reality that the virus SARS-Cov-2 has become a global pandemic. This novel virus, named coronavirus due to visual appearance—under electron microscopy—similar to a crown [2], is the cause of an infectious disease by severe acute respiratory syndrome (SARS-Cov-2) [3] and named by the World Health Organization (WHO): COVID-19.

Image analysis is a scientific discipline providing theoretical foundations and methods for solving problems appearing in a range of areas as diverse as the pathology, biology, astronomy, medicine, physics, geography, chemistry, robotics and other. Particularly, computer vision is an interdisciplinary field that deals with how computers can develop a high-level understanding by interpreting information present in digital images [4]. It has made substantial progress in the last few years, which might be of interest and importance, not only for image analysis but also, for example, within image segmentation and enhancement, for the development of digital models in the range-space domain [5].

The microscopic study of images of viruses and bacteria is very important and of great need, since it allows us to carry out a more precise and local research on them. However, many times these microscopic images present some blurring problems, which often makes it difficult to analyze correctly to them [1].

Obtaining high-quality microscopic images is not always a trivial process, even in the best microscopes, due to physical phenomena that originate between the glass coverslip, light and microscope optics. Some of these phenomena are: scattering, out-of-focus, imperceptible vibrations, voltage disturbances, among others, which cause certain blurring in microscopic images [1].

In order to improve those problems is necessary to design and develop algorithms that take into account the physics of formation of those microscopic images, as the correlation between pixels, boundary conditions, among others. For this reason, one focuses the greatest attention on the algorithmic part in order to find a solution to the physical phenomena mentioned above. However, many times around these algorithms lies a theoretical basis that is worth analyzing, since it can help to face new problems of similar magnitudes.

The aim of this work is to carry out a theoretical analysis of the proposed algorithms to the enhancement and segmentation of SARS-Cov-2 high-resolution microscopy images. In [1], we addressed the most attention to the application of those algorithms.

The rest of the paper is organized as follows: In Section II, materials and methods and the characteristics of the studied images are given. Section III outlines the Algorithms and related theoretical aspects. Section IV contains the experimental results and discussion. We describe our conclusions in Section V.

2 Materials and methods

As we expressed in [1], we captured microscopic images from nasopharyngeal swabs collected from Cuban individuals with COVID-19 symptomatic and RT-PCR positive for SARS-CoV-2 and processed by scanning electron microscopy, confocal microscopy and atomic force microscopy [6]. We completed a database of more than 1010 microscopic images.

In the case of SARS-CoV-2, the scanning electron microscopy (SEM) images provide fundamental data of the structural aspects of the virus [1, 6], which can be a guiding point to design and development of algorithms and analysis of associated theory.

2.1 Characteristics of the studied images

In this section, we will carry out a more detailed analysis of the characteristics of the studied images with the aim that one can better understand the proposed algorithm in [1], some variants of it and its corresponding theoretical analysis.

In Fig. 1, we show two examples of typical coronavirus images. One can see other examples, captured via SEM, in [1].

Fig. 1
figure 1

Two examples of SARS-Cov-2 coronavirus microscopic images. a and b Original microscopic images

In Fig. 1, one can observe several notable characteristics of these microscopic images of the novel coronavirus. The most notable in these images is its appearance of blurring, which is typical in images captured via SEM. Here, the important issue is to improve and isolate the S-spikes.

It is known that scientists have given the name coronavirus by the crown of S-proteins (also called S-spikes) covering their outer membrane surface. Many works have focused on these S-proteins, because they are the keys that the virus uses to enter host cells, where these S-proteins bind to a receptor called angiotensin-converting enzyme 2 (ACE2) to hack its way into host cells [7].

In addition, studies so far suggest that the S-proteins of the novel coronavirus bind to ACE2 significantly more strongly than those of SARS-CoV-2, and probably, that is one of the reasons it spreads more easily and is more infectious [7].

Everything expressed denotes the importance of enhancing and isolating the S-proteins in the blurring microscopic images of the novel coronavirus, since it helps to better interpret its virulence.

In [1], we presented a set of typical characteristics of microscopic images of the novel coronavirus. One can note in Fig. 1 that the S-proteins of the virus are hyper-dense regions (clearer areas), this being one of the most notable features of these microscopic images. We used this characteristic to design the algorithm we proposed in [1]. In this work, one will see variants of that strategy and a deeper theoretical analysis.

3 Algorithms and related theoretical aspects

We will address in greater depth the theoretical aspects related to the proposed algorithms for enhancement and isolation of the S-proteins.

We are first going to present Table 1 that summarizes some state-of-the-art methods used for processing microscopic images.

Table 1 A state-of-the-art review

In review of Table 1, one can see that the enhancement or segmentation of S-spikes does not appear. Therefore, the novelty of our work is the enhancement of S-proteins. Furthermore, deep learning was fundamentally applied to detect COVID-19 from chest CT radiography digital images [22]. On the other hand, some of the consulted references indicated that the training time of a convolutional network by using deep learning took approximately 4 to 14 h [17]. Our Algorithms do not require any training.

In Fig. (1), one can observe that the S-spikes are clearer small dots than its surrounding background, grouped in areas corresponding to the virus and scattered throughout the image. Then, within a region of microscopic images, changes in intensity levels can effectively distinguish S-spikes, which is indicative that the correlation between pixels, corresponding to the S-proteins, in those areas is stronger than outside them. Therefore, this analysis indicates that one must work in the region of interest (RoI) and using small masks (3 × 3 or 5 × 5 dimension) to carry out the processing [8].

3.1 Some definitions

3.1.1 Filtering process

Most of the time, medical images are corrupted with a lot of noise, due to phenomena that occur in the A/D converter (and often called white noise), which produce attenuation of high spatial frequencies (edges, corners, points, etc.), which makes it difficult to carry out a better interpretation of the scene. This phenomenon also occurs in the microscopic images.

This type of noise is overlapped to important aspects in images, being necessary, for its attenuation, to use alternative strategies. The main problem in the reduction of noise is to get a clean image, that is, without noise, but keeping all attributes of the original scene as could be the shape, size, color, edges, among others.

Many methods have been proposed so far for the elimination of white noise, among them, the “ad hoc” filters, the mean and Gaussian filters [9] and, more recently, the mean-shift iterative algorithm (MSHi) [8], which is more adaptive, but consumes more computational time. For that reason, we used the Gaussian filter with good results, which by the shape of this function attenuates the noise more strongly in the central part and edges are less smoothed.

We carried out several researches with many microscopic images, arriving to the conclusion that the best performance is obtained, according to our application, with σ = 3. We verified that for large values of sigma (σ > 4), the smoothing was poor, because homogeneous areas arose in the filtered image and some of the S-proteins were joined. This was the reason why we decided to carry out our experiments with the parameter equal to 3. The used window size was of 3 × 3 too. We verified that with this window size, we considerably smoothed the noise and did not affect the edges of interest objects (the S-spikes). This is because the S-proteins have a smaller size in relation to its environment (Fig. 1).

3.1.2 Mathematical morphology

Definition 1

(Morphological opening) Morphological opening is the dilation of the erosion of a set I by a structuring element S, that is,

$$ {\text{Opening}} = I \,{^\circ } S = \left( {I { \ominus } S} \right) \oplus S, $$
(1)

where symbols \(\ominus \) and \(\oplus \) denote erosion and dilation, respectively.

Morphological opening removes small objects from the foreground of an image (usually taken as the bright pixels: edges, corners, points, etc.). If one uses a large structuring element, one can obtain the background of image. Many times, opening operates similar to a low-pass filter.

Definition 2

(Decomposition at thresholds) Let Th (I) be the successive thresholds of I, for h = 0 to L-1,

$$ T_{h} (I) = \left\{ {p \in D_{I} / I (p) \le h} \right\}. $$
(2)

Then, they constitute the threshold decomposition of I, and these sets satisfy the following inclusion relationship [5]:

$$ {{T}}_{{{h}}} \left( {{I}} \right) \subseteq {{T}}_{{{{h}} - {1}}} \left( {{I}} \right)\;\;\; \forall {{h}} \in \left[ {0,{{ L}} - {1}} \right] $$
(3)

where I is an image of gray levels, DI is the domain of I and L is the number of gray levels.

Definition 3

(Regional maximum) A regional maximum at altitude h of grayscale image I is a connected component C of Th (I) such that \(C \cap T_{h + 1} \left( f \right) = \varphi\), where symbol φ means the empty set.

One should not confuse a regional maximum with a local maximum. A pixel p of I is a local maximum in an h environment, if and only if the I(p) value is greater than or equal to any of the pixels in your environment. However, a regional maximum is a set of pixels connected at an h height where there is no pixel with a value higher than that height. Therefore, we can consider each of the S-proteins as local and non-regional maxima. One should have this concept clear since, in this application, it is very important for the design of algorithms.

Definition 4

(Grayscale reconstruction) The grayscale reconstruction \(\rho_{I} (J) \, \) of I from J obtained by iterating grayscale dilations of J “under” I until stability is reached, that is,

$$ \rho_{I} \left( J \right) = \mathop \vee \limits_{n \ge 1} \delta_{I}^{(n)} \left( J \right). $$
(4)

Reconstruction is an efficient method for extracting regional maxima and minima in gray-level images.

However, we did not follow this path in the design of our algorithms, because in the initial tests carried out, we did not obtain good results. Reconstruction produced a connection of the S-proteins, which is not convenient in this application. One can find the explanation of this in what we expressed in previous paragraphs. (The S-spikes are local maxima and non-regional.)

Definition 5

The h-dome transformation (D h (I)) of the h-domes of a grayscale image I is given by,

$$ D_{h} \left( I \right) = I - \rho_{I} \left( {I - h} \right). $$
(5)

The h-dome transformation extracts light structures without involving any size or shape criterion. The only parameter (h) is related to the height of these structures. In the case of coronavirus S-spikes isolation, this parameter was of vital importance.

The need of isolating the S-proteins individually has great importance because it will allow establishing a surface density index, where at higher index, the higher the virulence level.

3.1.3 Enhancement and segmentation of images

Definition 6

(Extreme filter). The Extreme filter is defined by the following expression:

$$ g = E_{{{\text{ext}}}} \left( I \right) = \left\{ {\begin{array}{*{20}l} {\max \left( I \right) \;{\text{if}}\;{\text{max}}\left( I \right) - I \le I - {\text{min}}\left( I \right)} \\ {\min \left( I \right) \;\;\;\;\;{\text{otherwise}} } \\ \end{array} } \right. $$
(6)

where max (I) is the maximum value of pixel in the window and min (I) is the minimum value of pixel.

The extreme filter has been widely used to accentuate blurring sides and step sides in biological images. This one is very useful when one combines it with other filters. We used, due to small size of the S-spikes, 3 × 3 window size and utilized the extreme filter in the initial experiments, but we did not obtain good results in the isolation of the S-proteins. (Due to space problems, we did not present the results here.) We concluded that these results (fusion of the S-proteins) were due to that edges of the S-proteins are insignificant (very little thickness) in relation to the morphology of coronavirus.

Definition 7: (High spatial frequencies).

From a digital image processing point of view, the spatial frequency is a measure of how often small details (points, edges, corners, steps, etc.) of a structure repeat per unit of distance (that is, any structure that is periodic or not across position in space).

In this research, we consider the S-proteins as information of high spatial frequencies. Therefore, we focused our goal on these frequencies; that is, we try to enhance their contrast or isolate them.

A typical expression for local contrast enhancement is as follows:

$$ g \left( {x, y} \right) = \mu + k \left( {f\left( {x, y} \right) - \mu } \right) $$
(7)

where symbol “µ” is the mean at a given window and k is a gain factor.

One can consider expression (7) as the equation of a straight line. In addition, the second term is high-frequency information, a measure of local variability in the image, and one can consider “k” as the slope [9].

In fact, if k > 1, the contours of image will be sharpened and its performance is similar to a high-pass filter. If the parameter k is in the closed range, 0 ≤ k ≤ 1, then the image will be smoothed similar to a low-pass filter. In the extreme case, k = 0, the result will be equal to the local average. Many times, the first term in Eq. (7) is replaced by f(x, y), i.e., the original image.

An algorithm by using expression (7) can increase either sharpness or local contrast, because these are both ways of increasing differences between values, since when increasing slope, sharpness referring to very small-scale differences (high frequency) and contrast referring to larger scale (low frequency).

For example, a variant of expression (7) is to replace in the second term to µ by Gaussian function. In this case, one should take both small window size and σ, since one achieves good results when the scale (σ) is of similar dimension to the structures that one wants to enhance (the S-proteins) [10].

What we expressed in the previous paragraph was the fundamental base of the final version of our algorithm, always taking in consideration the spatial relation and implicitly the correlation among the pixels. In this way, the performance of algorithms improves substantially [8].

3.2 Algorithms analysis

In any application of computer vision, the study and characterization of images are of vital importance. The correct design of algorithms and their good performance depend on the quality of this study. The basic idea in this research was to obtain simple hybrid algorithms, without the need to carry out long hours of training, as is the case with deep learning.

In [1], we carried out a characterization of microscopic images captured by scanning electron microscopy, from nasopharyngeal swabs collected from Cuban individuals with COVID-19 symptomatic and RT-PCR positive for SARS-CoV-2.

In this section, we will expose and carry out an analysis of three designed algorithms, taking as base the study of the microscopic images and an appropriate combination of the theories described above.

An aspect to highlight in these strategies is their algorithmic simplicity. Today, many researchers want to apply deep learning to many problems that arise in computer vision, which requires big databases and a computational cost [4, 11, 12]. However, when one carries out a deep study of images, one can obtain good results of fast the way and with good performance of the algorithms, without the need of using complex methods.

Without doubt, deep learning has given very good results, especially in identification of patterns, but in this application one wants to enhance certain spatial frequencies (the S-pikes), which have a random spatial distribution of viruses.

figure a

Observe that we carry out only the subtraction of h parameter in a global way. All next steps, we work locally, and we process similar to the convolutional neural networks.

In effect, in Gaussian and Extreme filters, we compute the spatial size of the output as a function of the input size [11]. In this case, due to the size of the S-spikes, we used a stride equal to one; in this way, we obtained more precision. A bigger stride causes sharpness loss of S-proteins.

figure b

We used Algorithm No. 2 of a similar way that Algorithm No. 1.

In these Algorithms, the value of h parameter determines the increase in local contrast or the isolation of the S-proteins. For a small value of h, the S-spikes are enhanced (more contrasted), while for a bigger value of h, the S-spikes are isolated (practically segmented), where the S-spikes appear clear on a black background.

The importance of enhancing the S-proteins is because those areas that showed a high density of the S-spikes appear to be related to an active zone of viral germination [6]. Therefore, to isolate the S-proteins will allow to establish an “index” that relates the number of the S-proteins per unit area (surface density), where a higher index, a higher degree of virulence.

figure c

See that Algorithm No. 3 is simpler than the previous ones, and we obtained the best results with this strategy.

The computational complexity of our algorithms is of: O(RGl) iterations, where R is the number of partitions and Gl is the average number of different gray levels per partition.

4 Experimental results and discussion

Automated analysis of biomedical images remains a considerably challenging task mainly because biomedical images are complex and variant. Moreover, the SARS-CoV-2 pandemic has demonstrated that the difference between disease and non-disease cases is often subtle.

Therefore, accurate automated analysis of biomedical images requires the development of innovative strategies that are adaptive and scalable. This is possible with a deep study of images that will be processed, which avoids the blind use of complex strategies which consume computational time, and that in the end do not provide the expected results.

In Figs. 2 and 3, we show some examples of the obtained results of applying Algorithm No. 1 to SARS-CoV-2 microscopic images for four values of h parameter.

Fig. 2
figure 2

Application of Algorithm No. 1 for different values of h parameter. a Original image, b h = 30, c h = 60, d h = 80, e h = 100

Fig. 3
figure 3

Application of Algorithm No. 1 for different values of h parameter. a Original image, b h = 30, c h = 60, d h = 80, e h = 100

Note that the processed images of applying Algorithm No. 1 were notably enhanced; that is, contrast increased, and the S-spikes are more evident and show the urchin shape. One can observe that for small values of h, we obtained a better contrasted image (h = 30), while for a higher value (h = 100), the S-spikes have more sharpness, where one can see white areas on a black background. This result permits to carry out an automatic count of white points on a black background area, and one can establish an index.

The advantage of Algorithm No. 1 is that with a single parameter, one can increase either sharpness or local contrast. For an h parameter, a very large practically segmented image can be reached.

Now, we will present the achieved results by using Algorithm No. 2. In Figs. 4 and 5, we show the obtained results, taking, respectively, as original images of Figs. 2a and 3a.

Fig. 4
figure 4

Application of Algorithm No. 2 for a “DiskSE for different radios. Original image from Fig. 2a. a Radio = 15, b Radio = 25, c Radio = 35, d Radio = 45

Fig. 5
figure 5

Application of Algorithm No. 2 for a “DiskSE type for different radios. Original image from Fig. 3a. a Radio = 15, b Radio = 25, c Radio = 35, d Radio = 45

In Algorithm No. 2, the choice of a structuring element type “Disk” was due to the round morphology presented by the SARS-CoV-2 coronavirus. We verified that the radius size has a behavior contrary of the h parameter (Algorithm No. 1), in the sense that for smaller radius size increases the sharpness (Figs. 4a and 5a), but for larger size increases the local contrast (Figs. 4d and 5d).

Visually comparing the performance of Algorithms No. 1 and No. 2, one can note that Algorithm No. 1 accentuates more the high frequencies and produces a greater local contrast and greater sharpness in the S-proteins. With this Algorithm (No. 1), the background looks darker.

In Figs. 6 and 7, one can observe that Algorithm No. 3 produced the bigger local contrast and sharpness, and the S-pikes were notably highlighted. One can note that for an h bigger than 80, the processed image is close to a segmented image (the S-spikes appear as white points), and the background is practically zero. All contrary occurs for small h, where Algorithm No. 3 only enhanced the local contrast.

Fig. 6
figure 6

Application of Algorithm No. 3 for different values of h parameter. Original image from Fig. 2a. a h = 30, b h = 60, c h = 80, d h = 100

Fig. 7
figure 7

Application of Algorithm No. 3 for different values of h parameter. Original image from Fig. 3a. a h = 30, b h = 60, c h = 80, d h = 100

One can see that both algorithms were able to enhance the local contrast and sharpness of the S-proteins.

We point out that the slopes of profile curves, by using Algorithm No. 3, were more abrupt than those obtained by using Algorithms No. 1 and No. 2 (we, due to space problems, did not present the profile curves), which is indicative of higher contrast and sharpness.

It is important to note that the proposed Algorithms are hybrid strategies, obtained by a combination of different methods (mathematical morphology, statistical and other). One can observe the algorithmic simplicity of them, with the use of very few adjustment parameters, which makes them more adaptable to real problems. In these Algorithms, it is not necessary to carry out any training.

4.1 Comparisons with other methods and quantitative evaluation

In this section, we will carry out a comparison of our strategies with two recognized classical techniques. These classical techniques are a high-pass filter (HPF) and a statistical local contrast enhancement (SLCE) [9]. However, such a comparison will always be incomplete given the information volume existing in the literature.

In Figs. 8 and 9, we show the obtained results with the HPF, with the SLCE and with our algorithms. As original images, we used Figs. 2a and 3a.

Fig. 8
figure 8

Comparisons with two classical methods. Original image from Fig. 2a. a Algorithm No. 1 with h = 80, b Algorithm No. 2 with R = 35, c Algorithm No. 3 with h = 80, d HPF with window size equal to 3, e HPF with window size equal to 5, f SLCE with window size equal to 3

Fig. 9
figure 9

Comparisons with two classical methods. Original image from Fig. 3a. a Algorithm No. 1 with h = 80, b Algorithm No. 2 with R = 35, c Algorithm No. 3 with h = 80, d HPF with window size equal to 3, e HPF with window size equal to 5, f SLCE with window size equal to 3

When observing Figs. 8 and 9, one can see that the SLCE did not produce good results, because the S-spikes were not well discriminated, and it is possible to note an over-saturation. The obtained results when working with the HPF were a little better, where the S-spikes were more highlighted, but the background was not completely dark, and a good contrast in relation to the S-proteins was not achieved. However, our Algorithms (especially No. 3) accentuated and isolated the S-spikes, reaching a background practically equal zero, and obtained better contrast.

Qualitative comparison always generates certain subjectivity. For that reason, in order to achieve a more objective criterion of the performance of Algorithms, it is necessary to carry out a quantitative comparison.

4.1.1 Quantitative comparison among the algorithms

Due to the lack of truth images, the quantitative evaluation of enhancement and segmentation techniques is difficult to achieve. According to many authors, this remains an open problem [14]. For that reason, some quantitative comparison with manual segmentation may provide a useful indication. Manual segmentation generally gives the best and most reliable results when identifying structures for a particular clinical task (in this case, the S-spikes).

Thus, for this comparison a virology specialist outlined the S-spikes from images of Figs. 2a and 3a, in which it was an extremely tedious work. The specialist made a bank of 15 images, of which for reasons of space, we only will show two of them. The background of these images is equal to zero. Figure 10 shows the images.

Fig. 10
figure 10

Manually contrasted images from Figs. 2a and 3a that we considered as true

Tables 2 and 3 list the results of comparisons of manually contrasted images with those obtained by our Algorithms and with the HPF and SLCE. We utilized as metrics of evaluation: Accuracy, Sensitivity, Precision, Specificity and F-Measure [14,15,16].

Table 2 Comparison of manually contrasted images with those obtained by our Algorithms, and with the HPF and SLCE. Taking Fig. 10a as True
Table 3 Comparison of manually contrasted images with those obtained by our Algorithms, and with the HPF and SLCE. Taking Fig. 10b as True

Tables 2 and 3 reflect that the SLCE method was the worst of all. In all images an over-saturation is produced an over-saturation and joined the S-spikes, which it is manifested in the low percent obtained in the metrics (see last rows in Tables 2 and 3).

One can see in Tables 2 and 3 the high percent obtained in the used metrics in order to evaluate quantitatively the effectiveness of Algorithm No. 3 (see row 3). In particular, one can note the percent in the metrics of Sensitivity and Specificity, which many authors consider important performance measures in medical images [16]. These scores show that in all cases Algorithm No. 3 had good performance in the enhancement and segmentation of the S-spikes, which is in line with what one can appreciate in Figs. 8c and 9c.

5 Conclusions

We carried out a theoretical analysis of the proposed algorithms to the enhancement and segmentation of the SARS-CoV-2 high-resolution microscopy images. We showed that the Algorithms arose as result of the combination of techniques in several fields of computer vision, originating hybrid strategies with a minimum amount of parameters and elegant algorithmic simplicity. We also verified the importance of the h parameter to obtain a background close to zero and to increase the local contrast and sharpness of the S-proteins. Finally, we carried out a quantitative evaluation to achieve a more objective comparison with two recognized classical algorithms. The proposed algorithms can be applied to other types of bacteria and new viruses of high-resolution microscopy images.