Keywords

1 Introduction

Image segmentation is a process that classifies image constituents into two or more groups that represent some distinct aspects of the data to highlight relevant image features for further analysis. The simplest segmentation task is binary thresholding, which defines imaged object boundaries by dividing constituents into background and foreground depending on whether their grayscale value meets a selected gray-value threshold. There are several more complex thresholding methods that are classifiable by the locality of their thresholds, by the assumptions made about voxel connectivity, and by the extent that they can be automated. Examples of common methods are Otsu’s method [1], region growing methods [2], and shape-based methods like geodesic active contours [3].

Segmentation is one of the most difficult problems in image processing and remains an active area of development [4]. Despite decades of research, there exists no universal segmentation method that would produce best results in all cases. One reason is that there is no absolute ground truth (GT), and therefore, a single right answer. Another reason is that segmentation methods vary in their sensitivity to artifacts and image quality. Most segmentation methods are specialized and perform best on the object and modality they were designed for. In practice, segmentation methods are considered and selected separately for each task. Three-dimensional (3D) segmentation faces additional challenges due to large datasets, and because computationally intensive segmentation algorithms, like shape-based algorithms, require much more memory and calculation power than simple thresholding.

Nowadays, volumetric imaging is often used in the medical field, from devices such as magnetic resonance imaging or computed tomography (CT), and several applications require volumetric segmentation of tissues [5]. One of the biological structures that is relevant to assess in 3D is the metabolically active trabecular bone, this one being at upmost interest in studies of musculoskeletal disorders. Pathological bone conditions such as osteoarthritis and osteoporosis are linked to small changes in trabecular bone microstructure that can be assessed from 3D micro-computed tomography (µCT) images [6].

However, while high-resolution imaging can be used for in vitro CT studies, the clinical resolution is limited due to radiation levels which must be kept low [7]. Low resolution imaging increases the impact of artifacts such as the partial volume effect (PVE), and leads to challenging segmentations. PVE comprises a class of imaging resolution related artifacts that limit how well a reconstructed voxel can represent its object location [8].

The lack of generic automated algorithms increases the effect of human factors like time consumed, interpersonal variance and systematic error. Together these factors have negative effect on 3D scanned image segmentation and subsequent structural analysis as well as clinical diagnosis. Local binary patterns (LBP) based methods represent a promising but still under-investigated alternative solution to conventional manual and semi-automatic 3D segmentation methods [5, 9].

This paper presents a novel automated adaptive multiscale local binary patterns (LBP)-based 3D segmentation method (AMLM), which extends an existing trabecular bone tissue specific segmentation method with customized adaptive local thresholding to improve its robustness. We estimate scanner and resolution specific thresholding parameters using bone phantom scans at three resolutions, and then evaluate the adaptive thresholding by comparing automated and traditional binarization results of two bovine subchondral bone samples scanned with the same equipment. Section 2 introduces the used LBP-based method. Section 3 describes the measurements in detail. Section 4 presents the results and Sect. 5 concludes the paper.

2 Multiscale LBP-Based 3D Segmentation Method Using Adaptive Local Thresholding

2.1 Introduction to LBP

LBP is an image operator that assigns each pixel a descriptor value, which is obtained by thresholding gray-values of a pixel neighborhood point pattern and interpreting the result as a binary number. This LBP code can be used to classify different neighborhood patterns. The original LBP method was introduced for image texture analysis, where it has been shown to be both computationally efficient and insensitive to global variations of grayscale values [10,11,12,13].

The original LBP neighborhood consisted of the eight pixels adjacent to a center pixel, but the model has since been extended to larger and circularly shaped neighborhoods with bilinear interpolation [11]. Figure 1 (left) illustrates the principle of the 8-bit LBP code calculation using a circular neighborhood with eight ordered members.

Fig. 1.
figure 1

Left: 8-bit LBP code calculation of a pixel (solid black) in 2D with eight LBP neighbors (gray). The code value is the sum of the bit values of the neighborhood elements (solid gray) whose interpolated gray-value (number inside the element) equals or exceeds the center pixel gray-value. In this example, the LBP code is 1 + 4 + 8 + 16 + 32 + 64 = 125. Right: Partial multiscale neighborhood of a voxel (solid black, at the origo of this frame) in the default neighborhood configuration. The spherical inner neighborhood at radius R 1 = 1 consists of 26 points (gray). The remaining points (solid black) at radius 2 at the top represent the outer neighborhood cluster of the topmost inner neighborhood point (solid gray).

The circular LBP operator takes or interpolates the P neighboring point gray-values g p at radius R from the center pixel, thresholds them using the center pixel gray-value g c , then encodes non-negative results at (zero-based) neighbor positions p into binary values 2p and calculates their sum [11].

2.2 Multiscale LBP-Based 3D Segmentation

Multiscale LBP-based 3D segmentation method (MLM) is a new automated segmentation method that has been suggested as an alternative to the common binary thresholding for analyzing bone microstructures in CT images [5]. It has recently been validated for the segmentation of µCT scans of osteoarthritic trabecular bone and the subsequent statistical analysis [9].

MLM is based on LBP but differs from it in certain ways. Major difference from traditional LBP is that patterns are evaluated using a global threshold instead of the voxel gray-value or other local threshold. In other words, pattern elements are defined in effect by their tissue membership instead of their relative local activity.

In addition, MLM examines neighborhood patterns on two nested levels in a 3D volume, firstly for the voxel itself and secondly for its inner neighborhood points. Figure 1 (right) illustrates the geometry of the MLM neighborhood. The MLM groups and analyzes the nested patterns to label continuous bone structures and their edges and to set disconnected bone voxels to the background.

In general, the inner neighborhood consists of a spherical set of N 1 vertices at radius R 1, and the full outer neighborhood set comprises N 1 patches of small spherical caps further away at radius R 2 in the direction of each inner neighborhood point. Outer neighborhood patches are based on the polar vertex neighborhood of a vertex sphere with radius R 2 and N 2 uniformly distributed vertices.

2.3 Local Mean Based Thresholding: Bradley’s Method and Adaptive Mean Thresholding

Bradley’s thresholding method was introduced for 2D binarization of digital grayscale documents with varying levels of illumination [14]. The principle of the method is straightforward. First, a local threshold map B(x,y) is created by mean filtering a 2D source image f(x,y) and downscaling the result f µ(x,y) by T percent. The source image is then compared to the threshold map, and each pixel with gray-value equal to or greater than its corresponding local threshold value is assigned to the foreground. The method requires two external parameters, (isotropic) mean filter kernel size W and downscaling adjustment percentage T. The actual threshold map B(x,y) is

$$ B(x,y) = \left( {1 - \frac{T}{100}} \right)f_{\mu } (x,y ). $$
(1)

The calculation of the mean filtered image f µ can be expressed as

$$ f_{\mu } (x,y )= \frac{1}{{W^{2} }}\sum\nolimits_{i = x - r}^{x + r} {\sum\nolimits_{j = y - r}^{y + r} {f(x,y)} }. $$
(2)

where r is the axial extent of the filter mask from its center, and W is short for 2r + 1, the actual width of the mean filter kernel.

The other 2D method, adaptive mean thresholding (also known as mean–C) [15], is very much like Bradley’s method. The difference is that it subtracts an arbitrary constant value C from the mean filtered image instead of downscaling it with a weight factor. The threshold calculation can be done using the following:

$$ B(x,y) = f_{\mu } (x,y) - C. $$
(3)

An important feature of both mean thresholding methods is that the calculation of neighborhood mean values can be conducted very efficiently by using a pre-calculated integral image, which eliminates explicit sum calculation for local means. The revised calculations comprise simple addition and subtraction of indexed values, and the number of operations is fixed regardless of the kernel size W. This makes mean filtering based adaptive thresholding methods fast compared to e.g. median filtering. The performance gain is advantageous in 3D image analysis, where datasets can be very large. Hence, for this application, these computationally simple and relatively fast mean adaptive thresholding algorithms present therefore a promising local alternative to global thresholding.

2.4 MLM with Adaptive Local Thresholding (AMLM)

The AMLM is an extension of the MLM with the same basic principles. It integrates the adaptive local methods in 3D to threshold and evaluate neighborhood patterns. This and other distinguishing features of AMLM are highlighted in Fig. 2.

Fig. 2.
figure 2

Major features that make the AMLM segmentation (right) different from MLM.

The MLM uses a global value for its initial thresholding and LBP calculations. We replaced it with a map of local threshold values for this experiment to make the method more robust, especially if the gray-value intensity varies over the imaged region.

Our thresholding method is a generalized 3D adaptation of two existing 2D binary thresholding methods, the adaptive mean thresholding and Bradley’s method. The volumetric extension of the mean formula can be expressed as

$$ f_{\mu } (x,y,z) = \, \frac{1}{{W^{3} }}\sum\nolimits_{i = x - r}^{x + r} {\sum\nolimits_{j = y - r}^{y + r} {\sum\nolimits_{k = z - r}^{z + r} {f(x,y,z)} } }. $$
(4)

and the corresponding hybrid formula that incorporates both previously introduced 2D methods becomes

$$ B(x,y,z) = \left( {1 - \frac{T}{100}} \right)f_{\mu } (x,y,z) + C. $$
(5)

The hybrid method requires three external parameters: mean filter kernel size W, local mean volume scaling percent adjustment T and constant adjustment C.

Inner border labeling was added to AMLM based on the hypothesis that its inclusion would complement the set of likely partial volume voxels in the tissue border area and give more calculation options to improve the accuracy of subsequent volumetric LBP analysis.

AMLM segmentation is affected by several parameters. One set of parameters configures the initial volumetric thresholding, which in effect determines the segmented tissue volume. Another parameter set defines the neighborhood parameters for the LBP-based multiscale segmentation algorithm, which further segments the inner and outer object borders in the initial binary volume and relabels apparent disconnected voxels to the background. We estimate these two parameter sets separately, firstly because they do not have a large effect on each other, and secondly because coupling would be impractical due to large number of possible parameter combinations.

3 Materials and Methods

Scanned media were cylindrical subchondral trabecular bone samples (diameter 10 mm, length 20 mm) of two bovine lateral proximal tibias and a cylindrical 8 mm diameter thickness calibration phantom (SkyScan SP-4001, Bruker MicroCT) containing four 2 mm wide aluminum foil plates with nominal thicknesses of 20 µm, 50 µm, 125 µm and 250 µm (±10% tolerance). All samples were wrapped in foam and oriented horizontally (in proximal-tibial orientation) for scanning. We selected thresholding/segmentation volume of interest (VOI) from the middle of the scan volumes. The volumes were small enough to facilitate effective processing and large enough to produce meaningful trabecular measurements (6.7 mm × 6.7 mm × 6.7 mm).

We scanned the media using a µCT device (Skyscan 1172, Bruker microCT, Kontich, Belgium) set up with an Al (0.5 mm) filter, complete 360° rotation, a step size of 0.5°, and averaging of 3 frames. We used three camera settings: 4000 × 2672 pixels (1 × 1 binning), 2000 × 1336 pixels (2 × 2 binning), and 1000 × 668 pixels (4 × 4 binning), enabling resolutions of 8.71 μm, 17.42 µm, and 34.84 µm. These resolutions are referred to by the next highest integer sizes (9 µm, 18 µm, and 35 µm). Respective exposure time, acceleration voltage, and current settings were: 1300 ms, 50 kV, 500 µA; 350 ms, 50 kV, 500 µA; 90 ms, 40 kV, 476 µA (phantom) / 500 µA (samples).

We developed the automatic segmentation scripts with MATLAB (version R2016a 64-bit, MathWorks) and performed the automatic segmentation of the VOIs on a laptop (Fujitsu Lifebook NH532, Intel® Core™ i5-3210M CPU @ 2.50 GHz with 2 cores, 4 logical processors, 16 GB RAM) using the previously selected thresholding and neighborhood parameters.

3.1 Estimation of Segmentation Parameters

We calculated the volumetric trabecular bone thickness (Tb.Th) of segmented µCT thickness phantom scans for all three resolutions using a number of different thresholding parameter sets. We ignored the least reliable measurements of plates with nominal thickness less than double the voxel size (20 µm plate at the 18 µm resolution, 20 µm and 50 µm plates at the 35 µm resolution). We calculated parameter combinations that best matched the nominal and measured phantom plate thicknesses for each resolution. We used the mean of relative (percent) plate thickness errors between the nominal and measured values as a goodness indicator. The results yielded resolution specific thresholding parameters (W, T, C) for configuring the automatic segmentation.

We segmented all trabecular bone sample VOIs using AMLM with the default (global) MLM thresholding for different neighborhood parameter sets {R 1 × R 2 × N 1 × N 2}. The set of all test neighborhood parameter sets was R 1 × R 2 × N 1 × N 2, where R 1 ∈ {0.69, 1.00, 1.46}, R 2 ∈ {1.69, 2.00, 2.46}, N 1 ∈ {18, 26, 38}, and N 2 ∈ {54, 78, 114}. Subsequently, we measured the thresholding and processing times and calculated the structural similarity index (SSIM) between reconstructed and segmented CT volumes and its linear correlation with different parameters. After that, we analyzed the results and selected the best neighborhood parameter set for the automatic segmentation.

We measured segmentation times and calculated the bone volume fraction (BV/TV) and average Tb.Th of each segmented volume with the CT Analyzer (CTAn) application (version 1.16.4.1+ 64-bit, Bruker microCT). We configured the 35 µm AMLM thresholding also visually with an experimental segmentation preview tool that we developed with MATLAB for this purpose. With its help, we previewed the sample 1 to select a set of thresholding parameters and used them to segment both samples with AMLM. In addition, we calculated the mean BV/TV and Tb.Th measures from the traditionally segmented VOIs.

3.2 Evaluation of Automated Adaptive Segmentation

The adaptive 3D thresholding method was tested against traditional user-dependent segmentation. The same segmentation scans, of the two trabecular bone samples scanned at three different resolutions with a µCT device, were binarized by two experienced users separately with CTAn (version 1.14.4.1). Both people experimented with their preferred image operations and parameters (filters, morphological, segmentation) until the results were visually acceptable, and measured the time spent on each VOI.

Subsequently, we took the GT reference measures from the highest resolution VOIs, which we thresholded (Otsu’s method) and analyzed with CTAn (version 1.16.4.1+ 64-bit). Finally, we compared the measurements of the traditional computer-assisted segmentation and AMLM segmentation to each other and to the GT.

4 Results

4.1 Estimation of Segmentation Parameters

We selected the thresholding parameter values in Table 1 for each resolution based on the measurements of phantom scans segmented with different parameters. Parameter estimates were expected to be less reliable at lower resolutions due to lower image quality and smaller number of measurable plates. The large mean percent error at the lowest resolution is therefore not surprising.

Table 1. Selected thresholding parameters for each resolution based on segmented phantom scan measurements; W (px) = mean filter kernel size; T (%) = mean image downscaling adjustment; C * 256 = downscaled mean image absolute adjustment as normalized fraction value multiplied by 256; µδ (%) = mean percent error from the nominal phantom plate thickness

We calculated sample correlation coefficients µ r of the values of tested neighborhood parameters with corresponding measurements of the SSIM index and thresholding time and averaged them over the VOIs in Table 2. The results show that both SSIM and segmentation time express the strongest linear dependency on the inner neighborhood radius parameter R 1.

Table 2. Average sample correlation coefficients µ r of the tested neighborhood parameter values with similarity and segmentation time; constant values are R 1 = 1.46 and N 1 = 38; SD = standard deviation

Keeping the inner radius R 1 constant with the parameter value that gives the greatest similarity, the next set of averaged correlations show that N 1 is the next relevant parameter whose value correlates with SSIM. The measurements of the remaining parameters R 2 and N 2 do not significantly correlate with SSIM when the more significant parameters R 1 and N 1 were set constant with values that maximized their contribution to SSIM. Based on these results, we selected the neighborhood parameter values for automatic segmentation as R 1 = 1.46, N 1 = 38, R 2 = 2, and N 2 = 78.

4.2 Evaluation of Automated Adaptive Segmentation

The AMLM segmentation was performed using the previously selected thresholding and neighborhood parameters. We measured the average BV/TV and Tb.Th as well as binary thresholding and complete trabecular sub-label segmentation times as shown in Table 3. The data in the table include measurements using the alternate visual setup for the 35 µm volumes, as well as the GT reference measurements. The 35 µm resolution BV/TV and Tb.Th measurements of the original AMLM configuration are markedly different from the other measurements. Binarization times are very small compared to trabecular segmentation times.

Table 3. Measurements from AMLM segmentation results except where indicated; * = Automatic Otsu’s thresholding of the best resolution (GT); ** = Binarized using the alternative thresholding parameters (W = 3, T = 53, C = 60)

We measured the BV/TV and Tb.Th of the traditionally binarized volumes as reference for comparison with the automatic segmentation as shown in Table 4. Segmentation takes several minutes, and time deviation is large. Measured BV/TV and Tb.Th values tend to be greater at lower resolutions.

Table 4. Average measures from traditional binary segmentation; SD = standard deviation

Comparison of automatic AMLM segmentation results to traditional segmentation in Fig. 3 shows that the AMLM structural measurements are slightly lower but comparable, except for the measures taken from the lowest resolution VOIs, which are significantly lower. Binarization time with the parameterized 3D hybrid method is negligible compared to traditional thresholding even at the highest resolution, where the VOI consists of 2563 voxels.

Fig. 3.
figure 3

Percent ratio of the automatic and traditional segmentation measurements for different VOIs (sample, resolution). * = Segmented using the alternative AMLM thresholding parameters.

Figure 4 compares measured structural parameters of AMLM and traditional segmentation to those of the best resolution VOIs (GT). All measurements from AMLM segmented VOIs are closer to our GT than traditionally measured values. This is true especially for the measurements of Tb.Th, which is not surprising, because AMLM was parameterized with thickness measurements. Both the traditional and automatic AMLM segmentation tend to overestimate the bone volume fraction and trabecular thickness, except for the AMLM measures at the lowest resolution.

Fig. 4.
figure 4

Percent error of the automatic and traditional segmentation measurements compared to the best resolution raw image measurements. * = Segmented using the alternative AMLM thresholding parameters.

5 Conclusion

In this study, we segmented trabecular bone µCT scans with the LBP-based AMLM segmentation. We have demonstrated a successful application of the generalization of the adaptive mean thresholding and Bradley’s method in 3D as part of the segmentation. To our knowledge, neither of these thresholding methods has been adapted into 3D and applied to analysis of medical images before.

We have shown how neighborhood parameter adjustment affects the perceived similarity of the segmentation result to the source volume and suggested more optimal values based on the results.

We compared the automatic method to the traditional segmentation performed by two experienced users. The automatic method outperformed traditional segmentation in binarization speed by about 20000% in the worst case, and its relative BV/TV and Tb.Th measurement errors were respectively 33% and 57% smaller on the average. This indicates that the method was successful for the used equipment and samples. However, the varied Tb.Th measurement results indicate that the automatic segmentation method configuration is not reliable at low resolutions. Corresponding measurements of low resolution VOI segmentation configured visually with adaptive thresholding tool are more in line with the other measurements, and further suggest that thresholding parameter configuration is a problem.

The hybrid adaptive mean thresholding method was selected, because neither Bradley’s method nor mean–C could produce generally satisfactory results by themselves, and mean filter was preferred over median and Gaussian filters because of its superior speed performance in this setting. This was also the reason why more sophisticated segmentation methods were rejected. However, these considerations will change if more efficient methods become available, or if computation power and memory become significantly less of a factor in the future.

A drawback of the new adaptive local thresholding is that it requires three external parameters. Optimal parameters cannot be based on simple rules, and they must be determined experimentally for each tissue, imaging modality, and resolution, for example with a thickness phantom like here. Even then, parameter configuration takes time and the manual process is prone to inaccuracies, especially at low resolutions, which undermines the repeatability of automated segmentation. Configuration can be made easier and more reliable with an interactive thresholding preview tool, which facilitates experimenting with different parameters starting with reasonable default values. We developed such tool and used it with improved results.

As future work, the segmentation method could be tested with more varied data to determine how our results are able to generalize and to which extent they depend on the characteristics of the data. Also, the performance of the AMLM compared to the original method remains to be evaluated in statistical bone microstructure analysis. The 3D thresholding method might find use in other applications where speed or computational simplicity is important.