1 Introduction

Image registration [1, 2] is the basis of medical image analysis; this task can establish consistency among the corresponding anatomical structures in different medical images in space and is also a key technology for realizing precision medicine [3]. These attributes cause medical image registration to play an important role in clinical applications such as precise disease diagnosis [4], atlas analysis [5], image-guided radiotherapy [6] and surgical navigation [7]. Until recently, image registration was mostly performed manually by clinicians. However, many registration tasks can be quite challenging, and the quality of manual alignments is highly dependent upon the expertise of the user, which can be clinically disadvantageous [8]. These issues make automated registration methods urgently needed for clinical use.

During the preprocessing work of multimodal image fusion [9], multimodal image registration aligns multimodal information into the same image, making it easier for doctors to more accurately observe lesions and structures as detailed diagnostic information. The registrations of dynamic images collected at different times and the changes in lesions and organs can be quantitatively analyzed, making medical diagnosis, surgical planning, and radiation therapy planning more accurate and reliable. Since different modalities exhibit different characteristics, finding fast and accurate correspondences between images with different modalities is still a challenge.

In multimodal registration, the greatest difficulty comes from the great variability exhibited by organ and tissue appearances when imaged with different physical principles, which results in the lack of a general rule for establishing structure correspondence [10]. In this case, finding a proper similarity measure can be regarded as the central part of the objective function, making it the most important and challenging step. Particularly, the similarity measure between the images to be registered consists of numerous local extrema, making multimodal image registration an optimization problem. These aspects consequentially make optimization algorithms crucial parts of image registration.

Many optimization algorithms have been applied to multimodal registration. Compared to the traditional local search methods such as Powell’s method or the Levenberg-Marquardt (LM) method, metaheuristics-based methods [11,12,13,14,15] are excellent in that they rarely fall into local extrema, stably obtain global optima, are robust, and exhibit insensitivity to noise disturbances. Many metaheuristic algorithms have been applied in the past decade as optimization methods for medical image registration [16]. Regardless, metaheuristic algorithms are usually employed in situations where suboptimal solutions can be easily found but optimal solutions cannot be obtained in a stable manner [17,18,19]. One of the reasons for this phenomenon is that metaheuristic algorithms are usually limited to the initial values of the transformation parameters, which often lead the entire optimization process to fall into local extrema. Another reason is that in the latter stage of the registration process, the optimization procedure requires the algorithm to have good local search capabilities to obtain the optimal transformation matrix. These inner characteristics of intensity-based methods result in high requirements for the utilized optimization algorithm. The algorithm must have good global and local search capabilities, and it requires a good balance between exploration and exploitation [20].

As some of the most common metaheuristic algorithms, differential evolution (DE) and its variants have been applied to solve various optimization problems [21]. However, the directly application of DE to medical image registration still encounters many problems. For example, the medical image registration task is very demanding in terms of exploitation, which is often used to measure the pros and cons of the employed image registration optimization algorithm. Based on that, the most important shortcomings faced when applying DE to medical image registration are as follows. (1) DE occasionally falls into local minima, making DE unable to stably converge to the optimal region. (2) DE is capable of finding suboptima but incapable of attaining optimum values. In this case, it is necessary to replace DE with other modified algorithms to better complete the registration task.

Civicioglu et al. [22] proposed a modified DE method, named the Bernstein search-based DE (BSD) algorithm, by adopting 2nd-degree Bernstein polynomials that cooperate with a random crossover process. It has been proven that BSD is outperformed by approaches such as the artificial bee colony (ABC) algorithm [23], adaptive differential evolution (JADE) algorithm [24], cuckoo search algorithms (CUCKOO) [25], and weighted differential evolution (WDE) algorithm [26] in numerical function optimization and image evolution problems. However, BSD takes 2nd Bernstein polynomials as the crossover ratio, so this search strategy possesses some shortcomings. (1) The 2nd Bernstein polynomial curve is constant, which is not conducive to the variability of the algorithm. (2) The search pattern of BSD is not suitable for medical image registration, especially in the later exploitation stage of the registration process. Hence, we propose a modified BSD algorithm, a highly efficient metaheuristic named the normal vibration distribution search-based differential evolution algorithm (NVSA), to fix this problem.

The NVSA utilizes a new crossover method and search strategy. These improvements help the proposed method improve both its exploitation and exploration abilities and further form a balance between them in medical image registration scenarios. The original intention of the design of this method is to solve the problems that metaheuristic algorithms consume too many computation resources and achieve low precision. In addition, two-dimensional linear registration needs to optimize three to six parameters, which enables the NVSA to complete the registration of rigid and similar transformations without preregistration.

In the last decade, researchers have tended to use some classic approaches such as ANTS [27], Elastix [28] and FSL [29] to complete simple registration tasks. The advantage of these classic applications is that they do not require pretraining to quickly obtain results. Many deep learning methods also tend to use the above software to perform the registration task as a preprocessing step (linear) for deep learning (nonlinear) [30,31,32]. Convenient as these classic tools are, their accuracy still leaves much to be desired. Conversely, the proposed method can better complete the registration task with acceptable time consumption. Therefore, the NVSA has the potential to replace the commonly used linear registration method as a state-of-the-art (SOTA) approach.

We experimentally compare the proposed method, referred to as the NVSA, with nine benchmark metaheuristic algorithms and three frequently used applications on computed tomography (CT) and magnetic resonance imaging (MRI) brain images. The results show that the NVSA can achieve optimal accuracy at a reasonable computational cost, and more importantly, it outperforms the benchmark algorithms and classic registration methods. We further utilize Friedman’s mean rank test [33] and Bonferroni correction-adjusted significance to statistically prove that the results are not obtained by chance. The experimental results further suggest that different problem instances require different proportions of exploitation and exploration. Both conducted experiments confirm the importance of our new adaptive approach underpinned by the proposed mutation methods in the NVSA algorithm. The main contributions of our work are summarized as follows.

  • We propose a new metaheuristic algorithm called the NVSA, which is modified from BSD by replacing the \(2^{n}\) Bernstein polynomials with a novel normal vibration distribution and coordinating the approach with a variable search pattern.

  • We test the NVSA with five other metaheuristic algorithms on 23 classic optimization functions and summarize them with their accuracy, robustness and significance level.

  • We test the NVSA with five other metaheuristic algorithms on 16 brain images derived from real-world patients with 41 different type of multimodal registration and summarize them with their accuracy, time consumption, robustness and statistics. Moreover, we present supplementary figures including histograms for the test images, visualizations of the registration process, and population quality figures based on the number of iterations.

  • We test the proposed algorithm with the three other classic medical image registration methods that are frequently used in various medical image registration scenarios. The experimental results prove that the proposed method has the potential to surpass and replace the existing classic registration tools in linear registration tasks.

  • The present study demonstrates that the application of metaheuristic-based methods can effectively improve the accuracy and efficacy of linear registration tasks over that achieved with classic methods. These findings indicate that our work holds great promise for addressing the practical clinical needs of nonrigid registration as a preprocessing procedure.

The remainder of this paper is organized as follows. In Sect. 2, related works are presented. The basic BSD algorithm and the innovations and working principles of the NVSA are described in Sect. 3. Then, Sect. 4 demonstrates experiments involving numerical function-based optimization problems. Section 5 indicates the experimental results obtained with images derived from real-world patients with tables and figures. Finally, we conclude the paper and briefly discuss future work.

2 Related works

2.1 Medical image registration techniques

Classic medical image registration methods can be roughly categorized as either feature-based or intensity-based approaches [34]. Feature-based methods represent features with the famous scale-invariant feature transform algorithm (SIFT) [35], which obtains a geometric transformation by extracting corresponding features such as points, lines, vectors, surfaces, and volumes. To a certain extent, these features can reduce the complexity of the registration problem. However, the performance of this type of algorithm mainly depends on its feature design, and the effectiveness of the selected features directly affects the subsequent registration results. Usually, features with conspicuous characteristics are required, and the design of image features with strong expressive abilities remains a problem in registration tasks. On the other hand, for specific image data, finely designed artificial features may provide satisfactory registration results, but there is no guarantee that these features will be well-suited for other data registration tasks.

Regarding intensity-based methods, their process consisting of similarity measurement and optimization has become the major approach for medical image registration due to its high registration accuracy and lack of preprocessing requirements [36]. Generally, pairwise intensity-based medical image registration methods focus on optimizing image similarity, which is a metric that indicates how well images intensities correspond. The goal of pairwise registration is to find transformation parameters that bring two images into correspondence based on their contents. Typically, this process is solved by iteratively optimizing a predefined handcrafted intensity-based dissimilarity metric over the transformation parameters.

Recently, deep learning-based methods have been widely used in medical image registration [37, 38]. These approaches can be simply divided into three categories including deep similarity metrics and supervised and unsupervised transformation estimation methods [39]. Data-driven methods can effectively complete the registration process in a very short time after training them through a suitable framework. However, this idea is often based on large amounts of data labeling and preprocessing, which imposes high requirements on the training data and causes the training process to take a long time. In addition, each trained model usually cannot be effectively transplanted to other datasets. For deep iterative frameworks, an application-specific similarity metric that is learned from architecture neural networks like convolutional neural networks (CNNs) is needed [40], which indicates that an optimization algorithm is also necessary during that process. Other nonparametric approaches tend to be roughly preregistered, while some utilize a multistep process by refining the affine transformation parameters to predict large, global displacements and rotations [41]. It is worth noting that most deep learning-based methods focus on deformable registration rather than linear registration. Compared to commercial software, researchers tend to use some open-source tools such as ANTS [27], Elastix [28] and FSL [29] to accomplish linear as well as coarse registration. In this work, the three commonly used preregistered methods mentioned above are also included in the comparison.

Fig. 1
figure 1

The flowchart of the registration process. The main registration process involves finding a transformation that maximizes or minimizes the similarity metrics between the warped moving image and the corresponding fixed image

2.2 Intensity-based medical image registration algorithms

Intensity-based image registration is a crucial process that involves aligning two or more images by analyzing the intensity values of their pixels. The process is iterative and consists of four key components, including a similarity metric, a parameterizable transformation, an interpolation approach, and an optimization strategy, as illustrated in Fig. 1.

As medical image registration is an optimization problem, an appropriate optimization algorithm is essential for obtaining superior transformation parameters. However, many metaheuristic algorithms that perform well in unconstrained optimization problems may not be suitable for real problems due to the complexity of such problems. For example, the similarity measure employed for unregistered images is usually nonlinear and has numerous local extrema, making it difficult to find the globally best solution. Therefore, an appropriate optimization algorithm can benefit the registration process by finding the best-fitting transformation parameters between the unregistered images in a fast and reliable manner. To address this issue, high-performance algorithms are needed to satisfy the high requirements of intensity-based image registration.

Table 1 Comparison of multimodal registration methods adopting metaheuristic algorithms

Recently, with the development of multimodal medical image registration, various methods that use metaheuristic algorithms for optimization purposes have been proposed. As shown in Table 1, optimization algorithms including the coral reef optimization algorithm with substrate layers (CRO-SL) [42], the dragonfly algorithm (DA) [43], the gray wolf optimizer (GWO) [44], biography-based optimization with elite learning (BBOEL), the hybrid differential search algorithm (HDSA) [46] and the united equilibrium optimizer (UEO) [47] are applied to intensity-based medical image registration. These algorithms have demonstrated their superiority over many of the other algorithms listed above. However, these methods have certain limitations that restrict their applicability to general multimodal brain tasks. For example, some of these algorithms are based on rigid transformations [44,45,46,47] and do not consider affine transformations, which are indispensable in practical applications. In addition, some algorithms are time-consuming [43, 45] and have not been compared with classic registration tools [43,44,45, 45,46,47], which limits their practicality.

As seen in Table 1, the normalized mutual information (NMI) [48] is a commonly used measure for calculating global information intensities; other similarity measures such as the cross-cumulative residual entropy (CCRE) [49] have also been proposed. In general, similarity measures such as the NMI and CCRE provide global evaluations of registration quality, while other measures such as normalized cross-correlation (NCC) and the root mean square error (RMSE) [50] may be more localized. Choosing a similarity measure such as the NMI that tends to be global can improve the stability of the utilized algorithm during its iterative process; however, this comes at the cost of ignoring the relevance of some detail areas. On the contrary, if a similarity measure that tends to be local is used as the optimization function, the resulting registration performance may be overestimated despite poor actual alignment results. For example, a high NCC value may not necessarily indicate superior registration performance.

To evaluate the algorithm proposed in this paper more fairly, we use the NMI as the input objective function as well as the global maximum similarity measure as the output function. In addition, the RMSE is chosen as an auxiliary similarity measure for more comprehensive evaluating the registration results. This approach ensures a better overall registration effect and proves the quality of the obtained registration results from the side. The NMI and RMSE of a moving image M and a fixed image F are defined as Eqs. (1) and (2), respectively:

$$\begin{aligned} \mathrm{NMI(M,F)}=\frac{H(M)+H(F)}{2H(M,F)} \end{aligned}$$
(1)

Here H indicates Shannon entropy.

$$\begin{aligned} \textrm{RMSE} = \sqrt{\frac{1}{n} \sum _{i=1}^n(M_{i}-{F}_{i})^2} \end{aligned}$$
(2)

Here n indicates the number of image pixels.

3 Proposed method

3.1 BSD

BSD [22] was inspired by Bernstein polynomials; it is considered a new universal differential evolution (uDE) algorithm due to its lack of internal control parameters. BSD is described as an easily controllable, simple-structured, nonrecursive, highly efficient, fast, and practically parameter-free uDE algorithm. In addition, the BSD proved its superiority over other algorithms such as the ABC algorithm [23], the JADE algorithm [24], CUCKOO [25], and the WDE algorithm [26] in a comparison concerning numerical function optimization and image evolution problems. The essential parts of the BSD algorithm are summarized as follows, and the relevant nomenclature is summarized in Table 2:

Table 2 Nomenclature

3.1.1 Initialization and function evaluation

Similar to other uDE algorithms, in BSD, N individuals are randomly generated in a D-dimensional search space within the certain boundary range (up and low) using Eq. (3):

$$\begin{aligned} {P}_{i,j}^{\rm{Initial}}={\rm{low}}_{j}+\alpha \cdot ({\rm{up}}_{j}-{\rm{low}}_{j}) \ | \ \alpha \sim {\textbf {U}} (0,1) \qquad {i\in [1:N];j\in [1:D]} \end{aligned}$$
(3)

Then, the function evaluation values fitP are calculated based on the objective function f using Eq. (4):

$$\begin{aligned} {fitP}_{i}=f({P}_{i}) \qquad {i\in [1:N]}, \end{aligned}$$
(4)

3.1.2 Setting the best pattern vector

The best pattern vectors obtained thus far \({best}_{PV}\) are defined as Eq. (5), and the objective function value calculated with this set of vectors \({best}_{sol}\) is regarded as the global best vector and produces the best fitness value.

$$\begin{aligned}{}[{best}_{sol},{best}_{PV}] = [{fitP}_{(\gamma )},{P}_{(\gamma )}] \ | \ {fitP}_{(\gamma )}=min(fitP) \qquad {\gamma \in [1:N]} \end{aligned}$$
(5)

3.1.3 Crossover ratio

BSD manipulates the crossover [51, 52] ratio with an activation matrix M. The initial matrix M is a zero matrix. For every individual i, M is determined by using Eq. (6):

$$\begin{aligned} {M}_{i,u(1:[\rho .D])}=1 \ | \ u = Permute([1:D]) \end{aligned}$$
(6)

where \(permute(\cdot )\) is a sorter that rearranges the order of the \(elements(\cdot )\). \(\rho\) is defined using Eq. (7);

$$\begin{aligned} \rho = {\left\{ \begin{array}{ll} (1-\beta )^2 &{} k=1 \\ 2\beta \cdot (1-\beta ) &{} k=2 \\ \beta ^2 &{} k=3 \end{array}\right. } \end{aligned}$$
(7)

where \(\beta \sim {\textbf {U}} (0,1)\), and \(k\in [1:3]\). The value of \(\rho\) can be calculated with \(2^{nd}\)-degree Bernstein polynomials. Specifically, \(\beta\) is given to one of three cases that obey \(2^{nd}\)-degree Bernstein polynomials. As shown in Fig. 2, a \(\beta\) generated in the range of [0,1] is given to one of three cases that obey \(2^{nd}\)-degree Bernstein polynomials.

Fig. 2
figure 2

\(2^{nd}\) degree Bernstein polynomials

3.1.4 Search range of the population

The BSD algorithm computes the search range of the population using Eq. (8). Here, the parameters \(\eta \sim {\textbf {U}} (0,1)\) and \(\lambda \sim {\textbf {N}} (0,1)\) represent uniform and normal distributions, respectively. u and v are dynamic selectors whose default settings are \(u\sim {\textbf {U}} (0,1)\) and \(v\sim {\textbf {U}} (0,1)\), respectively. Furthermore, a \((\cdot ,\cdot )\) sized all-ones matrix is defined as \(Q(\cdot ,\cdot )\) = 1.

$$\begin{aligned} F= {\left\{ \begin{array}{ll} ([\eta _{(1,1:D)}^3\ \times |\lambda _{(1,1:D)}^3|]' \cdot Q_{(1,1:N)})' &{} u<v \\ \lambda _{(N,1)}^3\ {\times Q}_{(1,D)} &{} u\ge {v} \end{array}\right. } \end{aligned}$$
(8)

3.1.5 Pattern vectors

For the \(i-th\) individual among the best pattern vectors obtained thus far \({best}_{PV}\) and the current pattern vectors P, the trial pattern vector T is executed to evaluate the new objective function value.

$$\begin{aligned} T= & {} P+F\cdot M\cdot E \cdot {(W^*)}^3+(1-{(W^*)}^3) \cdot {best}_{PV}\nonumber \\{} & {} \quad -P \ | \ {W^*}_{(1:N,1)} \sim {\textbf {U}} (0,1) \end{aligned}$$
(9)

Where \(E=W \cdot {P}_{L1}+(1-W)\cdot {P}_{L2} \ | \ {W}_{1:N,1:D}\sim {\textbf {U}} (0,1)\), and L1, L2 are defined in Eq. (10)

$$\begin{aligned} L1=\textrm{Permute}\left( 1:N\right) ,\ L2=\textrm{Permute}\left( 1:N\right) \ | \ L1\ne \left[ 1:N\right] ,L1\ne L2 \end{aligned}$$
(10)

3.1.6 Boundary control

Once the candidate of a trial pattern vector exceeds the preset search space, it is randomly generated in the search space based on the boundary control strategy defined in Eq. (11).

$$\begin{aligned} {T}_{i,j}=\textrm{low}_j+{\alpha } \cdot (up_j-{low}_{j})\ |\ T_{i,j} < \textrm{low}_j \ or\ T_{i,j} > up_j\ | \ \alpha \sim {\textbf {U}} (0,1) \end{aligned}$$
(11)

3.2 NVSA

In this section, the motivation and working principles of the NVSA are introduced; then, the efficiency of the NVSA is discussed. We improve the NVSA by replacing the \(2^{nd}\)-degree Bernstein polynomials with the normal vibration distribution. This vibration can be run by default or fine-tuned by some control parameters. Correspondingly, we redesign the update method for the trial pattern vector and adjusting it with the search strategy.

3.2.1 Normal vibration distribution

The BSD algorithm computes the crossover ratio with 2nd-degree Bernstein polynomials. To be specific, every individual randomly chooses a value from zero to one, which is computed by one of three Bernstein polynomials functions using Eq. (7). As Fig. 2 illustrates, most of the B(t) values tend to be located in a small area. For example, given a random t value, there is a 77.8% probability that the number of population activations B(t) is less than or equal to 0.5. which represents the fact that most of the best-so-far candidates still remain in each iteration. Notably, this working principle favors exploitation over exploration.

To better balance the exploration and exploitation abilities of the developed algorithm, we initially try to replace the Bernstein polynomial functions with the normal distribution. However, a simple normal distribution does not successfully meet our verification standards due to the fact that a single normal distribution is unable to satisfy the crossover property as well as the mutation needs of the algorithm. Thus, a vibration function is added to the normal distribution to increase its population diversity. Finally, to strengthen the robustness of the proposed method when facing various real-world problems, we also set some control parameters to control the vibration process. Here, the NVSA proposes a new method to obtain the activation matrix M using Eq. (12) and applies Eq. (13) to replace Eq. (7) in the BSD algorithm.

$$\begin{aligned} \rho =\; & {} \root G \of {exp} + 2k_1 (1-k_1)- H \end{aligned}$$
(12)
$$\begin{aligned} H =\; & {} \textrm{exp}^{(\frac{1}{G(k_4-k_5)^2+P})} \ | \ G=a+(b-a)k_3; P= c+(d-c)k_2 \end{aligned}$$
(13)

where exp indicates natural constant e and \(k_{1-5}\sim {\textbf {U}} (0,1)\) denotes the uniform distribution. a, b, c, and d are parameters for controlling the vibration, and their default values are set as 1000, 10,000, 3, and 5, respectively. As a result, the values of G and P are decided by the above four parameters, where G and P are randomly generated between [1000, 10,000] and [3,5], respectively. It is worth noting that the four parameters mentioned above can be manually set when facing various optimization problems.

For ease of understanding, Table 3 and Fig. 3 illustrate different kinds of vibration with normal distributions. To be specific, we give each parameter G, P, and k5 three different values in Table 3, along with the corresponding graph shown in Fig. 3, to help the reader better understand how the vibration function works. For example, V0(t), V1(t), and V2(t) in Fig. 3a utilizes three different G values (1000, 5000, 10,000), while \(P=4\) and \(k5=0.5\) remain unchanged.

It can be seen that G controls the vibration width. The larger the value of G is, the narrower the width of the vibration (see Fig. 3a). P controls the maximum value of the entire normal distribution. The smaller the value of P is, the larger height of the entire distribution (see Fig. 3b). Additionally, k5 decides where the vibration appears. The locations of vibrations tend to be generated from left to right as the value of k5 increases from zero to one (see Fig. 3c). In short, a vibration can be generated at any point of the unique normal distribution with a random vibration width (see Fig. 3d).

Compared to the \(\rho\) obtained using 2nd-degree Bernstein polynomials in Eq. (7), this novel normal vibration distribution (1) increases the probability of the overall crossover ratio, which is helpful for exploration in the early stage; (2) replaces the constant 2nd-degree Bernstein polynomials with a dynamic normal distribution that changes over time, which helps promote high algorithmic diversity; and (3) makes the algorithm more variable than the ordinary normal distribution with the addition of vibration.

Table 3 Different value indicates to Fig. 3
Fig. 3
figure 3

Four kinds of random parameters setting illustration, the specific parameter settings are referred to Table 1

3.2.2 Search strategy

In this section, we propose a new search pattern to cooperate with the normal vibration distribution. This is because the search strategy of the original BSD algorithm may not suit the proposed method.

That is, the NVSA uses a smaller search step size to match the high-frequency crossover probability. Another reason for this is that the NVSA is presented to solve medical image registration problems, and this requires us to make corresponding improvements to fit the task needs. To be specific, two places are modified. First, we adjust the search range of the population using Eq. (14).

$$\begin{aligned} F= {\left\{ \begin{array}{ll} \eta _{(1:N,1:D)}^3 &{} u^3<v \\ \lambda _{(1:N,1:D)} &{} u^3\ge {v} \end{array}\right. } \end{aligned}$$
(14)

As in Eq. (8), the parameters \(\eta \sim {\textbf {U}} (0,1)\), \(v\sim {\textbf {U}} (0,1)\), and \(u\sim {\textbf {U}} (0,1)\) here are for the uniform distribution, while \(\lambda \sim {\textbf {N}} (0,1)\) is used for the normal distribution.

Compared to Eq. (8), these modifications simplify the generation of F and increase the search step size of the population to enhance the exploitation ability of the algorithm. This improvement makes the NVSA tend toward active iteration with smaller search steps, and the algorithm can be steadily updated during the early stage of the iterative process. It is easier to find the global optimal result in the later stage of iteration. By combining this improvement with the normal vibration distribution, the NVSA can better enhance the exploration capabilities of the algorithm in the early stage and its exploitation ability in the later stage.

Second, we enhance the variability of Eq. (9) by replacing the associated matrixes with independent versions using Eq. (15).

$$\begin{aligned} T =P+M\cdot F \cdot (s \cdot (w_1 \cdot P_{L1})+w_2 \cdot P_{L2})+w_3 \cdot {\rm{best}}_{\rm{PV}}-P) \end{aligned}$$
(15)

where L1 and L2 remain unchanged compared to Eq. (10). \({s}_{(1:N,1:D)} \sim {\textbf {N}} (0,1)\) signifies the normal distribution, while \(w_1\), \(w_2\) and \(w_3\) represents three different kinds of uniform distributions; \({w}_{1-3,(1:N,1:D)}\sim {\textbf {U}} (0,1)\).

Two aspects motivate this modification. First, we notice that \(W^*\) and \((1-W^*)\) interfere with each other in Eq. (9), which may decrease the diversity of the population. Second, \({W^*}_{(1:N)}\sim {\textbf {U}} (0,1)\) only provides a single function for restricting the search range. Hence, we make two changes: first, we replace W, \((1-W)\), \(W^*\), and \((1-W^*)\) in Eq. (9) with four independent parameters s, w1, w2 and w3, respectively. These unrelated parameters can enhance the diversity of the population.

The second change is to replace \({W^*}_{(1:N)}\sim {\textbf {U}} (0,1)\) with new parameters \({s}_{(1:N,1:D)}\sim {\textbf {N}} (0,1)\). In this s, D stands for the number of optimization problems under consideration, which makes s a matrix rather than a vector. As s obeys a normal distribution, not only can s reduce the step size but it also provides the ability to enlarge the step size. In addition, s also effects the search direction of the algorithm, which supplements the search orientation in the first case of Eq. (9).

Finally, compared to F and T in Eqs. (8) and (9), this modified search strategy (1) adjusts the search range of the population so that the search strategy can be more efficiently matched with the normal vibration distribution; (2) changes the interrelated random parameters into unrelated parameters, which improves the diversity of the algorithm; and (3) uses the normal distribution to expand the selectivity of the search direction of the algorithm so that the algorithm does not easily fall into local optima.

In the end, we propose a brand-new adjustable mutation strategy with a corresponding search structure named the NVSA. These modifiers (1) improve the early-stage global convergence ability of the algorithm; (2) increase the diversity of the algorithm by inventing a new mutation strategy; (3) provide more direction for the search process during the optimization procedure by redesigning the search strategy; and (4) improve the adaptability of the algorithm when facing different optimization problems by adding adjustable control parameters. The pseudocode of the NVSA for medical image registration is shown in Algorithm 1.

figure a

4 Results obtained on benchmark functions

In this section, the NVSA is compared to two categories of existing optimization methods in optimization test problems: (1) Bernstein search-based algorithms, including BSD [22] and the Bezier search-based DE algorithm (BeSD) [53]. (2) In recent years, high-performance optimizers, including the Coronavirus herd immunity optimizer (CHIO) [54], Archimedes optimization algorithm (archOA) [55], and Bernstein-Levy DE algorithm (BDE) [56], have also been developed.

Reference [22] demonstrated that BSD can outperform the ABC [23], JADE [24], CUCKOO [57] and WDE [26] algorithms. Furthermore, the BeSD [53] and BDE [56] algorithms are both modified versions of BSD, sharing similar structures. These methods have proven their superiority over the covariance matrix learning and searching preference (CRMLSP) method, the mean-variance optimization algorithm (MVO), without approximation optimization (WA), SHADE and LSHADE in solving numerical problems. The most relevant provisions are as follows.

  • PlatEMO [58] is adopted as the optimization platform, and the default parameters of each algorithm are used.

  • The algorithms use 50 particles along with 1000 iterations with 30 independent runs (the total numbers of cost function evaluations is 1,50,000).

  • We use 23 well-known classic benchmark functions [58].

  • The mean value and mean standard deviation (STD) are used as performance indicators.

  • A two-tailed Wilcoxon signed rank test was used for the statistical comparison of the results obtained from the experiments. In statistical comparisons, the level of significance is set to 0.05.

  • Testing is performed using MATLAB R2022a on a Windows 10 operating system with an Intel (R) Core (TM) i7-8550U CPU @ 1.80 GHz\(-\)2.00 GHz and 16.00 GB of DDR4 RAM.

Table 4 Comparative results by using 50 particles on classic benchmark functions,The bests are highlighted in bold
Table 5 Comparison of classic benchmark problems solving successes of NVSA and tested methods by using Wilcoxon signed rank test (p = 0.05)

In Table 4, we calculated the mean and standard deviation values (in brackets) for the \(SOP_{F1}-SOP_{F23}\) problems obtained by NVSA and the test methods. In Table 5, we present the results of a statistical comparison based on the Wilcoxon signed rank test (p = 0.05) of the numerical problem-solving success of NVSA and the tested methods for \(SOP_{F1}-SOP_{F23}\).

In the last rows of Table 5, the results obtained from the NVSA and the other tested methods are compared in terms of (+, −, =). (+) represents that the tested method obtains a statistically better result than that of the NVSA. (−) signifies that the NVSA obtains a statistically better result than the tested method. (=) denotes that the performances of the NVSA and the related tested method are statistically equal. 50 particles along with 1000 iterations and 30 independent runs: archOA (12,11,0), BDE (2, 12, 9), BeSD (3, 12, 8), BSD (3, 11, 9), and CHIO (10, 12, 1).

In summary, the results verify the superior performance of the NVSA over the other 5 metaheuristic algorithms in terms of solving various benchmark functions. These two tables demonstrate that the NVSA is significantly better than all Bernstein search-based algorithms: the BSD, BeSD and BDE algorithms. Moreover, the NVSA is slightly superior to CHIO and is competitive with archOA.

5 Multimodal medical image registration results

5.1 Materials

In the following experiments, the Retrospective Image Registration Evaluation (RIRE) database [59] is chosen as the dataset because it is one of the most frequently used datasets for benchmarking the performance achieved by metaheuristic algorithms in multimodal image registration tasks [60]. We take CT images (512 \(\times\) 512 \(\times\)28–34 voxels) as moving images and MR images, including T1, T2, PD images (256 \(\times\) 256 \(\times\) 20–26 voxels) as fixed images. Furthermore, the pixels of these images are normalized to [0,256] to better measure and accomplish the image registration task. Figure 4 illustrates the original and normalized CT and MR images of patient-001.

Fig. 4
figure 4

Histograms of Patient001 original images (up) and normalized images (down)

Fig. 5
figure 5

Visualizations of the registration process between multimodal P002 CT-T1 image by using six different metaheuristic algorithms

5.2 Comparisons with SOTA algorithms

In this section, six metaheuristic algorithms, including the proposed NVSA and the original BSD algorithm [22], the high-performance DA [43] and GWO [44] algorithms, and the SOTA BBOEL [45] and UEO [47] algorithms, are considered as the optimization methods for finding the best similarity metric between the moving and fixed images. To match the previous work, we refer to the parameter settings in [42, 45], which include a population size of 30, a maximum number of generations of 100, a rotation angle in the range of [0, 360] and a translation in the range of [−30,30]. The middle slices of the CT and MR images are taken as the moving and fixed images, respectively.

In addition, we choose a similarity transformation as the experimental model, where four parameters in the matrix must be optimized, and a total of 41 different multimodal registration scenarios are taken to determine experimental results. For each registration scenario, we perform 20 independent runs of each algorithm. In particular, we select the mean NMI and RMSE values as evaluation indices. Notably, higher NMI values and lower RMSE values represent better results in the experiments. Figure 5 illustrates the final multimodal medical image registration results obtained using six different metaheuristic algorithms.

Table 6 Comparison of the average NMI and RMSE results of six different algorithms. The bests are highlighted in bold

Table 6 demonstrates the average NMI and RMSE results obtained for “patient-001” to “patient-109” when optimized by the six algorithms.

Concerning the NMI, the NVSA rank first (as highlighted in bold) among all presented benchmarks and in terms of the average value, which is taken over 41 scenarios in total. In addition, the proposed algorithm also successively ranks first in the Friedman’s mean rank test [33]. Regarding the RMSE, the NVSA also obtains the lowest average value compared to those of the other five algorithms. The results suggest that the NVSA is the most effective algorithm since it produces the lowest ranks in the Friedman test.

Furthermore, the boxplots statistically generated by these six algorithms are shown in Fig. 6. Five-number summaries show that the maximum, first-quartile, median, third-quartile, and minimum NMI values calculated by the NVSA are statistically higher than those of other five algorithms. These boxplots also indicate that our method significantly outperforms the other five metaheuristic algorithms.

Fig. 6
figure 6

The NMI value of total 41 different multimodal registration scenarios optimized by six different algorithms

Table 8 shows the results of pairwise comparison tests of significance, and the p-values of both the asymptotic significance (Sig) and Bonferroni correction-adjusted significance (Adj. Sig.a) metrics are less than 0.05, indicating that the results support the alternative hypothesis over the null hypothesis. The NVSA is found to be statistically significantly different from and more effective than the other five algorithms in terms of multimodal medical image registration based on the RIRE dataset. It is worth mentioning that UEO is statistically significantly different than the other algorithms except for the NVSA. This proves that UEO is also an effective algorithm, but not as effective as the NVSA.

Table 7 Pairwise comparison of significance test

We also demonstrate the time consumption levels of these six algorithms in six scenarios to support our analysis (Table 9). As expected, our method is competitive with the BSD, GWO, and UEO algorithms. Undisputedly, the NVSA is faster than DA and BBOEL.

Table 8 Quantitative time of six different algorithms

To more clearly present the optimization process, we also show some iterative diagrams produced during the registration process, as illustrated in Fig. 7. The two SOTA methods, UEO and BBOEL, rank second and third, respectively, in the mean NMI values obtained over a total of 41 scenarios. Additionally, including GWO, these three algorithms can converge quickly in the early stages of the iterative process compared to NVSA, but they are still slightly inferior during the later local optimization stage, whereas the NVSA can continue searching for the globally best result. This proves that the NVSA has a stronger exploitation ability than the aforementioned algorithms.

Fig. 7
figure 7

Convergence rate comparison and performance comparison with BSD, BBO-EL, DA,GWO and NVSA. Two patients with three different multimodal categories, counting in total six situations are exhibited

Another point worth mentioning is that the iterative NMI value of the NVSA is superior to those of DA and BSD from the beginning to the end, as shown in all six iteration graphs. It can be claimed that the NVSA attains strong exploration and exploitation abilities by modifying its search strategy and utilizing the mutation pattern derived from BSD.

5.3 Comparisons with classic methods

We notice a lack of comparisons between classic methods and metaheuristic algorithm-based methods. In this part, three classic methods including ANTS [27, 30], FSL [29, 32], and Elastix [28, 31] are benchmarked with the proposed method with and without its initial spatial transformation (the NVSA (IST) and the NVSA, respectively) to compare their two-dimensional affine image registration performance. Notably, the metaheuristic algorithms without the initial transformation matrix easily fall into local extrema when using the affine transformation, so we take the final similarity transformation parameters in the NVSA as the initial affine transformation parameters as well as the initial spatial transformation parameters in the NVSA (IST) to estimate the rough geometric transformation. The rough registration results computed by the NVSA are also summarized.

Several changes are implemented in comparison with Section 4.2. 1) All CT images are downsampled to half their original size (256 \(\times\) 256 pixels) by applying bicubic interpolation. This is because images with different sizes may not match well across these three classic methods. 2) The warped and fixed images are cut to 208 \(\times\) 208 pixels because the warped images computed by these three methods contain a few blank pixels. 3) Experiments are be conducted on patients 001–007 because patients 101–109 demonstrate different blank areas than those of the former group. Taking the above factors into consideration, we remove those blank margins to make the comparison fair. Finally, we still use the NMI and RMSE values to evaluate the performance of the tested algorithm. The comparison can be seen in Table 10, and Fig. 8 illustrates the registration results obtained using different approaches from a visual perspective.

Fig. 8
figure 8

Visualizations of the registration results between multimodal P002 CT-T1 image on three classical methods and NVSA

Table 9 Comparison of the average NMI and RMSE results of five different methods. The bests are highlighted in bold

The registration results yielded by different methods are reported in Table 10. Overall, it is can be seen that compared to the commonly used methods such as ANTS, FSL, and Elastix, the NVSA (IST) rank first in terms of the Friedman’s mean rank test results and the average NMI values, and the NVSA without the initial spatial transformation ranks first in terms of the Friedman’s mean rank test results and the average RMSE values.

To be specific, the NVSA outperforms the other methods with respect to the NMI for most multimodal registration scenarios expect CT vs. T2 (P003). Regarding the RMSE, the NVSA and the NVSA (IST) rank first in 15 out of 20 scenarios. They are followed by FSL, which ranks first in 5 out of 20 scenarios. The above findings indicate that the proposed method generalizes better than the classic methods, and thus represents that the NVSA is more successful than the NVSA (IST) in comparing multimodal errors and measuring the distances between corresponding features for the RIRE dataset. Another advantage of the metaheuristic algorithm-based approach is that different transformation metrics can be obtained in each independent operation without specifically setting different initial parameters.

When examining Fig. 9, it can be said that the NVSA (IST) performs best among all five methods. In general, the proposed method is superior to the other methods according to their five-number summaries, which include the maximum, first-quartile, median, third-quartile, and minimum NMI values.

Fig. 9
figure 9

Analyzed performance of related methods for P001–P007 as mean NMI values

Interestingly, the NVSA and FSL do not perform well in terms of NMI (rand 4th and 5th), but they are ranked in the top three with respect to the RMSE. The reason for this may be that FSL chooses the rigid body transformation while the NVSA chooses the similarity transformation rather than the affine transformation employed by other methods. Considering the different properties of the rigid and affine transformations, specifically, the affine transformation distorts the moving image. This makes it more suitable for the NMI, which uses information entropy, than for the RMSE, which uses the distance between the corresponding features as the evaluation index.

6 Conclusion

This paper presents a modified version of the BSD algorithm, called the NVSA, which incorporates an innovative normal vibration distribution, a novel mutation pattern and a new search strategy to enhance its performance. The study evaluates the proposed method on 23 classic optimization problems and 41 multimodal image registration scenarios. The results show that the NVSA outperforms the SOTA metaheuristic methods (BBOEL and UEO) and achieves better results than its predecessor (BSD) and other high-performance algorithms (such as DA and GWO). The paper presents figures, boxplots, tables and significance tests to support its conclusions, demonstrating the NVSA’s excellent exploration and exploitation abilities and its competitive convergence speed. Comparisons with three classic methods further validate the potential of the proposed algorithm for addressing real-world medical imaging needs. Future research can focus on applying the proposed metaheuristic algorithm to a 3D rigid transformation model and exploring multiobjective algorithms to find a balance between various similarity measures. The combination of metaheuristic algorithms and deep learning for coarse-to-fine registration is another interesting point worth studying.