1 Introduction

Magnetic tracking systems are designed to estimate the position and/or orientation of a specially designed object by means of its interaction with static or low-frequency magnetic fields. Given that the human body is transparent to magnetic fields at these frequencies, magnetic tracking systems are popular within the biomedical engineering community. For example, magnetic tracking has been used for eye tracking to diagnose Ménière’s disease (Plotkin et al. 2010), positioning of wireless capsule endoscopes within the gastro-intestinal tract (Yang et al. 2009), real-time organ-positioning during radiotherapy of cancer tumors (Iustin et al. 2008), catheter tracking (Krueger et al. 2005; Biosense Webster 2011), monitoring of heart valve prostheses (Baldoni and Yellen 2007), tongue movement tracking (Gilbert et al. 2010; Wang et al. 2013), tracking of lung segment movements (Leira et al. 2012), and positioning of bone-embedded implants (Sherman et al. 2007). Examples of non-medical applications of magnetic tracking include head tracking for helmet-mounted sights in military aircraft (Raab et al. 1979), underground drilling guidance (Ripka et al. 2012), augmented and virtual reality (Liu et al. 2004), and tracking of the ball during an American football game (Arumugam et al. 2011).

In general, the performance of a measurement system is improved if the number of measurements increases because more information is collected and noise tends to be averaged. Nevertheless, this comes at the cost of more expensive hardware, lengthier measurement time and longer post-processing of the collected data. Therefore, a key issue in the design of measurement systems is how to maximize the information gained per measurement.

This question is fundamental to the theory on the (optimal) design of experiments that has been extensively applied to geo-spatial sciences for problems in agriculture, geology, meteorology etc. The reader is referred to Walter and Pronzato (1997), Uciński (2005), Pukelsheim (2006), Atkinson et al. (2007), and Pronzato and Pázman (2013) for an introduction to the subject. Joshi and Boyd (2009) studied sensor selection by means of convex optimization without a specific application in mind. Examples of electromagnetic applications include optimization of measurement setups for antenna measurements in the near-field (Nordebo and Gustafsson 2006), tracking of human tongue movements (Wang et al. 2013), estimation of current densities in magnetic resonance imaging magnets (Begot et al. 2002), and reconstruction of AC electric currents flowing in massive parallel conductors (Di Rienzo and Zhang 2010).

Within the magnetic tracking community, the impact of the number of measurements has been studied by Schlageter et al. (2001) and Plotkin and Paperno (2003). Schlageter et al. (2001) found that the accuracy of their magnetic tracking system was improved when the number of transmitters, and thus the number of measurements, was doubled. Plotkin and Paperno (2003) found that using more transmitters reduces the number of local minima present in the inverse problem. In contrast, how to obtain as much information as possible from a given number of measurements has received little attention. A rare example of such a study is the work by Shafrir et al. (2010) in which the positions of a fixed number of transmitters are optimized using a two-step evolutionary algorithm. However, their approach is devoted to a specific estimator and it requires that a positioning algorithm is executed a large number of times to build statistics.

In this work, we consider magnetic tracking systems that exploit time-division multiplexing and study how to allocate measurement efforts in an optimal way given a large number of possible measurements. We exploit the theory on the optimal design of experiments and formulate performance metrics based on the Fisher information matrix. The optimization of measurement allocation yields an optimization problem with integer variables. We approximate the integer variables by real variables, which gives us a convex optimization problem. In contrast to the method presented by Shafrir et al. (2010), the proposed method is valid for all unbiased estimators and it does not require a massive amount of computation. Furthermore, the convex nature of the proposed method is very attractive because it removes two difficulties commonly encountered in design of experiments optimization problems, namely, high dimensionality and presence of several local minima that are not globally optimal. Also, the convexity of the method proposed in this work makes it feasible to treat large scale problems.

In this work, we optimize for a measurement domain of arbitrary shape by formulating two cost functions that improve (i) the worst-case performance (minimax approach) and (ii) the expected performance for an assumed prior distribution of the position and orientation of the object we wish to track (average approach). The two approaches are compared for several test cases. Furthermore, we investigate optimal measurement allocation for a realistic measurement scenario. Finally, we study the impact of restrictions on the transmitter positions, which are commonly encountered in practice.

The paper is organized as follows. The modeling of a generic magnetic tracking system is presented in Sect. 2. Section 3 presents performance metrics and the proposed solution methods. The results are then presented in Sect. 4 and discussed in Sect. 5. Finally, the work is concluded in Sect. 6.

2 Modeling of the measurement system

Consider a quasi-magnetostatic tracking system operating at a single frequency. The tracking system consists of (i) one receiving coil with unknown position \(\left( \tilde{x}\,^\mathrm {r}, \tilde{y}\,^\mathrm {r}, {\tilde{z}}\,^\mathrm {r}\right) \) and unknown orientation \({\hat{\mathbf {m}}}^\mathrm {r}=(m^\mathrm {r}_x,m^\mathrm {r}_y,m^\mathrm {r}_z)\), and (ii) \(N^\mathrm {t}\) identical transmitting coils (also referred to as transmitters) with known positions \(\left( \tilde{x}_k\,\!\!^\mathrm {t}, \tilde{y}_k\,\!\!^\mathrm {t}, {\tilde{z}}_k\,\!\!^\mathrm {t}\right) \) and known orientations \({\hat{\mathbf {m}}}^\mathrm {t}_k\). Here and in the following, a vector \(\mathbf {a} =a {\hat{\mathbf {a}}}\) is represented by the magnitude a and the unit vector \({\hat{\mathbf {a}}}\). The tracking system exploits time-division multiplexing to separate the signals from the different transmitters, i.e., the transmitters are operated in sequence such that only one transmitter is transmitting at any given time instant.

The aim of the tracking system is to estimate the position and orientation of the receiving coil, i.e. to estimate

$$ \tilde{{\mathbf{{p}}}} = \left\{ \left[ \tilde{x}^{r} ,\tilde{y}^{r} ,\tilde{z}^{r} ,\tilde{m}_{x}^{r} ,\tilde{m}_{y}^{r} ,\tilde{m}_{z}^{r} \right]^{T} \in {\mathbb{R}}^{6} \mid ( \tilde{m}_{x}^{r} )^{2} + (\tilde{m}_{y}^{r} )^{2} + ( \tilde{m}_{z}^{r} )^{2}= ( \tilde{m}^{r} )^{2} \right\}.$$
(1)

We assume that the physical properties of the sensor are known, which is normally the case in practice. Thus, \(\tilde{m}^\mathrm {r}\) is known and we can use the unit vector \({\hat{\mathbf {m}}}^\mathrm {r}= \tilde{\mathbf {m}}^\mathrm {r}/ \tilde{m}^\mathrm {r}\) directly instead of \(\tilde{\mathbf {m}}^\mathrm {r}\), without loss of generality.

To obtain entries with identical units in the vector that we wish to estimate, the spatial coordinates are normalized with the distance \(d\), which yields

$$\begin{aligned} \mathbf {r_{}}^\mathrm {r}= & {} \left( x^\mathrm {r}, y^\mathrm {r}, z^\mathrm {r}\right) = \left( \tilde{x}\,^\mathrm {r}/d, \tilde{y}\,^\mathrm {r}/d, {\tilde{z}}\,^\mathrm {r}/d\right) \end{aligned}$$
(2)
$$\begin{aligned} \mathbf {r}_{k}^\mathrm {t}= & {} \left( x_k^\mathrm {t}, y_k^\mathrm {t}, z_k^\mathrm {t}\right) = \left( \tilde{x}_k\,\!\!^\mathrm {t}/d, \tilde{y}_k\,\!\!^\mathrm {t}/d, {\tilde{z}}_k\,\!\!^\mathrm {t}/d\right) . \end{aligned}$$
(3)

Thus, the \(p=5\) degrees of freedom that are to be estimated are described by

$$ {\mathbf {p}} = \left\{ \left[ x^{\mathrm {r}}, y^{\mathrm {r}}, z^{\mathrm {r}}, m_x^{\mathrm {r}}, m_y^{\mathrm {r}}, m_z^{\mathrm {r}}\right] ^T \in {\mathbb {R}}^6 \mid (m_x^{\mathrm {r}}) ^2 +(m_y^{\mathrm {r}}) ^2 + (m_z^{\mathrm {r}}) ^2 = 1 \right\} . $$
(4)

Let \(\mathbf {R}_k =\mathbf {r}^\mathrm {r}- \mathbf {r}_k^\mathrm {t}\) denote the distance vector of length \(R_k\) from the transmitting coil k to the receiving coil. By modeling the transmitting and receiving coils as magnetic dipoles and exploiting Faraday’s law, the scaled induced voltage in the receiving coil generated by transmitting coil k is given by Jackson (1998)

$$\begin{aligned} V_k = -j \omega \frac{\alpha _k}{V_0} \frac{\mu _0}{4\pi } \left( \frac{3({\hat{\mathbf {m}}}^\mathrm {r}\cdot \mathbf {R}_k) ({\hat{\mathbf {m}}}^\mathrm {t}_k \cdot \mathbf {R}_k)}{R^5_k} - \frac{{\hat{\mathbf {m}}}^\mathrm {r}\cdot {\hat{\mathbf {m}}}^\mathrm {t}_k}{R^3_k}\right) \end{aligned}$$
(5)

where \(\omega \) is the angular frequency, \(\mu _0\) is the permeability of free space, and \(V_0\) is a reference voltage that renders \(V_k\) unit-less and thereby independent of the unit of measurement. The parameter \(\alpha _k\) is assumed to be known and it describes the diameter, number of turns, and the excitation current for each transmitting coil k. We use \(\omega \alpha _k / V_0 = \omega \alpha / V_0 = 4.33\times 10^6 \, \text{ Am/Vs }\) for all k throughout this work, which implies that all transmitting coils are identical.

The gradient of the scaled induced voltage in the receiver generated by transmitting coil k with respect to the position of the receiver \(\mathbf {r}\,^\mathrm {r}\) is given by

$$\begin{aligned} \begin{aligned} \nabla _{(\mathbf {r}\,^\mathrm {r})}V_k&= \left[ \frac{\partial V_k}{\partial x^\mathrm {r}}, \frac{\partial V_k}{\partial y^\mathrm {r}}, \frac{\partial V_k}{\partial z^\mathrm {r}} \right] ^T \\&= -j \omega \frac{\alpha }{V_0} \frac{\mu _0}{4\pi } \Bigg ( 15 \frac{\left( {\hat{\mathbf {m}}}^\mathrm {r}\cdot \mathbf {R}_k\right) \left( {\hat{\mathbf {m}}}^\mathrm {t}_k \cdot \mathbf {R}_k\right) \mathbf {R}_k}{R^7_k} \\&\quad \quad \quad \quad \quad -3\frac{\left( {\hat{\mathbf {m}}}^\mathrm {r}\cdot \mathbf {R}_k\right) {\hat{\mathbf {m}}}^\mathrm {t}_k + \left( {\hat{\mathbf {m}}}^\mathrm {t}_k \cdot \mathbf {R}_k\right) {\hat{\mathbf {m}}}^\mathrm {r}+ \left( {\hat{\mathbf {m}}}^\mathrm {r}\cdot {\hat{\mathbf {m}}}^\mathrm {t}_k\right) \mathbf {R}_k}{R^5_k} \Bigg ) \end{aligned} \end{aligned}$$
(6)

and the gradient of the scaled voltage with respect to the magnetic dipole moment of the receiver \(\mathbf {m}^\mathrm {r}\) is given by

$$\begin{aligned} \nabla _{(\mathbf {m}^\mathrm {r})}V_k = \left[ \frac{\partial V_k}{\partial m^\mathrm {r}_x}, \frac{\partial V_k}{\partial m^\mathrm {r}_y}, \frac{\partial V_k}{\partial m^\mathrm {r}_z} \right] ^T = -j \omega \frac{\alpha }{V_0} \frac{\mu _0}{4\pi } \left( \frac{3\left( {\hat{\mathbf {m}}}^\mathrm {t}_k\cdot \mathbf {R}_k\right) \mathbf {R}_k}{R^5_k} - \frac{{\hat{\mathbf {m}}}^\mathrm {t}_k}{R^3_k}\right) . \end{aligned}$$
(7)

Notice that the gradient with respect to the position in (6) scales as \(R^{-4}_k\) whereas the gradient with respect to \(\mathbf {m}^\mathrm {r}\) in (7) scales as \(R^{-3}_k\).

Consider a measurement scenario where the receiver can be assumed to be stationary in both position and orientation during a time \(\Delta T\). The time to perform one measurement is \(\Delta t\), which corresponds to the time required to record and process the signal generated by one of the transmitters. In this article, we are focused on the positioning of a mechanical object that is quasi-stationary on time scales that are many order of magnitudes larger than the time scale associated with the electrical system that performs the measurement. Therefore, we make the assumption that \(\Delta T\) is many orders of magnitude larger than \(\Delta t\), which reflects many real-life situations. Thus, the maximum number of measurements that can be collected during the time \(\Delta T\) with stationary conditions is limited by the large number \(N^\mathrm {meas}= \Delta T / \Delta t\), which follows from that the measurement system is based on time-division multiplexing. For convenience, we assume that \(N^\mathrm {meas}\) is an integer in the following given the nature of an actual measurement system, i.e. measurements are collected and processed as single units by standard off-the-shelf measurement instruments.

3 Optimization problem

In this work, we seek to improve the performance of the tracking system by allocating the \(N^\mathrm {meas}\) measurements among \(N^\mathrm {t}\) candidate transmitters in an optimal way.

Let \(w_k \in \mathbb {N}\) be the number of measurements performed with transmitter k. Clearly, it is advantageous to perform as many measurements as possible during the stationary time-interval \(\Delta T\) and, thus, we have \(\sum _k w_k = N^\mathrm {meas}\). Let \(\mathbf {w} = \left[ w_1, w_2, \ldots , w_{N^\mathrm {t}} \right] ^T\). Thus, we want to

$$\begin{aligned} \begin{aligned} \underset{\mathbf {w}}{\text {minimize}}&\quad J(\mathbf {p}, \mathbf {w})\\ \text {subject\, to}&\quad w_k \in \left\{ 0, 1, 2, \ldots , N^\mathrm {meas}\right\} , \quad k = 1,\ldots ,N^\mathrm {t}\\&\quad \sum _{k=1}^{N^\mathrm {t}} w_k = N^\mathrm {meas}\\&\quad \mathbf {\mathbf {p}} \in \varOmega _\mathbf {p} \end{aligned} \end{aligned}$$
(8)

where \(J\) is a cost function quantifying the system’s performance and \(\varOmega _\mathbf {p}\) is the measurement domain for which we want to optimize the tracking system. This is a combinatorial optimization problem and an exhaustive search requires \((N^\mathrm {t})^{N^\mathrm {meas}}\) cost function evaluations, which is prohibitive from a computational perspective. This is particularly true for more complicated measurement scenarios that may require parameter studies that involve the solution of many optimization problems.

Now, let \(\lambda _k = w_k/N^\mathrm {meas}\) denote the fraction of the total number of measurements that are performed with transmitter k. Thus, \(\lambda _k \in \left\{ 0, 1/N^\mathrm {meas}, 2/N^\mathrm {meas}, \ldots , 1 \right\} \). We use the approximation \(\lambda _k \in \left[ 0,1 \right] \) because \(N^\mathrm {meas}\) is large as discussed in Sect. 2 above. By using the notation \(\varLambda = \left[ \lambda _1, \lambda _2, \ldots , \lambda _{N^\mathrm {t}} \right] ^T\), we obtain the relaxed optimization problem

$$\begin{aligned} \begin{aligned} \underset{\varLambda }{\text {minimize}}&\quad J(\mathbf {p}, \varLambda )\\ \text {subject\, to}&\quad \lambda _k \in \left[ 0, 1 \right] , \quad k = 1,\ldots ,N^\mathrm {t}\\&\quad \sum _{k=1}^{N^\mathrm {t}} \lambda _k = 1\\&\quad \mathbf {\mathbf {p}} \in \varOmega _\mathbf {p}, \end{aligned} \end{aligned}$$
(9)

which is a good approximation to the problem in (8). The feasible domain dictated by the constraints is convex. Thus, if the cost function \(J\) is convex with respect to \(\varLambda \), the entire optimization problem is convex and can be readily solved.

In the following subsections, we introduce a performance metric, present the cost functions that are used, and present the method to solve the corresponding optimization problems.

3.1 Cost functions

3.1.1 Performance metric

Let \(V^\mathrm {meas}_k(\mathbf {p}_0)\) denote the measured signal generated by transmitting coil k for an arbitrary receiver position and orientation \(\mathbf {p}_0 \in \mathbb {R}^{3} \times \mathbb {S}^3\) in the parameter space, where \(\mathbb {R}^3\) is the position in three dimensional space and \(\mathbb {S}^3\) is all possible directions of orientation on the unit sphere in \(\mathbb {R}^3\). Noise that is caused by, for example, thermal noise in amplifiers can degrade the performance of the positioning system. Therefore, we model the measured signal as the true signal \(V_k(\mathbf {\mathbf {p}}_0)\) corrupted with additive Gaussian noise as

$$V^{\rm meas}_k ({\mathbf {p}}_{0}) = V_k ({\mathbf {p}}_{\mathbf{0}}) + n_k $$
(10)

where the noise terms \(n_k \sim \mathcal {N}(0,\sigma ^2)\) are independent and identically distributed and \(\mathcal {N}(\mu ,\sigma ^2)\) denotes the Gaussian distribution with mean \(\mu \) and variance \(\sigma ^2\). Below, we denote the gradient of \(V_k(\mathbf {\mathbf {p}})\) with respect to the parameters in \(\mathbf {p}\) at the point \({\mathbf {p}}_0\) by \(\nabla _\mathbf {p}V_k(\mathbf {\mathbf {p}}_0)\).

A metric for the performance of the parameter estimation is provided by the Fisher information matrix \(\mathbf {M}\in \mathbb {R}^{p\times p}\) (Kay 1993) given by

$$\begin{aligned}{ \mathbf {M}}({\mathbf {p}}_0, \varLambda) = \sum _{k=1}^{N^{\mathrm {t}}} \lambda _k {\mathbf {M}}_k({\mathbf {p}}_0) =\sum _{k=1}^{N^{\mathrm {t}}} \lambda _k \frac{\big[ \nabla _{\mathbf {p}}V_k({\mathbf {p}}_0)\big] \big[ \nabla _{\mathbf {p}}V_k({\mathbf {p}}_0) \big] ^T}{\sigma ^2} \end{aligned}$$
(11)

because the Cramér-Rao inequality (Walter and Pronzato 1997)

$$\begin{aligned} \text{ cov } \, \! {\hat{\mathbf {\mathbf {p}}}} \succeq \mathbf {M}^{-1} \end{aligned}$$
(12)

yields a lower bound for the covariance of the estimate \({\hat{\mathbf {\mathbf {p}}}}\) for all unbiased estimators. Here, \(\mathbf {A} \succeq \mathbf {B}\) signifies that the matrix \(\mathbf {A}-\mathbf {B}\) is positive semi-definite. Furthermore, the bound can be attained, for example, asymptotically by the maximum-likelihood estimator. Therefore, the performance of the measurement system is expected to improve by maximizing \(\mathbf {M}\) (in some sense). However, to find an optimal Fisher information matrix \(\mathbf {M}^*\) that fulfills \(\mathbf {M}^* \succeq \mathbf {M}, \forall \mathbf {M}\ne \mathbf {M}^*\) is, in general, not possible (Uciński 2005) and a real-valued function \(J(\mathbf {M})\) is often optimized instead. Here, we use

$$\begin{aligned} J_{\mathrm {D}}(\mathbf {M}) =-\log \det (\mathbf {M}) \end{aligned}$$
(13)

that yields a so-called D-optimal (Determinant-optimal) solution. If the model \(V_k(\mathbf {p})\) is linear in \(\mathbf {p}\), the D-optimal solution minimizes the volume of the lower bound for the confidence ellipsoid described by \(\mathbf {M}^{-1}\) in (12). The volume of the \(\beta \)-confidence ellipsoid is given by Pronzato and Pázman (2013)

$$\begin{aligned} \text{Volume}(\beta ) = \frac{\left( \pi \right) ^{p/2}}{\varGamma \left( \frac{p}{2}+1\right) }\left( F^{-1}_{\chi ^2_p}(\beta )\right) ^{p/2}\left( \det \mathbf {M}^{-1}\right) ^{1/2} \end{aligned}$$
(14)

where \(F_{\chi ^2_p}\) is the cumulative distribution function for the \(\chi ^2\)-distribution with p degrees of freedom and \(\varGamma \) denotes the Gamma function. The geometric mean of the lengths of the confidence ellipsoid’s semi-axes, which we refer to as the mean confidence-radius, is given by Joshi and Boyd (2009)

$$\begin{aligned} \rho (\beta ) = \sqrt{F^{-1}_{\chi ^2_p}(\beta )} \left( \det \mathbf {M}^{-1}\right) ^{1/2p}. \end{aligned}$$
(15)

An attractive feature of the D-optimality criterion is that it is invariant to scaling of the parameters in \(\mathbf {p}\) (Uciński 2005).

By using the cost function from (11) and (13) as well as \(\varOmega _\mathbf {p}= \mathbf {p}_0\) in (9), we obtain the relaxed local design problem

$$ \begin{array}{*{20}l} {\mathop {{\text{minimize}}}\limits_{{\lambda _{k} }} } \hfill & { - \log \det \left( {\sum\limits_{{k = 1}}^{{N^{t} }} {\lambda _{k} {\mathbf{M}}_{k} ({\mathbf{p}}_{0} )} } \right)} \hfill \\ {{\text{subject}}\;{\text{to}}} \hfill & {\lambda _{k} \in \left[ {0,1} \right],\quad k = 1, \ldots ,N^{t} } \hfill \\ {} \hfill & {\sum\limits_{{k = 1}}^{{N^{t} }} {\lambda _{k} = 1} } \hfill \\ \end{array} $$
(16)

which is a convex optimization problem as shown by Boyd and Vandenberghe (2004, Section 7.5).

3.1.2 Local and non-local designs

The Cramér-Rao inequality in (12) yields a lower bound for the covariance of the estimated parameters. Given that the covariance is a measure of the linear relationship between the estimated parameters, it does not capture their true relationship when the functions \(V_k(\mathbf {p})\) are non-linear in \(\mathbf {p}\). In addition, \(\mathbf {M}^{-1}\) is a function of \(\mathbf {p}_0\) because of this non-linearity. This is the reason why an optimal experiment design based on (11) and (12) for \(V_k(\mathbf {p})\) non-linear in \(\mathbf {p}\) is referred to as a local design (Walter and Pronzato 1997). The region of validity of a local design depends on the size of the region where the linearization \(V_k(\mathbf {p}) \cong V_k(\mathbf {p}_0) + \nabla _\mathbf {p}V_k(\mathbf {\mathbf {p}}_0)(\mathbf {p}- \mathbf {p}_0)\) is a good approximation to the true non-linear \(V_k(\mathbf {p})\).

In contrast to local designs, it is often desired to optimize the performance of the measurement system not just for one point in the parameter space \(\mathbb {R}^p\) but rather for a measurement domain \(\varOmega _\mathbf {p}\subseteq \mathbb {R}^p\). In this work, we optimize the measurement performance in \(\varOmega _\mathbf {p}\) for (i) average optimality and (ii) minimax optimality. To this aim, we exploit a discrete set of linearization points \({\varOmega _\mathrm {lin}}= \{\mathbf {p}_i\}_{i=1}^{N_\mathrm {lin}}\subseteq \varOmega _\mathbf {p}\) that constitutes a sufficiently dense discretization of \(\varOmega _\mathbf {p}\).

Average optimality In this case, we assign a prior probability distribution \(\mathbf {\pi }_\mathbf {p}(\mathbf {p})\) for the parameters that are to be estimated. We then find the so-called ELD-optimal (Expectation of Log Determinant-optimal) experiment design (Walter and Pronzato 1997) by minimizing the cost function

$$\begin{aligned} J_{\mathrm {ELD}}(\varLambda ) = -\underset{\mathbf {p}}{\mathbb {E}}\{\log \det \mathbf {M}(\mathbf {p}, \varLambda )\} \end{aligned}$$
(17)

where \(\mathbb {E}\) denotes the expectation with respect to \(\mathbf {\pi }_\mathbf {p}(\mathbf {p})\). In our case, we assume a uniform prior probability density

$$\begin{aligned} \mathbf {\pi }_\mathbf {p}(\mathbf {p}) = \left\{ \begin{array}{ll} \left( \int _{\varOmega _\mathbf {p}}d\mathbf {p}\right) ^{-1}, &{} \mathbf {p}\in \varOmega _\mathbf {p}\\ 0, &{} \mathbf {p}\notin \varOmega _\mathbf {p}\end{array} \right. \end{aligned}$$
(18)

and so the cost function can be written as

$$\begin{aligned} J_{\mathrm {ELD}}(\varLambda )& = -\int _{\mathbb {R}^p} \log \det \mathbf {M}(\mathbf {p}, \varLambda ) \mathbf {\pi }_\mathbf {p}(\mathbf {p}) d\mathbf {p} \nonumber \\ & = \frac{-1}{\int _{\varOmega _\mathbf {p}}d\mathbf {p}}\int _{\varOmega _\mathbf {p}} \log \det \mathbf {M}(\mathbf {p}, \varLambda ) d\mathbf {p}. \end{aligned}$$
(19)

We evaluate this integral by quadrature at the linearization points \(\mathbf {p}_i \in {\varOmega _\mathrm {lin}}\) with weights \(q_i\) as

$$\begin{aligned} J_{\mathrm {ELD}}(\varLambda ) \approx - \sum _{i=1}^{{N_\mathrm {lin}}} q_i \log \det \mathbf {M}(\mathbf {p}_i, \varLambda ), \end{aligned}$$
(20)

with the quadrature scheme described in Quadrature. This quadrature scheme features weights \(q_i\) that are positive, which is important to preserve the convexity of the optimization problem. In addition, it preserves symmetries with respect to the \(m^\mathrm {r}_x m^\mathrm {r}_z\)- and \(m^\mathrm {r}_y m^\mathrm {r}_z\)-planes. Furthermore, the value of \(\det \mathbf {M}\) is unaffected if \({\hat{\mathbf {m}}}^\mathrm {r}\) is multiplied by -1, which is exploited by performing the quadrature for the half sphere \(m^\mathrm {r}_z\ge 0\) only. Notice that minimization of the continuous cost function in (17) and its discretized version in (20) are equivalent within the accuracy of the quadrature scheme. Also, notice that there are other more efficient quadrature schemes for evaluating the expectation in (17) than the one presented in Quadrature. An example of such a scheme is the one given by Gotwalt et al. (2009) and Gotwalt (2010) that, however, includes negative weights in situations where more than 7 parameters are to be estimated.

Thus, we obtain the relaxed average optimality problem

$$ \begin{array}{*{20}l} {\mathop {{\text{minimize}}}\limits_{{\lambda _{k} }} } \hfill & { - \sum\limits_{{k = 1}}^{{N_{{\lim }} }} {q_{i} } \log \det \left( {\sum\limits_{{k = 1}}^{{N^{t} }} {\lambda _{k} {\mathbf{M}}_{k} ({\mathbf{p}}_{i} )} } \right)} \hfill \\ {{\text{subject}}\;{\text{to}}} \hfill & {\lambda _{k} \in \left[ {0,1} \right],\quad k = 1, \ldots ,N^{t} } \hfill \\ {} \hfill & {\sum\limits_{{k = 1}}^{{N^{t} }} {\lambda _{k} = 1} } \hfill \\ {} \hfill & {{\mathbf{p}}_{i} \in \Omega _{{{\text{lin}}}} } \hfill \\ \end{array} $$
(21)

that is also a convex problem because the cost function is a sum of convex functions with positive weights.

Minimax optimality In many applications, it is often desired to guarantee a certain accuracy of the measurements. In these cases, the worst-case performance is optimized instead of the average performance. This leads to a so-called minimax problem, where we seek to minimize the MMLD (MiniMax Log of Determinant) cost function

$$\begin{aligned} J_{\text {MMLD}}(\varLambda ) = \max _{\mathbf {p}\in \varOmega _\mathbf {p}} \left\{ - \log \det \mathbf {M}(\mathbf {p}, \varLambda )\right\} . \end{aligned}$$
(22)

The computation of \(J_{\text {MMLD}}(\varLambda )\) involves solving a separate optimization problem defined by the right hand side of (22). We solve this optimization problem by computing \(J_{\text {MMLD}}(\varLambda )\) at \({N_\mathrm {lin}}\) linearization points \(\mathbf {p}_i \in {\varOmega _\mathrm {lin}}\) and taking the maximum value, i.e.

$$\begin{aligned} J_{\text {MMLD}}(\varLambda ) \approx \max _{\mathbf {p}_i\in {\varOmega _\mathrm {lin}}} \left\{ - \log \det \mathbf {M}(\mathbf {p}_i, \varLambda )\right\} , \end{aligned}$$
(23)

which yields the relaxed minimax optimality problem

$$ \begin{array}{*{20}l} {\mathop {{\text{minimize}}}\limits_{{\lambda _{k} }} } \hfill & {\mathop {\max }\limits_{{{\mathbf{p}}_{i} }} \left\{ { - \log \det \left( {\sum\limits_{{k = 1}}^{{N^{t} }} {\lambda _{k} {\mathbf{M}}_{k} ({\mathbf{p}}_{i} )} } \right)} \right\}} \hfill \\ {{\text{subject}}\;{\text{to}}} \hfill & {\lambda _{k} \in \left[ {0,1} \right],\quad k = 1, \ldots ,N^{t} } \hfill \\ {} \hfill & {\sum\limits_{{k = 1}}^{{N^{t} }} {\lambda _{k} = 1} } \hfill \\ {} \hfill & {{\mathbf{p}}_{i} \in \Omega _{{{\text{lin}}}} } \hfill \\ \end{array} $$
(24)

This is a convex problem because the pointwise maximum of convex functions is convex (Boyd and Vandenberghe 2004).

3.2 Solution method

In this work, we seek to allocate a limited number of measurements given a large number of possible candidate measurements in an optimal way. Apart from the information on the measurement allocation, the solutions to our optimization problems also inform us of the number of transmitters to use and their positions. A lower bound on the number of transmitters is given by the number of parameters p that are to be estimated. The number of measurements to use is limited by the constraint \( \sum _{k=1}^{N^\mathrm {t}} \lambda _k = 1\) in (21) and (24) and this constraint shows strong similarity with penalty terms encountered in compressed sensing and related problems (Bruckstein et al. 2009). Such a penalty term typically involves the \(L_1\)-norm of the solution vector and it is added with a weight to the cost function that should be minimized. In the context of compressed sensing and related problems, the penalty term favors a sparse solution with only a few non-zero entries, should such a solution be consistent with the rest of the problem statement. Here, we find that the optimized measurement allocation vectors \(\varLambda ^*\) computed from (21) and (24) feature only a few non-zero entries in comparison to the number of transmitter candidates, which is confirmed by the results presented in this article.

3.2.1 Thresholding and clustering of weights

The weights \(\lambda _k\) that are obtained in the solutions of the convex problems (21) and (24) above can, for the examples we have studied in this work, be grouped as follows: (i) a handful of the weights are large (\(> 10^{-3}\)); (ii) many are zero; (iii) several are nearly zero (\(< 10^{-9}\)). That the weights of the last group are not zero is due to the finite precision arithmetic and termination criteria tolerances of the exploited numerical solver. In addition, the weights of the last group are several orders of magnitude smaller than the weights of the first group. Therefore, we use the threshold \(\lambda _{\mathrm {th}}=10^{-6}\) and set all weights \(\lambda _k<\lambda _{\mathrm {th}}\) equal to zero. We refer to weights \(\lambda _k \ge \lambda _{\mathrm {th}}\) as non-zero.

Furthermore, the finite resolution of a Cartesian grid of transmitter candidates may cause several neighboring transmitters to obtain non-zero weights \(\lambda _k\). We replace such a cluster \( \varOmega _{\mathrm {cl}}\) of non-zero weights \(\lambda _k, k \in \varOmega _{\mathrm {cl}}\) with only one weight \(\lambda _{k_{\mathrm {cl}}}\) placed at \(\mathbf {r}_{k_{\mathrm {cl}}}\) according to

$$\begin{aligned} \lambda _{k_{\mathrm {cl}}}= & {} \sum _{k\in \varOmega _{\mathrm {cl}}} \lambda _k \nonumber \\ \mathbf {r}_{k_{\mathrm {cl}}}= & {} \frac{1}{\lambda _{k_{\mathrm {cl}}}} \sum _{k\in \varOmega _{\mathrm {cl}}} \lambda _k \mathbf {r}_{k}\!^\mathrm {t} \end{aligned}$$
(25)

if all the non-zero weights \(\lambda _k\) in the cluster are vertices of the same cell in the Cartesian grid. If this is not the case, e.g. there are five non-zero \(\lambda _k\) in the cluster or the cluster consists of three non-zero \(\lambda _k\) on a straight line, we do not perform the clustering. Instead, the problem should be solved again for a denser grid of transmitter candidates.

3.2.2 Evaluation of derivatives

To solve the optimization problems (21) and (24), the Fisher information matrix for a given receiver position and orientation \(\mathbf {p}_i \in {\varOmega _\mathrm {lin}}\) must be computed. Thus, the derivatives with respect to the two degrees of freedom given by

$$\begin{aligned} \left\{ \left[ m_x^\mathrm {r}, m_y^\mathrm {r}, m_z^\mathrm {r}\right] ^T \in \mathbb {R}^3 \mid (m_x^\mathrm {r})^2 + (m_y^\mathrm {r})^2 + (m_z^\mathrm {r})^2 = 1 \right\} \end{aligned}$$
(26)

are needed. However, the gradient \(\nabla _{(\mathbf {m}^\mathrm {r})}V\) in (7) includes the derivatives with respect to the Cartesian components of the receiver’s magnetic dipole moment \(\mathbf {m}^\mathrm {r}\). If all three of these components are included in the Fisher information matrix, the constraint in (26) makes the Fisher information matrix rank-deficient. We therefore introduce a local Cartesian coordinate system \(({\hat{\mathbf {u}}}_i,{\hat{\mathbf {v}}}_i,{\hat{\mathbf {w}}}_i)\) with \({\hat{\mathbf {w}}}_i = {\hat{\mathbf {m}}}^\mathrm {r}_i\) to express \(\nabla _{(\mathbf {m}^\mathrm {r}_i)}V\). The \(u_i\)- and \(v_i\)-components of \(\nabla _{(\mathbf {m}^\mathrm {r}_i)}V\) are then used in the computation of the Fisher information matrix. (The \(w_i\)-component of \(\nabla _{(\mathbf {m}^\mathrm {r}_i)}V\) is always zero because of the constraint in (26).) The cost functions that are exploited in this work are unaffected by a rotation of \({\hat{\mathbf {u}}}_i\) and \({\hat{\mathbf {v}}}_i\) around \({\hat{\mathbf {w}}}_i\) because determinants are invariant to rotations.

3.2.3 Solver

The relaxed average optimality problem in (21) and the relaxed minimax optimality problem in (24) are solved directly with the routine SNOPT (Gill et al. 2005) provided in the TOMLAB (Tomlab Optimization AB 2012) package of optimization algorithms. The SNOPT-routine is an implementation of the sequential quadratic programming algorithm. All gradients that are needed are computed analytically by SNOPT.

4 Results

Planar transmitter constellations have become increasingly popular, see for example (Iustin et al. 2008; Plotkin et al. 2010). In this work, we therefore consider only planar constellations of transmitters. More specifically, we consider constellations where all transmitters lie in the plane \(z=0\) with dipole moments oriented along the z-axis, i.e. \(z_k^\mathrm {t}= 0\) and \({\hat{\mathbf {m}}}_k^\mathrm {t}= {\hat{\mathbf {z}}}\) for all k. (It should be noted that the proposed method can handle any geometry of the transmitter constellation. Furthermore, transmitters with different orientations can also be considered with the method, should this be desired.) In particular, we consider two types of planar transmitter constellations based on (i) a Cartesian grid of transmitter candidates and (ii) a polar grid of transmitter candidates. These transmitter constellations are referred to as Cartesian arrays and polar arrays, respectively, in the following.

Examples of the two transmitter array types are shown in Fig. 1. The transmitters in a Cartesian array are placed on a Cartesian grid with \(|x_k^\mathrm {t}| \le x_\mathrm {max}\), \(|y_k^\mathrm {t}| \le y_\mathrm {max}\) and an inter-transmitter distance h in both the x- and y-directions. The transmitters in a polar array are placed on a polar grid with \(N_\mathrm {arms}\) transmitters per circle, \(0\le (x_k^\mathrm {t})^2 + (y_k^\mathrm {t})^2 \le r_\mathrm {max}^2\) and a radial distance h between neighboring circles. On each circle, the transmitters are placed at the polar angles \(\psi _l = l\frac{2\pi }{N_\mathrm {arms}}\) where \(l = 1,\ldots ,{N_\mathrm {arms}}\).

Fig. 1
figure 1

Cartesian transmitter array defined by \(x_\mathrm {max}\), \(y_\mathrm {max}\), and h (left). Polar transmitter array defined by \(r_\mathrm {max}\), h, and \(N_\mathrm {arms}\) (right). Transmitters are represented by circular markers and transmitter candidate array boundaries are indicated with dashed lines

Below, we compare the average and minimax cost functions in Sect. 4.1. Then, we study a realistic measurement scenario in Sect. 4.2. Finally, we investigate the impact of restrictions on the permissible size and position of the transmitter array (Sect. 4.3). Note that we use \(\sigma ^2 =1\) in the following tests without loss of generality. All the results in this section are presented in terms of the continuous weights \(\varLambda \), which we find useful and informative in an engineering setting. The continuous weight \(\lambda _k\) can directly be interpreted as the fraction of measurements that are to be collected based on transmitter candidate k, which is useful since \(\lambda _k\) is not explicitly dependent on \(N^\mathrm {meas}\). In other words, the solution \(\lambda _k\) describes a variety of measurement systems that feature different values of \(N^\mathrm {meas}\), which may involve widely different hardware implementations. In addition, the continuous weights may be used as a good starting point for the combinatorial optimization problem (8), which may be approached in a number of different ways depending on the application at hand and computational resources available. For sufficiently large values of \(N^\mathrm {meas}\), the weights \(\lambda _k\) can be rounded to an integer multiple of \(1/N^\mathrm {meas}\) without any significant change in the performance of the measurement system, i.e. the objective function in (8) is basically unaltered given the real-world measurement situation. Should \(N^\mathrm {meas}\) not be sufficiently large for the application at hand, the approach presented by Joshi and Boyd (2009) can be used to pursue the solution of the combinatorial optimization problem in (8), where the relaxed solution may be used as a starting guess. However, we are focused on positioning of a mechanical system that is quasi-stationary on time scales that are many orders of magnitude larger than the time scale associated with the electrical system that performs the measurement, which implies that \(N^\mathrm {meas}\) is indeed very large for any practical purposes.

4.1 Cost function comparison

In order to illustrate the differences between average and minimax optimality, we consider a simple problem with cylindrical symmetry. The measurement domain is given by

$$\begin{aligned} \varOmega _\mathbf {p}=\left\{ [x^\mathrm {r},y^\mathrm {r},z^\mathrm {r},{\hat{\mathbf {m}}}^\mathrm {r}]^{T} \mid x^\mathrm {r}=0, y^\mathrm {r}=0, z^\mathrm {r}\in [0.1, 1], {\hat{\mathbf {m}}}^\mathrm {r}= {\hat{\mathbf {z}}}\right\} \end{aligned}$$
(27)

where \({N_\mathrm {lin}}=1000\) linearization points for \(z^\mathrm {r}\in [0.1, 1]\) are exploited by the quadrature scheme described in Quadrature. We consider a polar array defined by \(r_\mathrm {max}=1.1\), \(h=0.0025\) and \(N_\mathrm {arms}=3\) with \(N^\mathrm {t}= 1321\) candidate transmitters. Next, we constrain the transmitters on each circle of constant radius in the array to have equal weights, which is motivated by the symmetry of the problem. (It should be noted that the symmetry is broken by the transmitter array. However, we obtain identical results for \(N_\mathrm {arms}=3,4,\ldots ,7\). Furthermore, circles centered at the origin are formed by the transmitters with non-zero weights obtained by solving the problem with a Cartesian array of transmitter candidates, where no additional constraints on the weights are incorporated.)

Figure 2 shows the non-zero weights of the solution to the relaxed average optimality problem (21) and the relaxed minimax optimality problem (24). For this measurement scenario, the clustering procedure described in Sect. 3.2.1 is modified such that all radially adjacent weights \(\lambda _k \ge \lambda _{\mathrm {th}}\) are clustered into one single weight \(\lambda _{k_{\mathrm {cl}}}\) placed at \(\mathbf {r}_{k_{\mathrm {cl}}}\) according to (25). The corresponding radii and total weights of the circles with non-zero weights are given in Table 1. The non-zero weights for minimax optimality are fewer and constitute a larger constellation than the non-zero weights for average optimality. Furthermore, optimizing for minimax optimality yields the same result as optimizing only for the sensor position that is furthest away from the transmitter plane, i.e. \(z^\mathrm {r}=1\), cf. (Talcoth and Rylander 2013).

Fig. 2
figure 2

Measurement allocations for average optimality (left) and minimax optimality (right). Clustered non-zero weights are represented by circular markers whose size is proportional to the weight \(\lambda _{k_{\mathrm {cl}}}\). Transmitter candidate array boundaries are indicated with dashed lines

Table 1 Radii and weights for clustered non-zero weights

Figure 3 shows the pointwise cost \(J_\mathrm {D}(\mathbf {p}_i, \varLambda ^*)\) as a function of \(z^\mathrm {r}\) by dashed and solid curves for the solutions to the average and minimax optimality problems, respectively. As can be seen in Fig. 3, the performance of the system degrades as the distance between the sensor and the transmitter plane increases. This can also be seen by combining (6), (7) and (11) with Sect. 3.2.2, to obtain

$$\begin{aligned} J_{\mathrm {D}} \propto 36\log R \end{aligned}$$
(28)

where the distances R to the sensor are assumed to scale in the same way for all contributing transmitters. This also explains why the optimum of the minimax optimality problem is identical to the optimum for pointwise D-optimality at \(z^\mathrm {r}=1\) (Talcoth and Rylander 2013) and the larger constellation size as compared to the clustered non-zero weights that correspond to average optimality.

Fig. 3
figure 3

Pointwise cost \(J_\mathrm {D}(\mathbf {p}_i, \varLambda ^*)\) as a function of \(z^\mathrm {r}\) for the solutions to the average and minimax optimality problems

As can be seen from the curves in Fig. 3, optimizing for minimax optimality gives a slight improvement in worst-case performance as compared to optimizing for average optimality because \(\rho (\varLambda _{\text {Minimax}})/\rho (\varLambda _{\text {Average}}) \approx 0.81\) at \(z^\mathrm {r}= 1\). Here, \(\rho \) is the mean confidence-radius from (15). Further, \(\varLambda _{\text {Minimax}}\) and \(\varLambda _{\text {Average}}\) denote the measurement allocations optimized for minimax and average optimality, respectively, and their clustered non-zero weights are shown in Fig. 2. However, the improvement in worst-case performance comes at the expense of a large degradation in performance close to the transmitter plane, e.g. \(\rho (\varLambda _{\text {Minimax}})/\rho (\varLambda _{\text {Average}}) \approx 125\) at \(z^\mathrm {r}= 0.1\).

We also examine the impact of linearization point density by varying the number of linearization points in the measurement domain. For average optimality, at least 30 linearization points are needed to obtain the same constellation of clustered non-zero weights as described above. In contrast, only one linearization point at \(z^\mathrm {r}= 1\) is needed for minimax optimality because the worst performance is governed by the point furthest away from the transmitter plane.

4.2 A realistic measurement scenario

We investigate a realistic measurement scenario and quantify the potential for improvement of measurement allocation optimization as compared to an ad-hoc measurement allocation procedure.

The measurement domain is given by

$$\begin{aligned} \varOmega _\mathbf {p}= & {} \left\{ [x^\mathrm {r},y^\mathrm {r},z^\mathrm {r},{\hat{\mathbf {m}}}^\mathrm {r}]^{T} \mid x^\mathrm {r}\in [-0.25,0.25], y^\mathrm {r}\in [-0.25, 0.25],\right. \nonumber \\&\left. z^\mathrm {r}\in [0.5,1], {\hat{\mathbf {m}}}^\mathrm {r}\in \mathbb {S}^3\right\} . \end{aligned}$$
(29)

The quadrature scheme from Quadrature is exploited with [5, 5, 4] points in the x-, y-, and z-directions, respectively, and 77 points on half the unit sphere, which gives \({N_\mathrm {lin}}=7700\).

We solve the relaxed average optimality problem in (21) and the relaxed minimax optimality problem in (24) with a transmitter candidate array of Cartesian type defined by \(x_\mathrm {max}=1.44\), \(y_\mathrm {max}=1.44\), and \(h=0.09\) with \(N^\mathrm {t}=1089\) candidate transmitters. Thresholding and clustering is applied as described in Sect. 3.2.1. Furthermore, we introduce an ad-hoc measurement allocation procedure consisting of a Cartesian array defined by \(x_\mathrm {max}=1\), \(y_\mathrm {max}=1\), and \(h=0.5\) with 25 equally weighted transmitters, i.e. \(\lambda _k =1/25\) for all k. This ad-hoc measurement allocation constitutes a natural choice for collecting measurements, should an optimization algorithm for measurement allocation not be available.

Figure 4 shows the ad-hoc measurement allocation as well as the measurement allocations optimized for average and minimax optimality. The optimized measurement allocations are symmetric with respect to the x- and y-axes. Notice that these symmetries are not imposed during the solution of the optimization problems but are due to the symmetries present in the optimization problems. Also notice that all three measurement allocations require 25 transmitters. Similar to the results in Sect. 4.1, minimax optimality requires measurements over a larger area than average optimality. This is because of the scaling with respect to the distance R for the derivatives (6) and (7) and the cost function (28).

Fig. 4
figure 4

Ad-hoc measurement allocation (left) and measurement allocations optimized for average optimality (middle) and minimax optimality (right). The clustered non-zero weights are represented by circular markers whose size is proportional to the weight \(\lambda _{k_{\mathrm {cl}}}\). Transmitter candidate array boundaries are indicated with dashed lines

Figure 5 shows \(J_\mathrm {D}(\mathbf {M}(\mathbf {p}_i),\varLambda ^*)\) for all linearization points \(\mathbf {p}_i \in {\varOmega _\mathrm {lin}}\) as a function of the linearization point index. Notice that these indices have been sorted individually for each case in non-decreasing order of the cost. All curves in Fig. 5 show four different levels that correspond to the different \(z^\mathrm {r}\)-values of the linearization points. Larger cost and, thus, worse performance is obtained for the most distant linearization points.

Fig. 5
figure 5

Pointwise cost \(J_\mathrm {D}(\mathbf {M}(\mathbf {p}_i),\varLambda ^*)\) as a function of linearization point index i for the ad-hoc measurement allocation as well as for the measurement allocations optimized for average and minimax optimality. Note that the indices i are sorted individually for each case to yield non-decreasing curves

The cost function values for average optimality \(J_\mathrm {ELD}\) and minimax optimality \(J_\mathrm {MMLD}\) are given in Table 2 for the different measurement allocations. The best performance in terms of average optimality is shown by the measurement allocation optimized for average optimality, as expected. The mean confidence-radius of the minimax-optimal measurement allocation is 15% larger than the mean confidence-radius of the average-optimal measurement allocation. Similarly, the mean confidence-radius of the ad-hoc measurement allocation is 73% larger than the mean confidence-radius of the average-optimal measurement allocation. For minimax optimality, the increase in mean confidence-radius is 23% for the ad-hoc measurement allocation and 35% for the average-optimal measurement allocation as compared to the minimax-optimal measurement allocation. Thus, we have shown an example where measurement allocation optimization provides substantial improvement of a measurement system as compared to an ad-hoc measurement allocation procedure. In this example, the improvement is especially large when average optimality is considered.

Table 2 Cost function values for the ad-hoc measurement allocation and the measurement allocations optimized for average or minimax optimality

Next, we consider the possible effect of rounding the elements in the optimized measurement allocation vector \(\varLambda ^*\) to multiples of \(1/N^\mathrm {meas}\), where we use a simplified analysis. Given an optimized measurement allocation vector \(\varLambda ^*\), we consider the corresponding perturbed vector \(\tilde{\varLambda } = \xi \varLambda ^*\). Here, all weights \(\lambda _k\) are scaled by the multiplicative factor \(\xi = 1 + \delta \xi \), where \(\delta \xi \) is small in comparison to unity. For a perturbation \(\delta \xi \), the relative perturbation in the mean confidence-radius (15) is \(\tilde{\rho }(\beta ) / \rho (\beta ) = (1 + \delta \xi )^{-1/2} \approx 1 - \delta \xi / 2\). If \(\lambda _k N^\mathrm {meas}> 20\) for all non-zero weights \(\lambda _k\), a pessimistic estimate of the relative change in the mean confidence-radius could be coarsely approximated by rounding all weights downwards to an integer multiple of \(1/N^\mathrm {meas}\). If we assume that the rounding (in the worst-case scenario) would correspond to roughly \(\delta \xi = -0.05\), we would have a relative perturbation in the mean confidence-radius of \(\tilde{\rho }(\beta ) / \rho (\beta ) = 1.026\), i.e. a degradation of about 2.5%. This is a rather small degradation in the performance of the measurement system in relation to the improvements achieved by the relaxed solution, when the relaxed solution is compared to the ad-hoc measurement allocation. Should the combinatorial problem be solved, it is rather likely that the degradation in mean confidence-radius is much smaller than 2.5% for such a situation. Given the vast difference in time-scales of the quasi-static object and the measurement of the electrical system, we find that such improvements are in many cases of minor importance but could be pursued by, e.g., the technique presented by Joshi and Boyd (2009).

4.3 Impact of restrictions on transmitter candidate array size and position

In some measurement situations, it may be impossible to perform measurements underneath the sensor, i.e. the receiver is located such that its orthogonal projection onto the transmitter plane is located outside the region occupied by transmitter candidates. Also, only a part of the transmitter plane may be available for measurements. We study this scenario by considering the measurement domain

$$\begin{aligned} \varOmega _\mathbf {p}= & {} \left\{ [x^\mathrm {r},y^\mathrm {r},z^\mathrm {r},{\hat{\mathbf {m}}}^\mathrm {r}]^{T} \mid x^\mathrm {r}=0, y^\mathrm {r}=dv_y, z^\mathrm {r}=1+dv_z,\right. \nonumber \\&\left. {\hat{\mathbf {m}}}^\mathrm {r}\in \mathbb {S}^3\right\} \end{aligned}$$
(30)

where d corresponds to the distance to the point \(\mathbf {r_\mathrm {center}^\mathrm {r}}=(0,0,1)\) above the center of the transmitter candidate array and \([v_y,v_z] =[3,1]/\sqrt{10}\). The relaxed minimax optimality problem in (24) is solved for different values of the distance d with \({N_\mathrm {lin}}=605\) linearization points on half of the unit sphere as described in Quadrature and a Cartesian transmitter candidate array defined by \(x_\mathrm {max}=1.2\), \(y_\mathrm {max}=1.2\), and \(h=0.04\) with \(N^\mathrm {t}= 3721\) candidate transmitters. The thresholding and clustering procedure from Sect. 3.2.1 is exploited. The receiver is above the edge of transmitter candidate array for \(d = d_\mathrm {edge} = y_\mathrm {max}/v_y \approx 1.26\). Furthermore, the receiver is above the transmitter candidate array for \(0 \le d < d_\mathrm {edge}\) and outside the transmitter candidate array for \(d > d_\mathrm {edge}\).

Figure 6 shows the cost function and some optimized measurement allocations for different values of d. For small d, the receiver is above the transmitter candidate array and close to \(\mathbf {r_\mathrm {center}^\mathrm {r}}\). For these receiver positions, non-zero weights are found in all parts of the transmitter candidate array without any effect of its limited size and the performance of the measurement system is almost constant. For increased values of d, the receiver is found further away from \(\mathbf {r_\mathrm {center}^\mathrm {r}}\) either above or outside the transmitter candidate array. For receiver positions in this region, the limited size of the transmitter candidate array strongly influences the measurement allocation and non-zeros weights are primarily obtained for positive \(y^\mathrm {t}\)-coordinates. Thus, weights with a shorter distance to the receiver are preferred to weights with a longer distance to the receiver. The measurement system performance decreases moderately with increasing d above the transmitter candidate array and substantially outside the transmitter candidate array. In contrast, for receiver positions far outside the transmitter candidate array corresponding to large values of d, measurements are also allocated to weights with a negative \(y^\mathrm {t}\)-coordinate far away from the receiver. This suggests that more information can be gained by diversifying the measurements than what is lost by the increased distance to the receiver.

For \(d>4\), the cost function scales approximately as \(49 \log d\) instead of \(36 \log d\) as indicated by (28). The increase in the distance scaling factor is likely due to that measurements can only be allocated within a region that does not scale with d.

Fig. 6
figure 6

Cost function for optimized measurement allocations as a function of d. Examples of optimized measurement allocations are shown as inlaid plots with clustered non-zero weights (circles) and transmitter candidate array boundaries (dashed rectangle). The size of the circular markers is proportional to the corresponding weights \(\lambda _{k_{\mathrm {cl}}}\)

5 Discussion

Many of the difficulties and approximations in this work are related to the model being non-linear in the parameters that we wish to estimate. For example, the Fisher information matrix approximates the confidence volume with an ellipsoid. If the model is linear in the parameters, the confidence volume is indeed an ellipsoid. However, our model is non-linear in the parameters and, then, the confidence volume can take other shapes and does not even have to form a connected set. Therefore, the mean confidence-radius should only be considered as a qualitative metric because it is based on this approximation.

Local designs are based on the assumption that the parameter values that we wish to estimate are known. However, if the parameters are known, we do not need to estimate them. In this work, we have addressed this issue by optimizing for a range of possible sensor positions and orientations, where we have considered minimax and average optimality. An alternative approach is to exploit so-called sequential designs that updates the measurement procedure depending on already measured data. Plotkin and Paperno (2003) constructed a magnetic tracking system based on this idea (without using the design of experiments-terminology), where a subset of the transmitters in a 8 by 8 transmitter array is activated as a function of the most recently estimated sensor position.

To solve the average and minimax optimality problems, we perform quadrature in the measurement domain at a finite set of linearization points. As shown by the results in Sect. 4.1, few linearization points are needed when optimizing for minimax optimality if they are positioned at the most distant part of the measurement domain, i.e. where the worst performance is obtained due to the considerable distance scaling of the cost function. In contrast, more linearization points are needed for average optimality.

The optimization method for measurement allocation presented in this work can also be useful in other situations. For example, the measurement allocation result could be exploited as a starting guess for an optimization method that considers integer variables, a more elaborate physical model, or the impact of non-linearities and the choice of estimation procedure. Moreover, the convex nature of the method is advantageous. In particular, it permits large scale problems to be addressed and extensive parameter studies to be performed.

We have limited this study to planar transmitter constellations with known transmitter orientations due to their importance in practice. However, the proposed method can handle any type of transmitter constellation geometry. That is, transmitter candidates can take any position and orientation. Thus, transmitter constellations that occupy curved surfaces, several disjoint surfaces, volumes, etc., can be handled.

We have studied the situation where \(N^\mathrm {meas}\) is large. Should a situation where \(N^\mathrm {meas}\) is not large be encountered, the proposed method could be exploited as a first step. The obtained weights would then have to be ensured to be multiples of \(1/N^\mathrm {meas}\). This could for example be achieved by a local optimization procedure similar to the ones proposed by Joshi and Boyd (2009).

In this work, we have optimized measurement allocations to yield large changes in the measured signals for a change in the parameters that are to be estimated. We have not considered in full the characteristics of the estimation problem that is obtained with the optimized measurement allocation during the optimization procedure; for example, if the parameters can be uniquely determined everywhere in the measurement domain and if there are local minima present in the estimation problem. This is related to the concepts of identifiability and estimability and the reader is referred to Pronzato and Pázman (2013) for further information.

6 Conclusion

Magnetic tracking is a popular technique that exploits static and low-frequency magnetic fields for positioning of quasi-stationary objects. In this work, we have proposed a method for optimizing the allocation of measurements given a large number of candidate transmitters of a generic magnetic tracking system that exploits time-division multiplexing. The sensor and the transmitters are modeled as magnetic dipoles in free space. Performance metrics based on the Fisher information matrix are exploited to quantify the worst-case performance (minimax optimality) and the expected performance with respect to a prior distribution of the sensor’s position and orientation (average optimality). Optimization problems with integer variables are formulated. By means of a convex relaxation, the integer variables are approximated with real variables and convex optimization problems are obtained. The proposed method is valid for all unbiased estimators and it avoids two commonly encountered problems, namely, high dimensionality and the presence of local minima that are not globally optimal.

The two performance metrics are compared for several realistic measurement scenarios where planar transmitter constellations are considered. Given the strong distance dependence of the measured signal, the worst-case performance is obtained in the most distant regions of the measurement domain. Consequently, measurement allocations optimized for minimax optimality requires measurements over a larger area than measurement allocations optimized for average optimality.

The optimized measurement allocations that are the result of solving the convex optimization problems can be used directly or as a starting guess for the solution of more detailed optimization problems.

In conclusion, the proposed method works well for optimization of measurement allocation for magnetic tracking systems that exploit time-division multiplexing and it provides useful and informative designs of such experiments.