1 Introduction

Random samples from a normal distribution are widely used in scientific computing. In numerical simulations of statistical physics, such as plasma particle-in-cell simulations (Hockney and Eastwood 1981; Birdsall and Langdon 1985) and Monte Carlo simulations (Hammersley and Handscomb 1964), initial distributions of particles in both configuration and velocity spaces strongly affect final results. A set of samples with an exactly requested distribution has been commonly used for initial loading of particles (Byers and Grewal 1970), whose numerical method is known as the “quiet start” (Morse and Nielson 1971).

A distribution of particles in the velocity space should be as close as possible to a requested distribution, such as the Maxwell–Boltzmann distribution (i.e., the normal distribution or the Gaussian distribution). Script language environments, such as Python™, MATLAB® and its clones, have their own built-in functions that generate pseudo-random samples from a normal distribution. On the other hand, high-performance programming languages such as C and Fortran do not have their own intrinsic functions/subroutines that generate pseudo-random samples from a normal distribution, while C++ has a class “normal_distribution” since C++11. There is a numerical method that generates pseudo-random samples with a standard normal distribution based on the central limit theorem (e.g., Lindeberg 1922), which requires twelve sets of source samples from a uniform distribution to generate one sample from a normal distribution. Pseudo-random samples from a normal distribution are also generated with pseudo-random samples from a uniform distribution by using numerical methods such as the rejection sampling (e.g., Kinderman and Monahan 1997; Marsalia and Tsang 1984; Leva 1992; Marsalia and Tsang 1998, 2000) and the inverse transform sampling (e.g., Box and Muller 1958). The rejection sampling generally needs a larger number of source samples than the desired number of samples. Therefore, it is not easy to parallelize the rejection sampling.

The purpose of the present study is to examine numerical methods for generating pseudo-random samples from a normal distribution in multi dimensions for high-performance computing languages such as C and Fortran. The inverse transform sampling is focused on, which is a one-to-one conversion method from a uniform distribution into an arbitrary desired distribution unlike the rejection sampling. A numerical method for generating samples from an exact uniform distribution is given in Sect. 2. New approximations of the inverse cumulative distribution function for a standard normal distribution in one dimension (i.e., the inverse error function) and in three dimensions are given in Sect. 3. Numerical methods for generating samples from standard normal distributions with these approximations are compared against the classic method with the Box–Muller transform (Box and Muller 1958) in Sect. 4. Finally, the conclusion is given in Sect. 5.

2 Source samples from uniform distributions

Unlike script language environments, high-performance programming languages such as C and Fortran do not have their own intrinsic functions or subroutines that generate pseudo-random samples from a normal distribution. Hence, samples from a normal distribution are generated with source samples from a uniform distribution. In this section, source samples from a uniform distribution are discussed firstly.

C has an intrinsic function “rand,” which returns a pseudo-random integer from a uniform distribution in the range between 0 and “RAND_MAX” (which depends on compilers). The period of the “rand” function is \(2^{32}-1\). Fortran 90 has an intrinsic subroutine “random_number,” which returns an array of pseudo-random floating-point values “r” (in either single- or double-precision) from a uniform distribution in the range of \(0 \le r < 1\). However, the period of the random number generator depends on compilers. For example, the period of random_number of Intel® oneAPI ver. 2023 (ifort) is approximately \(10^{18}(\approx 2^{60})\) (Intel 2022a). On the other hand, the period of random_number of the latest GNU Fortran (gfortran) is \(2^{256}-1\) (GCC 2022). Instead of these intrinsic functions/subroutines, the Mersenne Twister (MT19937) (Matsumoto and Nishimura 1998) has been commonly used in scientific computing, since the Mersenne Twister has a very long period of \(2^{19,937}-1\). C++ has a class “mt19937” since C++11.

In the present study, another type of pseudo-random samples from an exact uniform distribution is discussed. Firstly, a set of “sequential” samples in the range of \(0 \le s < 1\) is generated from an exact uniform distribution by the following simple procedure,

$$\begin{aligned} s_{i} = \textrm{mod}\left( \frac{i}{N} + r, 1\right) , \end{aligned}$$
(1)

where N is the number of samples, \(0 \le r < 1\) is a random floating-point value as a seed, and \(i = 1,\cdots , N\) is an integer sequence. The modulo function “\(\textrm{mod}(a,m)\)” returns the remainder after division of a by m. Secondly, by shuffling the samples generated from Eq. (1), a set of pseudo-random samples from an exact uniform distribution is generated. The shuffling of a list is performed by using a set of random integers without duplication, which is known as a random permutation.

$$\begin{aligned} {\varvec{u}} \Leftarrow {\varvec{s}} \left\{ \textrm{randperm}\left( N \right) \right\} . \end{aligned}$$
(2)

In C and Fortran, the shuffling of a list is performed by using uniformly distributed pseudo-random numbers described above. Note that script language environments have their own built-in functions for random permutations. For example, MATLAB® has a built-in function “randperm(N),” which returns a random permutation of the integers between 1 and N (Mathworks 2023).

In uniformly distributed samples generated from Eq. (2), a specific value reappears in a period longer than that of the seed random samples. With the number of samples of \(N = 2^n\), however, the same value sometimes appears in different streams.

3 Generation of normal distributions

3.1 Inverse transform sampling in one dimension

The inverse transform sampling method uses the inverse function of the cumulation (or prefix sum) of a desired distribution to transform a uniform distribution to the desired distribution. For the one-dimensional standard normal (Gaussian or Maxwell–Boltzmann) distribution, the cumulative distribution is given as

$$\begin{aligned} F(x) = \int _{-\infty }^x \frac{1}{\sqrt{2\pi }}\exp \left( -\frac{t^2}{2}\right) \textrm{d}t = \frac{1}{2}\textrm{erf}\left( \frac{x}{\sqrt{2}}\right) +\frac{1}{2}, \end{aligned}$$
(3)

where \(\textrm{erf}(x)\) is the error function, which is intrinsic (“erf”) in C/C++ and Fortran (since Fortran 2008). The inverse cumulative distribution is given as

$$\begin{aligned} F^{-1}(u) = x = \sqrt{2} \textrm{erf}^{-1} \left( 2u-1\right) . \end{aligned}$$
(4)

This equation performs the one-to-one transform of a set of samples \(0\le u<1\) from a uniform distribution to a set of samples x from a standard normal distribution.

The inverse transform sampling in one dimension in Eq. (4), however, is not commonly used. This is because there has not been a fast but free numerical library of the inverse error function. Numerical libraries of recent commercial compilers, e.g., Intel® oneAPI Math Kernel Library (oneMKL) (Intel 2022b) and NVIDIA® CUDA® API (NVIDIA 2023) have a built-in function “erfinv,” while the GNU compiler does not have one. Note that MATLAB® also has a built-in function “erfinv” (Mathworks 2023).

Instead of the numerical libraries of commercial compilers, an approximation of the inverse error function is used in the present study. The inverse error function was approximated by several methods. Classic approximations are obtained by the Taylor series expansion (Philip 1960; Carlitz 1963) and the Chebyshev series expansion (Blair et al. 1976). However, the numerical accuracy of these approximations depends on the number of terms.

In the present study, an invertible approximation of the error function (e.g. Winitzki 2008; Soranzo and Epure 2012a, b, 2014) is examined. Note that there are a variety of approximations for the error function \(\textrm{erf}(x)\) as listed in the following references ( Soranzo and Epure 2014; Yerukala and Boiroju 2015; Matić et al. 2018; Eidous and Abu-Shareefa 2020, and the references therein), but most of them are not invertible. In the present study, an accurate and invertible approximation of the error function obtained by Soranzo and Epure (2012a) is used,

$$\begin{aligned} \textrm{erf}(x) \approx \sqrt{1-\exp \left( -\frac{ax^2+bx^4}{1+cx^2+dx^4}\right) } \end{aligned}$$
(5)

the inverse of which results in

$$\begin{aligned}{} & {} \textrm{erf}^{-1}(x) \approx \sqrt{\frac{ \sqrt{\left\{ c\log (1-x^2)+a\right\} ^2-4\left\{ d\log (1-x^2)+b\right\} \log (1-x^2)} -\left\{ c\log (1-x^2)+a\right\} }{2\left\{ d\log (1-x^2)+b\right\} }} \end{aligned}$$
(6)

for \(x\ge 0\). In the previous studies, the parameters abc and d are assumed to be constant due to the invertibility (Winitzki 2008; Soranzo and Epure 2012a, b).

In the present study, the parameters \(a=4/\pi\) and \(b=0.14\) are assumed to be constant, which are identical to Winitzki (2008). On the other hand, the parameters c and d are assumed to be functions of \(x(\ge 0)\) in order to reduce numerical errors from the built-in inverse error function “erfinv” in MATLAB® for \(0.7<x<1\). The argument x is separated into three intervals, \(0 \le x < 0.72\), \(0.72 \le x < 0.94\), and \(0.94 \le x < 1\). A brute-force search is performed to find a set of parameters c and d that minimizes the root mean square of numerical errors in each interval. Then, the parameters c and d are given as follows,

$$\begin{aligned} c(x)= & {} c_1+0.5(c_2-c_1)\tanh \left( \frac{x-0.72}{0.01}\right) +0.5(c_3-c_2)\tanh \left( \frac{x-0.94}{0.01}\right) \\ d(x)= & {} d_1+0.5(d_2-d_1)\tanh \left( \frac{x-0.72}{0.01}\right) +0.5(d_3-d_2)\tanh \left( \frac{x-0.94}{0.01}\right) \end{aligned}$$

with \(c_1=0.14\), \(c_2=0.1404\), \(c_3=0.1415\), \(d_1=0.00145\), \(d_2=0.0008\), and \(d_3=0.0002\).

Fig. 1
figure 1

Comparison between the built-in inverse error function “erfinv” in MATLAB® and the approximation from Eq. (6). The thick line shows “erfinv” and the solid line shows the approximation from Eq. (6). The two dashed lines show the Taylor series expansion from Eq. (7) with \(M=20\) and \(M=50\), where M denotes the number of Taylor series terms. b The profiles of the function/approximations for \(0 \le x \le 1\), and a its expansion for \(0.9 \le x \le 1\). c The relative error \(\eta\)

Figure 1 shows a comparison between erfinv in MATLAB® and the approximation from Eq. (6). As a reference, the Taylor series expansion of the inverse error function with the number of Taylor series terms \(M=20\) and \(M=50\) is superimposed as well, where

$$\begin{aligned} \textrm{erf}^{-1}(x) \approx \sum _{m=0}^M \frac{h_m}{2m+1} x^{2m+1} \end{aligned}$$
(7)

with (Carlitz 1963)

$$\begin{aligned} h_m = \sum _{k=0}^{m-1}\frac{h_k h_{m-1-k}}{(k+1)(2k+1)}, \ \ \ h_0=1. \end{aligned}$$

Panel (b) shows the profiles of the function/approximations for \(0 \le x \le 1\), and panel (a) shows its expansion for \(0.9 \le x \le 1\). Panel (c) shows the relative error, \(\eta = |(f-f_\textrm{approx}) /{f} |\). The relative error of the Taylor series expansion is around the machine epsilon of double-precision floating point for a small x. However, the relative error exceeds \(10^{-2}\) at \(x\approx 0.96\) with \(M=20\) and at \(x\approx 0.98\) with \(M=50\). This result suggests that much higher-degree terms are necessary to approximate the inverse error function near \(x=1\) by the Taylor series expansion. The relative error of the new approximation is less than \(10^{-4}\) for \(x<0.9937\). It should be noted, however, that the new approximation is not invertible since c and d are functions of x.

3.2 Inverse transform sampling in two dimensions

For the two-dimensional standard normal distribution, the cumulative distribution is given by using the transform from the rectangular coordinate to the polar coordinate as

$$\begin{aligned} F(x,y)= & {} \int _{-\infty }^x \int _{-\infty }^y \frac{1}{2\pi }\exp \left( -\frac{t_1^2+t_2^2}{2}\right) \textrm{d}t_1\textrm{d}t_2 \nonumber \\= & {} \int _{0}^\chi \int _{0}^{\zeta } \frac{r}{2\pi }\exp \left( -\frac{r^2}{2}\right) \textrm{d}r\textrm{d}\phi \nonumber \\= & {} \frac{\zeta }{2\pi } \left\{ 1-\exp \left( -\frac{\chi ^2}{2}\right) \right\} \end{aligned}$$
(8)

with \(\chi ^2 = x^2+y^2\) and \(\zeta = \tan ^{-1}\left( y/x\right)\). Hence, the inverse cumulative distribution is given as

$$\begin{aligned} F^{-1}(u_1,u_2) = (x,y) = \left( \sqrt{-2\log (u_1)}\cos (2\pi u_2), \sqrt{-2\log (u_1)}\sin (2\pi u_2)\right) . \end{aligned}$$
(9)

This procedure is well-known as the Box–Muller transform (Box and Muller 1958), which transforms two samples \(0\le (u_1,u_2)<1\) from a two-dimensional uniform distribution to two samples (xy) from a two-dimensional standard normal distribution on one to one.

Note that samples from a two-dimensional standard normal distribution are generated as well by replacing \((u_1,u_2)\) with another set of samples from a two-dimensional uniform distribution \(-1\le (v_1,v_2)<1\), where

$$\begin{aligned} u_1=v_1^2+v_2^2, \ \ \ \cos (2\pi u_2) = \frac{v_1}{v_1^2+v_2^2}, \ \ \ \sin (2\pi u_2) = \frac{v_2}{v_1^2+v_2^2} \end{aligned}$$

with the rejection of \(1 < v_1^2+v_2^2\), which is known as the Marsaglia polar method (Marsaglia and Bray 1964). However, the rejection method is not focused on in the present study.

The Box–Muller transform (Box and Muller 1958) has been widely used, because almost all programming languages have intrinsic (or built-in) functions “sqrt,” “cos,” “sin,” and “log,” unlike “erfinv.” As a drawback, however, four sets of uniformly distributed samples are needed for generating three sets of normally distributed samples in three dimensions, because the Box–Muller transform is not able to generate one set of normally distributed samples from one set of uniformly distributed samples.

3.3 Inverse transform sampling in three dimensions

For the three-dimensional standard normal distribution, the cumulative distribution is given by using the transform from the rectangular coordinate to the spherical coordinate as

$$\begin{aligned} F(x,y,z)= & {} \int _{-\infty }^x \int _{-\infty }^y \int _{-\infty }^z \frac{1}{\sqrt{2\pi }^3}\exp \left( -\frac{t_1^2+t_2^2+t_3^2}{2}\right) \textrm{d}t_1\textrm{d}t_2 \textrm{d}t_3 \nonumber \\= & {} \int _{0}^\chi \int _{0}^{\upsilon } \int _{0}^{\zeta } \frac{r^2}{\sqrt{2\pi }^3}\exp \left( -\frac{r^2}{2}\right) \sin \theta \textrm{d}r\textrm{d}\theta \textrm{d}\phi \nonumber \\= & {} \frac{\zeta }{\sqrt{2\pi }^3} \left\{ \sqrt{\frac{\pi }{2}} \textrm{erf} \left( \frac{\chi }{\sqrt{2}}\right) - \chi \exp \left( -\frac{\chi ^2}{2}\right) \right\} (1-\cos \upsilon ) \end{aligned}$$
(10)

with \(\chi ^2 = x^2+y^2+z^2\), \(\upsilon = \cos ^{-1}\left( z/\sqrt{x^2+y^2+z^2} \right)\), and \(\zeta = \cos ^{-1}\left( x/\sqrt{x^2+y^2}\right)\). There is no analytic inverse function of the \(\chi\) component of the cumulative distribution g(x),

$$\begin{aligned} g(x) \equiv \textrm{erf}\left( \frac{x}{\sqrt{2}}\right) -\sqrt{\frac{2}{\pi }}x\exp \left( -\frac{x^2}{2}\right) . \end{aligned}$$
(11)

In the present study, the cumulative distribution g(x) is approximated by the following function that is explicitly invertible,

$$\begin{aligned} g(x) \approx \left\{ 1-\exp \left( -\frac{ax^2+bx^4}{1+cx^2+dx^4}\right) \right\} ^{\frac{3}{2}}. \end{aligned}$$
(12)
Fig. 2
figure 2

Comparison between g(x) and the approximation from Eq (12). The thick line shows g(x) and the solid line shows the approximation from Eq. (12). a The profiles of the function/approximations for \(0 \le x \le 6\), b the relative error \(\eta\), and c the absolute error \(\varepsilon\)

Figure 2 shows a comparison between g(x) and the approximation from Eq. (12) with \(a=0.4129\), \(b=0.0823\), \(c=0.1906\) and \(d=-0.000925\), which are obtained by a brute-force search to minimize the root mean square of numerical errors from Eq. (11). Panel (a) shows the profiles of the function/approximation for \(0 \le x \le 6\). Panel (b) shows the corresponding relative error. Panel (c) shows the corresponding absolute error. The relative error of the approximation in Eq. (12) is less than \(10^{-2}\), and the absolute error of the approximation in Eq. (12) is less than \(10^{-4}\).

With the constants a, b, c and d, the inverse function of g(x) is then given as follows,

$$\begin{aligned} \textrm{erf}^{-1}(x) \approx \sqrt{\frac{ \sqrt{\left\{ c\log (1-x^\frac{2}{3})+a\right\} ^2-4\left\{ d\log (1-x^\frac{2}{3})+b\right\} \log (1-x^\frac{2}{3})} -\left\{ c\log (1-x^\frac{2}{3})+a\right\} }{2\left\{ d\log (1-x^\frac{2}{3})+b\right\} }}. \end{aligned}$$
(13)

Hence, the inverse cumulative distribution of the three-dimensional standard normal distribution is given as

$$\begin{aligned} F^{-1}(u_1,u_2,u_3) = (x,y,z) = \left( \begin{array}{l} g^{-1}(u_1) \sin \left\{ \cos ^{-1}(2u_2-1) \right\} \cos (2\pi u_3) \\ g^{-1}(u_1) \sin \left\{ \cos ^{-1}(2u_2-1) \right\} \sin (2\pi u_3) \\ g^{-1}(u_1) \cos \left\{ \cos ^{-1}(2u_2-1) \right\} \\ \end{array} \right) ^T. \end{aligned}$$
(14)

Note that \(\cos \left\{ \cos ^{-1}(2u_2-1) \right\} = (2u_2-1)\) and \(\sin \left\{ \cos ^{-1}(2u_2-1) \right\} = \sqrt{1-(2u_2-1)^2} = 2\sqrt{u_2(1-2u_2)}\).

4 Numerical Tests

In the present numerical tests, four sets of source samples are generated with two methods. In the first method, four uniform distributions, \(({\varvec{u}}_1,{\varvec{u}}_2,{\varvec{u}}_3,{\varvec{u}}_4)\), are generated from Eq. (1), and then they are shuffled from Eq. (2). In the second method, pseudo-random numbers, \(({\varvec{r}}_1,{\varvec{r}}_2,{\varvec{r}}_3,{\varvec{r}}_4)\), are generated with MT19973 (Matsumoto and Nishimura 1998). Then, by using these source samples, three sets of random samples from standard normal distributions are generated with one-, two-, and three-dimensional inverse transform sampling methods shown in the previous section. Note that \({\varvec{u}}_4\) and \({\varvec{r}}_4\) are used for two-dimensional inverse transform sampling (i.e., the Box–Muller transform) only.

Fig. 3
figure 3

Histograms of standard normal distributions with \(N_{sample}=10,000\) samples and \(N_{bin}=100\) bins with different source distributions and different inverse transform samplings. The solid curves correspond to the 1D standard normal distribution in Eq. (16). The samples “\({\varvec{h}}^{(1)}_{u,1}\)” are generated by the 1D inverse transform sampling in Eq. (4) with the source samples \({\varvec{u}}_1\). The samples “\({\varvec{h}}^{(2)}_{u,1}\)” and “\({\varvec{h}}^{(2)}_{u,2}\)” are generated by the 2D inverse transform sampling (i.e., the Box–Muller transform) in Eq. (9) with the source samples \({\varvec{u}}_1\) and \({\varvec{u}}_2\). The samples “\({\varvec{h}}^{(3)}_{u,1}\),” “\({\varvec{h}}^{(3)}_{u,2}\)” and “\({\varvec{h}}^{(3)}_{u,3}\)” are generated by the 3D inverse transform sampling in Eq. (14) with the source samples \({\varvec{u}}_1\), \({\varvec{u}}_2\) and \({\varvec{u}}_3\). The samples “\({\varvec{h}}_{r}\)” are generated with the source samples \({\varvec{r}}\)

Figure 3 shows an example of 1D standard normal distributions with \(N_{sample}=10,000\) samples. The bars show the histogram of the samples with 100 bins. The samples “\({\varvec{h}}^{(1)}_{u,1}\)” and “\({\varvec{h}}^{(1)}_{r,1}\)” are generated by the 1D inverse transform sampling in Eq. (4) with the source samples \({\varvec{u}}_1\) and \({\varvec{r}}_1\), respectively. The samples “\({\varvec{h}}^{(2)}_{u,1}\)” and “\({\varvec{h}}^{(2)}_{u,2}\)” are generated by the 2D inverse transform sampling (i.e., the Box–Muller transform) in Eq. (9) with the source samples \({\varvec{u}}_1\) and \({\varvec{u}}_2\). The samples “\({\varvec{h}}^{(2)}_{r,1}\)” and “\({\varvec{h}}^{(2)}_{r,2}\)” are generated by the 2D inverse transform sampling with the source samples \({\varvec{r}}_1\) and \({\varvec{r}}_2\). The samples “\({\varvec{h}}^{(3)}_{u,1}\),” “\({\varvec{h}}^{(3)}_{u,2}\)” and “\({\varvec{h}}^{(3)}_{u,3}\)” are generated by the 3D inverse transform sampling in Eq. (14) with the source samples \({\varvec{u}}_1\), \({\varvec{u}}_2\) and \({\varvec{u}}_3\). The samples “\({\varvec{h}}^{(3)}_{r,1}\),” “\({\varvec{h}}^{(3)}_{r,2}\)”and “\({\varvec{h}}^{(3)}_{r,3}\)” are generated by the 3D inverse transform sampling with the source samples \({\varvec{r}}_1\), \({\varvec{r}}_2\) and \({\varvec{r}}_3\).

The 1D histograms (\(h_i\)) of these distributions are evaluated by the standard deviation from the 1D standard normal distribution f(x),

$$\begin{aligned} \sigma = \sqrt{ \frac{1}{N_{bin}} \sum _{i=1}^{N_{bin}} \left\{ h_i - f(x_i) \right\} ^2 }, \end{aligned}$$
(15)

where \(N_{bin}\) denotes the number of bins for the histogram with

$$\begin{aligned} f(x) = \frac{N_{sample}}{\sqrt{2\pi }} \exp \left( -\frac{x^2}{2}\right) . \end{aligned}$$
(16)
Fig. 4
figure 4

Normalized deviation from the 1D standard normal distribution \(\sigma /N_{spb}\) as a function of the number of samples per bin \(N_{spb}\). Samples are generated with the source samples a \({\varvec{r}}\) and b \({\varvec{u}}\), respectively. The circle marks show the deviation of samples generated by the 1D inverse transform sampling in Eq. (4). The triangle marks show the deviation of samples generated by the 2D inverse transform sampling in Eq. (9). The square marks show the deviation of samples generated by the 3D inverse transform sampling in Eq. (14). The error bars show the maximum and minimum deviations

Figure 4 shows the ratio of deviation (\(\sigma /N_{spb}\)) of each sample as a function of \(N_{spb} \equiv N_{sample}/N_{bin}\). In panel (a), samples are generated with the source samples \({\varvec{r}}\), i.e., MT19973 (Matsumoto and Nishimura 1998). In panel (b), samples are generated with the source samples \({\varvec{u}}\), i.e., random permutations from uniform distributions. The circle marks show the deviation of samples generated by the 1D inverse transform sampling in Eq. (4). The triangle marks show the deviation of samples generated by the 2D inverse transform sampling (i.e., the Box–Muller transform) in Eq. (9). The square marks show the deviation of samples generated by the 3D inverse transform sampling in Eq. (14). The error bars show the maximum and minimum deviations.

It is clearly shown that the 1D inverse transform sampling with source samples of random permutations provides samples from the 1D standard normal distribution with lower deviation. There is no difference in the deviation between the 2D and 3D inverse transform samplings with source samples of random permutations. The deviations of samples generated by the 1D/2D/3D transform samplings with source samples of MT19973 are the same as those by the 2D/3D transform samplings with source samples of random permutations. The deviation of samples from the 1D standard normal distribution decreases with the 0.5th power of the number of samples per bin.

The 2D histograms (\(h_{i,j}\)) of these distributions are then evaluated by the standard deviation from the 2D standard normal distribution f(xy) as well,

$$\begin{aligned} \sigma = \sqrt{ \frac{1}{{N_{bin}^2}} \sum _{i=1}^{N_{bin}} {\sum _{j=1}^{N_{bin}}} \left\{ h_{i,j} - f(x_i,y_j) \right\} ^2 }, \end{aligned}$$
(17)

with

$$\begin{aligned} f(x,y) = \frac{N_{sample}}{2\pi } \exp \left( -\frac{x^2+y^2}{2}\right) . \end{aligned}$$
(18)
Fig. 5
figure 5

Normalized deviation from the 2D standard normal distribution \(\sigma /N_{spb}\) as a function of the number of samples per bin \(N_{spb}\). Samples are generated with the source samples a \({\varvec{r}}\) and b \({\varvec{u}}\), respectively. The circle marks show the deviation of samples generated by the 1D inverse transform sampling in Eq. (4). The triangle marks show the deviation of samples generated by the 2D inverse transform sampling in Eq. (9). The square marks show the deviation of samples generated by the 3D inverse transform sampling in Eq. (14). The error bars show the maximum and minimum deviations

Figure 5 shows the ratio of deviation (\(\sigma /N_{spb}\)) of each sample as a function of \(N_{spb} \equiv N_{sample}/N_{bin}^2\) with the same format as in Fig. 4. The deviation of samples from the 2D standard normal distribution decreases with the 0.5th power of the number of samples per bin. The deviations of samples generated by the 1D/2D/3D transform samplings with source samples of MT19973 are the same as those by 1D/2D/3D transform samplings with source samples of permutations.

The 3D histograms (\(h_{i,j,k}\)) of these distributions are also evaluated by the standard deviation from the 3D standard normal distribution f(xyz),

$$\begin{aligned} \sigma = \sqrt{ \frac{1}{N_{bin}^3} \sum _{i=1}^{N_{bin}} \sum _{j=1}^{N_{bin}} \sum _{k=1}^{N_{bin}} \left\{ h_{i,j,k} - f(x_i,y_j,z_k) \right\} ^2 }, \end{aligned}$$
(19)

with

$$\begin{aligned} f(x,y,z) = \frac{N_{sample}}{\sqrt{2\pi }^3} \exp \left( -\frac{x^2+y^2+z^2}{2}\right) . \end{aligned}$$
(20)

Fig. 6 shows the ratio of deviation (\(\sigma /N_{spb}\)) of each sample as a function of \(N_{spb} \equiv N_{sample}/N_{bin}^3\) with the same format as in Fig. 4. The results of the 3D histograms are comparable to those of the 2D histograms.

Fig. 6
figure 6

Normalized deviation from the 3D standard normal distribution \(\sigma /N_{spb}\) as a function of the number of samples per bin \(N_{spb}\). Samples are generated with the source samples a \({\varvec{r}}\) and b \({\varvec{u}}\), respectively. The circle marks show the deviation of samples generated by the 1D inverse transform sampling in Eq. (4). The triangle marks show the deviation of samples generated by the 2D inverse transform sampling in Eq. (9). The square marks show the deviation of samples generated by the 3D inverse transform sampling in Eq. (14). The error bars show the maximum and minimum deviations

The computing costs of the random number generator are measured on a single compute core of Intel® Xeon® Gold 6342 processor with of Intel® oneAPI ver. 2021.5.0. It takes 1.84 s. to generate \(N=10^8\) samples from the 1D standard normal distribution with Eq. (4), 1.32 s. to generate \(N=10^8\times 2\) samples from the 2D standard normal distribution with Eq. (9), and 2.33 s. to generate \(N=10^8\times 3\) samples from the 3D standard normal distribution with Eq. (14). It also takes 0.18 s. to generate \(N=10^8\) samples from an exact uniform distribution with Eq. (1), and 0.90 s. to permutate the \(N=10^8\) samples with Eq. (2). As a reference, it takes 4.28 s. to generate \(N=10^8\) samples with the “random_number” subroutine of Fortran.

5 Conclusion

The inverse transform sampling methods are revisited for generating samples from standard normal distributions with source samples from uniform distributions. In the present study, the inverse cumulative distribution functions for standard normal distributions in one and three dimensions are approximated by new functions that are analytically invertible. The source samples are generated from uniform distributions by two methods. One is the Mersenne Twister (MT19937) (Matsumoto and Nishimura 1998) and the other is an exact uniform distribution with random permutations (shuffling).

It is shown that the deviations of samples from the 1D, 2D, and 3D standard normal distributions decrease with the 0.5th power of the number of samples per bin. The numerical tests have shown that there is no substantial difference among the deviations of samples from the standard normal distributions generated by 1D, 2D, and 3D inverse transform samplings with MT19937 and random permutations, suggesting that the new approximations of the 1D and 3D inverse cumulative distribution functions work well.

The conventional Box–Muller transform (Box and Muller 1958) (i.e., 2D inverse transform sampling) uses four sets of samples from uniform distributions for generating three sets of samples from standard normal distributions. On the other hand, the present 1D and 3D inverse transform sampling methods use three sets of samples from uniform distributions for generating three sets of samples from standard normal distribution, which is computationally cheaper. The 1D and 3D inverse cumulative distribution functions proposed in the present study consist of “sqrt” and “log” functions, which are not difficult for coding in C/C++ and Fortran.