Abstract

The use of floating-point numbers inevitably leads to inaccurate results and, in certain cases, significant program failures. Detecting floating-point errors is critical to ensuring that floating-point programs outputs are proper. However, due to the sparsity of floating-point errors, only a limited number of inputs can cause significant floating-point errors, and determining how to detect these inputs and to selecting the appropriate search technique is critical to detecting significant errors. This paper proposes characteristic particle swarm optimization (CPSO) algorithm based on particle swarm optimization (PSO) algorithm. The floating-point expression error detection tool PSOED is implemented, which can detect significant errors in floating-point arithmetic expressions and provide corresponding input. The method presented in this paper is based on two insights: (1) treating floating-point error detection as a search problem and selecting reliable heuristic search strategies to solve the problem; (2) fully utilizing the error distribution laws of expressions and the distribution characteristics of floating-point numbers to guide the search space generation and improve the search efficiency. This paper selects 28 expressions from the FPBench standard set as test cases, uses PSOED to detect the maximum error of the expressions, and compares them to the current dynamic error detection tools S3FP and Herbie. PSOED detects the maximum error 100% better than S3FP, 68% better than Herbie, and 14% equivalent to Herbie. The results of the experiments indicate that PSOED can detect significant floating-point expression errors.

1. Introduction

A large number of floating-point programs are used in key areas such as aerospace, defense, and military, where high reliability is required, so ensuring the accuracy of floating-point program results is critical. However, rounding errors exist between the numerical values used in computer floating-point operations and real values, and the accumulation rounding errors can affect the accuracy of floating-point programs. Currently, most floating-point computing programs use the floating-point precision defined by the IEEE 754 [1] arithmetic standard. When the precision employed does not meet the numerical criteria, the computer will round the value to obtain an estimated value and it is this treatment introduces rounding errors. At the same time, rounding errors will accrue as the operation progresses, and these rounding errors may accumulate to such an extent that they are difficult to ignore, ultimately impacting the correctness of floating-point program calculation outputs. Therefore, error detection techniques are required to detect significant errors in programs and offer input for error triggering, which serves as the foundation for further program optimization. However, it is difficult and inefficient to analyze the error of floating-point program directly. The core numeric operations of a floating-point program can often be abstracted as floating-point expressions, so most methods and tools perform error detection on floating-point expressions.

A lot of research has been done on error detection in floating-point expression, but there are still two areas that can to be improved.

First, prior work does not fully utilized the features of expression error distribution. Consider NMSE problem 336 in FPBench [2], the standard floating-point benchmark. Its expression is . There are 998,768,527 64-bit floating-point numbers in the expression definition field (0.01, 1,000). This is a huge search space, and testing all of the points in it takes time and effort. If we evenly sample the expression on a small scale and calculate the sampling point error, we can find that the input that causes the large error of the expression is mainly concentrated on the right side of the interval, such as the interval (900, 1,000), as shown in Figure 1. By focusing attention on the right side of the interval during the search, the effectiveness of detecting the largest error can be considerably enhanced. With the same number of sample points, the largest errors found using random search in the intervals (0.01, 1,000) and (900, 1,000) were 3.28E-13 and 4.41E-13, respectively. It can be seen that using the error distribution features to guide the search increased the detection impact by 34.5%. The above example shows that it is not necessary to perform a global search on the domain of the floating-point expression every time, but can be searched in some specific intervals to achieve the purpose of reducing the search space and improving the search efficiency. However, current tools fail to take full advantage of this information to aid the search.

Second, existing work which treats floating-point expression error detection as a search problem, using search strategies such as binary search or Monte Carlo Markov to find the maximum error. However, instead of fully exploiting the characteristic floating-point numbers and floating-point errors, the majority of these efforts concentrate on improving the search strategy itself. To produce satisfactory detection results, the algorithm itself should be paired with the properties of the problem. For example, one of the characteristic of floating-point numbers is that they are not evenly distributed along the number line. The closer the floating-point numbers are to the origin, the more concentrated they are. For 64-bit floating-point numbers, the number between (−1, 1) accounts for 49.95% of all 64-bit floating-point numbers. Suppose we use the particle swarm optimization (PSO) algorithm to search for the maximum error. According to the distribution characteristics of floating-point numbers, when the PSO algorithm initializes the population, 50% of the individuals in the population can be initialized outside (−1, 1). Therefore, when improving the existing algorithm and designing the new algorithm, it is necessary to consider how to combine the floating-point number and floating-point error characteristic with the search algorithm for error detection.

To address the aforementioned issues, this paper suggests an error detection method based on the PSO algorithm [3] and implements the error detection tool PSOED. First, the expression is sampled, the error at the sampling point is calculated, and an image is returned to the user. The peak value extraction algorithm is then used to extract the peak points of the image that indicate the potential large error margin. The extracted peak points are then screened for anomalies using the boxplot evaluation index. If outliers can be filtered out, a large error search interval is generated based on the outliers. Finally, the characteristic particle swarm optimization (CPSO) algorithm is applied to generated search interval to perform a maximum error search. This paper makes the following contributions:(i)A method of constructing detection intervals based on the characteristic of the distribution of expression errors and the distribution of floating-point numbers is proposed, which can effectively assist search algorithm in identifying the significant errors.(ii)A CPSO algorithm based on the PSO algorithm is proposed, which employs a number of strategies to help the algorithm breaking out of the local optimal solution and can dynamically pick parameters according to the expression.(iii)PSOED, a floating-point error detection tool, was conceived and built.(iv)The 28 expressions in FPBench are selected as test cases to validate the effectiveness of the error detection tool.

The rest of the paper is organized as follows. We first introduce the basics of floating-point representation, the definition of the floating-point error, PSO algorithm and boxplot algorithm in Section 2. We then give an overview of our approach in Section 3 and detail our approach in Section 4. We provide the experimental results in Section 5. In Section 6, we summarize related work. We end the paper with conclusion, the limitations of our work, and future work in Section 7.

2. Preliminaries

The basic concepts used in this paper are floating-point number, floating-point error, PSO algorithm, and boxplot. The basic concepts are presented blow.

2.1. Floating-Point Representation

The IEEE 754 standard defines the structure of floating-point numbers and the number of bits in each component of floating-point numbers. Figure 2 shows the number of bits in single and double precision floating point numbers for each component of floating-point numbers.

2.2. Floating-Point Error

When the floating-point expression is a real number, the error caused by the rounding operation can be defined as the difference between the real value x and the floating-point value , where is the error in representing real numbers as floating-point number.

For floating-point programs , errors are introduced and accumulated in each floating-point operation of the program. The accumulation of errors in the operation process leads to the final result containing errors. Relative error (Errorrel) and Absolute error (Errorabs) are two widely used metrics for measuring the magnitude of errors. For the ideal exact result (i.e., the result of a real number operation) and floating-point program results , the formulae for calculating the relative and absolute errors are as follows:

Unit in the last place (ULP) is a unit of measurement of relative error. For a floating-point expression and its corresponding real value: , ULP and ULP error (ErrorULP) can be expressed as follows:

2.3. Particle Swarm Optimization Algorithm

PSO is an intelligent search algorithm based on the group cooperation It is suitable for nonlinear and multipolar problems, and the code is relatively easy to implement with fewer adjustable parameters, so it is widely used in the optimization of parameter rectification. The specific algorithms are:

In the M-dimensional search space, N random particles form a particle swarm, and the position and velocity vectors of each particle are and . Set the position vector into the fitness function to calculate the fitness value of the individual particle. The global optimal solution is obtained by judging the magnitude of the fitness value to measure the superiority of the solution and going through successive iterations. Let be the i-th particle history search to the optimal position. The optimal position searched by the particle swarm history is . During each iteration, the current velocity and position are updated by the following equation:where , and are the learning factors, indicating the degree of learning of the particle to its own optimal solution and the population optimal solution; and are random numbers on (0,1) and obey a uniform distribution; is the inertia weight, indicating the maintenance of the original motion trend of the particle.

2.4. Boxplot

A boxplot is a mathematical algorithm that calculates the dispersion of a data set and is also used to reflect the characteristics of the data distribution. The advantage of the boxplot algorithm is that it relies only on the actual data and does not require the assumption that the data sample set follows a particular distribution shape. On the other hand, the criterion for judging outliers in the boxplot is based on quartiles and interquartile distances. The quartiles are resistant to perturbations, and up to 25% of the data can be arbitrarily far away without perturbing the quartiles significantly, so the outliers do not affect this criterion, which makes the results of anomalous data identification relatively more objective. The principle of the boxplot outlier detection algorithm is shown in the Figure 3.

Q1 is the first quartile of a data set, M is the median, Q3 is the third quartile, and the interquartile range IQR is the difference between the two numbers of Q3 and Q1. If the detected value is less than Q1 – 3IQR or greater than Q3 + 3IQR, the value is considered an extreme outlier; if the detected value is less than Q1 – 1.5IQR or greater than Q3 + 1.5IQR and does not reach the extreme outlier threshold, the value is considered a mild outlier.

3. Overview

The workflow of PSOED, an error detection tool for floating-point expressions, is shown in Figure 4, which consists of an error distribution generation module, an interval generation module and a search module.

3.1. Error Distribution Generation Module

The error distribution generation module performs small-scale uniform sampling of the expressions in the respective intervals based on the user input expressions and intervals. We convert the expressions into a low-precision (64-bit) version of the function and a high-precision (128-bit) version of the function based on the MPFR library [4], then calculates the high-precision and low-precision function values according to the sampling points, and the difference between the two is the error result, which is presented to the user in the form of an image.

3.2. Interval Generation Module

Based on error information generated by the error distribution module, the interval generation module will use the AMPD algorithm [5] (see Section 4.2 for details) to extract the error peaks (large error points) in the sampling points, and then calculate the dispersion degree of these error peak points using the boxplot indicator [6] to screen out the outlier error peaks. For any expression, we first generate the interval according to the law of floating-point distribution, and then examine if the error distribution module can extract the outlier peaks from the peak points of the floating-point expression in the sample points. If we can extract the extreme outliers, we apply the bitwise increment/decrement algorithm procedure to produce the big error interval by obtaining the left and right neighborhoods of the extreme outliers.

3.3. Search Module

The search module searches for the maximum error of a floating-point expression in the specified interval using CPSO algorithm (see Section 4.3 for details of the algorithm flow) under the conditions that the user specifies the number of particles and the number of iterations. Finally, the user is given the maximum erro as well as the input that triggered the error.

This tool supports error detection for 28 single-parameter floating-point expressions in FPBench, see Section 5.1 for details of the expressions.

4. Approach

4.1. Obtaining the Error Distribution and Error Peak Points

The error distribution module allows us to obtain the error distribution and peak points of a user input expression over a particular interval. To better describe how this module works, we illustrated its steps.

After obtaining a floating-point expression and the corresponding interval, we first perform a small-scale uniform sampling of the expression to obtain the expression’s initial error distribution information. Figures 510 depict the error plots for sampling points of 10, 100, 1,000, 10,000, and 1,000,000 on the interval (0, 1,000) using the Bsplines3 expression as an example, respectively.

As shown in Figure 5, when the sampling point is 100,000, the error distribution of the Bsplines3 expression is basically the same as when the sampling point is 1,000,000. Despite the fact that the more sampling points there are, the more accurate the error distribution is. We set the default sampling point to 100,000 for the performance and accuracy reasons, and all expressions adhere to this standard. Of course, the number of sampling points can be adjusted by the user.

After obtaining the error distribution of the expression, we focus on the image’s local maxima, which are referred to as error peak points. In order to obtain the peak points on the image more accurately and quickly, we introduce the multiscale-based automatic peak detection (AMPD) algorithm in the field of signal processing. This algorithm was proposed by Scholkmann et al. [5]. It is a noise and signal peak detection algorithm based on local maxima. In this paper, we can think of the error image as a noise (or signal) image and apply the AMPD algorithm to get the noise (or signal) peak points, i.e., the desired error peak points. In this paper, we will use it directly without going into the details of the AMPD algorithm. The error image peak point extraction based on the AMPD algorithm is described in Algotithm 1.

Require: expression ,
1: Us Uniform_sample(a, b, 100,000)
2: for of Us do
3:  temp_error
4:  error_list.append(temp_error)
5: end for
6: peak_list AMPD(error_list)
7: peak_list, error_list

As shown in Figure 11, after the AMPD algorithm extracts the peak points, we label the peak points on the error image to reflect the detection effect: the black points on the image are the error peak points found by the error distribution module.

4.2. Generating Large Error Intervals Based on Error Peaks

After the processing of the error distribution module, we get the peak points, the interval generation module will process the peak points to generate the interval, the workflow is divided into two steps. First, the peak points are filtered and sorted to generate outlier peak points. Then, large error intervals are generated based on the presence or absence of outlier peaks.

Since the peak points selected by the AMPD algorithm are local maxima, but the local maxima can be large or small, and the error of two peak points can be several orders of magnitude different, the first step we need to do in this module is to eliminate the peak points with small errors and keep only the peak points with relatively large errors. To eliminate the smaller peaks, we need to use an evaluation index to evaluate the degree of anomaly of each peak in the set of peaks.

The error data samples are discrete data that have the characteristics of strong randomness and do not obey specific distribution laws. Among the mature algorithms commonly used for data outlier detection, the rule or z-score method based on normal distribution have the prerequisite of assuming that the data obey normal distribution, but the error data samples do not obey normal distribution, and the criteria for determining the outliers of the above methods are based on calculating the mean and standard deviation of the full set of samples, and the mean and standard deviation have very little interference resistance, so the outliers themselves will have a large impact on the performance of the algorithm. Therefore, the outliers themselves have a significant impact on the performance of the algorithm. Taking the rule as an example, represents the standard deviation and represents the mean, the normal interval of values is considered to be (, ), and the probability that the values are distributed in this interval is 99.7%, that is, the number of outliers obtained by this algorithm is not more than 0.3% of the total number of sample sets. Therefore, this algorithm is obviously not the optimal solution. The boxplot algorithm is a good solution to this problem.

The standard given in the boxplot is used to evaluate the degree of dispersion of each peak point. Extreme outliers and mild outliers are considered as outlier peak points. We can obtain two results: , there is no outlier peak (normal value) and , there is an outlier peak (extreme outlier). These two results correspond to two characteristics of the error distribution: uniform distribution and nonuniform distribution. In both cases, we must first generate the interval Sf according to the floating-point distribution law. The existence of outlier peaks reflects whether the error distribution is uniform. In case , the absence of outlier peaks indicates that the difference between each peak is small, which means that the error of the expression is uniformly distributed in the given interval. The error distribution of the expression verhulst in the interval (0, 1,000) is shown in Figure 12.

At this point, it can be seen that the error information is not very meaningful here because the error is uniformly distributed and there are many peaks. For such expressions, no special interval generation is done, and the interval Sf is used as the search band.

Floating-point numbers are not evenly distributed along the number line. The closer you get to the origin, the denser the distribution. For floating-point numbers of type double, the numbers between (−1,1) make up 49.95% of all floating-point numbers, or about half. For this distribution property, two cases are considered when initializing the particle population: above (−1,1) and outside (−1,1).

For case , the presence of outlier peaks indicates that the errors are prominent. For example, the errors of the expressopm logexp, NMSE problem 341, and exp1x_log are shown in Figures 1315, respectively. Their errors do not have a uniform distribution.

For expressions with nonuniform error distribution, a series of outlier peaks can be obtained by calculating according to the boxplot method. The peaks set Sp and the corresponding error set can be obtained by sorting. According to the points in the outlier peak point set , the size of the left and right neighborhoods can be determined according to the bit increment/decrement search algorithm. Taking as an example, to determine the size of the left neighborhood, point is decremented by bit, and the error size of the inputs is calculated after each decrement. If it is less than , the iteration is stopped. The determination of the right neighborhood is similar to that of the left neighborhood. Note that the last peak point is discarded. Two factors are taken into account. One is that there is no reference threshold when searching by bit for the last peak point. According to this search method, the left and right neighborhood intervals and are generated, respectively, and the final large error interval  =  . It is worth noting that if there are two intervals and , and , the two need to be merged into  =  . Here is an example:

Through calculation, a peak point set and corresponding error set can be obtained as follows:

Using a bit search for each point in , you can obtain the left interval and the right interval .

Combine and to obtain the final large error interval set .

These intervals will divide the number of particles equally, assuming that the resulting set of large error intervals contains n intervals, and we calculate the number of CPSO initialization particles on each interval using Equation (6).

Finally, and form as the search interval for the CPSO algorithm. The overall process of the algorithm is shown in Algorithm 2.

Require: peak_list, error_list
Ensure:
1: ifthen
2:  fordo
3:   ifthen
4:    break
5:   else
6:    left, right peak_list[i]
7:    whiledo
8:     left left-0x1
9:    end while
10:    whiledo
11:     right right + 0x1
12:    end while
13:    [index] [left, right]
14:   end if
15:   index index + 1
16:  end for
17:   merge_intervals()
18: else
19:   float_distribution(a, b)
20: end if
21: ifthen
22:  
23: else
24:  
25: end if
26:
4.3. Searching Using CPSO Algorithm

The main disadvantages of the standard PSO algorithm are: (1) it is easy to fall into local optimal solutions; (2) it is highly dependent on the problem.

In this paper, the standard PSO algorithm is improved, and the new algorithm is call CPSO algorithm, and the specific improvement strategies and reasons are as follows:

4.3.1. The Perturbation Parameter Is Added

In the standard PSO algorithm, the particle will only move to the position with a better fitness value than itself, which is easy to fall into the local optimal solution. To help the particles jump out of the local optimal solution, we borrow the idea from the Annealing algorithm [7] and set a perturbation parameter , which represents the probability that the particle will move to the position with a fitness value lower than its own. In order not to affect the convergence of the algorithm, this parameter decrease linearly with the increase of the number of iterations. In the early iterations, the parameter increases the global search ability of the particle, and in the late iterations, it does not affect the normal convergence of the algorithm. As shown in Equation (10).

4.3.2. Each Particle Will Be Followed by Associated

Given the sparsity of errors in floating-point expressions, even small changes can produce large error results, so we introduce the concept of associated particles, as shown in Figure 16. The associated particles are all points in the space with a particle at the center of the circle, let’s call it the central particle, of radius x. After the associated particles are captured, we compare the fitness value of the central particle with that of the associated particles, one by one. If the fitness value of an associated particle is greater than that of the central particle, the associated particle will replace the central particle and inherit a number of attributes from the central particle. Each time the central particle moves, it creates a new batch of companion particles. Of course, we will not select all of a particle’s associated particles. The initial value designed by this algorithm is 1,000, which can be adjusted by the user.

4.3.3. When the Velocity or Position of the Particle Crosses the Boundary, the “Collision-Bounce” Strategy Is Used

Due to the interval limitation of the expression, the situation of position and velocity overshooting should also be considered in the particle motion process. In the case of overshooting, this paper adopts the collection-bounce strategy. If the particle position overshoot size is equal to the interval length, the particle is allowed to return; otherwise, the particle position is regenerated and the velocity is the global optimum. The calculation method is shown in Equations (11) and (12) as follows:

4.3.4. Parameters Vary with the Expression

In the standard PSO algorithm, we need to set a value for each given parameter. The idea of this paper is that the parameters should depend on the problem, while keeping the idea of the algorithm unchanged. For different expressions, different parameters should be selected by the algorithm, and the specific values of the parameters most applicable to the expressions are shown in the experimental part of the fourth, which will not be described here.

4.3.5. Fitness Function

This paper provides three fitness functions–the relative error, ULP error, and bit error of the expression corresponding to the particle under its value-in order to be able to compare the tool proposed in this paper with other tools. Since different tools use different types of error, this paper’s tool needs to be compared with other tools, and different tools use different types of error, too. The following equations provide these three fitness functions.

After processing the interval generation module in the second step of the tool, we now have a specific search interval. The next work is to use the CPSO algorithm to find the maximum error on the search interval.

As shown in Algorithm 3, first, the values of each parameter are determined by the expression. Then, the particle population is initialized to the search interval generated in the second step, the fitness value of each particle is calculated, and the global optimum is obtained. Meanwhile, the velocity direction of all particles is set to the global optimum. Lines 6–16 of the algorithm represent an iterative process. As shown in the algorithm, in an iterative process, the velocity and position of each particle in the particle swarm must be updated. In the updating process, it is necessary to judge whether the velocity and position of each particle are out of bounds. After updating the particle’s velocity and position, the particle’s associated particles are generated. According to the fitness of the associated particles, it is judged whether the central particle and the global optimum need to be updated. At this point, an iteration process is completed. After the maximum number of iterations, the maximum error found by the algorithm and the point that caused the error are stored in the global optimum, and the response value is obtained by the global optimum. Finally, the maximum error corresponding to the input expression is returned to the user and the error input is triggered.

Require: expression ,
1: Particles population_initialization()
2: gBest getGlobalBest(particles)
3: iter 0
4: whiledo
5:  for do
6:   vUpdate(particle)
7:   pUpdate(particle)
8:   updateLocalBest(particle)
9:   UpdateGlobalBest(particle)
10:   associatedParticles geneAssociatedParticles()
11:  end for
12:  for do
13:   ifthen
14:    gBest aParticle
15:   end if
16:  end for
17:  iter iter + 1
18: end while
19: MaxError gBest.error
20: x gBest.x
21: MaxError, x

5. Evaluation

First, the values of parameters in CPSO algorithm were determined by experiments. Second, random search, PSO, and CPSO were used, respectively, to detect and compare the errors of FPBench test case expressions to verify the effectiveness of CPSO algorithm. Finally, PSOED was compared with S3FP and Herbie to verify the effectiveness of PSOED.

5.1. Test Cases and Experimental Environment

The body code of PSOED was implemented in C++. The body code was approximately 2,000 lines. We selected 28 single-parameter expressions from FPBench benchmarks that did not contain loops and judgments. Instead of using the default range provided by FPBench, we used a large custom range. Information about the expressions is shown in Table 1.

The hardware and software environment required for the experiment is shown in Table 2.

5.2. CPSO Parameter Determination

In this paper, the parameters of the CPSO algorithm are determined from an experimental point of view. For example, the expression exp1 with the interval (0, 708) is used.

When other parameters are fixed, the test starts with parameters and . Figure 17 shows the average error of the expression searched by CPSO algorithm when and change dynamically. This is paper performs 12 sets of tests for each expression in Table 3, removes the maximum and minimum values of the experimental results, and calculates the average value of the remaining data, which is the obtained results as shown in Equation (16). In Figure 17, the coordinate representation of the bars (, ) and the height represent the average maximum error. When and are different, the bars in statistical graph use different colors. As shown in Figure 17 when and , the CPSO algorithm achieves the best effect on the expression exp1x. Figure 18 shows that when takes different values and there is a graph with , the average error obtained by CPSO is the largest.

, , and corresponding to different expressions are shown in Table 4.

5.3. Comparison of Random, PSO, and CPSO Search Algorithm

In this section, we compare the performance of PSO algorithm and CPSO algorithm, and the search effect of random search, PSO algorithm and CPSO algorithm.

In terms of performance, this paper compares the performance of PSO algorithm with that of CPSO algorithm. The reason why it is not compared with random search is that both PSO and CPSO have a series of judgment and iteration strategies, while random search does not have any judgment process, so the performance must be better than the two. In terms of error, we use relative error to describe the search effect of different algorithms. In order to reflect the search ability of PSO and CPSO, the random generation strategy was adopted for the particle population of PSO and CPSO. After obtaining the error results, in order to show on a statistical chart that the maximum error detected by different expressions is not necessarily in the same order of magnitude, we used a logarithmic scale with a base of 10 for the error results, as shown in the following formula.

As shown in Figure 19, the performance of the CPSO algorithm is better than that of standard PSO for 18 expressions (64%). Compared with PSO, CPSO has added a series of judgment and processing strategies, such as regeneration of particle velocity and position after transgression. After adding a series of judgment and processing strategies, such performance results are acceptable.

As shown in Figure 20, the search performance of CPSO algorithm is better than that of random search algorithm in 28 (100%) expression and better than that of PSO algorithm in 17 (61%) expressions.

5.4. Comparison of PSOED, S3FP, and Herbie

For the 28 expressions selected for the experiment, the PSOED test results are shown in Table 5.

Since the PSOED tool proposed in this paper is a dynamic detection tool, it needs to be compared with the existing dynamic detection tools. We selected S3FP and Herbie.

S3FP is a binary guided (BGRT) error detection tool, and the test result is a relative error. Since the test time can be set for S3FP, we set the same run time for S3FP and PSOED to ensure the fairness of the test, and compare the maximum errors found by the two tools. There are 14 expressions that S3FP does not support and are represented by no answer (NA) in this paper. The test results are shown in Figure 21.

As shown in the figure above, the maximum error obtained by PSOED is greater than that obtained by S3FP for 28 expressions in the experimental case for the same run time.

Herbie is an error detection and precision adjustment tool that uses random sampling and heuristic search to estimate and locate expression errors. Herbie’s error representation is . The two tools use different error measures. For ease of comparison, Equation (16) was used to convert the ULP error results measured by PSOED to . The sampling points set by Herbie was 8,000, and we the number of particles of PSOED was set to 8,000.

In terms of detection effect, as shown in Figure 22, among the 28 expressions used in the experiment, 19 (68%) expressions with PSOED had a better maximum detection effect than Herbie, 4 (14%) expressions with the same detection effect, and 5 (18%) expressions with PSOED had a lower maximum error than Herbie.

In terms of tool performance, as shown in Figure 23, only 11(40%) of the PSOED expressions analyzed had a lower detection time than Herbie. For the expressions that cost more time than Herbie, it is because there are many large error intervals generated in the second step of the tool, and PSOED performs multiple searches, resulting in high time cost. As shown in Figure 23, the expression NMSE example 38 has a much larger time overhead than Herbie, we analyze the error distribution of this expression and find that there are many peaks in the error image of this expression, so PSOED generates 174 large error intervals when processing this expression, which means that PSOED has to search at least 175 times, and the time cost is obvious.

6.1. Static Tools

Static analysis tools are based on the source code analysis and verify the accuracy of numerical operations by collecting key information from programs. Currently, there are many mature static analysis tools, such as Gappa [8], Daisy [9], FPTaylor [10], Fluctuat [11], and Rosa [12]. Static analysis tools are suitable for the specific areas (such as embedded systems), because programs in these fields do not have complex and dynamic data structures. Then, they rely on the program source code, and the detected errors are larger than the actual errors, so there are false positives.

6.2. Dynamic Tools

In contrast to dynamic analysis tools and methods, Chiang et al. [13] evaluated the accuracy of static analysis and SMT-based methods, and proposed their advantages and disadvantages for the first time. They designed a binary guided random search algorithm (BGRT), and implemented an S3FP tool to assist parallel developers in practical fault analysis. Zou et al. [14] and Zou et al. [15] conducted an empirical analysis and found that the size of the exponential bit and the composition of the mantissa bit can have a significant impact on accuracy, i.e., floating point numbers have only one index within a cell, which can lead to significant inaccuracy, while a large portion of the mantissa bit can lead to significant inaccuracy. They designed a new genetic algorithm, the locally sensitive genetic algorithm (LSGA) [14]. This algorithm improves the genetic algorithm [16] and combines the original algorithm with floating-point number composition. The basic idea is to evolve exponential bits to hit the cells with significant errors and at the same time randomly generate mantissa bits. The fitness function value of the algorithm is calculated by the FPDebug [17] tool. Zou et al. [14] and Zou et al. [15] later proposed to detect floating-point errors based on atomic conditions. Their view is that atomic conditions are an important factor in floating-point errors in atomic operations, and a floating-point error is represented as: and are the atomic condition, refers to the error introduced by atomic operations. The IEEE754 standard and the GNU C [18] reference manual guarantee that the latter is very small because atomic operations are carefully implemented and maintained by experts, so only atomic conditions need to be considered. Atomic conditions guide the search process for significant errors, and the tool Atomu [15] is implemented. Panchekha et al. [19] proposed Herbie, which is an error detection and accuracy adjustment tool that uses a random search method to locate areas with large errors, and then improves accuracy by rewriting the expressions in those areas. Guo and Rubio-González [20] believe that the loss of accuracy and the elimination of errors are the key factors leading to large errors in a program, so they are defined by the formulas. Using the symbol execution technology in software testing, the two conditions are injected into the program, and then the input that triggers the above two large error conditions, namely the high-error induced input, is checked. The FPGen tool is implemented.

7. Conclusion and Future Work

Due to the significant impact of floating-point errors on floating-point programs, floating-point detection is a fundamental work for precision optimization or mixed precision, and many researchers have conducted research on it. Various search-based techniques or tools for detecting the maximum error of floating-point programs or expressions have achieved good results, but it is still a very challenging problem. In this paper, the error distribution law of floating-point expression and the distribution characteristics of floating-point number are fully considered, and the CPSO algorithm is used to detect the maximum error of floating-point expression. These work are all very important, organically combined to form a good tool PSOED. According to the experimental results, the CPSO algorithm shows superiority over the random search algorithm and the standard PSO algorithm. The comparison with S3FP and Herbie tools proves the effectiveness and practicality of the tool.

It has to be admitted that PSOED tools have a lot of potential for optimization. For one thing, we are trying to support error detection for multiparameter expressions with loops and judgments. On the other hand, we need to optimize the performance of PSOED. Too many large error intervals lead to excessive processing time. In the future, we plan to use some strategy to reduce the size of large error intervals while paralleling the search process.

In order to identify significant errors in floating-point expressions, our research aims to enhance the PSO algorithm and apply it to the field of floating-point computing. There are also many excellent optimization algorithms that are used in the different areas. For example, using butterfly optimization algorithm (BOA) [21] to solve the problem of the setting of design parameters of designed ADRC controller, using gray-wolf optimization (GWO) [22] algorithm to improve the performances of SMC and STSMC [23] and using social spider optimization (SSO) algorithm to tune the parameters of proportional–derivative (PD) versions of both IT2FLC and T1FLC [24]. However, our work cannot directly be compared to the aforementioned work due to the diverse research areas and methodologies employed, however these works have strong reference relevance. We will, of course, continue to research the techniques you proposed in the future with the goal of coming up with a more effective solution to the problem of floating-point expression error detection. On the other hand, in order to verify the practicality of PSOED tools, we will test a large number of examples in the field of engineering and computer science in the future.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Hongru Yang and Jinchen Xu contributed equally to this work.