1 Introduction

Ikura and Gimple [5] first studied the deterministic batch scheduling problems. A batch processing machine is one that can process several jobs simultaneously as a batch. The processing time of a batch is equal to the largest processing time of the jobs in the batch. Once processing is begun on a batch, no job can be removed from or added to the batch. Since the processing of the burn-in operations is very long compared with the other steps of the final testing stage, its efficient scheduling is very critical to inventory management, productivity improvement, and on-time delivery. More details about the background of this model is given in [5].

In this paper we study the problem of scheduling jobs with release dates on parallel batch processing machines to minimize the makespan. Specifically, there are n jobs to be processed on m parallel batch processing machines. Each job j (\(j=1,\ldots ,n\)) is associated with a release date \(r_{j}\), at which the job arrives and becomes available, and a processing time \(p_{j}\), which is the time needed to process the job. A batch processing machine can process up to B jobs at the same time, where \(B<n\). The processing time of a batch is equal to the largest processing time of the jobs in the batch. If a batch contains exactly B jobs, we call it a full batch; otherwise, we call it a partial batch. This type of batch processing is called parallel batch scheduling (p-batch). There is also serial batch scheduling (s-batch), where the processing time of a batch is equal to the sum of the processing times of the jobs in the batch. For both types of batch scheduling, the completion time of a job is defined as the completion time of the batch it belongs to. Extending the traditional three-field notation for scheduling problems, Brucker et al. [2] introduced p-batch in the second field to denote parallel batch scheduling problems. Following Brucker et al. [2], we denote the problem under study as \(P|r_{j},p\text{-}batch|C_{max}\), where \(P\) and \(C_{max}\) indicates parallel machines and the makespan, respectively.

For the single-machine case, considerable research has been carried out. Brucker et al. [2] proved that the problem \(1|r_{j},p\text{-}batch|C_{max}\) is strongly NP-hard even if \(B=2\). Lee and Uzsoy [6] developed polynomial-time algorithms for several special cases of the problem and provided several heuristics for the general case. Liu and Yu [10] showed that the problem is binary NP-hard even if the jobs are subject to two distinct release dates. They presented a pseudopolynomial-time algorithm for the case where there are a fixed number of distinct release dates. Independently, Deng et al. [4] presented a pseudopolynomial-time algorithm for the same special case of the problem. Based on the pseudopolynomial-time algorithm they provided, Deng et al. [4] developed a polynomial-time approximation scheme for the problem. Yuan et al. [12] showed that the scheduling problem with family jobs and release dates on a single batch processing machine to minimize the makespan is strongly NP-hard and they developed two dynamic programming algorithms and a heuristic with a performance ratio 2. For the scheduling problems with release dates and other objective functions on a single batch processing machine, Cheng et al. [3] proved that the scheduling problem with release dates and deadlines on an unbounded batch processing machine is NP-hard and provided polynomial-time algorithms for several special cases. Liu et al. [11] showed that the problem of minimizing the total tardiness on a single unbounded batch processing machine is NP-hard and developed pseudopolynomial solvable problems on an unbounded batch processing machine. Liu et al. [9] presented a polynomial-time approximation scheme for the problem of scheduling jobs with release dates on a single batch processing machine to minimize the total completion time and developed a fully polynomial-time approximation scheme for the case with an unbounded batch processing machine.

To the best of our knowledge, little research has been done on the problem of minimizing the makespan on parallel batch processing machines with job release dates except Li et al. [8], who presented a polynomial-time approximation scheme for this problem.

We organize the rest of the paper as follows: In Sect. 2 we present two heuristics and analyze their performance ratios. In Sect. 3 we propose a polynomial-time algorithm for the special case where the jobs have equal processing times.

2 Heuristics

We first develop an integer programming formulation for problem \(P|r_{j},p\text{-}batch|C_{max}\). Given a feasible solution, we denote the processing time, release date, and completion time of the \(j\)-th batch on machine \(k\) as \(p_{jk}\), \(r_{jk}\), and \(y_{jk}\), respectively, for \(j=1,\ldots ,n\) and \(k=1,\ldots ,m\). Then, we can formulate the problem considered in this paper as the following integer programming problem, where \(x_{ijk}\) takes the value 1 if job \(i\) is assigned to the \(j\)-th batch on machine \(k\) and the value 0 otherwise for \(i=1,\ldots ,n, j=1,\ldots ,n\), and \(k=1,\ldots ,m\), and \(Z\) denotes the makespan:

$$\begin{aligned}&min Z\\&s.t. \left\{ \begin{array}{ll} \sum \limits ^{n}_{j=1}\sum \limits _{k=1}^{n}x_{ijk}=1, \quad i=1,\ldots ,n;\\ \sum \limits _{i=1}^{n}x_{ijk}\le B, \quad j=1,\ldots ,n;k=1,\ldots ,m;\\ p_{i}x_{ijk}\le p_{jk}, \quad i=1,\ldots ,n;j=1,\ldots ,n;k=1,\ldots ,m;\\ r_{i}x_{ijk}\le r_{jk}, \quad i=1,\ldots ,n;j=1,\ldots ,n;k=1,\ldots ,m;\\ y_{(j-1)k}+p_{jk}\le y_{jk}, \quad j=1,\ldots ,n;k=1,\ldots ,m;\\ r_{jk}+p_{jk}\le y_{jk}, \quad j=1,\ldots ,n;k=1,\ldots ,m;\\ y_{nk}\le Z, \quad k=1,\ldots ,m;\\ x_{ijk}\in \{0,1\}, \quad i=1,\ldots ,n;j=1,\ldots ,n;k=1,\ldots ,m; \end{array} \right. \end{aligned}$$

where \(y_{0k}=0,p_{jk},r_{jk},y_{jk},Z\ge 0, j=1,\ldots ,n\), and \(k=1,\ldots ,m\).

We develop two heuristics for problem \(P|r_{j},p\text{-}batch|C_{max}\), which are based on Algorithm FBLPT (Full Batch Largest Processing Time) [1] for problem \(1|p\text{-}batch|C_{max}\). Algorithm FBLPT first orders the jobs in non-increasing order of their processing times, then assigns adjacent B jobs as a batch from the beginning until all the jobs have been assigned, and finally arranges the batches in any arbitrary order. We define available jobs as the jobs that have arrived but have not yet been scheduled.

The first heuristic GP(Greedy Processing) is based on the intuition that if there is a machine becoming free and there are jobs available, we should assign as many jobs with the largest possible processing times as possible to this machine. It generalizes the following two algorithms: Algorithm GRLPT (Greedy Longest Processing Time) presented by Lee and Uzsoy [6] for problem \(1|r_{j},p\text{-}batch|C_{max}\) and Algorithm BLPT presented by Lee et al. [7] for problem \(P|p\text{-}batch|C_{max}\).

Algorithm GRLPT

Whenever the machine becomes free, put as many unscheduled available jobs with the largest possible processing times as possible into one batch and assign this batch to the machine.

Liu and Yu [10] proved that the performance ratio of Algorithm GRLPT is 2 and that this bound is tight. We give an alternative proof of this result, which is simpler and more straightforward.

Theorem 1

The performance ratio of Algorithm GRLPT for problem \(1|r_{j},p\text{-}batch|C_{max}\) is 2.

Proof

Let \(C_{max}^{*}\) and \(C_{max}\) be the optimal makespan and the makespan obtained by Algorithm GRLPT, respectively; \(r_{max}\) be the last release date; and \(B_{l}\) be the batch that starts strictly before \(r_{max}\) and completes at or after time \(r_{max}\). The processing time and start time of batch \(B_{l}\) are denoted by \(p^{l}\) and \(s_{l}\), respectively; the job set processed after batch \(B_{l}\) is denoted by \(S\), which is arranged by Algorithm FBLPT according to the procedures of Algorithm GRLPT; and the makespan obtained by applying Algorithm FBLPT to \(S\) is denoted by \(C_{max}(S)\). Then we consider two sub-cases.

  • Case 1 Batch \(B_{l}\) is a partial batch. Then all the jobs in \(S\) must be released after time \(s_{l}\) according to the algorithm and we have \(s_{l}+C_{max}(S)\le C_{max}^{*}\). Since \(p^{l}\le C_{max}^{*}\), it follows that (see Fig. 1)

    $$\begin{aligned} C_{max}=s_{l}+p^{l}+C_{max}(S)\le 2C_{max}^{*}. \end{aligned}$$
  • Case 2 Batch \(B_{l}\) is a full batch. If the largest processing time of the jobs in \(S\) is no larger than the smallest processing time of the jobs in batch \(B_{l}\), then batch \(B_{l}\) and \(S\) are a partial schedule obtained by applying Algorithm FBLPT to the job set \(B_{l}\bigcup S\). We have \(p^{l}+C_{max}(S)=C_{max}(B_{l}\bigcup S)\le C_{max}^{*}\). Since \(s_{l}\le r_{max}\le C_{max}^{*}\), we have \(C_{max}=s_{l}+p^{l}+C_{max}(S)\le 2C_{max}^{*}\). Otherwise, the largest processing time in \(S\) is larger than the smallest processing time in batch \(B_{l}\) and the jobs in \(S\) with larger processing times than the smallest processing time in batch \(B_{l}\) must be released after time \(s_{l}\). Denote this job set as \(S_{1}\) and \(S_{2}=S-S_{1}\). Then we have \(C_{max}(S)\le C_{max}(S_{1})+C_{max}(S_{2})\), \(s_{l}+C_{max}(S_{1})\le C_{max}^{*}\), and \(p^{l}+C_{max}(S_{2})=C_{max}(B_{l}\bigcup S_{2})\le C_{max}^{*}\). Thus, we have

    $$\begin{aligned} C_{max}=s_{l}+p^{l}+C_{max}(S)\le s_{l}+p^{l}+C_{max}(S_{1})+C_{max}(S_{2})\le 2C_{max}^{*}. \end{aligned}$$

\(\square \)

Fig. 1
figure 1

Schedule obtained by algorithm GRLPT

Algorithm BLPT

Whenever a machine becomes free, put as many unscheduled jobs with the largest possible processing times as possible into one batch and assign this batch to the machine. The performance ratio of Algorithm BLPT for problem \(P|p\text{-}batch|C_{max}\) is \(4/3-1/3m\) [7].

Inspired by Algorithm GRLPT and Algorithm BLPT, we develop the following Algorithm GP for problem \(P|r_{j},p\text{-}batch|C_{max}\).

Algorithm GP

Whenever a machine becomes free, put as many unscheduled available jobs with the largest possible processing times as possible into one batch and assign this batch to the machine.

Theorem 2

The performance ratio of Algorithm GP is \(10/3-1/3m\).

Proof

Let the batch starting strictly before time \(r_{max}\) and completing at or after time \(r_{max}\) on machine \(i\) be batch \(B_{i}\), the start time of batch \(B_{i}\) be \(s_{i}\), the processing time of batch \(B_{i}\) be \(p^{i}\,(i=1,\ldots ,m)\), and the batch \(B_{l}\) satisfy \(l=max_{1\le i\le m}{\{s_{i}+p^{i}\}}\). Let \(S\) denote the job set that starts at or after time \(r_{max}\) and \(C_{max}(S)\) be the makespan obtained by applying Algorithm GP to S. Assume that all the jobs in \(S\) are released at time zero. Then we have \(s_{l}\le C_{max}^{*}, p^{l}\le C_{max}^{*}\), and \(C_{max}(S)\le (4/3-1/3m)C_{max}^{*}\). It follows that

$$\begin{aligned} C_{max}\le s_{l}+p^{l}+C_{max}(S)\le \left(\frac{10}{3}-\frac{1}{3m}\right)C_{max}^{*}. \end{aligned}$$

\(\square \)

Obviously, Algorithm GP is an on-line algorithm and we have no information about the jobs before they arrive. We believe that the performance ratio \(10/3-1/3m\) is not tight. The following example shows that the worst-case performance ratio of Algorithm GP is at least \(7/3-1/3(m-1)\) for any given number of machines \(m\).

Consider an instance with \(B=m+1\) and \(n=2mB=2m(m+1)\). Without loss of generality, we assume \(m\) is an odd number and \(\delta \) is an arbitrary small positive number. \(r_{1}=0, r_{2}=\delta , r_{3}=2\delta ,\,\ldots , r_{m}=(m-1)\delta \), and \(r_{m+1}=r_{m+2}=\cdots =r_{n}=m\delta . p_{1}=p_{2}=\cdots =p_{m+1}=3(m+1), p_{m+2}=p_{m+3}=\cdots =p_{3(m+1)}=2m-3, p_{3(m+1)+1}=\cdots =p_{5(m+1)} =2m-4, \ldots , p_{(2m-5)(m+1)+1}=\cdots =p_{(2m-3)(m+1)}=m\), and \(p_{(2m-3)(m+1)+1}=\cdots =p_{2m(m+1)}=m-1\). The makespan obtained by Algorithm GP is \(7m-8+\delta \) (see Table 1, in which the number in the brackets denote the processing time of the batch and the number in the brackets on the top right hand corner denotes the number of jobs in the batch, and different batches are separated by commas). The optimal makespan of the instance is \(3(m-1)+m\delta \) (see Table 2). Hence, the performance ratio of Algorithm GP is at least \(\frac{7m-8+\delta }{3(m-1)+m\delta }\,\rightarrow \frac{7}{3}-\frac{1}{3(m-1)}\) when \(\delta \rightarrow 0\).

Table 1 Schedule obtained by Algorithm GP
Table 2 The optimal schedule

Considering the last release date \(r_{max}\), we next present another algorithm for problem \(P|r_{j},p\text{-}batch|C_{max}\).

Algorithm RP (Processing based on the largest Release date)

Apply Algorithm GP until there is a batch \(B_{l}\) that starts at or before time \(r_{max}\) and completes strictly after time \(r_{max}\), then apply Algorithm BLPT to all the remaining jobs, together with the jobs in batch \(B_{l}\) at time \(r_{max}\).

Theorem 3

The performance ratio of Algorithm RP is \(7/3-1/3m\) and the bound is tight.

Proof

Let \(C_{max}\) and \(C_{max}^{*}\) denote the makespan obtained by Algorithm RP and the optimal makespan, respectively. The job set processed after time \(r_{max}\) is denoted by \(S\) and the makespan obtained by applying Algorithm BLPT to job set \(S\) is denoted by \(C_{max}(S)\). Since \(r_{max}\le C_{max}^{*}\) and \(C_{max}(S)\le (4/3-1/3m)C_{max}^{*}\),

$$\begin{aligned} C_{max}=r_{max}+C_{max}(S)\le \left(\frac{7}{3}-\frac{1}{3m}\right)C_{max}^{*}. \end{aligned}$$

The following instance shows that the bound is tight. Consider an instance of problem \(P|r_{j},B|C_{max}\) with \(B=m+1\) and \(n=2mB=(2m+1)(m+1)+1\). Without loss of generality, we assume that \(m\) is an even number and \(\delta \) is an arbitrary small positive number. \(r_{1}=0, r_{2}=\delta , r_{3}=2\delta , \ldots , r_{m}=(m-1)\delta , r_{m+1}=r_{m+2}=\cdots =r_{(2m+1)(m+1)}=m\delta \), and \(r_{(2m+1)(m+1)+1}=3m. p_{1}=p_{2}=\cdots =p_{2(m+1)}=2m-1, p_{2(m+1)+1}=\cdots =p_{4(m+1)}=2m-2, p_{4(m+1)+1}=\cdots =p_{6(m+1)}=2m-3, \ldots , p_{(2m-4)(m+1)+1}=\cdots =p_{(2m-2)(m+1)}=m+1, p_{(2m-2)(m+1)+1}=\cdots =p_{(2m+1)(m+1)}=m\), and \(p_{(2m+1)(m+1)+1}=\delta \). Algorithm RP generates a schedule with makespan \(7m-1\) (see Table 3). The optimal makespan, however, is \(3m+(m+1)\delta \) (see Table 4.). Thus, the worst-case performance ratio of Algorithm RP is \(\frac{7m-1}{3m+(m+1)\delta }\rightarrow \frac{7}{3}-\frac{1}{3m}\) when \(\delta \rightarrow 0\). \(\square \)

Table 3 Schedule obtained by Algorithm RP
Table 4 The optimal schedule

For the special case where all the jobs have the same processing times, we denote it as \(P|r_{j},p\text{-}batch,p_{j}=p|C_{max}\) and provide the following Algorithm FBERD (Full Batch Earliest Release Dates) to treat it.

Algorithm FBERD

  • Step 1. Order the jobs in non-decreasing order of their release dates.

  • Step 2. Assign the first \(n-\lfloor n/B\rfloor B\) jobs at time \(r_{j}\) on one of the machines, where \(j=n-\lfloor n/B\rfloor B\). For the remaining jobs, if there are \(B\) jobs available, put them into one batch and assign them to the first available machine; otherwise, wait until there are \(B\) jobs available.

Theorem 4

Algorithm FBERD yields an optimal solution to problem \(P|r_{j},p\text{-}batch,p_{j}=p|C_{max}\).

Proof

Let \(S\) be the schedule generated by Algorithm FBERD and \(S^{*}\) be an optimal schedule. We adjust schedule \(S^{*}\) such that the first batch contains \(n-\lfloor n/B\rfloor B\) jobs and the other batches contain exactly \(B\) jobs without increasing the makespan. The resulting schedule is denoted as \(S^{^{\prime }}\). Then \(S\) is also optimal since schedule \(S\) is almost the same as \(S^{^{\prime }}\) except that each batch starts at the earliest possible time. \(\square \)

3 Computational experiments

We conducted a series of computational experiments to evaluate the performance of the heuristic algorithms by generating random instances. Since the problem under study is strongly NP-hard, it is difficult to obtain the optimal values for instances with a large number of jobs within a reasonable time. So for such instances, we compare the makespan obtained by the heuristics with the largest value among the following lower bounds for the optimal value.

  • \(LB1\): In any feasible schedule, the earliest possible completion time of job \(j\) is \(r_{j}+p_{j}\). Hence we have \(LB1=\max \nolimits _{1\le j\le n}\{r_{j}+p_{j}\}\).

  • \(LB2\): Order the jobs in increasing order of their release dates \(r_{j}\). Let \(C_{max}^{FBLPT}(j,n)\) be the makespan obtained by applying Algorithm FBLPT to jobs \(j\) through \(n\), assuming that they are released at time zero. Algorithm FBLPT is optimal for problem \(1|p\text{-}batch|C_{max}\) [2]. Then we have

    $$\begin{aligned} LB2=\max \limits _{1\le j\le n} \left\{ r_{j}+\frac{C_{max}^{FBLPT}(j,n)}{m}\right\} . \end{aligned}$$
  • \(LB3\): Order the jobs in increasing order of their release dates \(r_{j}\). Let \(C_{max}^{BLPT}(j,n)\) be the makespan obtained by applying Algorithm BLPT to jobs \(j\) through \(n\), assuming that they are released at time zero. Algorithm BLPT is a \(4/3-1/3m\) approximation algorithm for problem \(P|p\text{-}batch|C_{max}\) [7]. Thus we have

    $$\begin{aligned} LB3=\max \limits _{1\le j\le n} \left\{ r_{j}+\frac{C_{max}^{BLPT}(j,n)}{4/3-1/3m}\right\} . \end{aligned}$$

There are five parameters that may influence the performance of the heuristics: number of machines (\(m\)), machine capacity (\(B\)), number of jobs (\(n\)), processing times (\(p_{j}, 1\le j\le n\)), and release dates (\(r_{j}, 1\le j\le n\)). We generated two levels of processing times from two discrete uniform distributions U[1,100] and U[1,20], respectively. To generate the release dates, we first computed the makespan \(C_{max}^{BLPT}\) obtained by applying Algorithm BLPT to all the jobs by assuming that they are released at time zero, then generated the release dates from a discrete uniform distribution ranging from 0 to \(FR\cdot C_{max}^{BLPT}\), where FR is the frequency of the release dates, and we call it the release date ratio. Thus we generated all the instances by assigning different values to the five parameters, whereby there were two different values of \(m\) (3 and 5), two different values of \(B\) (3 and 5), eight different values of \(n\) (20, 40, 60, 80, 100, 150, 200, and 300), two different levels of processing time distribution (U[1,100] and U[1,20]), and two different values of the release date ratio (0.5 and 1). There are totally 128 combinations of these five parameters. For each combination, we applied the two heuristics to ten randomly generated instances, which resulted in a total of 1,280 instances. We coded all the algorithms and the lower bound in Matlab 6.0 and ran the computational experiments on a Pentium 4/2.0 GHz personal computer.

We first compare the makespan obtained by the heuristics with the optimal values obtained by integer programming and with the lower bound, respectively, for instances with eight jobs, in which \(B=2, m=2\), the processing time distribution is U[1,10], and the release date ratio is 0.5. Table 5 reports the computational results, where GP/LB (RP/LB) and GP/OP (RP/OP) denote the ratios of the makespan generated by Algorithm GP (RP) to the lower bound and the optimal value, respectively. The results indicate that the differences in the ratios between the heuristics to the lower bound and to the optimal value are about 0.2, revealing the effectiveness of the lower bounds.

Table 5 Performance ratios of the heuristics to lower bounds and optimal values

Naturally, it is convenient to compare the makespan obtained by the heuristics with the lower bounds for instances with a large number of jobs. Tables 6, 7, 8, 9 report the computational results instances with a large number of jobs. For the ten instances of each combination, they show the average and maximum ratios obtained by the heuristics to the largest lower bounds on the optimal values. In general, both algorithms perform well since almost all the ratios are between 1 and 1.5, and Algorithm GP performs much better than Algorithm RP with respect to the average ratio and the maximum ratio because the latter delays some jobs to the largest release date.

Table 6 Performance ratios of the heuristics to lower bounds for instances with processing times U[1,100] and release date ratio 0.5
Table 7 Performance ratios of the heuristics to lower bounds for instances with processing times U[1,100] and release date ratio 1
Table 8 Performance ratios of the heuristics to lower bounds for instances with processing times U[1,20] and release date ratio 0.5
Table 9 Performance ratios of the heuristics to lower bounds for instances with processing times U[1,20] and release date ratio 1

Next we examine the results related to the five job parameters. From the four tables we notice that the patterns of the performance of the heuristics are very different for the cases of smaller release date ratio and larger release date ratio, so we discuss the computational results according to these two cases.

If the jobs are released frequently, i.e., the release ratio is small, then from Tables 6 and 8, we see that if the number of jobs increases, or number of machines decreases, or machine capacity decreases, or processing time variance decreases, the performance of both algorithms improves. The reason is that if the number of jobs increases, or processing time variance decreases, or machine capacity decreases, there are more jobs with close processing times available at a time and both algorithms will assign unscheduled jobs with close processing times into one batch. If the number of machines decreases, the workload on each machine can be balanced much more than the case with a large number of machines.

If the jobs are released less frequently, the above mentioned pattern does not exist and there are no significant differences in the performance ratios of Algorithms GP and RP if the number of jobs, number of machines, machine capacity, or processing time variance changes.

With respect to the parameter of release dates, if the jobs are released less frequently, both algorithms perform better than the case with a small ratio of release dates. However, this performance gap becomes smaller when the number of jobs increases.

4 Conclusions

In this paper we consider the problem of scheduling jobs with release dates on parallel batch processing machines. Since this problem is strongly NP-hard, we provide two efficient heuristic algorithms and evaluate their performance by exploring their performance ratios and conducting computational experiments. We also identify a special case of the problem that can be solved in polynomial time.

A challenging topic for future research is to develop a better performance ratio for the on-line Algorithm GP. To develop more efficient heuristics than the heuristics presented in this paper is also worth further studying.