Abstract
This paper presents an adapted trust-region method for computationally expensive black-box optimization problems with mixed binary variables that involve a cyclic symmetry property. Mixed binary problems occur in several practical optimal design problems, e.g., aircraft engine turbines, mooring lines of offshore wind turbines, electric engine stators and rotors. The motivating application for this study is the optimal design of helicopter bladed disk turbomachines. The necklace concept is introduced to deal with the cyclic symmetry property, and to avoid costly black-box objective-function evaluations at equivalent solutions. An adapted distance is proposed for the discrete-space exploration step of the optimization method. A convergence analysis is presented for the trust-region derivative-free algorithm, DFOb-\(d_H\), extended to the mixed-binary case and based on the Hamming distance. The convergence proof is extended to the new algorithm, DFOb-\(d_{neck}\), which is based on the necklace distance. Computational comparison with state-of-the-art black-box optimization methods is performed on a set of analytical problems and on a simplified industrial application.
Similar content being viewed by others
Notes
The number of positive integers between 1 and n that are relatively prime to n.
References
Abramson M, Audet C, Couture G, Dennis JE, Le Digabel S, Tribes C (2011) The NOMAD project. Software available at https://www.gerad.ca/nomad/
Achterberg T, Berthold T, Koch T, Wolter K (2008) Constraint integer programming: a new approach to integrate CP and MIP. In: Perron L, Trick M (eds) Integration of AI and OR techniques in constraint programming for combinatorial optimization problems. Springer, Berlin, Heidelberg, pp 6–20. https://doi.org/10.1007/978-3-540-68155-7_4
Audet C, Béchard V, Le Digabel S (2008) Nonsmooth optimization through mesh adaptive direct search and variable neighborhood search. J Global Optim 41:299–318. https://doi.org/10.1007/s10898-007-9234-1
Audet C, Hare W (2017) Derivative-free and blackbox optimization. Springer series in operations research and financial engineering. Springer. https://doi.org/10.1007/978-3-319-68913-5
Audet C, Dennis JE (2000) Analysis of generalized pattern searches. SIAM J Optim. https://doi.org/10.1137/S1052623400378742
Audet C, Dennis JE (2006) Mesh adaptive direct search algorithms for constrained optimization. SIAM J Optim 17:188–217. https://doi.org/10.1137/060671267
Belotti P, Kirches C, Leyffer S, Linderoth J, Luedtke J, Mahajan A (2013) Mixed-integer nonlinear optimization. Acta Numer 22:1–131. https://doi.org/10.1017/S0962492913000032
Bonami P, Biegler LT, Conn AR, Cornuéjols G, Grossmann IE, Laird CD, Lee J, Lodi A, Margot F, Sawaya N, Wächter A (2008) An algorithmic framework for convex mixed integer nonlinear programs. Discret Optim 5(2):186–204. https://doi.org/10.1016/j.disopt.2006.10.011 (In Memory of George B. Dantzig)
Bremner D, Chan TM, Demaine ED, Erickson J, Hurtado F, Iacono J, Langerman S, Pǎtraşcu M, Taslakian P (2014) Necklaces, convolutions, and \(X+Y\). Algorithmica. https://doi.org/10.1007/s00453-012-9734-3
Cartis C, Roberts L, Sheridan-Methven O (2018) Escaping local minima with derivative-free methods: a numerical investigation. Optim Control (mathOC). arXiv:1812.11343
Choi B (2003) Pattern optimization of intentional blade mistuning for the reduction of the forced response using genetic algorithm. KSME Int J 17(7):966–977. https://doi.org/10.1007/BF02982981
Choi BK, Lentz J, Rivas-Guerra AJ, Mignolet MP (2003) Optimization of intentional mistuning patterns for the reduction of the forced response effects of unintentional mistuning: Formulation and assessment. J Eng Gas Turbines Power 125(1):131–140. https://doi.org/10.1115/1.1498270
Conn AR, D’Ambrosio C, Liberti L, Sinoquet D (2016) A trust region method for solving grey-box mixed integer nonlinear problems with industrial applications. SMAI-MODE, Toulouse. https://mode2016.sciencesconf.org/file/223761
Conn AR, Digabel SL (2013) Use of quadratic models with mesh-adaptive direct search for constrained black box optimization. Optim Methods Softw 28(1):139–158. https://doi.org/10.1080/10556788.2011.623162
Conn AR, Scheinberg K, Vicente L (2009) Introduction to derivative-free optimization. Soc Ind Appl Math. https://epubs.siam.org/doi/abs/10.1137/1.9780898718768
Conn AR, Scheinberg K, Vicente LN (2009) Global convergence of general derivative-free trust-region algorithms to first- and second-order critical points. SIAM J Optim 20(1):387–415. https://doi.org/10.1137/060673424
Costa A, Nannicini G (2018) RBFOpt: an open-source library for black-box optimization with costly function evaluations. Math Program Comput 10(4):597–629. https://doi.org/10.1007/s12532-018-0144-7
Dixon LCW, Szegö GP (1975) The global optimization problem: an introduction. In: Dixon LCW, Szegö GP (eds) Towards global optimization. North Holland, pp 1–15
D’Ambrosio C, Frangioni A, Liberti L, Lodi A (2010) On interval-subgradient and no-good cuts. Oper Res Lett 38(5):341–345. https://doi.org/10.1016/j.orl.2010.05.010
D’Ambrosio C, Lodi A (2013) Mixed integer nonlinear programming tools: an updated practical overview. Ann Oper Res 204:1572–9338. https://doi.org/10.1007/s10479-012-1272-5
Fredricksen H, Kessler IJ (1986) An algorithm for generating necklaces of beads in two colors. Discrete Math 61(2):181–188. http://www.sciencedirect.com/science/article/pii/0012365X86900890
Gabric D, Sawada J (2018) Constructing de Bruijn sequences by concatenating smaller universal cycles. Theoret Comput Sci 743:12–22. https://doi.org/10.1016/j.tcs.2018.06.039
Gendreau M, Potvin JY (2019) Handbook of metaheuristics, international series in operations research & management, science vol 272, 3rd edn. Springer
Gutmann HM (2001) A radial basis function method for global optimization. J Global Optim 19(3):201–227. https://doi.org/10.1023/A:1011255519438
Hock W, Schittkowski K (1981) Test examples for nonlinear programming codes, Lecture notes in economics and mathematical systems, vol 87. Springer
Holmström K, Quttineh NH, Edvall MM (2008) An adaptive radial basis algorithm (ARBF) for expensive black-box mixed-integer constrained global optimization. Optim Eng 9(4):311–339. https://doi.org/10.1007/s11081-008-9037-3
Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black-box functions. J Global Optim 13(4):455–492. https://doi.org/10.1023/A:1008306431147
Le Digabel S (2011) Algorithm 909: NOMAD: nonlinear optimization with the MADS algorithm. ACM Trans Math Softw 37(4):1–15
Liuzzi G, Lucidi S, Rinaldi F (2012) Derivative-free methods for bound constrained mixed-integer optimization. Comput Optim Appl 53:505–526. https://doi.org/10.1007/s10589-011-9405-3
Lukšan L, Vlček J (2000) Test problems for nonsmooth unconstrained and linearly constrained optimization. Technical report VT798-00, Institute of Computer Science, Academy of Sciences of the Czech Republic
Mckay MD, Beckman RJ, Conover WJ (2010) A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Am Stat Assoc Am Soc Qual 42:55–61. http://www.jstor.org/stable/1271432
Minghui J (2008) On the sum of distances along a circle. Discret Math 308(10):2038–2045. https://doi.org/10.1016/j.disc.2007.04.025
MINLPLib (2021). http://www.minlplib.org
Moré J, Wild SM (2009) Benchmarking derivative-free optimization algorithms. SIAM J Optim 20:172–191. https://doi.org/10.1137/080724083
Moustapha M (2009) Conception robuste en vibration et aéroélasticité des roues aubagées de turbomachines. Ph.D. thesis Université Paris-Est Marne la Vallée, France. https://tel.archives-ouvertes.fr/tel-00529002v2/document
Munoz ZM, Sinoquet D (2020) Global optimization for mixed categorical-continuous variables based on Gaussian process models with a randomized categorical space exploration step. INFOR: Inf Syst. Oper Res 58(2):310–341
Neumaier A (2014) Neumaier’s collection of test problems for global optimization, pp 1–15. http://www.mat.univie.ac.at/~neum/glopt/my_problems.html. Accessed May 2014
Pelamatti J, Brévault L, Balesdent M, Talbi EG, Guerin Y (2019) Efficient global optimization of constrained mixed variable problems. J Global Optim 73(3):583–613. https://doi.org/10.1007/s10898-018-0715-1
Regis R, Shoemaker C (2007) Improved strategies for radial basis function methods for global optimization. J Global Optim 37:113–135. https://doi.org/10.1007/s10898-006-9040-1
Regis R, Shoemaker C (2007) A stochastic radial basis function method for the global optimization of expensive functions. INFORMS J Comput 19:497–509. https://doi.org/10.1287/ijoc.1060.0182
Shin DK, Gurdal Z, O. H. Griffin J (1990) A penalty approach for nonlinear optimization with discrete design variables. Eng Optim 16(1):29–42. https://doi.org/10.1080/03052159008941163
Torczon V (1997) On the convergence of pattern search algorithms. SIAM J Optim 7:1–25
Toussaint G (2002) A mathematical analysis of African, Brazilian, and Cuban clave rhythms. School of Computer Science, McGill University, Montreal, pp 157–168
Toussaint G (2005) The geometry of musical rhythm. In: Akiyama J, Kano M, Tan X (eds) Discrete and computational geometry. Springer, pp 198–212
Toussaint G (2010) Computational geometric aspects of rhythm, melody, and voice-leading. Comput Geom 43(1):2–22. http://www.sciencedirect.com/science/article/pii/S092577210900042X. Special Issue on the 14th Annual Fall Workshop
Tran TT (2020) Nonlinear optimization of mixed continuous and discrete variables for black-box simulators. Research report, IFP Energies Nouvelles. https://hal-ifp.archives-ouvertes.fr/hal-02511841
Vu KK, d’Ambrosio C, Hamadi Y, Liberti L (2017) Surrogate-based methods for black-box optimization. Int Trans Oper Res. https://onlinelibrary.wiley.com/doi/pdf/10.1111/itor.12292
Wild SM (2009) Derivative-free optimization algorithms for computationally expensive functions. Ph.D. thesis Cornell University, Ithaca
Wächter A, Biegler LT (2006) On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math Program 106:25–57. https://doi.org/10.1007/s10107-004-0559-y
Acknowledgements
We thank the anonymous referees for their valuable remarks, and Safran Tech and IFP Energies Nouvelles for funding the Ph.D. position of the first author.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A Proof of Lemma 2
Lemma 2
Let \((x_0,y_0)\) be the initial iterate. Under Assumptions 1and2, the model \({\widetilde{m}}(\cdot ,y_0)\) which is constructed from \({\widetilde{m}}(x,y)\) by fixing \(y = y_0\) is fully linear in \(B_{y_0}(x_0,\varDelta _x)\). In other words, for all \(x \in B_{y_0}(x_0,\varDelta _x)\), there exist \(\kappa _f^*, \kappa _g^* > 0\) such that:
and
Proof
The model constructed in mixed space is given as:
where \(z = (x,y), \quad g = (g_x,g_y), \quad H = \begin{pmatrix} H_{xx} &{} H_{xy}\\ H_{yx}&{}H_{yy} \end{pmatrix}\), \(H_{xy} = H_{yx}\), and where \(H_{xx}, H_{yy}\) are symmetric matrices.
Thus, the model with y fixed to \(y_0\) is defined as follows:
with \(\bar{c}_x =\Big ( c + g_y^T y_0 + \dfrac{1}{2} y_{0}^T H_{yy} y_0 \Big )\), \(\bar{g}_x x = \Big ( g_x^T+H_{xy} y_0 \Big )\) and \(\bar{H}_{x} = H_{xx}\).
The gradient of \({\widetilde{m}}(x,y_0)\) with respect to x is therefore:
To be convenient, let us introduce the following notations: \(f_0(x)=f(x,y_0)\), \({\widetilde{m}}_0(x)= {\widetilde{m}}(x,y_0)\), \(\triangledown f_0(x) = \triangledown _x f(x,y_0)\) , \(\triangledown {\widetilde{m}}_0(x) =\triangledown _x{\widetilde{m}}(x,y_0)\) and \(B_0(\varDelta _x)=B_{y_0}(x_0, \varDelta _x)\).
We define
For all \(x^i \in B_0(\varDelta _x)\), we develop
Since \(f_0\) is continuously differentiable, we have:
which implies that
Using (40) with \(x = x_0\), we obtain:
First, note that \(err_0^f(x_0) = 0\). Then, for each terms of (41), we obtain the following upper bounds:
-
From the Lipschitz property of \(f_0(x)\) (Assumption 1), one has:
$$\begin{aligned} \big |\int _0^1 (x^i - x)^T ( \triangledown f(x + t(x^i - x)) -\triangledown f(x))dt \big |\le & {} \dfrac{1}{2}\nu \Vert x^i-x\Vert ^2\nonumber \\\le & {} \frac{1}{2} \nu (2\varDelta _x)^2\nonumber \\\le & {} 2 \nu \varDelta _x^2. \end{aligned}$$(42) -
In the same way, we have:
$$\begin{aligned} \big |\int _0^1 (x_0 - x)^T ( \triangledown f(x + t(x_0 - x)) -\triangledown f(x))dt\big |&\le \dfrac{1}{2}\nu \Vert x_0-x\Vert ^2 \le \dfrac{1}{2} \nu \varDelta _x^2. \end{aligned}$$(43) -
In the following two inequalities, note that \(\Vert \bar{H}_x\Vert _F\) is bounded from Assumption 2:
$$\begin{aligned} |\dfrac{1}{2}(x^i - x)^T \bar{H}_x (x^i - x)|\le & {} \dfrac{1}{2} \Vert \bar{H}_x\Vert _F \Vert x^i-x\Vert ^2 \nonumber \\\le & {} \dfrac{1}{2} \Vert \bar{H}_x\Vert _F(2 \varDelta _x)^2 \le 2 \Vert \bar{H}_x\Vert _F \varDelta _x^2. \end{aligned}$$(44)$$\begin{aligned} |\dfrac{1}{2}(x_0 - x)^T \bar{H}_x (x_0 - x)|\le & {} \dfrac{1}{2} \Vert \bar{H}_x\Vert _F \Vert x_0-x\Vert ^2 \le \dfrac{1}{2} \Vert \bar{H}_x\Vert _F \varDelta _x^2. \end{aligned}$$(45) -
There exists \(\epsilon ' > 0\) such that
$$\begin{aligned} |err_0^f(x^i)| \le \epsilon ' \varDelta _x^2, \end{aligned}$$(46)which can be shown by contradiction. Indeed, suppose that we have
$$\begin{aligned} |err_0^f(x^i)|> \epsilon ' \varDelta _x^2 \quad \forall \epsilon ' > 0. \end{aligned}$$(47)By definition of \(err_0^f\) and from the continuity assumption on \(f_0\) and \({\widetilde{m}}\) on \(B_0(\varDelta _x)\), there exist \(\epsilon _1, \epsilon _2 >0\) such that:
$$\begin{aligned} |err_0^f(x^i)|= & {} |f_0(x^i) - {\widetilde{m}}_0(y_0)| \nonumber \\= & {} |f_0(x^i) - f_0(x_0)+{\widetilde{m}}_0(x_0)- {\widetilde{m}}_0(x^i)| \nonumber \\\le & {} |f_0(x^i) - f_0(x_0)| + |{\widetilde{m}}_0(x_0)- {\widetilde{m}}_0(x^i)| \nonumber \\\le & {} (\epsilon _1 + \epsilon _2)\varDelta _x. \end{aligned}$$(48)Thus, setting \(\epsilon ' = \dfrac{\epsilon _1 + \epsilon _2}{\varDelta _{x,min}} \ge \dfrac{\epsilon _1 + \epsilon _2}{\varDelta _x}\) in (47) contradicts (48).
Thus, we find from (41) and the inequalities [(42)–(46)]:
with \(\epsilon = \dfrac{2}{5} \epsilon '\).
Using now Cauchy–Schwarz inequality, we obtain:
Consider now the matrix \(X = \dfrac{1}{\varDelta _x} \big [ x^1 - x_0, x^2 - x_0, \ldots , x^p -x_0 \big ]\).
We recall that the interpolation set Z is defined as
Since Z is poised, Z is full rank, i.e., \({{\,\mathrm{rank}\,}}(S) = \min (p,m+n) = m+n\) based on the fact that \(p>m+n\) (see Sect. 2.1), and the m column vectors \(x_0,x^1,\ldots ,x^p\) are linearly independent. Therefore, \(X^T\) is a non-singular matrix.
We have
Then, we obtain from inequality (49):
Moreover, we have:
Thus, we obtain:
Recovering \(err_0^f(x)\) by Eq. (40), we have:
We complete the proof by the definition of the two required constants:
and
\(\square\)
Appendix B Proof of Proposition 1
Proposition 1
Let \(\mu >0\) be a given constant, N, n be positive integers, and let \(f:\varOmega \subseteq {\mathbb {R}}^N \rightarrow {\mathbb {R}}\) be a quadratic function, \(g_i: \varOmega \rightarrow {\mathbb {R}}\), \(i = 1, 2, \ldots ,n\), be real-valued functions satisfying \(0 \le g_i (z) \le M\), for all \(z\in \varOmega ,\) for some \(M>0\). Then, the two following optimization problems are equivalent:
Proof
We prove the proposition in two steps:
-
firstly, we show that, (\(P_2\)) is a relaxation of (\(P_1\)) in the sense that if \((\bar{z},\bar{t})\) is a feasible solution of (\(P_1\)), then \((\bar{z},\bar{y},\bar{t})\) is a feasible solution of \(P_2\);
-
secondly we prove that any optimal solution \((z^*,y^*,t^*)\) of (\(P_2\)) is feasible for(\(P_1\)).
Let us consider the first assertion: (\(P_2\)) is a relaxation of (\(P_1\)).
Let \((\bar{z},\bar{t})\) be a feasible solution of (\(P_1\)).
Consider now the point \((\bar{z},\bar{y},\bar{t})\) where, for \(i = 1,2,\ldots ,n\):
Let I be the unique index i such that \(\bar{y}_i = 0\).
From the definition of \(\bar{y}\) , we note that:
-
\(\bar{t} = g_I(\bar{z})\),
-
\(\bar{y}_I = 0\),
-
\(\bar{y}_i = 1\) for all \(i \ne I\).
Then, for \(i \ne I\), the constraint \(\bar{t} \ge g_i (\bar{z}) - M\) holds since
And for \(i = I\), \(\bar{t} \ge g_I (\bar{z}) - M\) holds also since
Then, \((\bar{z}, \bar{y}, \bar{t})\) is feasible for (\(P_2\)).
For the second step, let us now show that: if \((z^*, y^*, t^*)\) is an optimal solution of (\(P_2\)) then \((z^*, t^*)\) is feasible for (\(P_1\)), i.e., we want to prove that \(t^* = \min \limits _{ i= 1,2, \ldots , n} \{g_i (z^*)\}\).
By contradiction, we shall suppose that this optimal solution of (\(P_2\)) is such that \(t^* \ne \min \limits _{ i= 1,2, \ldots , n} \{g_i (z^*)\}\).
Let us consider two cases:
-
either \(t^* < \min \limits _{ i= 1,2, \ldots , n} \{g_i (z^*)\}\),
-
or \(t^* > \min \limits _{ i= 1,2, \ldots , n} \{g_i (z^*)\}\).
Let \(I_{y^*}\) denote the unique index i such that \(y^*_i = 0\).
Then, \(y_{I_{y^*}} = 0\) and \(y^*_i = 1\), for all \(i \ne I_{y^*}\).
Using the fact that \((z^*, y^*, t^*)\) is a feasible solution for (\(P_2\)), we have
In the first case, with \(t^* < \min \limits _{ i= 1,2, \ldots , n} \{g_i (z^*)\}\), we have
which contradicts (60).
Therefore, the second case necessarily holds, i.e., \(t^* > \min \limits _{ i= 1,2, \ldots , n} \{g_i (z^*)\}\). Consider now a solution \((\bar{z}, \bar{y}, \bar{t})\) defined as follows:
and
where \(I^* = \{ i: g_i (z^*) = \min \limits _{ i= 1,2, \ldots , n} \{g_i (z^*)\}\}\). We have:
-
This new solution \((\bar{z}, \bar{y}, \bar{t})\) is feasible for (\(P_2\)). Indeed, for \(i \ne I^*,\) the \(i^{th}\) constraint, \(t \ge g_i(z) - M y_i\), is satisfied for \((\bar{z},\bar{y},\bar{t})\), since M is an upper bound for the function \(g_i(z)\) and \(\bar{y}_i = 1\).
If \(i = I^*\), then on the one hand \(\min \nolimits _{ i= 1,2, \ldots , n} \{g_i (z^*)\} = \bar{t}\), and on the second hand \(g_{I^*}(\bar{z}) = g_{I^*}(z^*) = \min \nolimits _{ i= 1,2, \ldots , n} \{g_i (z^*)\}\) by definition of \(I^*\). Therefore, the \(I^{*}\)th constraint of (\(P_2\)) is satisfied for \((\bar{z}, \bar{y}, \bar{t})\).
-
In terms of objective-function values, it is clear that
$$\begin{aligned} f(z^*) + \mu t^* > f(\bar{z}) + \mu \bar{t}, \end{aligned}$$since by hypothesis \(t^* > \min \nolimits _{ i= 1,2, \ldots , n} \{g_i (z^*)\}\), while \(\bar{t} = \min \nolimits _{ i= 1,2, \ldots , n}\{g_i (z^*)\}\). This contradicts the optimality of \((z^*, y^*, t^*)\).
\(\square\)
Rights and permissions
About this article
Cite this article
Tran, T.T., Sinoquet, D., Da Veiga, S. et al. Derivative-free mixed binary necklace optimization for cyclic-symmetry optimal design problems. Optim Eng 24, 353–394 (2023). https://doi.org/10.1007/s11081-021-09685-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11081-021-09685-1