Introduction

Computational HT screening is nowadays a consolidated approach to materials discovery1,2,3, as a complementary and accelerated tool with respect to experimental efforts. In the last decade, seminal works in this field have addressed, among many other topics, the discovery of novel 2D materials4,5,6,7,8,9,10, the identification of optimal new lithium-ion battery anodes11,12, thermoelectric13,14, photocatalysts15, and photovoltaic light harvesting16,17,18 materials. The success of these studies relies on the development of different software and implementations19 that were able to encode complex domain-specific knowledge into automated and robust workflows – enforcing rigorous computational protocols19,20 and managing all the steps concerning a simulation – which thus require the least possible human intervention21,22,23,24,25,26,27,28,29.

Concerning the electronic structure field, most of these works and implementations are based on DFT, which allows one to compute total energies, optimized geometries, and other ground state properties of materials with predictive accuracy. However, different approaches are required for the accurate prediction of excited-state properties of materials, such as quasiparticle (QP) band structures and absorption spectra, which are typically crucial for the description of active processes in modern optoelectronic technologies, like photovoltaics, photocatalysis, light-emitting diodes (LEDs), photodetectors, and solar cells30,31,32,33,34,35,36. In this context, MBPT and Green’s function methods represent the state-of-the-art tools, where charged (electronic quasi-particle levels) and neutral excitations (optical properties, electron energy loss spectra) can be obtained by means of the GW approximation and the Bethe-Salpeter equation (BSE), respectively37.

To date, a limited number of attempts have been made toward automation38 and HT screening9,39 based on these MBPT approaches, mainly because of their conceptual and computational complexity. Indeed, depending on the specific physical problem, different levels of theory might be adopted40, further branching off depending on the chosen approximations and implementations41,42. Furthermore, even for the simplest approximations, these calculations require the control over a much larger parameter space with respect to DFT, with parameters that are often interdependent and might change depending on the specific implementations adopted, but the choice of which is always crucial to obtain reliable results. From a computational point of view, this type of simulations are constituted by a chain of distinct steps, often relying on the usage of different software tools, each of them with their own specificity, e.g., in terms of memory and parallelization requirements. Memory requirements are much heavier than in standard DFT simulations even for moderate system size, and often require massive usage of parallel computing resources. Calculations often fail due to memory overflow and have to be restarted with careful choice of parameters. All of these problems make the application of MBPT-based approaches a complex and difficult task per se, and its automation still an open challenge.

Building on pioneering works in the field38,39, we here focus on the development of an improved algorithm aiming at an efficient and computationally cost-effective management of the choice of converged parameters for accurate GW-BSE calculations. The algorithm is implemented in the AiiDA framework28,29, a platform that is routinely used for HT studies6,43,44,45 and that incorporates the ADES model for Automation, Data, Environment, and Sharing24. As detailed below, this implementation allows us to encode an efficient error handling, memory and parallelization management, and logic computational flows within automated python workflows. Moreover, it guarantees a seamless interoperability of different software codes that tackle the different steps usually involved in MBPT simulations, i.e., the preliminary DFT part (here using Quantum ESPRESSO46,47), the GW-BSE calculations (Yambo48,49), and any required post-processing. In particular, for the latter, we here introduce a scheme based on maximally localized Wannier functions50 for the automatic interpolation of GW band structures, which interfaces the Wannier9051 and Yambo projects. All of these developments point to a drastic reduction of both human and computational efforts, key issues for enabling HT studies. In addition, by incorporating the different domain-specific scientific and computational competences into robust and reliable workflows, we aim at making accurate GW-BSE calculations available for the materials science community at large, including non-experts in the field (e.g., via graphical user interfaces52), similarly to what has already happened with DFT.

The manuscript is organized as follows. In the Results and Discussion Section, we first introduce a model for the convergence surface in the N-dimensional space of the GW (BSE) variables, and present our improved algorithm for efficiently retrieving converged values for the main (interdependent) parameters. The implementation of this algorithm within the aiida-yambo plugin is then described, followed by the presentation of the aiida-yambo-wannier90 plugin that encodes the GW band interpolation based on Wannierisation. Both these implementations are validated for selected prototypical systems. Additional details on simulations are provided in the Methods section. In the remainder of this section, we instead introduce the main concepts and quantities related to the GW and BSE schemes, which will be useful to properly understand the subsequent Sections.

Accurate electronic band structures of materials can be computed within the MBPT framework by correcting the Kohn–Sham (KS) DFT eigenvalues with a self-energy term Σ by means of the GW approximation, i.e. Σ is approximated with the first term of the perturbation series expansion in terms of the screened Coulomb interaction W53. We hereafter consider the simplest and most widespread implementation of GW, i.e. the so-called G0W0 approach, where G0 is the KS independent-particle one-body Green’s function and W is computed within the random-phase approximation (RPA)54. Nevertheless, the convergence algorithm presented here can be used also for more sophisticated flavors of the theory, e.g. the self-consistent GW37. Under these assumptions and by considering a plane-wave expansion, the self-energy term Σnk for a band n at a given k-point – written as the sum of the Fock exchange (Σx) and the frequency-dependent correlation (Σc) terms – is given by:

$$\Sigma_{n{{{\bf{k}}}}}^{x}=-\mathop{\sum }\limits_{m}^{{{{\rm{occ}}}}}\int\frac{d{{{\bf{q}}}}}{{(2\pi )}^{3}}\mathop{\sum }\limits_{{{{\bf{G}}}}}^{{G}_{{{{\rm{cut}}}}}^{x}}{v}_{{{{\bf{G}}}}}({{{\bf{q}}}}){\left\vert {\rho }_{nm}({{{\bf{k}}}},{{{\bf{q}}}},{{{\bf{G}}}})\right\vert }^{2}{f}_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}$$
(1)

and

$$\begin{array}{ll}\Sigma_{n{{{\bf{k}}}}}^{c}(\omega )=\displaystyle\,{{{\rm{i}}}}\mathop{\sum }\limits_{m}^{{N}_{b}}\int\frac{d{{{\bf{q}}}}}{{(2\pi )}^{3}}\mathop{\sum }\limits_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime} }}^{{G}_{{{{\rm{cut}}}}}}{\rho }_{nm}({{{\bf{k}}}},{{{\bf{q}}}},{{{\bf{G}}}}){\rho }_{nm}^{* }({{{\bf{k}}}},{{{\bf{q}}}},{{{{\bf{G}}}}}^{{\prime} })\\\qquad\qquad \displaystyle\times\,\int\,\frac{d{\omega }^{{\prime}}}{2\pi}{W}_{{{{{\bf{GG}}}}}^{{\prime} }}({{{\bf{q}}}},{\omega }^{{\prime} })\\\qquad\qquad \displaystyle \times\,\left[\frac{{f}_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}}{\omega -{\omega }^{{\prime} }-{\epsilon }_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}-{{{\rm{i}}}}\eta }+\frac{1-{f}_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}}{\omega -{\omega }^{{\prime} }-{\epsilon }_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}+{{{\rm{i}}}}\eta }\right],\end{array}$$
(2)

where vG(q) and \({W}_{{{{{\bf{GG}}}}}^{{\prime} }}({{{\bf{q}}}},{\omega }^{{\prime} })\) are the bare and screened Coulomb interaction, respectively. The generalized dipole matrix elements ρnm(k, q, G) are defined as:

$${\rho }_{nm}({{{\bf{k}}}},{{{\bf{q}}}},{{{\bf{G}}}})=\left\langle n{{{\bf{k}}}}\right\vert {{{{\rm{e}}}}}^{{{{\rm{i}}}}({{{\bf{q}}}}+{{{\bf{G}}}})\cdot {{{\bf{r}}}}}\left\vert m{{{\bf{k}}}}-{{{\bf{q}}}}\right\rangle ,$$
(3)

where \({\epsilon }_{n{{{\bf{k}}}}}^{{{{\rm{KS}}}}}\) and \(\left\vert n{{{\bf{k}}}}\right\rangle\) are the corresponding KS eigenvalues and eigenvectors. Here, W can be written in terms of the reducible polarizability χ (W = v + vχv), which is in turn computed by solving a Dyson equation for the RPA irreducible polarizability χ0:

$${\chi }_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime} }}({{{\bf{q}}}},\omega )=\mathop{\sum }\limits_{{{{{\bf{G}}}}}^{{\prime\prime} }}^{{G}_{{{{\rm{cut}}}}}}{\left[I-v({{{\bf{q}}}}){\chi }^{0}({{{\bf{q}}}},\omega )\right]}_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime\prime} }}^{-1}{\chi }_{{{{{\bf{G}}}}}^{{\prime\prime} }{{{{\bf{G}}}}}^{{\prime} }}^{0}({{{\bf{q}}}},\omega ),$$
(4)

where

$$\begin{array}{l}{\chi }_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime} }}^{0}({{{\bf{q}}}},\omega )=\displaystyle2\mathop{\sum }\limits_{nm}^{{N}_{b}}{\int}\frac{d{{{\bf{k}}}}}{{(2\pi )}^{3}}{\rho }_{mn}^{* }({{{\bf{k}}}},{{{\bf{q}}}},{{{\bf{G}}}}){\rho }_{mn}({{{\bf{k}}}},{{{\bf{q}}}},{{{{\bf{G}}}}}^{{\prime} })\\\qquad\qquad\qquad\displaystyle\times\left[\frac{{f}_{n{{{\bf{k}}}}-{{{\bf{q}}}}}(1-{f}_{m{{{\bf{k}}}}})}{\omega +{\epsilon }_{n{{{\bf{k}}}}-{{{\bf{q}}}}}-{\epsilon }_{m{{{\bf{k}}}}}+i\eta}-\frac{{f}_{n{{{\bf{k}}}}-{{{\bf{q}}}}}(1-{f}_{m{{{\bf{k}}}}})}{\omega +{\epsilon }_{m{{{\bf{k}}}}}-{\epsilon }_{n{{{\bf{k}}}}-{{{\bf{q}}}}}-i\eta}\right].\end{array}$$
(5)

In practice, the above quantities are computed by introducing specific approximations that crucially impact both the accuracy and the computational and memory costs of the calculations. In particular, Eq. (4) as well as Eq. (2) involve a discrete, finite number of G vectors, set by a cutoff parameter Gcut, which defines the size of the response matrix. Moreover, with the parameter Nb, we introduce a truncation over the KS states summation for both response \({\chi }_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime} }}^{0}\) (Eq. (5)) and self-energy Σc (Eq. (2)), which should in principle include all occupied and an infinite number of empty states. Finally, all the integrals in reciprocal space are computed on a discrete k-points (q-points) grid whose size, Nk, defines the accuracy of the sampling of the Brillouin zone (BZ). All of these three parameters, Gcut, Nb, and Nk, need to be increased till the desired convergence is reached.

One of the major obstacles to automate the convergence procedure lies in the interdependence of the first two parameters, Gcut and Nb, such that their convergence has to be performed jointly, as thoroughly discussed elsewhere38,42,55. Indeed, Eqs. (4) and (5) contains a summation over both empty states and reciprocal lattice vectors (G), and the expression of the generalized dipole terms in Eq. (3) is such that matrix elements with large G are governed by high-energy KS states38. Furthermore, the interdependence is non-trivial, given the presence of an inversion in Eq. (4) that further enters in the evaluation of the correlation self-energy, Eq. (2). Given the lack of an efficient “recipe” to carry out this non-trivial, coupled convergence, one has to resort to an iterative convergence of each of the two parameters by fixing the other, in an alternating way. This procedure, in addition of being tedious, is computationally very expensive, sometimes representing the most cumbersome part of a GW calculation.

Starting from the GW quasi-particles, the solution of the BSE54,56 can give access to optical properties of materials via the macroscopic dielectric function:

$$\begin{array}{l}{\epsilon }_{M}({{{\bf{q}}}},\omega)=\displaystyle1-\frac{2}{V{N}_{q}}v({{{\bf{q}}}})\mathop{\sum}\limits_{\lambda }{\Phi }_{\lambda }({{{\bf{q}}}})\\ \qquad\qquad\qquad\displaystyle\times \left[\frac{1}{{E}^{\lambda }({{{\bf{q}}}})-(\omega +{{{\rm{i}}}}\eta )}+\frac{1}{{E}^{\lambda }({{{\bf{q}}}})+(\omega +{{{\rm{i}}}}\eta )}\right],\end{array}$$
(6)

where V is the volume of the unit cell; Nq is the number of q-points sampling the Brillouin zone (BZ); Eλ(q) is the eigenvalue of the exciton λ at momentum \({{{\bf{q}}}}={{{\bf{k}}}}-{{{{\bf{k}}}}}^{{\prime} }\), and the corresponding exciton oscillator strength Φλ(q) is defined as:

$${\Phi }_{\lambda }({{{\bf{q}}}})={\left\vert \mathop{\sum}\limits_{vc,{{{\bf{k}}}}}\left\langle v{{{\bf{k}}}}-{{{\bf{q}}}}\right\vert {{{{\rm{e}}}}}^{-{{{\rm{i}}}}{{{\bf{q}}}}\cdot {{{\bf{r}}}}}\left\vert c{{{\bf{k}}}}\right\rangle {A}_{vc{{{\bf{k}}}}}^{\lambda }({{{\bf{q}}}})\right\vert }^{2}.$$
(7)

Equation (7) contains a summation over valence and conduction bands (v, c), and k-point mesh, which are the main parameters to converge for a BSE calculation. The terms \({A}_{vc{{{\bf{k}}}}}^{\lambda }({{{\bf{q}}}})\) represent the weight of each electron-hole transition contributing the exciton λ, as resulting from the solution of the BSE. The summation over bands mainly defines the range of energies under investigation. The k-point mesh is connected to the accuracy with which we describe the exciton composition in terms of single-particle v → c transitions over the entire BZ, and it is usually significantly larger than the one needed to converge the corresponding GW band structure. Additional convergence parameters are the number of G vectors used for expanding the KS wave-functions in the transition matrix elements, as defined in Eq. (3), and for Fast-Fourier-Transform (FFT) operations, as well as the plane-wave expansion and empty bands summation in the BSE kernel (both direct and exchange terms). These last three parameters are usually inherited from GW convergences. The convergence procedure proposed in this work, described in detail in the next Section, consists in the same algorithm for both GW and BSE, i.e. is agnostic on the observable (gaps, excitonic eigenvalues) that we are considering. For this reason and to avoid repetitions, we will focus herein on GW convergences only for the validation of the workflow discussed below. An example of BSE convergence is provided in the Supporting Information (Supplementary Fig. 12).

Results and discussion

Description of the convergence surface

The above-described coupled convergence of the parameters, combined with a much worse scaling than DFT and more computationally and memory demanding calculations, call for efficient procedures to describe and explore the convergence space in GW-BSE simulations. A possible strategy is to describe the convergence space in terms of an analytic function of the parameters38,39,41,57,58,59. For a general (N + 1) dimensional space, a model convergence surface f(x) that represents the value of a given observable E(x) (e.g., quasiparticle energies or excitonic eigenvalues) as a function of the N parameters x = [x1, . . . , xN] can be defined as:

$$f({{{\bf{x}}}})=\mathop{\prod }\limits_{i}^{N}\left(\frac{{A}_{i}}{{x}_{i}^{{\alpha }_{i}}}+{b}_{i}\right),$$
(8)

where Ai, bi, and αi are free fitting parameters, and \(B=\mathop{\prod }\nolimits_{i}^{N}{b}_{i}\) is the extrapolated converged value. Of course, the accuracy of the latter depends on the actual region of the parameter space explored for the evaluation of the convergence behavior, i.e., how far is the extrapolated value from the exact one. As such, B might not always be a good choice for guiding the search of the convergence parameters.

In the following, we introduce conditions on the mixed partial derivatives of Eq. (8) as a way to address explicitly the parameter interdependence. Indeed, these terms coincide with the off-diagonal elements of the related Hessian matrix, and correspond to the interaction terms between the parameters. The adoption of an analytical form for the description of the convergence space has the clear advantage of enabling the calculation of all-order derivatives, once the fitting parameters are known. On the other hand, not taking directly into account this interdependence can result in a very tedious convergence procedure, as it would require the cyclic repetition of multiple univariate convergences, as mentioned in the Introduction (further details are provided in Ref. 49).

For the expression in Eq. (8), the gradient components are:

$${f}_{{x}_{i}}^{{\prime} }({{{\bf{x}}}})=-{\alpha }_{i}\frac{{A}_{i}}{{x}_{i}^{{\alpha }_{i}+1}}\mathop{\prod }\limits_{j\ne i}^{N}\left(\frac{{A}_{j}}{{x}_{j}^{{\alpha }_{j}}}+{b}_{j}\right),$$
(9)

while second derivatives are:

$${f}_{{x}_{i}}^{{\prime\prime} }({{{\bf{x}}}})={\alpha }_{i}({\alpha }_{i}+1)\frac{{A}_{i}}{{x}_{i}^{{\alpha }_{i}+2}}\mathop{\prod }\limits_{j\ne i}^{N}\left(\frac{{A}_{j}}{{x}_{j}^{{\alpha }_{j}}}+{b}_{j}\right),$$
(10)
$${f}_{{x}_{i},{x}_{j}}^{{\prime\prime} }({{{\bf{x}}}})={\alpha }_{i}{\alpha }_{j}\frac{{A}_{i}{A}_{j}}{{x}_{i}^{{\alpha }_{i}+1}{x}_{j}^{{\alpha }_{j}+1}}\mathop{\prod }\limits_{k\ne i,j}^{N}\left(\frac{{A}_{k}}{{x}_{k}^{{\alpha }_{k}}}+{b}_{k}\right).$$
(11)

The asymptotic region of the convergence surface can be determined by imposing, for each parameter xi, xj with i, j = 1,..., N, two conditions:

$$\left\{\begin{array}{l}| {f}_{{x}_{i}}^{{\prime} }({{{\bf{x}}}})| \,<\, {\Delta }_{i}\quad \\ | {f}_{{x}_{i},{x}_{j}}^{{\prime\prime} }({{{\bf{x}}}})| \,<\, {\Delta }_{ij}\quad \end{array}\right.$$
(12)

The first condition determines the region in which the convergence surface becomes flat (thus approaching convergence), whereas the condition on second partial derivatives ensures that the N parameters are no longer interdependent. The threshold values Δi and Δij can be tuned according to the desired accuracy. According to our experience, the asymptotic region can be safely identified by choosing Δi = 5  10−5 and Δij = 1  10−8. Once this asymptotic region has been determined, a guess for the converged value of f(x), Eguess, is made, and its accuracy is then checked and validated according to the automated algorithm described below.

Convergence algorithm

Figure 1 schematically depicts our convergence algorithm, which is purposed to obtain the desired accuracy on GW-BSE results with the least possible number of calculations. In the following, we consider Nb and Gcut as the two interdependent parameters to be converged simultaneously. We remark that, while the algorithm is specifically designed to handle coupled convergences, it can be successfully used to accelerate convergence tests with respect to any other parameter, such as the BZ k-point mesh or the FFT grids.

Fig. 1: Flowchart of the convergence algorithm.
figure 1

After generating the grid for the N-dimensional parameter space, a subset of M simulations is performed. The results are then fitted to predict the converged parameters. Finally, the accuracy (Eq. (13)) and the convergence (Eq. (14)) of the prediction are verified, and the procedure iterated, if needed.

The first step (i) consists in the construction of the N-dimensional space of parameters as a grid of equally spaced points, with spacing and ranges provided from input. It is worth noting that, to fit the functional form of Eq. (8), one needs to generate a grid with minimum 3N points, since f(x) contains three free fitting parameters for each of the N dimensions of the parameter space. However, a further reduction on the minimal grid size (that is, on the minimum number of calculations to perform) can be obtained by fixing the power-law dependence, αi, to a given value, as suggested in Ref. 38, thus resulting in a minimal grid size of 2N. Usually, much denser grids are generated, where (ii) M0 ≥ 2N calculations are performed on a subset of the grid points, chosen such as to efficiently sample the parameter space.

Next, (iii) the results of the calculations are fitted by using the expression Eq. (8). As mentioned above, the power laws are fixed to given values: αi {1, 2} i = 1, ...N, and the one resulting in the lowest mean squared error is chosen. The asymptotic region is then identified by computing the first and second order derivatives (Eqs. (9)–(11)), and imposing the conditions in Eq. (12). Given the asymptotic region, a guess converged value is selected Eguess= \(f({N}_{b}^{0},{G}_{cut}^{0})\), where (\({N}_{b}^{0},{G}_{cut}^{0}\)) are the lowest values that can be chosen for the parameters such that Eguess is within a desired convergence threshold Δ with respect to the asymptotic region.

To establish the accuracy and convergence of the fitting procedure, (iv) \(E({N}_{b}^{0},{G}_{cut}^{0})\) is computed and compared to the outcome of the fit Eguess, by considering the chosen threshold Δ (see Eq. (13) below). If the accuracy condition is satisfied, we need to check the convergence of the fitting procedure, that is, a new pair of parameters \(({N}_{b}^{1},{G}_{cut}^{1})\) is obtained from the fit by adding the \(({N}_{b}^{0},{G}_{cut}^{0})\) point to the initial M0 grid, and compared to the previous one (see Eq. (14) below). The last step is repeated until convergence is reached. If the accuracy condition is not satisfied, the grid is instead shifted toward higher values of the parameters, and the steps (ii)–(iv) are repeated until the two conditions:

$$| E({N}_{b}^{j},{G}_{cut}^{j})-{E}^{fit}({N}_{b}^{j},{G}_{cut}^{j})| \,<\, \Delta ,$$
(13)
$$({N}_{b}^{j+1},{G}_{cut}^{j+1})=({N}_{b}^{j},{G}_{cut}^{j})$$
(14)

are simultaneously satisfied for the j-th iteration.

The aiida-yambo plugin and automated workflows

The above convergence algorithm has been implemented in the new version of the aiida-yambo plugin (see Code Availability Section), which is meant to fully automate GW-BSE calculations by interfacing the Yambo project48,49 and the AiiDA informatics infrastructure and workflow management system28,29. The automation concerns input generation, scheduler submission, and output parsing phases. The output parsing of the aiida-yambo plugin is partially done by using yambopy functions49. Thanks to the AiiDA infrastructure, links between single calculations are managed on the fly by ad-hoc, dynamic workflows (the so-called workchains in the AiiDA jargon), i.e. their execution path is not fixed, but can depend on the results of completed calculations. This allows for the implementation of complex logics, such as those characterizing the convergence algorithm and GW band interpolation that we propose in this work. Moreover, each calculation, together with inputs and outputs, is stored in the AiiDA relational database, thus ensuring data provenance and full reproducibility of results.

Currently, the aiida-yambo plugin supports quasiparticle (G0W0 and COHSEX53 level) and optical properties (IP-RPA and BSE) simulations, as well as interfaces with different codes (e.g., Quantum ESPRESSO and Wannier90). These options are implemented in the YamboCalculation and YppCalculation classes, which manage individual simulations (including data interfacing) that can be performed by using the Yambo code. On top of them, task-specific workflows are implemented, and organized in a modular way, in order to automate tasks of increasing complexity. In particular, the aiida-yambo plugin contains three main workflows, each of them targeting a precise task:

  • YamboRestart: automation of error handling and restart for each YamboCalculation;

  • YamboWorkflow: automation of the single GW or BSE flow (composed of several interlinked steps, explained in the following);

  • YamboConvergence: automation of the convergence (composed of multiple YamboWorkflow runs).

Fig. 2: Hierarchical structure of the aiida-yambo workchains.
figure 2

The highest level workflow is YamboConvergence, which calls multiple YamboWorkflow workchains. YamboWorkflow comprises all the steps needed to perform individual GW-BSE calculations from scratch. In case of failures, it calls the YamboRestart workchain for automatic error handling. The outputs are stored in the AiiDA database in a human readable fashion, and are easily accessible and shareable by the user.

Their nested organization is shown in Fig. 2. The highest level workflow is represented by the YamboConvergence workchain, which implements the full automation of the convergence algorithm described above, thus allowing for all Yambo simulations to be organized on the fly, without any external user intervention. The user is only requested to provide a python list containing the information on the parameter space to be explored. An example of such input reads:

[

{

‘var’: [

‘BndsRnXp’,

‘GbndRnge’,

‘NGsBlkXp’

],

‘start’: [50, 50, 2],

‘stop’: [400, 400, 10],

‘delta’: [50, 50, 2],

‘max’: [1000, 1000, 36],

‘what’: [‘gap_GG’],

‘conv_thr’: 0.1,

‘conv_thr_units’: ‘eV’,

},

]where the Yambo variables “BndsRnXp” and “GbndRnge” govern the convergence over empty states, Nb, to be carried out jointly with that on the size of the response matrix, Gcut (“NGsBlkXp” variable). The edges of the grid (“start” and “stop”) and its spacing (“delta”), together with an upper bound of the parameter space (“max”), limiting the search to computationally accessible calculations, are also set by the user. The key “what” indicates the quantity to be converged – in our example, the direct band gap of the material at Γ point – up to a given convergence threshold Δ (“conv_thr” key). Notably, it is possible to select a convergence threshold Δ for each set of GW (BSE) parameters that have to be converged simultaneously, thereby avoiding any risk of over-convergence.

The output summarizes the convergence history and allows the user to easily parse the converged simulation. YamboConvergence allows one to converge several many-body quantities, like quasiparticle levels, band-gaps, as well as optical excitation energies. Notably, the convergence block in Fig. 2 can be skipped if converged parameters are already known.

Each single GW (BSE) calculation is instead automated within YamboWorkflow, which is the core workchain of the plugin that takes care of performing all the steps needed in a typical Yambo simulation – from preliminary self-consistent (SCF) and non-self-consistent (NSCF) DFT calculations to the actual GW (BSE) calculations, and the related post-processing. The workflow ensures a robust interoperability between DFT and MBPT codes (Quantum ESPRESSO and Yambo, respectively), and links subsequent calculations, interfacing the data automatically. In practice, YamboWorkflow encodes the specific flowchart underlying each requested calculation, and allows for its dynamic execution according to the instructions provided in input. This implies performing all the intermediate steps needed for a specific calculations without the need of instructing them explicitly, or, on the contrary, to skip some of the intermediate steps for which parent calculations are available, fully exploiting the YamboWorkflow provenance information.

To support a restart mechanism in case of code failures, YamboWorkflow takes advantage of the YamboRestart workchain, a sub-level workflow that encodes an automatic error handler (inherited from the AiiDA BaseRestartWorkchain class) which, depending on the encountered failure, automatically instructs a restart run. For out-of-memory errors or failures connected with insufficient wall-time requests, YamboRestart automatically resubmits the calculation by appropriately changing the requested resources (e.g., the maximum wall-time and the MPI/OpenMP balance); parallelization errors are managed by overwriting the parallelism variables set in input by the user with the default parallelism decided on the fly by yambo. In all these cases, an efficient, CPU-time-saving restart mechanism is implemented, which avoids to restart unfinished runs from scratch by automatically retrieving and enabling the reuse of stored data files.

As a final issue, we would like to discuss the possibility to develop protocols for MBPT calculations. Indeed, most of the DFT-based AiiDA plugins enable the use of protocols60, that is, the possibility of creating inputs with pre-populated default values for several parameters. Such protocols are usually code-agnostic and robust, given the high level of reproducibility of DFT with different quantum engines19. Moreover, their reliability is guaranteed by means of large scale studies spanning systems with a wide variety of characteristics (e.g., metals, semiconductors, systems with different dimensionality)44.

Concerning MBPT calculations, the possibility to define protocols is still an open issue38. First of all, code-agnostic parameters are not at all easy to be determined as it is for DFT-based codes, because the MBPT implementations and the subsequent definition of parameters can differ in very many aspects, as highlighted in the Introduction section. Secondly, the high computational cost of these simulations has limited so far the number of systems to be studied extensively, which is crucial to define a reliable statistics on convergence parameters. Last but not least, DFT-based protocols usually result in safe but overconverged parameters, an approach that might lead to unfeasible calculations when moving to the GW-BSE framework.

For all these reasons, we believe that an efficient, fully automated convergence tool, as the one presented here, is currently the most valuable solution. Nonetheless, also in view of possible future developments, the aiida-yambo plugin provides an implementation for a protocol framework for both GW and BSE simulations, which is currently pre-populated on the basis of previous experience on a limited subset of systems. Such protocols concern several parameters connected to the main yambo input variables, such as the FFT grids (“FFTGvecs”), the summation over empty states (“BndsRnXp” and “GbndRnge”), the plane-wave expansion for the polarizability (“NGsBlkXp”) and the BZ k-point sampling. A detailed documentation on these and other aspects concerning the aiida-yambo plugin is provided at https://aiida-yambo.readthedocs.io/en/master/.

Automatic GW bands interpolation: the aiida-yambo-wannier90 plugin

GW band interpolation from Wannierisation is a crucial task in order to obtain the most accurate quasiparticle band structure with the lowest computational cost. This task is encoded in the aiida-yambo-wannier90 plugin (see Code Availability Section). Essentially, the plugin provides a meta-workflow, called YamboWannier90WorkChain, which utilizes the automation and error handling of the underlying aiida-yambo and aiida-wannier90-workflows plugins for GW convergence and Wannierisation, respectively. The flowchart of the workflow is summarized in Fig. 3. Starting from a given crystal structure, the workflow first launches a YamboConvergence workflow for automatic convergence. Then, it finds the minimal commensurate mesh with the wannier90.x ones that satisfies the GW convergence conditions (see below). Thirdly, it runs YamboWorkflow to compute all the quasiparticle corrections required for the Wannierisation on the commensurate mesh, and a subsequent ypp calculation (by means of YppRestart) to extract the GW corrections in a Wannier90eig file format. Fourthly, the workflow Wannierizes the KS wavefunctions, saving the unitary transformation matrices of maximal localization, and interpolates the band structure. Finally, the workflow performs the Wannierisation procedure at G0W0 level, which consists in incorporating the GW corrections into the DFT eigenvalues, and interpolating the band structure by using the DFT Wannierisation outcomes.

Fig. 3: Flowchart of the YamboWannier90WorkChain for the Wannier interpolation of GW band structures.
figure 3

After performing the GW convergence, the workflow searches for a commensurate k-point meshes for yambo and wannier90.x, and carries out the corresponding GW QPs calculations. Given the QP corrections, the workflow proceeds with the Wannierisation and the band interpolation at DFT and GW level.

A crucial step of the workflow is finding a commensurate mesh for both GW QP calculations and Wannierisation. Indeed, the GW mesh resulting from automated convergence might not always be compatible with the mesh required by wannier90.x to ensure interpolation accuracy. Notably, considering a Monkhorst-Pack (MP) grid for the Wannierisation, the corresponding GW mesh must be an integer multiple of the MP grid. We here propose a recipe to find the minimal commensurate meshes for yambo and wannier90.x calculations, as depicted in Fig. 4. Considering nd as the number of k-points chosen by the YamboConvergence workflow, and nc the number of k-points chosen by the Wannierisation protocol (typically based on a k-point spacing, 0.2 Å−1)45, the target is to find a new (\({n}_{d}^{{\prime} }\), \({n}_{c}^{{\prime} }\)) such that the dense mesh \({n}_{d}^{{\prime} }=k\cdot {n}_{c}^{{\prime} }\), where \(k\in {\mathbb{N}}\), i.e., natural number. The given input (nd, nc) restricts the search space to a sector bounded by klow and khigh (see Fig. 4), where klow = 1 (\({n}_{d}^{{\prime} }={n}_{c}^{{\prime} }=\max ({n}_{d},{n}_{c})\) is always a good solution), and \({k}_{{{{\rm{high}}}}}=\lceil \frac{{n}_{d}}{{n}_{c}}\rceil\), where indicates the ceiling integer. The search always succeeds since \({s}_{{{{\rm{low}}}}}=(\max ({n}_{d},{n}_{c}),\max ({n}_{d},{n}_{c}))\) and shigh = (khighnc, nc) are already two good solutions. In fact, the optimal solution is often inside the triangular region determined by the input (nd, nc), slow, and shigh. The final solution is chosen according to the 1 distance to the input, such to ensure the minimal increase in computational cost. It is also possible to change the metric, e.g., pushing the solution towards increasing the Wannierisation mesh or GW mesh, depending on which calculation is less cumbersome. The aforementioned recipe is repeated for each of the three dimensions of the MP grid.

Fig. 4: Recipe to find commensurate meshes for GW and Wannierisation calculations.
figure 4

Using input meshes (11, 5) as an example, final commensurate meshes (12,6) are found. The khigh and klow lines, that intersect in the origin, are respectively the imposed upper and lower bound for searching the commensurate meshes; the orange dots are possible solutions; the red dot is the chosen solution, which is the closest to the input in the metric of 1 norm.

Validation of the workflows

The proposed convergence algorithm, as implemented in the YamboConvergence workchain, has been validated by performing convergence studies for the quasiparticle G0W0 gap of a set of prototypical semiconductors: silicon, diamond, ZnO, rutile TiO2, monolayer MoS2, bulk and monolayer hBN. The convergence addresses the direct band gap at the Γ point with respect to the two coupled parameters, Nb and Gcut, and the k-point grid (in terms of k-points density ρk). Once convergence is achieved, the minimum band gap \({E}_{gap}^{{{\rm{G}}}_{0}{{\rm{W}}}_{0}}\) is also computed. The results are summarized in Table 1, where, together with \({E}_{gap}^{{{\rm{G}}}_{0}{{\rm{W}}}_{0}}\), we report the final parameters resulting from the convergence procedure (Nb, Gcut, and ρk) as well as the convergence threshold (absolute, ΔΓ, and relative, \({\Delta }_{ \% }^{\Gamma }\)) adopted in each case. Our results are found in good agreement with previous findings, which are also reported in the Table 1. Deviations with respect to reference results can be ascribed to different GW implementations and/or different DFT starting points used in the related works41. This is the case, for instance, of the band gap of ZnO: our result (2.36 eV, Table 1) is in excellent agreement with the result reported in ref. 42, where the same plasmon pole model is applied, but differs from more recent benchmark calculations41 (2.8–3.1 eV) that exploit other approaches. Further details on the convergence tests are contained in the Supplemental Material (including convergence plots obtained within this work and additional information in the Wannierisation procedure).

Table 1 G0W0 convergence tests on prototypical semiconductors.

Figure 5 shows the convergence procedure for monolayer hBN, considering both the joint convergence with respect to Nb and Gcut (panel a), and the single-parameter convergence with respect to the k-mesh (panel b). Starting from an input grid with Nb [200,800] and Gcut [4,16] Ry (Fig. 5a, blue shaded area), a subset of 6 calculations is performed (black squares). Since the converged guess is above the upper bound of the parameter space (see “max” variable in the plugin description), a new shifted grid (orange shaded area) is considered, and a first guess for the converged parameters is found from the fitting procedure (blue square), now satisfying the accuracy condition (Eq. (13)). A new fit is performed including this additional point, which results in a new converged guess (red square). The procedure is repeated till the converged result is verified to be consistent with the prediction within the given threshold (Eq. (13)) and to be the true converged point (Eq. (14)).

Fig. 5: Convergence algorithm applied to monolayer hBN.
figure 5

a Convergence of the direct quasiparticle band gap at the Γ point with respect to the coupled parameters Nb and Gcut. The blue shaded area represents the starting grid, while the red shaded area the shifted grid obtained after the first iteration, according to the flowchart in Fig. 1. Black squares represent the actual calculations performed by the workflow; the blue square is the first guess for the converged parameters; the red square indicates the final, converged point. The colormap specifies the absolute relative error Δ% with respect to the converged point. A maximum absolute error of Δ = 42 meV is achieved for (Nb, Gcut) = (1200,28 Ry), corresponding to a maximum relative error of Δ% = 0.5%. b Convergence of the direct quasiparticle band gap at the Γ point with respect to the k-mesh. The black points represent the actual calculations performed by the workflow, whereas the blue points are the ones obtained within the fitting procedure and used to predict the convergence. The final, converged mesh (red square) is achieved with five simulations.

A similar path is followed for the k-mesh convergence (Fig. 5b): a limited number of calculations is initially performed (black squares), from which the fitting is evaluated (blue curve), and the smallest grid compatible with the given threshold is finally selected (here 8 × 8 × 1). We note that, despite the results of the simulations seem to have an oscillating behavior with respect to the fitted curve, the error bar considered here (from the 14 × 14 × 1 mesh) is ~ 0.13% of the band gap at Γ, i.e. ~10 meV, in line with the accuracy of state-of-the-art GW calculations. Each of the simulations shown in Fig. 5 was performed on a 40-core (230 GB RAM) Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz machine, for a computational time of 6 hours. Considering that the workflow submits several jobs simultaneously, the total wallclock time needed to obtain the final converged result is further reduced to less than 3 hours.

The dependence of the outcome of the algorithm on the input settings (e.g., the initial value of the parameters and the boundaries of the convergence space) is a key issue for evaluating the robustness of the algorithm itself. Indeed, the convergence procedure has been tested on the starting grid for monolayer MoS2. By using two different grids, Nb [200,800], Gcut [4,20] Ry and Nb [200,1200], Gcut [8,24] Ry, the convergence point obtained from the workflow remains the same, i.e., (Nb,Gcut) = (400,8). Another important issue to evaluate is the efficiency of the algorithm. We notice that, in the case of the 2D-hBN convergence shown in the left panel of Fig. 5, only 14 calculations where required to reach convergence. Older implementations49 would have required ≥25 simulations to achieve a final result, with over 40% reduction.

Next, the YamboWannier90WorkChain has been tested on bulk Si and Cu, in order to validate the automatic Wannier interpolation of GW band structures for both semiconductors and metals. Results are plotted in Fig. 6, where we compare the computed DFT bands (black dots), the Wannier interpolated DFT bands (red dashed lines), and the Wannier interpolated GW bands (green solid lines). For Si (Fig. 6a), the comparison between computed and interpolated DFT bands shows that the results are almost identical, indicating the accuracy of the Wannierisation of the KS wavefunctions. Moreover, the typical band gap opening upon inclusion of GW corrections is found when comparing KS and QP band structures.

Fig. 6: Wannier interpolation of GW band structures.
figure 6

Band structure of Si (a) and Copper (b). Interpolated bands are plotted for both DFT (red dashed line) and GW (green solid line) eigenvalues, as compared to the DFT computed bands (black dots).

For Cu (Fig. 6b), we obtain a discrepancy of ~10 meV around the Fermi energy (here set to zero) between computed and interpolated bands at the DFT level. Better accuracy can be achieved imposing more stringent values of the involved parameters. At QP level, the GW correction is very small around the Fermi level ( ~ 37 meV), but still not negligible. Here, the GW convergence is more stringent than for Si, especially concerning the k-point mesh. Indeed, denser grids are needed to account for the contribution of intra-band transitions in the q → 0 limit, which is crucial for metallic systems but not explicitly included within the plasmon pole approximation61,62. Considering the converged parameters, (Nb, Gcut, ρk) = (400, 18 Ry, 0.2 Å−1), the quasiparticle evaluations required to interpolate the bands for the minimum converged wannier90.xk-point mesh (16 × 16 × 16) become 2900. This quite large number of QP corrections can be easily computed using the YamboWorkflow workchain thanks to the possibility to split the QPs calculation in several runs, each of them computing only a fraction of the GW corrections, and then collecting all the data in a final database well-suited for the YamboWannier90WorkChain. Since the number of QP corrections to compute can be quite high, in the Supporting Information (Supplementary Note IIB) we suggest an effective way to reduce the number of the required calculations for cases when accurate QP corrections are needed only in a limited energy region, e.g. around the Fermi energy, and energies outside of the chosen region can be approximated e.g. through a scissor and stretching correction.

In this work, we have presented the successful design and implementation of advanced algorithms in state-of-the-art GW (BSE) calculations, that is, convergence between interdependent parameters, error handling and automatic band interpolation by means of Wannierisation. We validated the tools on selected cases among semiconductors and metallic systems. The results contained in this work clearly show the power of these developed workflows for the automated study of excited states properties of materials, paving the way for achieving high-throughput MBPT studies. Thanks to these developments and within the next-generation of pre-exascale and exascale supercomputers, these simulations may become extensively and routinely performed by the materials-science community in the near future.

Methods

For all the systems studied here, we used symmetrized geometries in such a way to reduce the computational cost of simulations. We do not expect relevant differences in the results obtained with fully-relaxed structures. DFT simulations were carried out by using the Quantum ESPRESSO simulation package, which implements plane-wave basis set and pseudopotential approach. The KS-DFT exchange-correlation functional was approximated using GGA-PBE63, through the optimized norm-conserving Vanderbilt (ONCV) SG15 pseudopotentials64,65. In the case of ZnO, we adopted Local Density Approximation (LDA), to compare the results with the existing literature41,42, and PseudoDojo pseudopotentials66. GW and BSE results were obtained by means of the Yambo code. The frequency dependence of the screened interaction potential was approximated by using the Godby-Needs plasmon pole approximation62 (GNPPA), and the quasiparticle energies were calculated according to the G0W0 approximation53,54. The Bruneval-Gonze technique67 was used to reduce the number of empty states Nb needed for the construction of the correlation Self-Energy Σc (Eq. (2)). For low-dimensional systems, spurious interactions between supercell replica were avoided using a slab truncation of the Coulomb potential68 along the non-periodic direction; its divergences are cured by means of the Random Integration Method69 (RIM), which also accelerates convergence with respect to the BZ sampling. For 2D systems, specifically, we adopted a recently developed accelerating technique based on stochastic integration of the screened potential70, which allows to have GW-converged results using reduced Monkhorst-Pack k-points grids, just slightly denser than the DFT one. Finally, Wannierisation and band interpolations are performed by means of the Wannier90 code. All the simulations are performed using the automated workflows implemented in the aiida-yambo and aiida-yambo-wannier90 plugins, developed for the AiiDA platform and presented here as part of the results achieved in this work. Input parameters are generated using the protocols procedure, as implemented in the corresponding plugins.