Towards high-throughput many-body perturbation theory: efficient algorithms and automated workflows

Bonacci, Miki; Qiao, Junfeng; Spallanzani, Nicola; Marrazzo, Antimo; Pizzi, Giovanni; Molinari, Elisa; Varsano, Daniele; Ferretti, Andrea; Prezzi, Deborah

doi:10.1038/s41524-023-01027-2

Download PDF

Article
Open access
Published: 18 May 2023

Towards high-throughput many-body perturbation theory: efficient algorithms and automated workflows

npj Computational Materials volume 9, Article number: 74 (2023) Cite this article

1709 Accesses
3 Citations
3 Altmetric
Metrics details

Subjects

Abstract

The automation of ab initio simulations is essential in view of performing high-throughput (HT) computational screenings oriented to the discovery of novel materials with desired physical properties. In this work, we propose algorithms and implementations that are relevant to extend this approach beyond density functional theory (DFT), in order to automate many-body perturbation theory (MBPT) calculations. Notably, an algorithm pursuing the goal of an efficient and robust convergence procedure for GW and BSE simulations is provided, together with its implementation in a fully automated framework. This is accompanied by an automatic GW band interpolation scheme based on maximally localized Wannier functions, aiming at a reduction of the computational burden of quasiparticle band structures while preserving high accuracy. The proposed developments are validated on a set of representative semiconductor and metallic systems.

Database of Wannier tight-binding Hamiltonians using high-throughput density functional theory

Article Open access 13 April 2021

Towards fully automated GW band structure calculations: What we can learn from 60.000 self-energy evaluations

Article Open access 29 January 2021

pyGWBSE: a high throughput workflow package for GW-BSE calculations

Article Open access 13 February 2023

Introduction

Computational HT screening is nowadays a consolidated approach to materials discovery^1,2,3, as a complementary and accelerated tool with respect to experimental efforts. In the last decade, seminal works in this field have addressed, among many other topics, the discovery of novel 2D materials^{4,5,6,7,8,9,10}, the identification of optimal new lithium-ion battery anodes^11,12, thermoelectric^13,14, photocatalysts¹⁵, and photovoltaic light harvesting^16,17,18 materials. The success of these studies relies on the development of different software and implementations¹⁹ that were able to encode complex domain-specific knowledge into automated and robust workflows – enforcing rigorous computational protocols^19,20 and managing all the steps concerning a simulation – which thus require the least possible human intervention^{21,22,23,24,25,26,27,28,29}.

Concerning the electronic structure field, most of these works and implementations are based on DFT, which allows one to compute total energies, optimized geometries, and other ground state properties of materials with predictive accuracy. However, different approaches are required for the accurate prediction of excited-state properties of materials, such as quasiparticle (QP) band structures and absorption spectra, which are typically crucial for the description of active processes in modern optoelectronic technologies, like photovoltaics, photocatalysis, light-emitting diodes (LEDs), photodetectors, and solar cells^{30,31,32,33,34,35,36}. In this context, MBPT and Green’s function methods represent the state-of-the-art tools, where charged (electronic quasi-particle levels) and neutral excitations (optical properties, electron energy loss spectra) can be obtained by means of the GW approximation and the Bethe-Salpeter equation (BSE), respectively³⁷.

To date, a limited number of attempts have been made toward automation³⁸ and HT screening^9,39 based on these MBPT approaches, mainly because of their conceptual and computational complexity. Indeed, depending on the specific physical problem, different levels of theory might be adopted⁴⁰, further branching off depending on the chosen approximations and implementations^41,42. Furthermore, even for the simplest approximations, these calculations require the control over a much larger parameter space with respect to DFT, with parameters that are often interdependent and might change depending on the specific implementations adopted, but the choice of which is always crucial to obtain reliable results. From a computational point of view, this type of simulations are constituted by a chain of distinct steps, often relying on the usage of different software tools, each of them with their own specificity, e.g., in terms of memory and parallelization requirements. Memory requirements are much heavier than in standard DFT simulations even for moderate system size, and often require massive usage of parallel computing resources. Calculations often fail due to memory overflow and have to be restarted with careful choice of parameters. All of these problems make the application of MBPT-based approaches a complex and difficult task per se, and its automation still an open challenge.

Building on pioneering works in the field^38,39, we here focus on the development of an improved algorithm aiming at an efficient and computationally cost-effective management of the choice of converged parameters for accurate GW-BSE calculations. The algorithm is implemented in the AiiDA framework^28,29, a platform that is routinely used for HT studies^6,43,44,45 and that incorporates the ADES model for Automation, Data, Environment, and Sharing²⁴. As detailed below, this implementation allows us to encode an efficient error handling, memory and parallelization management, and logic computational flows within automated python workflows. Moreover, it guarantees a seamless interoperability of different software codes that tackle the different steps usually involved in MBPT simulations, i.e., the preliminary DFT part (here using Quantum ESPRESSO^46,47), the GW-BSE calculations (Yambo^48,49), and any required post-processing. In particular, for the latter, we here introduce a scheme based on maximally localized Wannier functions⁵⁰ for the automatic interpolation of GW band structures, which interfaces the Wannier90⁵¹ and Yambo projects. All of these developments point to a drastic reduction of both human and computational efforts, key issues for enabling HT studies. In addition, by incorporating the different domain-specific scientific and computational competences into robust and reliable workflows, we aim at making accurate GW-BSE calculations available for the materials science community at large, including non-experts in the field (e.g., via graphical user interfaces⁵²), similarly to what has already happened with DFT.

The manuscript is organized as follows. In the Results and Discussion Section, we first introduce a model for the convergence surface in the N-dimensional space of the GW (BSE) variables, and present our improved algorithm for efficiently retrieving converged values for the main (interdependent) parameters. The implementation of this algorithm within the aiida-yambo plugin is then described, followed by the presentation of the aiida-yambo-wannier90 plugin that encodes the GW band interpolation based on Wannierisation. Both these implementations are validated for selected prototypical systems. Additional details on simulations are provided in the Methods section. In the remainder of this section, we instead introduce the main concepts and quantities related to the GW and BSE schemes, which will be useful to properly understand the subsequent Sections.

Accurate electronic band structures of materials can be computed within the MBPT framework by correcting the Kohn–Sham (KS) DFT eigenvalues with a self-energy term Σ by means of the GW approximation, i.e. Σ is approximated with the first term of the perturbation series expansion in terms of the screened Coulomb interaction W⁵³. We hereafter consider the simplest and most widespread implementation of GW, i.e. the so-called G₀W₀ approach, where G₀ is the KS independent-particle one-body Green’s function and W is computed within the random-phase approximation (RPA)⁵⁴. Nevertheless, the convergence algorithm presented here can be used also for more sophisticated flavors of the theory, e.g. the self-consistent GW³⁷. Under these assumptions and by considering a plane-wave expansion, the self-energy term Σ_nk for a band n at a given k-point – written as the sum of the Fock exchange (Σ^x) and the frequency-dependent correlation (Σ^c) terms – is given by:

$$\Sigma_{n{{{\bf{k}}}}}^{x}=-\mathop{\sum }\limits_{m}^{{{{\rm{occ}}}}}\int\frac{d{{{\bf{q}}}}}{{(2\pi )}^{3}}\mathop{\sum }\limits_{{{{\bf{G}}}}}^{{G}_{{{{\rm{cut}}}}}^{x}}{v}_{{{{\bf{G}}}}}({{{\bf{q}}}}){\left\vert {\rho }_{nm}({{{\bf{k}}}},{{{\bf{q}}}},{{{\bf{G}}}})\right\vert }^{2}{f}_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}$$

(1)

and

$$\begin{array}{ll}\Sigma_{n{{{\bf{k}}}}}^{c}(\omega )=\displaystyle\,{{{\rm{i}}}}\mathop{\sum }\limits_{m}^{{N}_{b}}\int\frac{d{{{\bf{q}}}}}{{(2\pi )}^{3}}\mathop{\sum }\limits_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime} }}^{{G}_{{{{\rm{cut}}}}}}{\rho }_{nm}({{{\bf{k}}}},{{{\bf{q}}}},{{{\bf{G}}}}){\rho }_{nm}^{* }({{{\bf{k}}}},{{{\bf{q}}}},{{{{\bf{G}}}}}^{{\prime} })\\\qquad\qquad \displaystyle\times\,\int\,\frac{d{\omega }^{{\prime}}}{2\pi}{W}_{{{{{\bf{GG}}}}}^{{\prime} }}({{{\bf{q}}}},{\omega }^{{\prime} })\\\qquad\qquad \displaystyle \times\,\left[\frac{{f}_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}}{\omega -{\omega }^{{\prime} }-{\epsilon }_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}-{{{\rm{i}}}}\eta }+\frac{1-{f}_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}}{\omega -{\omega }^{{\prime} }-{\epsilon }_{m,{{{\bf{k}}}}-{{{\bf{q}}}}}+{{{\rm{i}}}}\eta }\right],\end{array}$$

(2)

where v_G(q) and ${W}_{{{{{\bf{GG}}}}}^{{\prime} }}({{{\bf{q}}}},{\omega }^{{\prime} })$ are the bare and screened Coulomb interaction, respectively. The generalized dipole matrix elements ρ_nm(k, q, G) are defined as:

$${\rho }_{nm}({{{\bf{k}}}},{{{\bf{q}}}},{{{\bf{G}}}})=\left\langle n{{{\bf{k}}}}\right\vert {{{{\rm{e}}}}}^{{{{\rm{i}}}}({{{\bf{q}}}}+{{{\bf{G}}}})\cdot {{{\bf{r}}}}}\left\vert m{{{\bf{k}}}}-{{{\bf{q}}}}\right\rangle ,$$

(3)

where ${\epsilon }_{n{{{\bf{k}}}}}^{{{{\rm{KS}}}}}$ and $\left\vert n{{{\bf{k}}}}\right\rangle$ are the corresponding KS eigenvalues and eigenvectors. Here, W can be written in terms of the reducible polarizability χ (W = v + vχv), which is in turn computed by solving a Dyson equation for the RPA irreducible polarizability χ⁰:

$${\chi }_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime} }}({{{\bf{q}}}},\omega )=\mathop{\sum }\limits_{{{{{\bf{G}}}}}^{{\prime\prime} }}^{{G}_{{{{\rm{cut}}}}}}{\left[I-v({{{\bf{q}}}}){\chi }^{0}({{{\bf{q}}}},\omega )\right]}_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime\prime} }}^{-1}{\chi }_{{{{{\bf{G}}}}}^{{\prime\prime} }{{{{\bf{G}}}}}^{{\prime} }}^{0}({{{\bf{q}}}},\omega ),$$

(4)

where

$$\begin{array}{l}{\chi }_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime} }}^{0}({{{\bf{q}}}},\omega )=\displaystyle2\mathop{\sum }\limits_{nm}^{{N}_{b}}{\int}\frac{d{{{\bf{k}}}}}{{(2\pi )}^{3}}{\rho }_{mn}^{* }({{{\bf{k}}}},{{{\bf{q}}}},{{{\bf{G}}}}){\rho }_{mn}({{{\bf{k}}}},{{{\bf{q}}}},{{{{\bf{G}}}}}^{{\prime} })\\\qquad\qquad\qquad\displaystyle\times\left[\frac{{f}_{n{{{\bf{k}}}}-{{{\bf{q}}}}}(1-{f}_{m{{{\bf{k}}}}})}{\omega +{\epsilon }_{n{{{\bf{k}}}}-{{{\bf{q}}}}}-{\epsilon }_{m{{{\bf{k}}}}}+i\eta}-\frac{{f}_{n{{{\bf{k}}}}-{{{\bf{q}}}}}(1-{f}_{m{{{\bf{k}}}}})}{\omega +{\epsilon }_{m{{{\bf{k}}}}}-{\epsilon }_{n{{{\bf{k}}}}-{{{\bf{q}}}}}-i\eta}\right].\end{array}$$

(5)

In practice, the above quantities are computed by introducing specific approximations that crucially impact both the accuracy and the computational and memory costs of the calculations. In particular, Eq. (4) as well as Eq. (2) involve a discrete, finite number of G vectors, set by a cutoff parameter G_cut, which defines the size of the response matrix. Moreover, with the parameter N_b, we introduce a truncation over the KS states summation for both response ${\chi }_{{{{\bf{G}}}}{{{{\bf{G}}}}}^{{\prime} }}^{0}$ (Eq. (5)) and self-energy Σ^c (Eq. (2)), which should in principle include all occupied and an infinite number of empty states. Finally, all the integrals in reciprocal space are computed on a discrete k-points (q-points) grid whose size, N_k, defines the accuracy of the sampling of the Brillouin zone (BZ). All of these three parameters, G_cut, N_b, and N_k, need to be increased till the desired convergence is reached.

One of the major obstacles to automate the convergence procedure lies in the interdependence of the first two parameters, G_cut and N_b, such that their convergence has to be performed jointly, as thoroughly discussed elsewhere^38,42,55. Indeed, Eqs. (4) and (5) contains a summation over both empty states and reciprocal lattice vectors (G), and the expression of the generalized dipole terms in Eq. (3) is such that matrix elements with large G are governed by high-energy KS states³⁸. Furthermore, the interdependence is non-trivial, given the presence of an inversion in Eq. (4) that further enters in the evaluation of the correlation self-energy, Eq. (2). Given the lack of an efficient “recipe” to carry out this non-trivial, coupled convergence, one has to resort to an iterative convergence of each of the two parameters by fixing the other, in an alternating way. This procedure, in addition of being tedious, is computationally very expensive, sometimes representing the most cumbersome part of a GW calculation.

Starting from the GW quasi-particles, the solution of the BSE^54,56 can give access to optical properties of materials via the macroscopic dielectric function:

$$\begin{array}{l}{\epsilon }_{M}({{{\bf{q}}}},\omega)=\displaystyle1-\frac{2}{V{N}_{q}}v({{{\bf{q}}}})\mathop{\sum}\limits_{\lambda }{\Phi }_{\lambda }({{{\bf{q}}}})\\ \qquad\qquad\qquad\displaystyle\times \left[\frac{1}{{E}^{\lambda }({{{\bf{q}}}})-(\omega +{{{\rm{i}}}}\eta )}+\frac{1}{{E}^{\lambda }({{{\bf{q}}}})+(\omega +{{{\rm{i}}}}\eta )}\right],\end{array}$$

(6)

where V is the volume of the unit cell; N_q is the number of q-points sampling the Brillouin zone (BZ); E^λ(q) is the eigenvalue of the exciton λ at momentum ${{{\bf{q}}}}={{{\bf{k}}}}-{{{{\bf{k}}}}}^{{\prime} }$, and the corresponding exciton oscillator strength Φ_λ(q) is defined as:

$${\Phi }_{\lambda }({{{\bf{q}}}})={\left\vert \mathop{\sum}\limits_{vc,{{{\bf{k}}}}}\left\langle v{{{\bf{k}}}}-{{{\bf{q}}}}\right\vert {{{{\rm{e}}}}}^{-{{{\rm{i}}}}{{{\bf{q}}}}\cdot {{{\bf{r}}}}}\left\vert c{{{\bf{k}}}}\right\rangle {A}_{vc{{{\bf{k}}}}}^{\lambda }({{{\bf{q}}}})\right\vert }^{2}.$$

(7)

Equation (7) contains a summation over valence and conduction bands (v, c), and k-point mesh, which are the main parameters to converge for a BSE calculation. The terms ${A}_{vc{{{\bf{k}}}}}^{\lambda }({{{\bf{q}}}})$ represent the weight of each electron-hole transition contributing the exciton λ, as resulting from the solution of the BSE. The summation over bands mainly defines the range of energies under investigation. The k-point mesh is connected to the accuracy with which we describe the exciton composition in terms of single-particle v → c transitions over the entire BZ, and it is usually significantly larger than the one needed to converge the corresponding GW band structure. Additional convergence parameters are the number of G vectors used for expanding the KS wave-functions in the transition matrix elements, as defined in Eq. (3), and for Fast-Fourier-Transform (FFT) operations, as well as the plane-wave expansion and empty bands summation in the BSE kernel (both direct and exchange terms). These last three parameters are usually inherited from GW convergences. The convergence procedure proposed in this work, described in detail in the next Section, consists in the same algorithm for both GW and BSE, i.e. is agnostic on the observable (gaps, excitonic eigenvalues) that we are considering. For this reason and to avoid repetitions, we will focus herein on GW convergences only for the validation of the workflow discussed below. An example of BSE convergence is provided in the Supporting Information (Supplementary Fig. 12).

Results and discussion

Description of the convergence surface

The above-described coupled convergence of the parameters, combined with a much worse scaling than DFT and more computationally and memory demanding calculations, call for efficient procedures to describe and explore the convergence space in GW-BSE simulations. A possible strategy is to describe the convergence space in terms of an analytic function of the parameters^{38,39,41,57,58,59}. For a general (N + 1) dimensional space, a model convergence surface f(x) that represents the value of a given observable E(x) (e.g., quasiparticle energies or excitonic eigenvalues) as a function of the N parameters x = [x₁, . . . , x_N] can be defined as:

$$f({{{\bf{x}}}})=\mathop{\prod }\limits_{i}^{N}\left(\frac{{A}_{i}}{{x}_{i}^{{\alpha }_{i}}}+{b}_{i}\right),$$

(8)

where A_i, b_i, and α_i are free fitting parameters, and $B=\mathop{\prod }\nolimits_{i}^{N}{b}_{i}$ is the extrapolated converged value. Of course, the accuracy of the latter depends on the actual region of the parameter space explored for the evaluation of the convergence behavior, i.e., how far is the extrapolated value from the exact one. As such, B might not always be a good choice for guiding the search of the convergence parameters.

In the following, we introduce conditions on the mixed partial derivatives of Eq. (8) as a way to address explicitly the parameter interdependence. Indeed, these terms coincide with the off-diagonal elements of the related Hessian matrix, and correspond to the interaction terms between the parameters. The adoption of an analytical form for the description of the convergence space has the clear advantage of enabling the calculation of all-order derivatives, once the fitting parameters are known. On the other hand, not taking directly into account this interdependence can result in a very tedious convergence procedure, as it would require the cyclic repetition of multiple univariate convergences, as mentioned in the Introduction (further details are provided in Ref. ⁴⁹).

For the expression in Eq. (8), the gradient components are:

$${f}_{{x}_{i}}^{{\prime} }({{{\bf{x}}}})=-{\alpha }_{i}\frac{{A}_{i}}{{x}_{i}^{{\alpha }_{i}+1}}\mathop{\prod }\limits_{j\ne i}^{N}\left(\frac{{A}_{j}}{{x}_{j}^{{\alpha }_{j}}}+{b}_{j}\right),$$

(9)

while second derivatives are:

$${f}_{{x}_{i}}^{{\prime\prime} }({{{\bf{x}}}})={\alpha }_{i}({\alpha }_{i}+1)\frac{{A}_{i}}{{x}_{i}^{{\alpha }_{i}+2}}\mathop{\prod }\limits_{j\ne i}^{N}\left(\frac{{A}_{j}}{{x}_{j}^{{\alpha }_{j}}}+{b}_{j}\right),$$

(10)

$${f}_{{x}_{i},{x}_{j}}^{{\prime\prime} }({{{\bf{x}}}})={\alpha }_{i}{\alpha }_{j}\frac{{A}_{i}{A}_{j}}{{x}_{i}^{{\alpha }_{i}+1}{x}_{j}^{{\alpha }_{j}+1}}\mathop{\prod }\limits_{k\ne i,j}^{N}\left(\frac{{A}_{k}}{{x}_{k}^{{\alpha }_{k}}}+{b}_{k}\right).$$

(11)

The asymptotic region of the convergence surface can be determined by imposing, for each parameter x_i, x_j with i, j = 1,..., N, two conditions:

$$\left\{\begin{array}{l}| {f}_{{x}_{i}}^{{\prime} }({{{\bf{x}}}})| \,<\, {\Delta }_{i}\quad \\ | {f}_{{x}_{i},{x}_{j}}^{{\prime\prime} }({{{\bf{x}}}})| \,<\, {\Delta }_{ij}\quad \end{array}\right.$$

(12)

The first condition determines the region in which the convergence surface becomes flat (thus approaching convergence), whereas the condition on second partial derivatives ensures that the N parameters are no longer interdependent. The threshold values Δ_i and Δ_ij can be tuned according to the desired accuracy. According to our experience, the asymptotic region can be safely identified by choosing Δ_i = 5 ⋅ 10⁻⁵ and Δ_ij = 1 ⋅ 10⁻⁸. Once this asymptotic region has been determined, a guess for the converged value of f(x), E_guess, is made, and its accuracy is then checked and validated according to the automated algorithm described below.

Convergence algorithm

Figure 1 schematically depicts our convergence algorithm, which is purposed to obtain the desired accuracy on GW-BSE results with the least possible number of calculations. In the following, we consider N_b and G_cut as the two interdependent parameters to be converged simultaneously. We remark that, while the algorithm is specifically designed to handle coupled convergences, it can be successfully used to accelerate convergence tests with respect to any other parameter, such as the BZ k-point mesh or the FFT grids.

The first step (i) consists in the construction of the N-dimensional space of parameters as a grid of equally spaced points, with spacing and ranges provided from input. It is worth noting that, to fit the functional form of Eq. (8), one needs to generate a grid with minimum 3N points, since f(x) contains three free fitting parameters for each of the N dimensions of the parameter space. However, a further reduction on the minimal grid size (that is, on the minimum number of calculations to perform) can be obtained by fixing the power-law dependence, α_i, to a given value, as suggested in Ref. ³⁸, thus resulting in a minimal grid size of 2N. Usually, much denser grids are generated, where (ii) M₀ ≥ 2N calculations are performed on a subset of the grid points, chosen such as to efficiently sample the parameter space.

Next, (iii) the results of the calculations are fitted by using the expression Eq. (8). As mentioned above, the power laws are fixed to given values: α_i ∈ {1, 2} ∀ i = 1, ...N, and the one resulting in the lowest mean squared error is chosen. The asymptotic region is then identified by computing the first and second order derivatives (Eqs. (9)–(11)), and imposing the conditions in Eq. (12). Given the asymptotic region, a guess converged value is selected E_guess= $f({N}_{b}^{0},{G}_{cut}^{0})$, where (${N}_{b}^{0},{G}_{cut}^{0}$) are the lowest values that can be chosen for the parameters such that E_guess is within a desired convergence threshold Δ with respect to the asymptotic region.

To establish the accuracy and convergence of the fitting procedure, (iv) $E({N}_{b}^{0},{G}_{cut}^{0})$ is computed and compared to the outcome of the fit E_guess, by considering the chosen threshold Δ (see Eq. (13) below). If the accuracy condition is satisfied, we need to check the convergence of the fitting procedure, that is, a new pair of parameters $({N}_{b}^{1},{G}_{cut}^{1})$ is obtained from the fit by adding the $({N}_{b}^{0},{G}_{cut}^{0})$ point to the initial M₀ grid, and compared to the previous one (see Eq. (14) below). The last step is repeated until convergence is reached. If the accuracy condition is not satisfied, the grid is instead shifted toward higher values of the parameters, and the steps (ii)–(iv) are repeated until the two conditions:

$$| E({N}_{b}^{j},{G}_{cut}^{j})-{E}^{fit}({N}_{b}^{j},{G}_{cut}^{j})| \,<\, \Delta ,$$

(13)

$$({N}_{b}^{j+1},{G}_{cut}^{j+1})=({N}_{b}^{j},{G}_{cut}^{j})$$

(14)

are simultaneously satisfied for the j-th iteration.

The aiida-yambo plugin and automated workflows

The above convergence algorithm has been implemented in the new version of the aiida-yambo plugin (see Code Availability Section), which is meant to fully automate GW-BSE calculations by interfacing the Yambo project^48,49 and the AiiDA informatics infrastructure and workflow management system^28,29. The automation concerns input generation, scheduler submission, and output parsing phases. The output parsing of the aiida-yambo plugin is partially done by using yambopy functions⁴⁹. Thanks to the AiiDA infrastructure, links between single calculations are managed on the fly by ad-hoc, dynamic workflows (the so-called workchains in the AiiDA jargon), i.e. their execution path is not fixed, but can depend on the results of completed calculations. This allows for the implementation of complex logics, such as those characterizing the convergence algorithm and GW band interpolation that we propose in this work. Moreover, each calculation, together with inputs and outputs, is stored in the AiiDA relational database, thus ensuring data provenance and full reproducibility of results.

Currently, the aiida-yambo plugin supports quasiparticle (G₀W₀ and COHSEX⁵³ level) and optical properties (IP-RPA and BSE) simulations, as well as interfaces with different codes (e.g., Quantum ESPRESSO and Wannier90). These options are implemented in the YamboCalculation and YppCalculation classes, which manage individual simulations (including data interfacing) that can be performed by using the Yambo code. On top of them, task-specific workflows are implemented, and organized in a modular way, in order to automate tasks of increasing complexity. In particular, the aiida-yambo plugin contains three main workflows, each of them targeting a precise task:

YamboRestart: automation of error handling and restart for each YamboCalculation;
YamboWorkflow: automation of the single GW or BSE flow (composed of several interlinked steps, explained in the following);
YamboConvergence: automation of the convergence (composed of multiple YamboWorkflow runs).

**Fig. 2: Hierarchical structure of the aiida-yambo workchains.**

Their nested organization is shown in Fig. 2. The highest level workflow is represented by the YamboConvergence workchain, which implements the full automation of the convergence algorithm described above, thus allowing for all Yambo simulations to be organized on the fly, without any external user intervention. The user is only requested to provide a python list containing the information on the parameter space to be explored. An example of such input reads:

[

{

‘var’: [

‘BndsRnXp’,

‘GbndRnge’,

‘NGsBlkXp’

],

‘start’: [50, 50, 2],

‘stop’: [400, 400, 10],

‘delta’: [50, 50, 2],

‘max’: [1000, 1000, 36],

‘what’: [‘gap_GG’],

‘conv_thr’: 0.1,

‘conv_thr_units’: ‘eV’,

},

]where the Yambo variables “BndsRnXp” and “GbndRnge” govern the convergence over empty states, N_b, to be carried out jointly with that on the size of the response matrix, G_cut (“NGsBlkXp” variable). The edges of the grid (“start” and “stop”) and its spacing (“delta”), together with an upper bound of the parameter space (“max”), limiting the search to computationally accessible calculations, are also set by the user. The key “what” indicates the quantity to be converged – in our example, the direct band gap of the material at Γ point – up to a given convergence threshold Δ (“conv_thr” key). Notably, it is possible to select a convergence threshold Δ for each set of GW (BSE) parameters that have to be converged simultaneously, thereby avoiding any risk of over-convergence.

The output summarizes the convergence history and allows the user to easily parse the converged simulation. YamboConvergence allows one to converge several many-body quantities, like quasiparticle levels, band-gaps, as well as optical excitation energies. Notably, the convergence block in Fig. 2 can be skipped if converged parameters are already known.

Each single GW (BSE) calculation is instead automated within YamboWorkflow, which is the core workchain of the plugin that takes care of performing all the steps needed in a typical Yambo simulation – from preliminary self-consistent (SCF) and non-self-consistent (NSCF) DFT calculations to the actual GW (BSE) calculations, and the related post-processing. The workflow ensures a robust interoperability between DFT and MBPT codes (Quantum ESPRESSO and Yambo, respectively), and links subsequent calculations, interfacing the data automatically. In practice, YamboWorkflow encodes the specific flowchart underlying each requested calculation, and allows for its dynamic execution according to the instructions provided in input. This implies performing all the intermediate steps needed for a specific calculations without the need of instructing them explicitly, or, on the contrary, to skip some of the intermediate steps for which parent calculations are available, fully exploiting the YamboWorkflow provenance information.

To support a restart mechanism in case of code failures, YamboWorkflow takes advantage of the YamboRestart workchain, a sub-level workflow that encodes an automatic error handler (inherited from the AiiDA BaseRestartWorkchain class) which, depending on the encountered failure, automatically instructs a restart run. For out-of-memory errors or failures connected with insufficient wall-time requests, YamboRestart automatically resubmits the calculation by appropriately changing the requested resources (e.g., the maximum wall-time and the MPI/OpenMP balance); parallelization errors are managed by overwriting the parallelism variables set in input by the user with the default parallelism decided on the fly by yambo. In all these cases, an efficient, CPU-time-saving restart mechanism is implemented, which avoids to restart unfinished runs from scratch by automatically retrieving and enabling the reuse of stored data files.

As a final issue, we would like to discuss the possibility to develop protocols for MBPT calculations. Indeed, most of the DFT-based AiiDA plugins enable the use of protocols⁶⁰, that is, the possibility of creating inputs with pre-populated default values for several parameters. Such protocols are usually code-agnostic and robust, given the high level of reproducibility of DFT with different quantum engines¹⁹. Moreover, their reliability is guaranteed by means of large scale studies spanning systems with a wide variety of characteristics (e.g., metals, semiconductors, systems with different dimensionality)⁴⁴.

Concerning MBPT calculations, the possibility to define protocols is still an open issue³⁸. First of all, code-agnostic parameters are not at all easy to be determined as it is for DFT-based codes, because the MBPT implementations and the subsequent definition of parameters can differ in very many aspects, as highlighted in the Introduction section. Secondly, the high computational cost of these simulations has limited so far the number of systems to be studied extensively, which is crucial to define a reliable statistics on convergence parameters. Last but not least, DFT-based protocols usually result in safe but overconverged parameters, an approach that might lead to unfeasible calculations when moving to the GW-BSE framework.

For all these reasons, we believe that an efficient, fully automated convergence tool, as the one presented here, is currently the most valuable solution. Nonetheless, also in view of possible future developments, the aiida-yambo plugin provides an implementation for a protocol framework for both GW and BSE simulations, which is currently pre-populated on the basis of previous experience on a limited subset of systems. Such protocols concern several parameters connected to the main yambo input variables, such as the FFT grids (“FFTGvecs”), the summation over empty states (“BndsRnXp” and “GbndRnge”), the plane-wave expansion for the polarizability (“NGsBlkXp”) and the BZ k-point sampling. A detailed documentation on these and other aspects concerning the aiida-yambo plugin is provided at https://aiida-yambo.readthedocs.io/en/master/.

Automatic GW bands interpolation: the aiida-yambo-wannier90 plugin

GW band interpolation from Wannierisation is a crucial task in order to obtain the most accurate quasiparticle band structure with the lowest computational cost. This task is encoded in the aiida-yambo-wannier90 plugin (see Code Availability Section). Essentially, the plugin provides a meta-workflow, called YamboWannier90WorkChain, which utilizes the automation and error handling of the underlying aiida-yambo and aiida-wannier90-workflows plugins for GW convergence and Wannierisation, respectively. The flowchart of the workflow is summarized in Fig. 3. Starting from a given crystal structure, the workflow first launches a YamboConvergence workflow for automatic convergence. Then, it finds the minimal commensurate mesh with the wannier90.x ones that satisfies the GW convergence conditions (see below). Thirdly, it runs YamboWorkflow to compute all the quasiparticle corrections required for the Wannierisation on the commensurate mesh, and a subsequent ypp calculation (by means of YppRestart) to extract the GW corrections in a Wannier90eig file format. Fourthly, the workflow Wannierizes the KS wavefunctions, saving the unitary transformation matrices of maximal localization, and interpolates the band structure. Finally, the workflow performs the Wannierisation procedure at G₀W₀ level, which consists in incorporating the GW corrections into the DFT eigenvalues, and interpolating the band structure by using the DFT Wannierisation outcomes.

**Fig. 3: Flowchart of the YamboWannier90WorkChain for the Wannier interpolation of GW band structures.**

A crucial step of the workflow is finding a commensurate mesh for both GW QP calculations and Wannierisation. Indeed, the GW mesh resulting from automated convergence might not always be compatible with the mesh required by wannier90.x to ensure interpolation accuracy. Notably, considering a Monkhorst-Pack (MP) grid for the Wannierisation, the corresponding GW mesh must be an integer multiple of the MP grid. We here propose a recipe to find the minimal commensurate meshes for yambo and wannier90.x calculations, as depicted in Fig. 4. Considering n_d as the number of k-points chosen by the YamboConvergence workflow, and n_c the number of k-points chosen by the Wannierisation protocol (typically based on a k-point spacing, 0.2 Å⁻¹)⁴⁵, the target is to find a new (${n}_{d}^{{\prime} }$, ${n}_{c}^{{\prime} }$) such that the dense mesh ${n}_{d}^{{\prime} }=k\cdot {n}_{c}^{{\prime} }$, where $k\in {\mathbb{N}}$, i.e., natural number. The given input (n_d, n_c) restricts the search space to a sector bounded by k_low and k_high (see Fig. 4), where k_low = 1 (${n}_{d}^{{\prime} }={n}_{c}^{{\prime} }=\max ({n}_{d},{n}_{c})$ is always a good solution), and ${k}_{{{{\rm{high}}}}}=\lceil \frac{{n}_{d}}{{n}_{c}}\rceil$, where ⌈⋅⌉ indicates the ceiling integer. The search always succeeds since ${s}_{{{{\rm{low}}}}}=(\max ({n}_{d},{n}_{c}),\max ({n}_{d},{n}_{c}))$ and s_high = (k_high ⋅ n_c, n_c) are already two good solutions. In fact, the optimal solution is often inside the triangular region determined by the input (n_d, n_c), s_low, and s_high. The final solution is chosen according to the ℓ¹ distance to the input, such to ensure the minimal increase in computational cost. It is also possible to change the metric, e.g., pushing the solution towards increasing the Wannierisation mesh or GW mesh, depending on which calculation is less cumbersome. The aforementioned recipe is repeated for each of the three dimensions of the MP grid.

**Fig. 4: Recipe to find commensurate meshes for GW and Wannierisation calculations.**

Validation of the workflows

The proposed convergence algorithm, as implemented in the YamboConvergence workchain, has been validated by performing convergence studies for the quasiparticle G₀W₀ gap of a set of prototypical semiconductors: silicon, diamond, ZnO, rutile TiO₂, monolayer MoS₂, bulk and monolayer hBN. The convergence addresses the direct band gap at the Γ point with respect to the two coupled parameters, N_b and G_cut, and the k-point grid (in terms of k-points density ρ_k). Once convergence is achieved, the minimum band gap ${E}_{gap}^{{{\rm{G}}}_{0}{{\rm{W}}}_{0}}$ is also computed. The results are summarized in Table 1, where, together with ${E}_{gap}^{{{\rm{G}}}_{0}{{\rm{W}}}_{0}}$, we report the final parameters resulting from the convergence procedure (N_b, G_cut, and ρ_k) as well as the convergence threshold (absolute, Δ^Γ, and relative, ${\Delta }_{ \% }^{\Gamma }$) adopted in each case. Our results are found in good agreement with previous findings, which are also reported in the Table 1. Deviations with respect to reference results can be ascribed to different GW implementations and/or different DFT starting points used in the related works⁴¹. This is the case, for instance, of the band gap of ZnO: our result (2.36 eV, Table 1) is in excellent agreement with the result reported in ref. ⁴², where the same plasmon pole model is applied, but differs from more recent benchmark calculations⁴¹ (2.8–3.1 eV) that exploit other approaches. Further details on the convergence tests are contained in the Supplemental Material (including convergence plots obtained within this work and additional information in the Wannierisation procedure).

Table 1 G₀W₀ convergence tests on prototypical semiconductors.

Full size table

Figure 5 shows the convergence procedure for monolayer hBN, considering both the joint convergence with respect to N_b and G_cut (panel a), and the single-parameter convergence with respect to the k-mesh (panel b). Starting from an input grid with N_b ∈ [200,800] and G_cut ∈ [4,16] Ry (Fig. 5a, blue shaded area), a subset of 6 calculations is performed (black squares). Since the converged guess is above the upper bound of the parameter space (see “max” variable in the plugin description), a new shifted grid (orange shaded area) is considered, and a first guess for the converged parameters is found from the fitting procedure (blue square), now satisfying the accuracy condition (Eq. (13)). A new fit is performed including this additional point, which results in a new converged guess (red square). The procedure is repeated till the converged result is verified to be consistent with the prediction within the given threshold (Eq. (13)) and to be the true converged point (Eq. (14)).

**Fig. 5: Convergence algorithm applied to monolayer hBN.**

A similar path is followed for the k-mesh convergence (Fig. 5b): a limited number of calculations is initially performed (black squares), from which the fitting is evaluated (blue curve), and the smallest grid compatible with the given threshold is finally selected (here 8 × 8 × 1). We note that, despite the results of the simulations seem to have an oscillating behavior with respect to the fitted curve, the error bar considered here (from the 14 × 14 × 1 mesh) is ~ 0.13% of the band gap at Γ, i.e. ~10 meV, in line with the accuracy of state-of-the-art GW calculations. Each of the simulations shown in Fig. 5 was performed on a 40-core (230 GB RAM) Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz machine, for a computational time of 6 hours. Considering that the workflow submits several jobs simultaneously, the total wallclock time needed to obtain the final converged result is further reduced to less than 3 hours.

The dependence of the outcome of the algorithm on the input settings (e.g., the initial value of the parameters and the boundaries of the convergence space) is a key issue for evaluating the robustness of the algorithm itself. Indeed, the convergence procedure has been tested on the starting grid for monolayer MoS₂. By using two different grids, N_b ∈ [200,800], G_cut ∈ [4,20] Ry and N_b ∈ [200,1200], G_cut ∈ [8,24] Ry, the convergence point obtained from the workflow remains the same, i.e., (N_b,G_cut) = (400,8). Another important issue to evaluate is the efficiency of the algorithm. We notice that, in the case of the 2D-hBN convergence shown in the left panel of Fig. 5, only 14 calculations where required to reach convergence. Older implementations⁴⁹ would have required ≥25 simulations to achieve a final result, with over 40% reduction.

Next, the YamboWannier90WorkChain has been tested on bulk Si and Cu, in order to validate the automatic Wannier interpolation of GW band structures for both semiconductors and metals. Results are plotted in Fig. 6, where we compare the computed DFT bands (black dots), the Wannier interpolated DFT bands (red dashed lines), and the Wannier interpolated GW bands (green solid lines). For Si (Fig. 6a), the comparison between computed and interpolated DFT bands shows that the results are almost identical, indicating the accuracy of the Wannierisation of the KS wavefunctions. Moreover, the typical band gap opening upon inclusion of GW corrections is found when comparing KS and QP band structures.

**Fig. 6: Wannier interpolation of GW band structures.**

For Cu (Fig. 6b), we obtain a discrepancy of ~10 meV around the Fermi energy (here set to zero) between computed and interpolated bands at the DFT level. Better accuracy can be achieved imposing more stringent values of the involved parameters. At QP level, the GW correction is very small around the Fermi level ( ~ 37 meV), but still not negligible. Here, the GW convergence is more stringent than for Si, especially concerning the k-point mesh. Indeed, denser grids are needed to account for the contribution of intra-band transitions in the q → 0 limit, which is crucial for metallic systems but not explicitly included within the plasmon pole approximation^61,62. Considering the converged parameters, (N_b, G_cut, ρ_k) = (400, 18 Ry, 0.2 Å⁻¹), the quasiparticle evaluations required to interpolate the bands for the minimum converged wannier90.x k-point mesh (16 × 16 × 16) become 2900. This quite large number of QP corrections can be easily computed using the YamboWorkflow workchain thanks to the possibility to split the QPs calculation in several runs, each of them computing only a fraction of the GW corrections, and then collecting all the data in a final database well-suited for the YamboWannier90WorkChain. Since the number of QP corrections to compute can be quite high, in the Supporting Information (Supplementary Note IIB) we suggest an effective way to reduce the number of the required calculations for cases when accurate QP corrections are needed only in a limited energy region, e.g. around the Fermi energy, and energies outside of the chosen region can be approximated e.g. through a scissor and stretching correction.

In this work, we have presented the successful design and implementation of advanced algorithms in state-of-the-art GW (BSE) calculations, that is, convergence between interdependent parameters, error handling and automatic band interpolation by means of Wannierisation. We validated the tools on selected cases among semiconductors and metallic systems. The results contained in this work clearly show the power of these developed workflows for the automated study of excited states properties of materials, paving the way for achieving high-throughput MBPT studies. Thanks to these developments and within the next-generation of pre-exascale and exascale supercomputers, these simulations may become extensively and routinely performed by the materials-science community in the near future.

Methods

For all the systems studied here, we used symmetrized geometries in such a way to reduce the computational cost of simulations. We do not expect relevant differences in the results obtained with fully-relaxed structures. DFT simulations were carried out by using the Quantum ESPRESSO simulation package, which implements plane-wave basis set and pseudopotential approach. The KS-DFT exchange-correlation functional was approximated using GGA-PBE⁶³, through the optimized norm-conserving Vanderbilt (ONCV) SG15 pseudopotentials^64,65. In the case of ZnO, we adopted Local Density Approximation (LDA), to compare the results with the existing literature^41,42, and PseudoDojo pseudopotentials⁶⁶. GW and BSE results were obtained by means of the Yambo code. The frequency dependence of the screened interaction potential was approximated by using the Godby-Needs plasmon pole approximation⁶² (GNPPA), and the quasiparticle energies were calculated according to the G₀W₀ approximation^53,54. The Bruneval-Gonze technique⁶⁷ was used to reduce the number of empty states N_b needed for the construction of the correlation Self-Energy Σ_c (Eq. (2)). For low-dimensional systems, spurious interactions between supercell replica were avoided using a slab truncation of the Coulomb potential⁶⁸ along the non-periodic direction; its divergences are cured by means of the Random Integration Method⁶⁹ (RIM), which also accelerates convergence with respect to the BZ sampling. For 2D systems, specifically, we adopted a recently developed accelerating technique based on stochastic integration of the screened potential⁷⁰, which allows to have GW-converged results using reduced Monkhorst-Pack k-points grids, just slightly denser than the DFT one. Finally, Wannierisation and band interpolations are performed by means of the Wannier90 code. All the simulations are performed using the automated workflows implemented in the aiida-yambo and aiida-yambo-wannier90 plugins, developed for the AiiDA platform and presented here as part of the results achieved in this work. Input parameters are generated using the protocols procedure, as implemented in the corresponding plugins.

Data availability

The data supporting the findings of this paper are available on the Materials Cloud⁷¹ at https://doi.org/10.24435/materialscloud:6w-qh⁷². Results obtained in this work can be reproduced by means of the example scripts delivered within the aiida-yambo and aiida-yambo-wannier90 plugins.

Code availability

All the codes used in this work are fully available to the community by means of their repositories, and supported by appropriate documentations and tutorials. The Yambo code is accessible at https://www.yambo-code.eu/download/. The Quantum ESPRESSO and Wannier90 codes can be found, respectively, at https://www.quantum-espresso.org/download and http://www.wannier.org/download. The AiiDA infrastructure is available at http://www.aiida.net/download. AiiDA plugins can be downloaded from the corresponding GitHub repositories. Specifically, the aiida-yambo code is available at https://github.com/yambo-code/aiida-yambo and The aiida-yambo-wannier90 code is available at https://github.com/aiidaplugins/aiida-yambo-wannier90. The plugin documentations are available at https://aiida-yambo.readthedocs.io/en/master/ and https://aiida-yambo-wannier90.readthedocs.io/en/latest/, respectively.

References

Curtarolo, S. et al. The high-throughput highway to computational materials design. Nat. Mater. 12, 191–201 (2013).
Article CAS Google Scholar
Vecchio, K. S., Dippo, O. F., Kaufmann, K. R. & Liu, X. High-throughput rapid experimental alloy development (HT-READ). Acta Mater. 221, 117352 (2021).
Article CAS Google Scholar
Luo, S., Li, T., Wang, X., Faizan, M. & Zhang, L. High-throughput computational materials screening and discovery of optoelectronic semiconductors. WIREs Comput. Mol. Sci. https://onlinelibrary.wiley.com/doi/10.1002/wcms.1489 (2021).
Ashton, M., Paul, J., Sinnott, S. B. & Hennig, R. G. Topology-scaling identification of layered solids and stable exfoliated 2D materials. Phys. Rev. Lett. 118, 106101 (2017).
Article Google Scholar
Cheon, G. et al. Data mining for new two- and one-dimensional weakly bonded solids and lattice-commensurate heterostructures. Nano Lett. 17, 1915–1923 (2017).
Article CAS Google Scholar
Mounet, N. et al. Two-dimensional materials from high-throughput computational exfoliation of experimentally known compounds. Nat. Nanotechnol. 13, 246–252 (2018).
Article CAS Google Scholar
Choudhary, K., Kalish, I., Beams, R. & Tavazza, F. High-throughput identification and characterization of two-dimensional materials using density functional theory. Sci. Rep. 7, 5179 (2017).
Article Google Scholar
Mounet, N. et al. Two-dimensional materials from high-throughput computational exfoliation of experimentally known compounds. Mater. Cloud Archive 2020.158 https://archive.materialscloud.org/record/2020.158 (2020).
Haastrup, S. et al. The Computational 2D Materials Database: high-throughput modeling and discovery of atomically thin crystals. 2D Mater. 5, 042002 (2018).
Article CAS Google Scholar
Marrazzo, A., Gibertini, M., Campi, D., Mounet, N. & Marzari, N. Relative abundance of Z2 topological order in exfoliable two-dimensional insulators. Nano Lett. 19, 8431–8440 (2019).
Article CAS Google Scholar
Kirklin, S., Meredig, B. & Wolverton, C. High-Throughput computational screening of new li-ion battery anode materials. Adv. Energy Mater. 3, 252–262 (2013).
Article CAS Google Scholar
Zhang, Z. et al. Computational screening of layered materials for multivalent ion batteries. ACS Omega 4, 7822–7828 (2019).
Article CAS Google Scholar
Chen, W. et al. Understanding thermoelectric properties from high-throughput calculations: trends, insights, and comparisons with experiment. J. Mater. Chem. C 4, 4414–4426 (2016).
Article CAS Google Scholar
Bhattacharya, S. & Madsen, G. K. H. High-throughput exploration of alloying as design strategy for thermoelectrics. Phys. Rev. B 92, 085205 (2015).
Article Google Scholar
Castelli, I. E. et al. Computational screening of perovskite metal oxides for optimal solar light capture. Energy Environ. Sci. 5, 5814–5819 (2012).
Article CAS Google Scholar
Yu, L. & Zunger, A. Identification of potential photovoltaic absorbers based on first-principles spectroscopic screening of materials. Phys. Rev. Lett. 108, 068701 (2012).
Article Google Scholar
Yan, Q. et al. Solar fuels photoanode materials discovery by integrating high-throughput theory and experiment. Proc. Natl Acad. Sci. USA 114, 3040–3043 (2017).
Article CAS Google Scholar
Kuhar, K., Pandey, M., Thygesen, K. S. & Jacobsen, K. W. High-throughput computational assessment of previously synthesized semiconductors for photovoltaic and photoelectrochemical devices. ACS Energy Lett. 3, 436–446 (2018).
Article CAS Google Scholar
Lejaeghere, K. et al. Reproducibility in density functional theory calculations of solids. Science 351, aad3000 (2016).
Article Google Scholar
Prandini, G., Marrazzo, A., Castelli, I. E., Mounet, N. & Marzari, N. Precision and efficiency in solid-state pseudopotential calculations. Npj Comput. Mater. 4, 72 (2018).
Article Google Scholar
Maffioletti, S. & Murri, R. GC3Pie: A Python Framework For High-throughput Computing. (Sissa Medialab, Munich, Germany, 2012).
Google Scholar
Curtarolo, S. et al. AFLOW: an automatic framework for high-throughput materials discovery. Comput. Mater. Sci. 58, 218–226 (2012).
Article CAS Google Scholar
Jain, A. et al. FireWorks: a dynamic workflow system designed for high-throughput applications. Concurr. Comput. Pract. Exp. 27, 5037–5059 (2015).
Article Google Scholar
Pizzi, G., Cepellotti, A., Sabatini, R., Marzari, N. & Kozinsky, B. Aiida: automated interactive infrastructure and database for computational science. Comput. Mater. Sci. 111, 218–230 (2016).
Article Google Scholar
Hjorth Larsen, A. et al. The atomic simulation environment—a Python library for working with atoms. J. Phys. Condens. Matter 29, 273002 (2017).
Article Google Scholar
Mathew, K. et al. Atomate: a high-level interface to generate, execute, and analyze computational materials science workflows. Comput. Mater. Sci. 139, 140–152 (2017).
Article Google Scholar
Mortensen, J., Gjerding, M. & Thygesen, K. MyQueue: task and workflow scheduling system. J. Open Source Softw. 5, 1844 (2020).
Article Google Scholar
Huber, S. P. et al. AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance. Sci. Data 7, 300 (2020).
Article Google Scholar
Uhrin, M., Huber, S. P., Yu, J., Marzari, N. & Pizzi, G. Workflows in AiiDA: engineering a high-throughput, event-based engine for robust and modular computational workflows. Comp. Mater. Sci. 187, 110086 (2021).
Article Google Scholar
Bablich, A., Kataria, S. & Lemme, M. C. Graphene and two-dimensional materials for optoelectronic applications. Electronics https://www.mdpi.com/2079-9292/5/1/13 (2016).
Zhang, X. et al. A review on optoelectronic device applications of 2d transition metal carbides and nitrides. Mater. Des. 200, 109452 (2021).
Article CAS Google Scholar
Jean, J., Brown, P. R., Jaffe, R. L., Buonassisi, T. & Bulović, V. Pathways for solar photovoltaics. Energy Environ. Sci. 8, 1200–1219 (2015).
Article CAS Google Scholar
Zhu, S. & Wang, D. Photocatalysis: basic principles, diverse forms of implementations and emerging scientific opportunities. Adv. Energy Mater. 7, 1700841 (2017).
Article Google Scholar
Jin, W. & Hu, L. Review on quasi one-dimensional cdse nanomaterials: synthesis and application in photodetectors. Nanomaterials https://www.mdpi.com/2079-4991/9/10/1359 (2019).
Xia, F., Mueller, T., Lin, Y.-m., Valdes-Garcia, A. & Avouris, P. Ultrafast graphene photodetector. Nat. Nanotechnol. 4, 839–843 (2009).
Article CAS Google Scholar
Lee, W. et al. High-detectivity flexible near-infrared photodetector based on chalcogenide Ag ₂ Se nanoparticles. Adv. Opt. Mater. 7, 1900812 (2019).
Article CAS Google Scholar
Martin, R. M., Reining, L. & Ceperley, D. M. Interacting Electrons: Theory and Computational Approaches (Cambridge University Press, 2016).
van Setten, M. J., Giantomassi, M., Gonze, X., Rignanese, G.-M. & Hautier, G. Automation methodologies and large-scale validation for G W : towards high-throughput G W calculations. Physical Review B 96, 155207 (2017).
Article Google Scholar
Rasmussen, A., Deilmann, T. & Thygesen, K. S. Towards fully automated GW band structure calculations: What we can learn from 60.000 self-energy evaluations. Npj Comput. Mater. 7, 22 (2021).
Article Google Scholar
Hüser, F., Olsen, T. & Thygesen, K. S. Quasiparticle GW calculations for solids, molecules, and two-dimensional materials. Phys. Rev. B 87, 235132 (2013).
Article Google Scholar
Rangel, T. et al. Reproducibility in G 0 W 0 calculations for solids. Comput. Phys. Commun. 255, 107242 (2020).
Article CAS Google Scholar
Stankovski, M. et al. G 0 W 0 band gap of ZnO: effects of plasmon-pole models. Phys. Rev. B 84, 241201 (2011).
Article Google Scholar
Mercado, R. et al. In silico design of 2D and 3D covalent organic frameworks for methane storage applications. Chem. Mater. 30, 5069–5086 (2018).
Article CAS Google Scholar
Prandini, G., Marrazzo, A., Castelli, I. E., Mounet, N. & Marzari, N. Precision and efficiency in solid-state pseudopotential calculations. Npj Comput. Mater. 4, 72 (2018).
Article Google Scholar
Vitale, V. et al. Automated high-throughput Wannierisation. Npj Comput. Mater. 6, 66 (2020).
Article Google Scholar
Giannozzi, P. et al. QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials. J. Phys. Condens. Matter 21, 395502 (2009).
Article Google Scholar
Giannozzi, P. et al. Advanced capabilities for materials modelling with Quantum ESPRESSO. J. Phys. Condens. Matter 29, 465901 (2017).
Article CAS Google Scholar
Marini, A., Hogan, C., Grüning, M. & Varsano, D. yambo: an ab initio tool for excited state calculations. Comput. Phys. Commun. 180, 1392–1403 (2009).
Article CAS Google Scholar
Sangalli, D. et al. Many-body perturbation theory calculations using the yambo code. J. Phys. Condens. Matter 31, 325902 (2019).
Article CAS Google Scholar
Marzari, N., Mostofi, A. A., Yates, J. R., Souza, I. & Vanderbilt, D. Maximally localized Wannier functions: theory and applications. Rev. Mod. Phys. 84, 1419–1475 (2012).
Article CAS Google Scholar
Pizzi, G. et al. Wannier90 as a community code: new features and applications. J. Phys. Condens. Matter 32, 165902 (2020).
Article CAS Google Scholar
Yakutovich, A. V. et al. Aiidalab – an ecosystem for developing, executing, and sharing scientific workflows. Comput. Mater. Sci. 188, 110165 (2021).
Article CAS Google Scholar
Hedin, L. New method for calculating the one-particle green’s function with application to the electron-gas problem. Phys. Rev. 139, 796–823 (1965).
Article CAS Google Scholar
Onida, G., Reining, L. & Rubio, A. Electronic excitations: density-functional versus many-body Green’s-function approaches. Rev. Mod. Phys. 74, 601–659 (2002).
Article CAS Google Scholar
Gao, W., Xia, W., Gao, X. & Zhang, P. Speeding up GW calculations to meet the challenge of large scale quasiparticle predictions. Sci. Rep. 6, 36849 (2016).
Article CAS Google Scholar
Strinati, G. Application of the green’s functions method to the study of the optical properties of semiconductors. Riv. del Nuovo Cim. (1978-1999) 11, 1–86 (1988).
Article CAS Google Scholar
Schindlmayr, A. Analytic evaluation of the electronic self-energy in the G W approximation for two electrons on a sphere. Phys. Rev. B 87, 075104 (2013).
Article Google Scholar
Klimeš, J., Kaltak, M. & Kresse, G. Predictive G W calculations using plane waves and pseudopotentials. Phys. Rev. B 90, 075125 (2014).
Article Google Scholar
Maggio, E., Liu, P., van Setten, M. J. & Kresse, G. Gw100: a plane wave perspective for small molecules. J. Chem. Theory Comput. 13, 635–648 (2017).
Article CAS Google Scholar
Huber, S. P. et al. Common workflows for computing material properties using different quantum engines. Npj Comput. Mater. 7, 136 (2021).
Article Google Scholar
Hybertsen, M. S. & Louie, S. G. Electron correlation in semiconductors and insulators: band gaps and quasiparticle energies. Phys. Rev. B 34, 5390–5413 (1986).
Article CAS Google Scholar
Godby, R. W. & Needs, R. J. Metal-insulator transition in Kohn-Sham theory and quasiparticle theory. Phys. Rev. Lett. 62, 1169–1172 (1989).
Article CAS Google Scholar
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996).
Article CAS Google Scholar
Hamann, D. R. Optimized norm-conserving vanderbilt pseudopotentials. Phys. Rev. B 88, 085117 (2013).
Article Google Scholar
Schlipf, M. & Gygi, F. Optimization algorithm for the generation of oncv pseudopotentials. Comput. Phys. Commun. 196, 36–44 (2015).
Article Google Scholar
van Setten, M. et al. The PseudoDojo: training and grading a 85 element optimized norm-conserving pseudopotential table. Comput. Phys. Commun. 226, 39–54 (2018).
Article Google Scholar
Bruneval, F. & Gonze, X. Accurate G W self-energies in a plane-wave basis using only a few empty states: towards large systems. Phys. Rev. B 78, 085125 (2008).
Article Google Scholar
Rozzi, C. A., Varsano, D., Marini, A., Gross, E. K. U. & Rubio, A. Exact Coulomb cutoff technique for supercell calculations. Phys. Rev. B 73, 205119 (2006).
Article Google Scholar
Pulci, O., Onida, G., Del Sole, R. & Reining, L. Ab initio calculation of self-energy effects on optical properties of GaAs(110). Phys. Rev. Lett. 81, 5374–5377 (1998).
Article CAS Google Scholar
Guandalini, A., D'Amico, P., Ferretti, A. & Varsano, D. Efficient gw calculations in two dimensional materials through a stochastic integration of the screened potential. Npj Comput. Mater. 9, 44 (2023).
Article CAS Google Scholar
Talirz, L. et al. Materials Cloud, a platform for open computational science. Sci. Data 7, 299 (2020).
Article Google Scholar
Bonacci, M. et al. Towards high-throughput many-body perturbation theory: efficient algorithms and automated workflows. Mater. Cloud Archive 2022.161 https://archive.materialscloud.org/record/2022.161 (2022).
Gao, S.-P. Band gaps and dielectric functions of cubic and hexagonal diamond polytypes calculated by many-body perturbation theory. Phys. Stat. Solidi (b) 252, 235–242 (2015).
Article CAS Google Scholar
Rasmussen, F. A., Schmidt, P. S., Winther, K. T. & Thygesen, K. S. Efficient many-body calculations for two-dimensional materials using exact limits for the screened potential: Band gaps of mos₂, h-bn, and phosphorene. Phys. Rev. B 94, 155406 (2016).
Article Google Scholar

Download references

Acknowledgements

We acknowledge stimulating discussions with Nicola Marzari, Michael O. Atambo, Gianluca Prandini, Dario A. L. Valido, Alberto Guandalini, and Fulvio Paleari. This work was supported by: the Centre of Excellence “MaX - Materials Design at the Exascale” funded by European Union (H2020-EINFRA-2015-1, Grant No. 676598; H2020-INFRAEDI-2018-1, Grant No. 824143; HORIZON-EUROHPC-JU-2021-COE-1, Grant No. 101093324); the European Union’s Horizon 2020 research and innovation program (BIG-MAP, Grant No. 957189, also part of the BATTERY 2030+ initiative, Grant No. 957213); SUPER (Supercomputing Unified Platform - Emilia-Romagna) from Emilia-Romagna PORFESR 2014-2020 regional funds; the Italian national program PRIN2017 2017BZPKSZ “Excitonic insulator in two-dimensional long-range interacting systems”; the ICSC - Centro Nazionale di Ricerca in High Performance Computing, Big Data and Quantum Computing, funded by European Union - NextGenerationEU - PNRR, Missione 4 Componente 2 Investimento 1.4; the Swiss National Science Foundation (SNSF) Project Funding (Grant No. 200021E_206190 “FISH4DIET”); NCCR MARVEL, a National Centre of Competence in Research, funded by the Swiss National Science Foundation (Grant No. 205602). Computational time on the Marconi100 and Galileo100 machines at CINECA was provided by the Italian ISCRA program.

Author information

Authors and Affiliations

FIM Department, University of Modena and Reggio Emilia, Via Campi 213/a, Modena, Italy
Miki Bonacci & Elisa Molinari
S3 Center, Istituto Nanoscienze, CNR, Via Campi 213/a, Modena, Italy
Miki Bonacci, Nicola Spallanzani, Elisa Molinari, Daniele Varsano, Andrea Ferretti & Deborah Prezzi
Theory and Simulation of Materials (THEOS) and National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, CH-1015, Lausanne, Switzerland
Junfeng Qiao & Giovanni Pizzi
Dipartimento di Fisica, Università di Trieste, I-34151, Trieste, Italy
Antimo Marrazzo
Laboratory for Materials Simulations (LMS), Paul Scherrer Institut (PSI), CH-5232, Villigen PSI, Switzerland
Giovanni Pizzi

Authors

Miki Bonacci
View author publications
You can also search for this author in PubMed Google Scholar
Junfeng Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Spallanzani
View author publications
You can also search for this author in PubMed Google Scholar
Antimo Marrazzo
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Pizzi
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Molinari
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Varsano
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Ferretti
View author publications
You can also search for this author in PubMed Google Scholar
Deborah Prezzi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.B., J.Q., A.M., and N.S. contributed to the development of the aiida-yambo plugin; M.B. designed, implemented and tested the automatic convergence algorithm, and the other workflows belonging to the aiida-yambo plugin. J.Q. and M.B. implemented the aiida-yambo-wannier90 plugin. E.M., A.F., D.V., D.P., and G.P. were responsible for the project supervision and coordination. M.B., J.Q., and D.P. wrote the manuscript with contributions from all authors.

Corresponding author

Correspondence to Miki Bonacci.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

41524_2023_1027_MOESM1_ESM.pdf

Supplementary Material for: Towards high-throughput many-body perturbation theory: efficient algorithms and automated workflows

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bonacci, M., Qiao, J., Spallanzani, N. et al. Towards high-throughput many-body perturbation theory: efficient algorithms and automated workflows. npj Comput Mater 9, 74 (2023). https://doi.org/10.1038/s41524-023-01027-2

Download citation

Received: 07 December 2022
Accepted: 15 April 2023
Published: 18 May 2023
DOI: https://doi.org/10.1038/s41524-023-01027-2