Introduction

Energy and environmental issues raise imperative demands on the sustainable development of energy. At present, a promising strategy is to produce renewable energy through electrochemically catalytic reactions, which convert some common materials, such as water, carbon dioxide, and nitrogen, into energy carriers. The electrocatalytic hydrogen evolution is one of the most important ways to achieve this goal, which is a critical reaction for a variety of electrochemical processes and has been applied in diverse applications such as hydrogen fuel cells, electrodeposition, corrosion of metals in acids, and energy storage using H2 medium. For the hydrogen evolution reaction (HER), choosing an appropriate electrocatalyst is most crucial to achieve high efficiency. Precious metals such as platinum are recognized as the best catalysts for electrochemical HER with intrinsically good activity but high cost1,2,3,4. Compared to conventional bulk electrocatalysts, nanomaterials could exhibit even more superior performance as benefited from the rapid development of nanoscience and nanotechnology5,6,7.

Nanoclusters, consisting of a few tens to a few thousands of atoms, exhibit the distinctive optical, electronic, magnetic, and chemical properties definitely different from the bulk materials8,9. Metal clusters have received long-lasting attention owing to their potential applications in catalysis10,11,12. As known, the catalytic reactions always occur on the surface of catalysts, thus a significant proportion of atoms in the bulk materials cannot be effectively utilized in the catalytic process. The nanoclusters possess compelling advantages like high surface-to-volume ratio and more surface-exposed atoms. In addition, it is possible to increase the activity and selectivity in a catalytic process by carefully controlling the size and structure of metal nanoclusters8. As an alternative to the precious metals like Pt and Pd with high expenses and low resources, the alloy nanoclusters prepared by mixing two or more metal components have been successfully synthesized and display better catalytic activity due to the synergistic effects9,13,14,15, which could replace or reduce the loading of the precious metals as the efficient electrocatalysts. The formation of core–shell or intermixed nanostructures results in superior catalytic properties with regard to one of the components, such as NiPt, CoPt, CuPd nanoparticles.

For the equilibrium structures of bimetallic alloy nanoclusters, there are four types of geometric patterns9,16. The first one is core–shell structure, where a core of one type of metal atoms is surrounded by a shell of the other kind of metal atoms. The second one is a segregated structure, where two types of metal atoms are separated by distinct boundary. The third one is homogeneously mixed structure with the two metal atoms are distributed in an atomically ordered or a statistically random manner. The fourth one is “onion-like” structure with more than one concentric shell covering the metal core. Among them, the core–shell alloy clusters have received increasing attention since their catalytic activities could be efficiently tuned by controlling the compositions of core and shell atoms.

In order to develop low-cost and environmentally benign HER catalysts to substitute the precious metals, here we performed high-throughput density functional theory (DFT) computations of Cu-based alloy nanoclusters and screened for the promising HER electrocatalysts. Among a total number of 7924 candidate structures, the most stable structures for Cu55-nMn (M = Co, Ni, Ru and Rh, and n ≤ 22) nanoclusters adopt the core–shell configurations with Cu as the shell atoms. To avoid the heavy computations, a descriptor has been established to evaluate the HER activity on the nanoclusters, and by training a machine-learning neutral network with a large DFT database, we have achieved a quick and accurate prediction of the hydrogen adsorption free energies on the nanoclusters surface.

Results and discussion

As an earth-abundant and inexpensive metal, copper has been widely used in some electrocatalytic reactions. Previously, Nørskov and co-workers17 have calculated the free energies of hydrogen adsorption on a variety of metal alloy surfaces and evaluated their HER activities. Taking Cu as the host metal, four transition metal dopants (i.e., Co, Ni, Ru, and Rh) in the Cu surface are shown to exhibit superior hydrogen evolution activities. Thus, we choose a medium-sized Cu55 nanocluster (with a diameter of about 1 nm) as the initial structure, and randomly substitute the transition metals for Cu atoms to form binary alloy nanoclusters and screening for the high-efficient HER catalysts. The high-throughput computational workflow is shown in Fig. 1. First, given an initial cluster structure and the doping concentration, the possible doped cluster structures are generated and optimized by DFT calculations. After the stable cluster structures are obtained, the surface-active sites are enumerated and then considered for the H adsorption. Together with the contributions of zero-point energy and entropy, the H adsorption free energies are calculated to identify the optimal active sites and search for the promising HER catalysts.

Fig. 1: Workflow of high-throughput screening.
figure 1

The workflow of computational high-throughput screening of HER electrocatalysts on metal alloy nanoclusters.

Geometric structures

For pure Cu55 cluster, the lowest-energy structure is found to be a three-layer icosahedron with Ih symmetry18. To achieve the low-cost catalysts, it is preferred to contain less precious metals in the catalysts. As the host metal of Cu is remarkably less expensive, the dopant noble metal is the less the better. Here we have considered the dopant concentration is less than half. Firstly, three ratios of the Cu-based alloy nanoclusters of Cu55-nMn are considered to explore their structures and growth patterns, where M represents Co, Ni, Ru, and Rh, and n = 7, 13, and 16. By eliminating duplicate individuals using the nearest neighbor matrix method19, we have obtained totally 6020 configurations for these 12 alloy clusters for further geometric optimization.

As mentioned earlier, there are four kinds of structures for the bimetallic alloy clusters. In order to describe the geometric distributions of Cu and dopant metal atoms, we define the average radial distance (Dradial, M) between the transition metal atoms (M = Co, Ni, Ru, and Rh) and the center atom by the following formula:

$$\begin{array}{*{20}{c}} {D_{{\mathrm{radial}},{\mathrm{M}}} = \mathop {\sum }\limits_{i = 1}^n \frac{{\left| {{\boldsymbol{r}}_i - {\boldsymbol{r}}_{\mathrm{c}}} \right|}}{n}} \end{array}$$
(1)

where |rirc| indicates the radial distance from the ith metal atom to the center atom, and n is the number of transition metal dopant atoms. In the case of the icosahedral Cu55-nMn clusters, as the Dradial, M value becomes smaller, the dopant atoms distribute closer to the core region, and vice versa. The excess energies (\(E_{{\mathrm{exc}}}^ \ast\)) for Cu55-nMn alloy clusters as a function of the average radial distance of the dopant metal atoms to the core atom are plotted in Fig. 2. These four plots for different transition metal dopants display nearly identical trends. By definition, the lower excess energy indicates a more stable cluster structure. The structures in the left lower quadrant possess relatively lower excess energies, which are the core–shell structures with the core of dopant metal atoms surrounded by the shell of Cu atoms, independent of the doping concentration.

Fig. 2: Energies and structural patterns for alloy clusters.
figure 2

The excess energies (\(E_{{\mathrm{exc}}}^ \ast\)) for (a) Cu55-nCon, (b) Cu55-nNin, (c) Cu55-nRun, and (d) Cu55-nRhn alloy clusters with n = 7, 13, and 16 as a function of the average radial distance (Dradial, M) between the dopant metal atoms and the center atom. The insets show the lowest-energy configurations.

We take Cu42M13 as the representative to illustrate the detailed structural patterns. For the icosahedral structure, the central atom could be either Cu or M. We have systematically compared these two types of core–shell structures together with the other possible structural patterns (homogenous and segregated structures). The excess energies as a function of the average radial distance of the M atoms to the central atom are summarized in Fig. 3. A linear relationship exists between excess energies and Dradial, M for Cu42M13 clusters, and the difference of the central atom (Cu or M) causes a slightly different slope. Among four types of Cu-based alloy clusters, Cu43Ni13 clusters exhibit the smallest difference between the Cu-center and Ni-center structural patterns, as the two fitting lines almost coincide with each other. This indicates the energy differences between Ni-center and Cu-center structures are very small. In contrast, the two fitting lines for Cu-center and Rh-center structures show relatively big difference. According to previous discussions on alloy nanoclusters9, as the alloy bond strength is greater, intimate mixing is favored. If the bond strength is lower, segregation of the two metal components is favored; the metal atoms with lower surface energy tend to move to the surface of alloy cluster to form a shell, like Cu.

Fig. 3: Energies and structural patterns for Cu42M13 clusters.
figure 3

The excess energies (\(E_{{\mathrm{exc}}}^ \ast\)) for (a) Cu42Co13, (b) Cu42Ni13, (c) Cu42Ru13, and (d) Cu42Rh13 alloy clusters as a function of the average radial distance (Dradial, M) between the dopant metal atoms and the central atom. The core–shell structure and the segregated structure are illustrated.

Before discussing the catalysts for HER, we examine the structural stability of these 55-atom alloy clusters. To confirm the stability of the core–shell structures, we have constructed totally 7924 structures for the Cu55-nMn clusters with n from 1 to 22, corresponding to the dopant concentrations ranging from 1.8% to 40.0%, and the most stable cluster structures are shown in Supplementary Figs 14. The excess energies of structures for each dopant concentration are plotted in Fig. 4. As shown, the CuNi alloy clusters are energetically favorable than the other alloy clusters, that is to say, Ni doping into Cu clusters is most thermodynamically favorable compared to the other metal doping. Doping Co into Cu clusters leads to the second low-energy alloy clusters. The Ru and Rh doped Cu clusters are not as stable as Co and Ni, which may be attributed to the relatively larger volume expansion of the alloy clusters (the covalent radii of Ru and Rh atoms are 1.46 Å and 1.42 Å, respectively, larger than that of Cu atom with 1.32 Å). The diameter of the pure Cu55 clusters is 9.606 Å. Taking Cu33M22 cluster as an example, the equilibrium diameters for Cu33Co22, Cu33Ni22, Cu33Ru22, and Cu33Rh22 clusters are 9.489 Å, 9.548 Å, 9.939 Å, and 10.058 Å, respectively. The Ru and Rh doping into the Cu clusters result in the largest volume expansion to increase energy. Therefore, doping Ni into Cu clusters is most feasible to form the CuNi alloy clusters in experiments, compared to the other alloy clusters considered here.

Fig. 4: Energetic stability for core–shell clusters.
figure 4

The excess energies (\(E_{{\mathrm{exc}}}^ \ast\)) for core–shell structures of (a) Cu55-nCon, (b) Cu55-nNin, (c) Cu55-nRun, and (d) Cu55-nRhn alloy clusters as a function of dopant concentration, where nM is the number of dopant metal atoms in the 55-atom cluster. The lowest excess energies for each dopant concentration are connected by a solid line.

For the icosahedral core–shell structures of the Cu55-nMn clusters, the outer shell consists of 42 atoms distributed in two different sites, i.e., 30 atoms at the edge sites and 12 atoms at the vertex sites. Meanwhile, there are 13 atoms in the core region. When the number of dopant metal atoms is less than 13, the dopants prefer to occupy the core positions, and the Cu atoms tend to stay on the outer shell. As the dopant concentration increases (n > 13), Co, Ni, and Rh atoms start to occupy the edge sites on the shell, while Ru atoms seem to prefer both sites. For the Cu55-nNin clusters with dopant concentration of 13 < nNi ≤ 19, Ni atoms are distributed dispersedly and symmetrically on edge sites; but when nNi > 19, Ni atoms prefer to form small island on the cluster surface. Our results have demonstrated that, the most stable structural pattern of the Cu55-nMn clusters (M = Co, Ni, Ru, and Rh) is the core–shell structure, in which the dopant metal atoms occupy the core and the Cu atoms are distributed as the shell. It is really important to consider the structural evolution of the 55-atom clusters with high concentration doping. To this point, we have compared our icosahedron structures to another typical cuboctahedron structures. Because the structural evolution is more possible to occur in the high doping concentration, here we have investigated the Cu33M22 (M = Co, Ni, Ru, and Rh) clusters, which is the largest doping concentration considered in our work. Upon structural optimization (40 structures for each dopant), most of the cuboctahedron structures are reconstructed and transformed into icosahedron structures. The total energies of the low-energy cuboctahedron structures with different mixing patterns are summarized in Supplementary Table 1. The core–shell structural pattern in cuboctahedron configuration also possesses the lowest energy, but still higher by more than 2 eV in total energy than the icosahedron configuration. Therefore, the Cu55 clusters with high doping concentration (less than half) will remain the same structure as Cu55 cluster.

Hydrogen adsorption free energy

Based on the most stable structures of the Cu55-nMn clusters, we further investigate the HER activities on the surface of these alloy clusters. To evaluate the HER activity, the H adsorption free energy is commonly used as a reasonable descriptor for a wide range of metals and alloys20,21,22. The HER process includes two steps. The first step involves bonding of H to the catalyst H+ + e + * → H*, where * denotes a site on the surface able to bind to H. The second step is the release of molecular H2 through one of the two processes: 2H* → H2 + * or H+ + e + H* → H2 + *. If H binds too weakly on the catalyst surface, it is difficult to activate in the first step; but when the H binds too strongly on the catalyst, the active sites will be occupied and suppress the second step of releasing H2. Thus, the optimal H adsorption free energy should be close to zero. The H adsorption free energy is calculated at a potential U = 0 relative to the standard hydrogen electrode at pH = 0. The free energy of H+ + e is by definition the same as that of 1/2 H2 at standard conditions. Therefore, herein we have evaluated the adsorption free energy for hydrogen (ΔGH) on all the possible adsorption sites for the stable structures of Cu55-nMn nanoclusters. Totally, we have 242 H adsorption sites on each cluster structure (more details can be found in Supplementary Fig. 5, and those with H adsorption free energy (|ΔGH| ) less than 0.1 eV are defined as the optimal active sites.

Here we utilize the number of the optimal active sites with |ΔGH| < 0.1 eV as the descriptor to evaluate the HER performance of nanoalloy catalysts. The total number of active sites counted in different ranges of |ΔGH| for pure Cu55, M55, and Cu55-nMn alloy clusters with various compositions are summarized and plotted in Fig. 5 and Supplementary Fig. 6. Compared to pure metal clusters, the amount of active sites with |ΔGH| < 0.1 eV on the Cu55-nMn alloy clusters is greatly increased, indicating the superior electrocatalytic HER performance. Among the four kinds of alloy clusters, the CuCo and CuNi alloy clusters are predicted to have more active sites with |ΔGH| < 0.1 eV, in contrast to CuRu and CuRh clusters. It is worthy to note that pure Ni clusters exhibit very poor HER activity, but the CuNi alloy clusters display the best catalytic efficiency. As we have discussed above, Ni-doped Cu55 clusters are the most stable alloy clusters; thus the CuNi alloy clusters not only possess the stable structures but also exhibit the best HER activity. By increasing the dopant concentrations, the HER performance of the Cu55-nMn alloy clusters varies. But it is interesting to note that, the HER catalytic efficiency has a maximum value for each type of Cu55-nMn alloy clusters, such as Cu46Co9 for CuCo alloy clusters and Cu49Ni6 for CuNi alloy clusters, respectively. With further increasing the dopant concentration, the amount of the active sites with |ΔGH| < 0.1 eV decreases, indicating a remarkable drop in HER performance.

Fig. 5: HER activities of alloy catalysts.
figure 5

Total number of active sites counted in different ranges of |ΔGH| values as a function of various compositions for (a) Cu55-nCon, (b) Cu55-nNin, (c) Cu55-nRun, and (d) Cu55-nRhn alloy clusters. The dark green, green, and light green colors indicate the total number of active sites with the |ΔGH| values in the range of [0.0, 0.1), [0.1, 0.2), [0.2, 0.3), respectively.

We now check the optimal active site on the surface of these alloy clusters. For the icosahedral clusters, we have totally 242 adsorption sites; among them, we have 42 on-top sites on V (vertex) or E (edge), 120 bridge sites on VE or EE, 80 three-fold sites on VEE or EEE. The H adsorption free energies between –0.1 eV and 0.1 eV are defined as the optimal values for HER. After examining carefully, it is found that the majority of the optimum sites with the H free energies between –0.1 eV and 0.1 eV are bridge sites of VE and three-fold sites of VEE on Cu atoms. The adsorbed H atom prefers to form a ring structure around the shell Cu atoms (see Fig. 6), which is remarkably different from the pure Cu clusters. The binding of H atoms on top of V or E sites are too weak with more positive ΔGH values. As an H atom is adsorbed on the bridge sites of EE, it will spontaneously move to the EEE or VEE sites. In addition, most of the optimal active sites are Cu sites instead of the dopant metal sites, which may account for why the HER activity starts to decrease as the dopants concentration further increasing. As the dopants concentration increases, the dopant metal atoms are gradually exposed on the cluster surface, and the H adsorption free energy on these dopant metal sites are mostly strong with the ΔGH values more negative. Among the alloy clusters that we have considered here, the core–shell Cu49Ni6 alloy cluster is recognized as the promising electrocatalyst for HER with high stability and the most abundant active sites. Some related experiments on CuNi nanoparticle as the high-efficient HER catalyst have been reported recently and proved our theoretical prediction14,15.

Fig. 6: Active sites for HER.
figure 6

The optimal adsorption sites for H on the surface of (a) pure Cu55, (b) Cu49Ni6, (c) Cu34Ni21, (d) Cu34Co21, (e) Cu34Ru21, and (f) Cu34Rh21 clusters with |ΔGH| < 0.1 eV.

Descriptor for the structure-activity relationship

To deeply understand the modulation mechanism of alloy metal clusters, Bader charge distributions of the Cu55-nMn clusters have been further calculated and discussed. Since the HER occurs on the cluster surface, we focus on the charge distributions of the shell metal atoms. The metal atoms in the pure Cu55 cluster with icosahedral structure possess two inequivalent sites on the surface, i.e., vertex site and edge site. For the pure Cu55 cluster, the charge of Cu atoms on the vertex site (VCu) is –0.05 e, and that of Cu atoms on the edge site (ECu) is +0.002 e; thus the outer shell of Cu55 are negatively charged. By doping transition metal atoms into Cu clusters, the charge distributions on the surface of alloy clusters have been changed accordingly. In the alloy clusters, the edge-Cu atoms become positively charged, and finally enlarging the charge difference between the adjacent ECu and VCu. Taking Cu49Ni6 cluster as an example, it is found that the charge values on ECu reach +0.03 e, while the on-site charges of VCu are around –0.07 e, inducing a substantial charge difference of 0.1 e between the edge-Cu and the adjacent vertical-Cu atoms. As discussed above, most of the optimal sites with high HER activity are the bridge sites of VE on the shell Cu atoms, so the H adsorption free energies should be associated with the charge values of two adjacent Cu atoms locating on the vertex site and edge site. Hence, we suggest the average charge difference between two adjacent ECu and VCu as a descriptor to probe the structure-activity relationship and screening for the HER electrocatalyst, which is defined by:

$$\begin{array}{*{20}{c}} {{{\Delta }}Q_{{\mathrm{Cu}} - {\mathrm{Cu}}} = \frac{{\mathop {\sum }\nolimits_i^n \left( {\mathop {\sum }\nolimits_j^m \left| {q_{{\mathrm{Cu}}_{{\mathrm{V}},i}} - q_{{\mathrm{Cu}}_{{\mathrm{E}},j}}} \right|} \right)}}{{N_{{\mathrm{VE}}}}}} \end{array}$$
(2)

where \(q_{{\mathrm{Cu}}_{{\mathrm{V}},i}}\) is the charge of ith Cu atom on vertex site, and \(q_{{\mathrm{Cu}}_{{\mathrm{E}},j}}\) is the charge of jth Cu atom located on shell next to the ith Cu atom. n is the number of Cu atoms on vertex site, and m is the number of the ith Cu atom’s neighboring Cu atoms on cluster shell. NVE is chosen as 60, which is calculated from the total number of Cu-Cu bonds between Cu atoms on vertex and edge sites (m·n) in Cu55.

The proposed descriptor is controlled by two factors: the first one is the amount of active metal atoms on the cluster surface (i.e., the surface Cu atoms in CuM clusters), and the second one is the charge difference between two adjacent Cu atoms on vertex and edge sites. We have calculated the ΔQCu-Cu values for the Cu55-nMn alloy clusters. The ΔQCu-Cu values on the Cu55-nMn alloy clusters vary with the increasing dopant concentration, and there is a maximum value for each type of Cu55-nMn alloy clusters, as shown in Fig. 7a. After reaching the maximum value, the ΔQCu-Cu values start decreasing, because more dopant metal atoms are distributed on the shell and less Cu–Cu bonds are preserved on the cluster surface. The variations of the charge difference ΔQCu-Cu values exhibit almost identical trends to the total number of active sites on the alloy clusters with |ΔGH| < 0.1 eV. For example, the Cu46Co9 and Cu49Ni6 clusters exhibit superior HER activities with relatively more active sites of |ΔGH| < 0.1 eV, and they also possess the larger ΔQCu-Cu values. To make it clear, we have calculated the relationship between the ΔQCu-Cu values and the number of active sites on the Cu55-nMn alloy clusters and found a nearly linear dependency between them, as displayed in Fig. 7b. As the charge differences between two adjacent surface Cu atoms become larger, the HER activity described by the number of optimal active sites almost linearly increases. Therefore, ΔQCu-Cu value could be identified as a descriptor to evaluate the HER activity. Compared to the traditional descriptor (e.g., H adsorption free energy) for evaluating the HER performance and confirming the optimal active sites, our proposed descriptor of charge difference is remarkably easier to compute, which greatly enhances the computational screening efficiency and facilitates the HER evaluations on the more complicated alloy nanoclusters.

Fig. 7: Descriptor to evaluate HER performance.
figure 7

a The average charge difference ΔQCu-Cu between two adjacent Cu atoms on the edge site and vertical site for Cu55-nMn (M = Co, Ni, Ru, and Rh) alloy clusters, b linear relationship between the number of active sites (|ΔGH| < 0.15 eV) and the average charge difference ΔQCu-Cu.

Machine-learning prediction

Machine-learning technique combined with the DFT calculations holds an immense potential for the acceleration of catalyst discovery23. As we discussed above, the charge distribution around the active site is closely related to the strength of the hydrogen adsorption, so here we introduce the machine-learning techniques based on our large data set of high-throughput DFT calculations, The neural network utilizes the fingerprints of the H adsorption site, and the features consist of the coordination number and the mean Bader charge of the first-neighbor and the second-neighbor atoms of the active site on the outer atomic shell. The H adsorption free energies for different sites on the Cu-based alloy nanoclusters are predicted with a subsequently increasing training set. A learning curve in Fig. 8a shows that about 700 adsorption free energies from DFT calculations (including 420 training data) is enough to achieve an ideal mean absolute error (MAE) of 0.10 eV from our features. We obtain a similar machine-learning precision to a recent machine-learning work with an accuracy of 0.11 eV MAE by using the other features24. Based on this, we use 3388 adsorption free energies to predict the H adsorption free energy on nanoclusters with a MAE error of 0.07 eV and a root-mean-square error (RMSE) of 0.11 eV on the test set, respectively, as shown in Fig. 8b. Our input properties for machine-learning models are ideally available without heavy computation from the H adsorption relaxation. Therefore, the charge of atoms around the adsorption site is closely correlated with the H adsorption free energies, which could be a useful descriptor to estimate the HER activity and screening for the high-efficient HER electrocatalysts.

Fig. 8: Machine-learning prediction.
figure 8

a Learning curve of the neural network. The errors are averaged over 25 randomized runs and the error bars indicate the 95% confidence interval; b parity plots between predicted and DFT-calculated ΔGH values.

In summary, we have performed high-throughput DFT calculations to screening for the efficient HER electrocatalysts on the alloy nanoclusters. The following important conclusions have been addressed. First, considering different structural patterns, the core–shell structures with M-core and Cu-shell are energetically preferred for Cu-based alloy clusters of Cu55-nMn (M = Co, Ni, Ru, and Rh), and CuNi is the most stable alloy clusters among these four kinds of alloy clusters. Second, the total number of the active sites with |ΔGH| < 0.1 eV is utilized as a descriptor to evaluate the HER performance of the alloy catalysts, which could be efficiently modulated by the dopant concentration. Among all the explored alloy clusters, Cu49Ni6 exhibits the best HER activity, which is significantly improved in comparison with the pure Cu55. Third, we propose a descriptor to screen for the HER electrocatalysts by combining DFT calculations and machine-learning techniques. Finally, our high-throughput screening strategy could be widely applicable to the other alloy nanoclusters, and also extend to the other electrocatalytic reactions.

Methods

DFT calculations

The high-throughput computations were performed using DFT as implemented in the Vienna Ab-initio Simulation Package (VASP)25,26. The projector augmented wave (PAW) method was used to describe the ion-electron interactions27,28. The valence states of the transition metals are considered as 3d104p1 for Cu atom, 3d84s1 for Co atom, 3d94s1 for Ni, 4d75s1 for Ru, and 4d85s1 for Rh, respectively. The Perdew–Burke–Ernzerh (PBE) functional within the generalized gradient approximation (GGA) was used to describe the exchange-correlation interactions29,30, and the D3 Grimme’s scheme was adopted to account for London dispersion interactions31. Each cluster was placed in a cubic box of 20 Å × 20 Å × 20 Å to minimize the interaction from periodical images. The Γ-point was adopted for sampling the Brillouin zone of simulation supercell throughout the calculations. The kinetic energy cutoff for plane wave was set to 350 eV. All the geometries were fully optimized until the maximal components of force on each atom converged to less than 5 × 10−2 eV Å−1.

To compare the relative energies among different compositions of clusters, the excess energy is defined by9:

$$\begin{array}{*{20}{c}} {E_{{\mathrm{exc}}}^ \ast = \frac{{E_{{\mathrm{Cu}}_{55 - n}{\mathrm{M}}_n} - \left( {55 - n} \right)E_{{\mathrm{Cu}}} - nE_{\mathrm{M}}}}{{55}}} \end{array}$$
(3)

where \(E_{{\mathrm{Cu}}_{55 - n}{\mathrm{M}}_n}\) is the total energy of Cu55-nMn cluster, ECu and EM are the total energies of the Cu55 cluster and M55 cluster (M = Co, Ni, Ru, and Rh), respectively.

The adsorption free energy for hydrogen is defined as:

$$\begin{array}{*{20}{c}} {{{\Delta }}G_{\mathrm{H}} = {{\Delta }}E_{\mathrm{H}} + {{\Delta }}E_{{\mathrm{ZPE}}} - T\Delta S} \end{array}$$
(4)

where the ΔEH is the adsorption energy of H species on the cluster surface from DFT calculations. The ΔEZPE and ΔS are the difference for the zero-point energy and entropy between the adsorbed H and half of H2 molecule in the gas phase, respectively. The zero-point energy and entropy are calculated at the standard conditions corresponding to the pressure of 101325 Pa (~1 bar) of H2 at the temperature of 298.15 K. The coverage of H adsorbate is difficult to consider for the cluster structures due to more expensive calculations; but for the similar systems (the Cu-based alloy clusters in our work possess the same structure), the H adsorption free energy could provide a reasonable trend under the assumption of same coverage.

The machine-learning protocol

The neural network consists of one input layer, three hidden layers and one output layer. The number of hidden layer neurons are 32, 32, and 64. The Rectified Linear Unit activation function32 was chosen for each hidden layer. The overfitting issue was prevented using the L2 regularization33. The neural network is subjected to a supervised training scheme by using a backpropagation algorithm34 implemented in the TensorFlow framework35. All the input features are normalized to avoid the use of raw variables with different range of values. The data is transformed with \(x^{\prime} = \frac{{x_i - \mu }}{\sigma }\), where x′, xi, μ, and σ represent the normalized data, input data, mean and sample standard deviation values of the input data, respectively. The Adam optimizer36 is selected with the learning rate of 0.001. Prediction errors are evaluated by mean absolute errors, and the cross-validation technique37 is applied to estimate the accuracy and robustness of the neural network training. The DFT results of hydrogen adsorption are divided into 60%, 20%, and 20% for training, validation, and test sets, respectively.