Introduction

Partial differential equations (PDEs) can model complex physical processes, ranging from fluid dynamics, and aerospace engineering to material science1,2,3. Nevertheless, only specific types of PDEs can be analytically solved, while most require numerical approximations such as finite difference or finite element methods. Numerical solvers, as the aforementioned ones, have many qualities such as error estimation for the solution, stability for long-time simulations, and generalization across different boundary/ initial conditions or geometries4,5. Nevertheless, these numerical approaches are extremely costly regarding computational resources, requiring significant time to obtain simulation results. This is even worse in parametric contexts, where a new simulation must be performed for each set of new parameters, making unfeasible real-time computations. To overcome this barrier, reduced order models (ROMs) have become an emerging field in computational sciences, fulfilling the demand for efficient computational tools for performing real-time simulations6,7,8,9,10. In particular, data-driven ROMs have grasped increasing attention from the scientific community due to their ability to generalize across multiple system configurations using only a finite set of data. This modeling approach is “agnostic” in the sense that the equations describing the phenomenon are not required, rendering the methodology versatile even when data originate from real-world observations.

ROMs for parametric PDEs are usually composed of two steps: dimensionality reduction, and manifold interpolation. In the dimensionality reduction phase, a low-dimensional space is employed to optimally represent the problem at hand; while the manifold interpolation procedure is used for mapping the parameter space to the mode of variation coefficients space. One of the most used techniques for the ROM’s (linear) dimensionality reduction phase is the proper orthogonal decomposition (POD)11, while radial basis functions (RBFs)12 are usually used as a standard interpolation procedure. Data-driven ROM using linear dimensionality reduction like POD has the major drawback that exhibits limitations for non-linear dynamics13.

Deep learning (DL) algorithms have recently emerged as new approaches for performing non-linear ROM using autoencoders14,15,16,17. These architectures consist of two neural networks, namely encoder and decoder. The encoder network is used for compressing the data onto a lower-dimensional manifold, while the decoder performs the opposite transformation. The final objective is to minimize the reconstruction error between the encoded-decoded data and the original one, quantified by a specific norm. For interpolating the reduced manifold, the RBF procedure can be used, or neural networks, such as long short-term memory (LSTM) networks for temporal predictions18,19, or feed-forward neural networks in parametric problems20 can be employed.

Regardless of the great advancement achieved in generalizability by neural networks21,22,23,24,25, only discriminative models have been extensively investigated. Discriminative models, such as regression techniques, are machine learning models optimized for learning the conditional probability of obtaining an output given a specific input. In other words, they focus on learning decision boundaries that separate (discriminative part) the data during training. During inference, the new data point is mapped to the respective boundary, depending on a specific distance metric26. Nevertheless, these models lack a semantic understanding of the phenomenology characterizing the data27, and quantifying the uncertainty may be challenging. On the other hand, generative modeling is used to model the dataset probability distribution using a probabilistic approach. Indeed, generative models aim to learn how to generate new data with the same statistics as the underlying dataset distribution, which forces the network to learn specific patterns in the data. Latent variable models28 are an example of generative models that have been successfully applied in many fields, see Refs.29,30,31,32 as far from an exhaustive list of possible applications.

Recently, latent variable model approaches, specifically variational autoencoder (VAE) models33, have been applied for reduced order modeling. In particular, in Ref.34 a ROM for fluid flow predictions is presented. The model uses a VAE architecture for dimensionality reduction to produce near-orthogonal representations as in Ref.35, while the interpolation is done using a transformer networks36. VAEs represent a powerful subclass of latent variable models, namely prescribed models. The (parametric) distributions defining the probabilistic model must be chosen upfront in prescribed models. Nevertheless, the quality of the model can be affected if too simplistic distributions are used27,37.

The implicit models — different from the above-mentioned ones — distribution. In particular in GANs, the objective is achieved by implicitly modelling the true unknown probability distribution through a generator network trained by adversarial learning.

The usage of GANs for solving PDEs is an emerging research line, and several works have already addressed the problem. Specifically, in Ref.38 a GAN is used to learn solutions across multiple initial conditions for a specific PDE, by combining the GAN loss with a physics-informed39 one. Other examples of physics-informed GAN are the works in Ref.40,41,42. The first one40, trained by minimizing the classical GAN loss, injects physical information during training by augmenting the generator and discriminator with a physics consistency score, assessing how physical the produced generator field is. The second one41, on the other hand, uses the GAN architecture to learn the loss for a physics-informed neural network39, i.e. the training is unsupervised. Finally, the third one42 uses a variation of the methodology in Ref.40 to learn stochastic PDEs. In contrast from the models above where the methodology is limited to learning solutions for a fixed PDE, the works in Refs.43,44 introduce a way to condition the generative process for parametric and time-dependent PDEs. A common point of the mentioned methodologies is the fact that vanilla GANs, as introduced in Ref.45, are used in training with the addition of physical information. However, in Ref.46 the authors show that vanilla GANs, i.e. without any physical information, obtain in general higher errors compared to discriminative models. Since the addition of physical loss to the GAN loss can always be done, we seek in the article to introduce a methodology for solving parametric PDEs that can compete against discriminative models by leveraging only the GAN loss. In addition, differently from Refs.43,44 we want the generator to not be deterministic, i.e. we want to introduce stochasticity in the network input in order to perform uncertainty quantification as done in Refs.47,48.

In this work, we present GAROM, a generative adversarial reduce order model. GAROM is a novel reduced order model framework, based on a variation of conditional boundary equilibrium GAN49. In particular, GAROM uses adversarial learning to learn the distribution over high-fidelity data, conditioned on domain-specific conditioning. At convergence, the GAROM network is able to generate high-fidelity data given domain-specific conditioning, e.g. PDE parameters, or temporal time steps. We also present a regularized version of the methodology, r-GAROM, providing extra information on the generative process as explained in Sect. Methods. The frameworks are tested in a variety of problems, comparing the prediction ability to state-of-the-art ROM methods, and analyzing the convergence robustness. Empirical results on the network’s ability to generalize to unseen data are provided, and different uncertainty quantification strategies based on statistics are presented. Finally, we provide further possible research directions to investigate this novel methodology.

Results

In this section, the performances of GAROM and its improved variant r-GAROM are presented. To estimate the accuracy of the presented methodology, we use the mean \(l_2\) relative error \(\epsilon\) between the high-fidelity solution \({\textbf{u}}({\textbf{c}}) \in {\mathbb {R}}^{N_u}\), and the generated one \(\hat{{\textbf{u}}}({\textbf{c}}) \in {\mathbb {R}}^{N_u}\), for the parameter \({\textbf{c}} \in {\mathbb {R}}^{N_c}\). It is formally defined as:

$$\begin{aligned} \varepsilon = \frac{1}{N}\sum _{i=1}^{N}\frac{\Vert {\textbf{u}}({\textbf{c}}_i) - \hat{{\textbf{u}}}({\textbf{c}}_i)\Vert ^2_2}{\Vert \textbf{u}({\textbf{c}}_i)\Vert ^2_2}, \end{aligned}$$
(1)

with N the number of testing parameters used for the metric evaluation.

All of our experiments are performed on three different test cases, representing benchmark problems in parametric contexts. Aiming at increasing complexity, we take into account a Gaussian signal moving inside a square (algebraic), the Graetz-Poiseuille problem (linear PDE), and the Lid Cavity problem (nonlinear PDE). The datasets for the training are therefore composed of the (parameter-dependent) high-fidelity solutions, which are obtained by solving such problems with the consolidated numerical solver, as exhaustively explained in section "Datasets description". Moreover, for all the test cases, different latent dimensions for the discriminator network are considered. The other architectural details for the models at hand are fully described in section "Training setup and model architecture", together with the optimizer settings.

Model inference

In the first experiment, we test the GAROM capability to predict the solution for a new parameter, aka conditioning variable. A visualization of the r-GAROM prediction for a testing parameter, and its \(l_2\) point-wise error is reported in Fig. 1. The spatial distribution of the prediction for all the test cases is almost identical to the original one, as proved also by the low error. The variance reaches low values in all the problems, demonstrating that the model has learned to properly generate the solution field. The spatial distributions of variance and error do not show similarities, but the former identifies the spatial region where the network is more uncertain (we highlight that the plots present the variance for a single test parameter), which can be potentially used to adapt the training procedure. A similar analysis like the one in Fig. 1 has been replicated also for the GAROM model. The unregularized model results are barely indistinguishable to the eye with respect to the regularized counterpart, so we have not reported them.

Figure 1
figure 1

r-GAROM inference results. The images show the generated snapshot representing the magnitude of the unknown field for a testing parameter using a latent dimension of 64, compared to the corresponding high-fidelity solution. Top: Gaussian dataset. Center: Graetz dataset. Bottom: Lid cavity dataset.

Looking at a more precise measure for the GAROM accuracy, we calculate the mean \(l_2\) relative error over a set of training and testing parameters, comparing it with the error obtained using the ROM baseline models: POD–RBF, POD–NN, AE–RBF, and AE–NN; the generative modelling baselines: cGAN, and r-cGAN; and the Operator Learning baselines: DeepONet, and NOMAD. As we can note from Table 1, for the majority of the experiments POD–RBF is the best across all models on the training dataset (gray rows), but it fails to properly generalize to test data. Indeed, the error computed on the test data exhibits a completely different trend with respect to the one computed over the training snapshots, especially in the Graetz case, showing a poor ability to correctly generalize. On the contrary, ML-based methodologies exhibit comparable train and test errors, due to their ability to learn complex nonlinear relationships by the data. For the test error, which is used to assess the model prediction ability, the r-GAROM model obtains very promising results for almost half of the test cases. Across the different deep learning models, r-GAROM obtains lower error for six out of nine tests, highlighting its encouraging precision even in this preliminary contribution. It is also important to highlight that the (r-)GAROM architecture is fairly simple in all the numerical tests here pursued: both generator and discriminator are composed only of a few dense layers (details in section "Training setup and model architecture"). We suppose the accuracy can be further improved by using more powerful deep learning architectures, as in Refs.15,50,51, at the cost of longer and more expensive training.

Table 1 Accuracy comparison for different test cases, methods, and latent dimensions.

Compared to the vanilla (r-)cGAN, the (r-)GAROM methodology obtains lower error in both training and testing, with the difference in accuracy especially evidenced in the most challenging tests, i.e. the Graetz and Lid Cavity. Furthermore, the predictive standard deviation for the (r-)GAROM model is much lower than the (r-)cGAN model, empirically showing that the GAROM model is less uncertain (see uq for more details). Finally, the r-GAROM error seems to be almost independent of the latent dimension, since only a small variation of the errors is observed in training and testing for the various dimensions. In such a way, we can empirically prove the better capability to reduce the original dimension of the snapshots, since adding new reduced dimensions does not affect the final accuracy.

Model variance

Figure 1 reports the (r-)GAROM variance obtained by a Monte Carlo approximation with different random inputs for the generator. Contrary to the ROM baseline models, GAROM can provide also an estimate of the model variance, obtained by a probabilistic ensemble (see section "GAROM predictive distribution"). Since the solution is unique by assumption, we expect a zero variance for the generated samples given the conditioning variables. Hence, a high variance in the GAROM estimate gives an uncertain prediction of the model. This is confirmed by observing the variances for different models in Table 1, and the point-wise variance in Fig. 1. Both GAROM and r-GAROM models report a small variance, with training variance always one or two orders of magnitude smaller than the testing one. This indicates that GAROM models are more uncertain in predicting testing data than training ones. This is expected since the GAROM model builds a distribution on the training data. However, it is worth noticing that the \(l_2\) errors between training and testing in Table 1 are almost always identical across different dimensions, showing good generalizability of the model, as also reported in section "Generalization and robustness". It is evident from Table 1 that r-GAROM has a lower variance than GAROM for testing and training. This is mainly due to the regularizer term, which forces the network to converge faster, as explained in the section "Methods".

The variance estimate, although very important for the aforementioned reasons, does not give a quantification of the predictive uncertainty, but only statistical information of the learned distribution. Nevertheless, due to the probabilistic framework adopted, we can obtain bounds in probability for the reconstruction error by using the variance estimate. Indeed, as deeply explained in section "Uncertainty quantification strategies", with the variance information provided by the GAROM framework, it is possible to obtain prediction uncertainty using the Markov inequality, or the confidence interval theory. While extremely important, in this work, we just show how to obtain these näive bounds. Exploring further to obtain tighter bounds, or employ the uncertainty quantification strategies for noisy data are possible new interesting research topics, which we will explore in the future.

Generalization and robustness

One fundamental aspect, especially in the context of data-driven modelling, is the ability to correctly generalize across new instances of the conditioning domain, which of course were not inside the training dataset. To empirically prove the GAROM ability to generalize over the testing parameters, we define the difference \(\delta ({\textbf{c}})\) as the mean difference between the truth parametric solution and the corresponding GAROM prediction, such that:

$$\begin{aligned} \delta ({\textbf{c}}) = \frac{1}{N_u}\sum _{i=1}^{N_u}(u_i({\textbf{c}}) - {\hat{u}}_i({\textbf{c}})), \end{aligned}$$
(2)

where \({\textbf{u}}({\textbf{c}}) = \begin{bmatrix} u_1({\textbf{c}})&\dots&u_{N_u}({\textbf{c}}) \end{bmatrix}\) and \(\hat{{\textbf{u}}}({\textbf{c}}) = \begin{bmatrix} {\hat{u}}_1({\textbf{c}})&\dots&{\hat{u}}_{N_u}({\textbf{c}}) \end{bmatrix}\). We highlight that we do not consider the absolute value here in order to detect possible systematic over- and under-estimations of the proposed model.

Figure 2
figure 2

Distribution of the \(\delta\) difference. The graph depicts, for each latent dimension, train (red) and test (blue) distribution of the \(\delta\). Top: r-GAROM model. Bottom: GAROM model.

Figure 2 shows the distribution of the \(\delta\) different for all latent dimensions and for all test cases. We can note that the distribution of the error computed over the training parameters is quite similar to the error distribution for the testing parameters in all the numerical investigations. Such behaviour indicates that both models can correctly generalize outside the training dataset, for the proposed problems. Furthermore, the distributions are centered around zero for both models, indicating a good high-fidelity solution reconstruction, as also shown in the previous Section. Finally, it is worth noticing that r-GAROM distributions are more close to zero than the unregularized counterpart, as a consequence of the regularizer term addition.

Figure 3
figure 3

r-GAROM convergence graph in \(l_2\) relative error for multiple training. The solid line indicates the average across all simulations. The shaded area represents the interval obtained by taking the maximum and minimum error across all simulations. The total number of training is 5.

The last investigation regards the robustness of the training. It is well established in the deep learning community that generative adversarial networks tend to be unstable during training45. Nevertheless, such behaviour is drastically reduced with BEGAN49,52. In order to assess the convergence of the optimization procedure, we perform a statistical analysis studying the error trend during the training, using different random initialization for the weights of the network. Practically, 5 training loops are carried out changing the seed for the random number generator, and using Xavier weight initialization. Figure 3 depicts the mean error and the interval between the minimum and the maximum error across the 5 simulations. The plot indicates a stable convergence of r-GAROM, with sporadic perturbations in the statistical campaign. The only test case” that is showing not a stable trend is the Gaussian one, but only when the latent dimension is 4: we suppose the latent dimension is not sufficiently big to represent the original problem, leading to an aleatory optimization. However, the \(l_2\) relative error for the maximum discrepancy is in the same order of magnitude as the average result, showing that the model has still reached a good performance.

Discussion

In this study, we proposed GAROM, a generative adversarial reduced order modeling approach to tackle parametric PDE problems. The presented methodology is also extended to a regularized version (r-GAROM) in case the solution for the PDE is unique. The methodology is general, and applicable to a variety of ROM problems. The GAROM and r-GAROM methods are tested on three parametric benchmarks for different latent dimensions, showing very promising results in data-driven modeling. Among the deep learning approaches for ROM, r-GAROM outperforms in terms of accuracy in model prediction all of them in seven out of nine tests. Moreover, this approach seems to be more resilient to overfitting. With respect to POD, it emerges in these examples the POD models have a high difference between train and test errors when the POD is not able to efficiently capture the underlying behavior represented by the snapshot — e.g. due to nonlinearity. Differently, GAROM and its variation do not suffer from poor generalization, resulting in a more reliable model, despite the lack of precision in some tests. Surprisingly, it also shows any dependence between the \(l_2\) relative mean error and the latent dimension we use in the discriminator, making it less sensitive to this parameter. It must be repeated that, with the GAROM approach, we are not projecting data on a reduced space, but the inner dimension is only used in the discriminator autoencoder. The adversarial approach combined with the proposed regularisation seems also helpful to stabilize the training in terms of loss trend, making such an approach practically less difficult to tune the hyper-parameters. Finally, the statistical approach allows the computation of the variance learned by the generator, making it possible to quantify the uncertainty of predictions applying different statistical strategies.

We identify multiple exciting research directions for the methodology. Firstly, improving the quality of the neural networks for the generator, discriminator, and conditioning mechanisms, and applying the methodology to more complex problems. As already stated, in the work we have used very simple architectures, which limits the generation capability. As an example, the framework could be easily extended to time-dependent problems, employing LSTM53 or transformer36 networks for the conditioning mechanism. Alternatively, as also suggested in Ref.52, employing different architecture for the discriminator, e.g. U-net54, could lead to accuracy improvement. Another possibility is to enhance the conditioning set with more information, e.g. by passing POD modes, or mesh information. Another very interesting extension of this work would be the continuous extension of the entire architecture. With the proposed architecture, both the networks’ dimension scales with the number of components in the snapshots vector. To apply such an approach to very fine meshes, the computational requirement to train and infer the model would be huge and sometimes unfeasible. In these cases, a spatial continuous extension can lead to a wider diffusion, especially in more complex problems. Another possibility to mitigate the difficult application to finer meshes (and so snapshots with more degrees of freedom) is the employment of the architecture presented in55 where the authors employ continuous convolutional neural networks over unstructured meshes. Finally, extending to a continuous setting might allow the use of our methodology to learn mappings between functional spaces in a discretization invariant manner, as done with Neural Operators56,57.

In summary, the presented new methodology, combined with the different new research directions, paves the way to the application of GAROM in very complex settings, exhibiting great potential for future ROM developments in computational science.

Methods

Conditional boundary equilibrium GAN

Boundary equilibrium generative adversarial network (BEGAN)52 is an autoencoder based generative adversarial network. The BEGAN model consists of a generator network \(G: {\mathbb {R}}^{N_z} \rightarrow {\mathbb {R}}^{N_u}\), generating a domain-specific sample in \({\mathbb {R}}^{N_u}\) by passing random noise \({\textbf{z}}\in {\mathbb {R}}^{N_z}\) following \(p({\textbf{z}})\) distribution; and a discriminator network \(D: {\mathbb {R}}^{N_u} \rightarrow {\mathbb {R}}^{N_u}\), which encodes and decodes real and generated samples. The core idea of BEGAN is utilizing the autoencoder reconstruction loss distribution derived from the Wasserstein distance58 to approximate the data distribution. Specifically, the optimization is done with respect to the Wasserstein distance between the autoencoder reconstruction losses of real and generated samples. In addition, to prevent the imbalance of G and D, an equilibrium term is added in the objective function.

Formally, let the autoencoder reconstruction loss \({\mathcal {L}}: {\mathbb {R}}^{N_u} \rightarrow {\mathbb {R}}^{+}\) defined as:

$$\begin{aligned} {\mathcal {L}}({\textbf{u}}) = | {\textbf{u}} - D({\textbf{u}}) | \quad \text {where } {\textbf{u}} \in {\mathbb {R}}^{N_u}, \end{aligned}$$
(3)

with \({\mathbb {R}}^{N_u}\) the space of real data samples. Thus, the BEGAN objective is:

$$\begin{aligned} {\left\{ \begin{array}{ll} {\mathcal {L}}_D = {\mathcal {L}}({\textbf{u}}) - k_t\, {\mathcal {L}}(G({\textbf{z}})) &{}\quad \text {minimize for }\theta _D \\ {\mathcal {L}}_G = {\mathcal {L}}(G({\textbf{z}})) &{}\quad \text {minimize for }\theta _G \\ k_{t+1} = k_t + \lambda \, (\gamma {\mathcal {L}}({\textbf{u}}) - {\mathcal {L}}(G({\textbf{z}}))) &{}\quad \text {for each training step { t}}, \end{array}\right. } \end{aligned}$$
(4)

with \(\theta _G\) and \(\theta _D\) the network parameters for the generator and discriminator, \(k_t\) a control term allowing the losses balance at each step t, and \(\lambda\) the learning rate for \(k_t\). Finally, \(\gamma \in [0, 1]\) is the diversity ratio defined as the ratio between the generated and real data reconstruction loss expected value:

$$\begin{aligned} \gamma = \frac{{\mathbb {E}}\left[ {\mathcal {L}}(G({\textbf{z}}))\right] }{{\mathbb {E}}\left[ {\mathcal {L}}({\textbf{u}})\right] }. \end{aligned}$$
(5)

Therefore, the discriminator has two main goals: encoding and decoding real data, and discriminating real from generated data. The \(\gamma\) ratio is used to balance the two objectives. In fact, for lower values of \(\gamma\), the discriminator focuses more heavily on autoencoding real data, leading to smaller data diversity. BEGAN has been proved to be effective for generating high resolution images, but it lacks to specifically condition the network.

Conditional BEGAN (cBEGAN)49 has been introduced to overcome this issue. The cBEGAN objective is very similar to the one presented in Eq. (4), but the generator and discriminator are conditioned on conditioning variables \({\textbf{c}}\in {\mathbb {R}}^{N_c}\). Hence, the cBEGAN objective becomes:

$$\begin{aligned} {\left\{ \begin{array}{ll} {\mathcal {L}}_D = {\mathcal {L}}({\textbf{u}} \mid {\textbf{c}}) - k_t\, {\mathcal {L}}(G({\textbf{z}} \mid {\textbf{c}})) &{}\quad \text {minimize for }\theta _D, \\ {\mathcal {L}}_G = {\mathcal {L}}(G({\textbf{z}} \mid {\textbf{c}})) &{}\quad \text {minimize for }\theta _G, \\ k_{t+1} = k_t + \lambda \, (\gamma {\mathcal {L}}({\textbf{u}} \mid {\textbf{c}}) - {\mathcal {L}}(G({\textbf{z}}\mid {\textbf{c}}))) &{}\quad \text {for each training step} t, \end{array}\right. } \end{aligned}$$
(6)

with the autoencoder reconstruction loss:

$$\begin{aligned} {\mathcal {L}}({\textbf{u}} \mid {\textbf{c}}) = | {\textbf{u}} - D({\textbf{u}} \mid {\textbf{c}}) |. \end{aligned}$$
(7)

In practice, the generator is conditioned by concatenating the random noise vector \({\textbf{z}}\) with the conditioning variable \({\textbf{c}}\). The concatenation of the two represents the input for G. On the other hand, the discriminator is conditioned by concatenating the encoder output with the conditioning variable, before passing the concatenation to the decoder.

GAROM implementation details

The final objective of GAROM is to obtain a unique neural network which can perform ROM, while maintaining high reconstruction accuracy and generalization. We follow a data-driven approach for ROM, where a sample of high fidelity results from a domain-specific simulation is collected and used for training. Let \({\textbf{u}} \in {\mathbb {R}}^{N_u}\) indicating the output of the real simulation, and \({\textbf{c}} \in {\mathbb {R}}^{N_c}\) the variable affecting the simulation, e.g. parameters, time steps or boundary conditions. We name the variables \({\textbf{c}}\) as the conditioning variables, represented in this work by the free simulation parameters, as explained in section "Datasets description".

GAROM framework is based on an implicit generative modeling approach, specifically using cBEGAN. Indeed, cBEGAN allows the discriminator to learn a latent space for the real data manifold by autoencoding, forcing the generator to learn fundamental patterns of the data. Furthermore, having defined the discriminator as an autoencoder allows to impose constrains on the latent space, e.g. orthogonality, or to change the architecture to make it more robust, e.g. utilizing denoising59 or contractive60 regularization. These adjustments would not be possible with a vanilla GAN45, where the discriminator outputs a real number in [0, 1] only.

Formally, GAROM aims to approximate the density of the high fidelity solution data given the conditioning variables. In the present work the uniqueness of the high fidelity solutions is assumed, thus for a given parameter \({\textbf{c}}\) a unique solution \({\textbf{u}}\) is obtained. Following the cBEGAN framework, the GAROM objective is defined as:

$$\begin{aligned} {\left\{ \begin{array}{ll} {\mathcal {L}}_D = {\mathcal {L}}({\textbf{u}} \mid {\textbf{c}}) - k_t\, {\mathcal {L}}(G({\textbf{z}} \mid {\textbf{c}})) &{}\quad \text {minimize for }\theta _D \\ {\mathcal {L}}_G = {\mathcal {L}}(G({\textbf{z}} \mid {\textbf{c}})) + \eta \, |{\textbf{u}} - G({\textbf{z}} \mid {\textbf{c}})| &{}\quad \text {minimize for }\theta _G \\ k_{t+1} = k_t + \lambda \, (\gamma {\mathcal {L}}({\textbf{u}} \mid {\textbf{c}}) - {\mathcal {L}}(G({\textbf{z}}\mid {\textbf{c}}))) &{}\quad \text {for each training ste0} t, \end{array}\right. } \end{aligned}$$
(8)

where \(\eta \in \{ 0,1 \}\) ensure the absence or presence of the regularization parameter. Indeed, when \(\eta =0\) the GAROM objective is the same as the cBEGAN objective, which forces the generator to learn a distribution without any information of the uniqueness of the solution. On the contrary, setting \(\eta =1\) forces the generator to learn a unique distribution, which results in a shrinkage of the generator’s variance, as well as better accuracy and generalization (see Section "Model variance"). We refer to this model as the regularized generative adversarial reduced order model, r-GAROM. We want to highlight that, on the contrary to many (classical or deep learning based) ROM frameworks, we do not project onto a lower dimensional space and, once the latent representation is learnt, train an interpolator. Instead, no interpolation is done since the conditioning is passed directly to the generator and discriminator. Hence, we perform only one neural network training with our methodology.

Figure 4
figure 4

A schematic representation for GAROM generator and discriminator. The Generator input is the concatenation of random noise \({\textbf{z}}\), and the conditioning representation \(f_\tau ({\textbf{c}})\). The Discriminator encodes the input obtaining a latent vector, which is concatenated with the conditioning representation \(g_\psi ({\textbf{c}})\) before it is passed to the decoder.

The (r-)GAROM architecture is depicted schematically in Fig. 4. Notice that the conditioning variables (the PDE parameters) are not concatenated directly, as done in cBEGAN, but passed to a domain-specific decoder for the generator \(f_\tau\), and discriminator \(g_\psi\). This is done in order to pass more information to the generator similarly to61, and to avoid a high mismatch in the dimension of \({\textbf{z}}\) and \({\textbf{c}}\). Finally, it is worth noticing that in general different conditioning variables can be applied, e.g. parameters, POD modes or coordinates information. As a consequence, a specific network is needed to encode the information into one high-dimensional vector. In the present work, only parametric problems are considered, and we leave for future works the investigation of highly effective conditioning mechanisms.

GAROM predictive distribution

Once the (r-)GAROM model is trained, a probabilistic ensemble for the output variable \({\textbf{u}}({\textbf{c}})\) can be constructed leveraging \(G({\textbf{z}}\mid {\textbf{c}})\). Following a similar approach to62, we can predict the mean prediction \(\hat{{\textbf{u}}}\) and its variance \(\hat{\varvec{\sigma }}^2\), for a new conditioning variable \({\textbf{c}}^*\), by Monte Carlo sampling:

$$\begin{aligned} \begin{aligned} \hat{{\textbf{u}}}&= {\mathbb {E}}_{p_G}[{\textbf{u}} \mid {\textbf{z}}, {\textbf{c}}^*] \approx \frac{1}{K} \sum _{k=1}^K G({\textbf{z}}_k \mid {\textbf{c}}^*), \\ \hat{\varvec{\sigma }}^2&= {\mathbb {V}}\text {ar}_{p_G}[{\textbf{u}} \mid {\textbf{z}}, {\textbf{c}}^*] \approx \frac{1}{K-1} \sum _{k=1}^K [\hat{{\textbf{u}}} - G({\textbf{z}}_k \mid {\textbf{c}}^*)]^2. \end{aligned} \end{aligned}$$
(9)

In particular, \(p_G\) is the implicit distribution learned by the generator via adversarial optimization, K is the number of Monte Carlo samples used for approximating the distribution moments, and \({\textbf{z}}_k\) are the samples from the latent distribution \(p({\textbf{z}})\).

Uncertainty quantification strategies

Quantifying the uncertainty of a prediction is a common problem for data-driven reduced order models63. Exploiting the predictive distribution of (r-)GAROM one can obtain moment estimates by Monte-Carlo sampling, as discussed in section "GAROM predictive distribution". These estimates can be used during inference to compute bounds in probability of the prediction error, i.e. the difference between real and average (r-)GAROM solution. As an example, employing the Markov inequality we can find the following bound for any threshold \(a>0\):

$$\begin{aligned} \text {Prob}[({\textbf{u}}_{\text{true}}({\textbf{c}})-\hat{{\textbf{u}}}({\textbf{c}}))^2 \ge a] \le \frac{1}{a} \left\{ {\mathbb {E}}_{\text{Prob}} \left[ {\mathbb {E}}_{p_G}[({\textbf{u}}_{\text{true}}({\textbf{c}})-{\textbf{u}}({\textbf{c}}))^2] - \hat{\varvec{\sigma }}^2({\textbf{c}}) \right] \right\} , \end{aligned}$$
(10)

where Prob is the probability distribution for \({\textbf{c}}\), e.g. random uniform. In practice, the right hand side of Eq. (10) can be estimated, with sufficient statistics, after training by Monte Carlo approximation on the training dataset, resulting in an estimate in probability for the average prediction.

A more tight bound for each parameter \({\textbf{c}}\), can be obtained assuming an approximation of the distribution \(p_G\), and applying the theory of confidence interval. As an example, assuming \(p_G\) is well approximated using a normal distribution \({\mathcal {N}}(\hat{{\textbf{u}}}, \hat{\varvec{\sigma }}^2)\), one could estimate the prediction uncertainty in a similar approach to the one adopted by64. While the first bound based on Markov inequality is certain, the latter one uses an approximation of the distribution. Nonetheless, as also evidenced in64, these kinds of approximations (problem-specific) can obtain very good estimates.

Datasets description

This section is dedicated to introduce the numerical benchmarks applied for testing GAROM and comparing it to the baseline models presented in section "Baseline models". For notation consistency, \({\textbf{c}}\) indicates the vector of free parameters while \({\textbf{u}}\) the simulation solution.

Parametric Gaussian

The first test case is a simple problem representing a Gaussian function moving in a domain \(\Omega = [-1, 1] \times [-1, 1]\). The high fidelity function \(u: \Omega \rightarrow {\mathbb {R}}\) is defined as:

$$\begin{aligned} u(x, y) = e^{-[(x - c_1)^2 + (y - c_2)^2]}, \end{aligned}$$
(11)

with \({\textbf{c}} = [c_1, c_2]\) the centers of the Gaussian. Practically, u is evaluated on 900 points uniformly randomly chosen in \(\Omega\) for a fixed \({\textbf{c}}\). The evaluation results are flattened (row-major order) in a one-dimensional vector of size 900, namely \({\textbf{u}}\). The dataset is composed by \(N=400\) instances \({\textbf{c}}_i\) uniformly randomly chosen in \(\Omega\), resulting in N high fidelity solutions \({\textbf{u}}_i\).

Graetz problem

The second test case deals with the Graetz-Poiseuille problem, which models forced heat convection in a channel. The problem is characterized by two parameters: \(c_1\) controlling the length of the domain, and \(c_2\) the Péclet number, which takes into account the heat transfer in the domain. The full domain is \(\Omega (c_1) = [0, 1 + c_1]\times [0, 1]\), with \({\textbf{c}}=[c_1, c_2] \in [0.1, 10] \times [0.01, 10]\). The high fidelity solution \(u : \Omega (c_1) \rightarrow {\mathbb {R}}\) is obtained by solving:

$$\begin{aligned} {\left\{ \begin{array}{ll} -\Delta u(x, y; {\textbf{c}}) + c_2 y(1-y)\frac{\partial }{\partial x} u(x, y; {\textbf{c}})=0 &{}\quad (x, y) \in \mathring{\Omega }(c_1)\\ u(x=0, y; {\textbf{c}})=0 &{}\quad y \in [0, 1]\\ u(x, y=0; {\textbf{c}})=0 &{}\quad x \in [0, 1]\\ u(x, y=1; {\textbf{c}})=0 &{}\quad x \in [0, 1]\\ u(x, y=0; {\textbf{c}})=1 &{}\quad x \in [1, 1+c_1]\\ u(x, y=1; {\textbf{c}})=1 &{}\quad x \in [1, 1+c_1]\\ \partial _{{\textbf{n}}}u(x=1+c_1, y)=0 &{}\quad y \in [0, 1]\\ \end{array}\right. } \end{aligned}$$
(12)

where \(\Delta\) indicates the Laplacian operator, and \(\partial _{{\textbf{n}}}\) the normal derivative. The high-fidelity solutions are numerically computed by finite elements following the work6, using a mesh of 5160 points. Specifically, \(N=200\) instances of \({\textbf{c}}_i\) are uniformly randomly chosen in \(\Omega (c_1)\), resulting in N high fidelity solutions \({\textbf{u}}_i\) arranged as a vector (row-major order).

Lid cavity

The last test case is the famous Lid Cavity problem, modeling isothermal, incompressible flow in a two-dimensional square domain. The domain is composed of a top wall moving along the horizontal axis, while the other three walls are stationary. At low Reynolds number (Re) the flow is laminar, but it becomes turbulent when increasing the Re. A complete mathematical formulation of the problem is found in65. The full domain is \(\Omega = [d, d]\times [d, d]\) with \(d=0.1 m\) the domain size. The free parameter c is the magnitude of the velocity of the top wall, measured in \(ms^{-1}\). The Reynolds number is given by:

$$\begin{aligned} \text {Re} = \frac{d c}{\nu }, \end{aligned}$$
(13)

with the kinematic viscosity \(\nu =10^{-5}m^2s^{-1}\). To compute the high fidelity solution, we simulate for 5s using a time step of 0.0001 keeping only the last time-step simulation. The mesh is composed of hexahedral cells of \(70 \times 70 \times 1\) number of cell points in each direction. We use a cell-center finite volume scheme. The dataset is composed by \(N=300\) instances of \(c \in [0.01, 1]\) evenly spaced, resulting in N high fidelity solutions arranged as a vector (row-major order).

Baseline models

We compared the performances of the (r-)GAROM approach against some of the more consolidated data-driven frameworks in ROM community63. Such models rely on machine learning and/or linear algebra techniques, resulting on different approaches from the algorithmic perspectives. In such way, we aim for the most possible fair comparison, distinguishing the GAROM advantages and disadvantages with respect to the state-of-the-art in different possible operating contexts.

The baseline comparison models are:

  • Proper Orthogonal Decomposition with Radial Basis Function interpolation (POD-RBF)66,

  • Proper Orthogonal Decomposition with Artificial Neural Network interpolation (POD-NN)20,

  • Autoencoders with Radial Basis Function interpolation (AE-RBF)51,

  • Autoencoders with Artificial Neural Network interpolation (AE-NN)67,68,69.

  • Conditional GAN (cGAN)70

  • Regularized vanilla conditional GAN (r-cGAN)46

  • Deep Operator Networks (DeepONet)24

  • Deep Operator Networks with Non-Linear Manifol Decoder (NOMAD)25

In a few words, POD-RBF, POD-NN, AE-RBF, and AE-NN use proper orthogonal decomposition and an autoencoder for dimensionality reduction, while radial basis function and artificial neural network for approximating the reduced parametric solution manifold. The POD is performed by truncated singular value decomposition with a variable rank (latent dimension in the manuscript), while for radial basis function interpolation, we employed a Gaussian Kernel with length scale equal to 1. Differently, DeepONet and NOMAD are neural operator networks, mapping directly the parameters and field coordinates to the field solution on the specified coordinates. Finally, we also present two vanilla GAN approaches: cGAN and r-cGAN, where at the latter we add an L1 penalty loss to the generator network46.

For the neural networks-based approaches we performed hyperparameter optimization to find the architecture with lower training mean square error. The optimization was carried out with RayTune71. For the specifics on network parameters and training procedures see the Supplementary Information.

Software details

The PyTorch software72 is used to implement the GAROM model due to its versatility and its wide used in the deep learning community. The EZyRB software73 and PINA software74 are used for performing the comparison between GAROM and state-of-the-art techniques in Section numerics. Openfoam75 and FeniCSx76 are used for the numerical simulations. An implementation of GAROM is available in PINA74, while the datasets used are available in the Smithers package on GitHub at https://github.com/mathLab/Smithers.

Training setup and model architecture

An NVIDIA Quadro RTX 4000 GPU was used to train the GAROM models, while the GAROM comparison methods are trained on Intel CPUs. In the Graetz and Lid Cavity experiments, the GAROM and baseline models are all trained on a dataset containing \(80\%\) of the total number of high fidelity simulation instances chosen evenly spaced, and tested on the remaining \(20\%\). On the contrary, for the Gaussian test case, \(60\%\) of the high fidelity instances chosen evenly spaced are used for training, and the remaining \(40\%\) for testing. This choice is due to the higher number of instances with respect to Lid and Graetz test cases.

The GAROM models are all trained for 20000 epochs, with Adam optimizer using a learning rate of 0.001 for both the discriminator and generator. The GAROM training times for the benchmarks are approximate of 45 minutes for the parametric Gaussian, 1 hours and 30 minutes for Graetz, and 2 hours for the Lid Cavity, independently of the discriminator latent dimension. In order to fit all data in GPU we used a batch size of 4 for all tests, where the value was obtained by hyper-parameter optimization using RayTune71 choosing from \(\{4, 8, 16, 24, 64\}\). The noise vector \({\textbf{z}}\) are uniform random samples in \([- 1, 1]\) of dimension 13 for the Gaussian test, 8 for the Graetz test, and 12 for the Lid Cavity test. Also, the noise vector dimension was tuned with RayTune, choosing a random integer in the range [8, 16]. Furthermore, the \(\lambda\) and \(\gamma\) parameters of Eq. (8) are set to 0.001 and 0.3 respectively, as suggested in52. The generator, discriminator, generator conditional encoder, and discriminator conditional encoder networks are reported in Supplementary information.