Data-Driven and Physics-Informed Deep Learning Operators For Solution of Heat Conduction Equation

International Journal of Heat and Mass Transfer 203 (2023) 123809
Contents lists available at ScienceDirect
International Journal of Heat and Mass Transfer

journal homepage: www.elsevier.com/locate/hmt
Data-driven and physics-informed deep learning operators for solution

of heat conduction equation with parametric heat source
Seid Koric a,b,∗, Diab W. Abueidda a
a
National Center for Supercomputing Applications, University of Illinois, Urbana, IL 61801, USA
b
Mechanical Science and Engineering, University of Illinois, Urbana, IL 61801, USA
a r t i c l e i n f o a b s t r a c t
Article history: Deep neural networks as universal approximators of partial differential equations (PDEs) have attracted
Received 11 October 2022 attention in numerous scientific and technical circles with the introduction of Physics-informed Neural
Revised 12 December 2022
Networks (PINNs). However, in most existing approaches, PINN can only provide solutions for defined
Accepted 23 December 2022
input parameters, such as source terms, loads, boundaries, and initial conditions. Any modification in
Available online 31 December 2022
such parameters necessitates retraining or transfer learning. Classical numerical techniques are no excep-
Keywords: tion, as each new input parameter value necessitates a new independent simulation. Unlike PINNs, which
DeepONet approximate solution functions, DeepONet approximates linear and nonlinear PDE solution operators by
Heat (Poisson’s) equation using parametric functions (infinite-dimensional objects) as inputs and mapping them to different PDE
Multi-dimensional parameter solution function output spaces. We devise, apply, and compare data-driven and physics-informed Deep-
Deep learning ONet models to solve the heat conduction (Poisson’s) equation, one of the most common PDEs in science
and engineering, using the variable and spatially multi-dimensional source term as its parameter. We
provide novel computational insights into the DeepONet learning process of PDE solution with spatially
multi-dimensional parametric input functions. We also show that, after being adequately trained, the pro-
posed frameworks can reliably and almost instantly predict the parametric solution while being orders
of magnitude faster than classical numerical solvers and without any additional training.
© 2022 Elsevier Ltd. All rights reserved.
1. Introduction produce the data needed to train such deep-learning-based surro-

gate models, a traditional discretization strategy, such as the finite
Machine learning (ML) has grown significantly lately, resulting element method (FEM), is typically required, and the whole pro-
in a wide range of applications in areas such as medical diagno- cess is usually time-consuming.
sis, speech and image recognition, document classification, bioin- Deep neural networks have been also presented as a way to
formatics, and autonomous driving, to name a few. Deep Learn- solve PDEs governing many physical phenomena with conservation
ing, a subset of machine learning inspired by the biological struc- and constitutive rules [7–11]. These physics-informed neural net-
ture and function of the brain, has sparked a lot of interest in the works (PINNs) are designed to handle problems with little or no
fields of computational engineering and physics. Numerous data- training data. The PINN method is augmented with physical princi-
driven surrogate deep learning models have been trained to learn ples regulating the situation at hand to compensate for the absence
and quickly inference models of heat conduction [1], temperature of data. Physics principles and data are also used combined to pro-
in data centers [2], topologically optimized materials and struc- vide solutions to problems that are incomplete or ill-posed and
tures [3], nonlinear material responses such as plasticity, thermo- cannot be solved using traditional numerical approaches [12]. Al-
plasticity, thermo-viscoplasticity [4–6], and a variety of other ap- though trained PINNs are orders of magnitude quicker than tradi-
plications. The complexity of the problem dictates the size of tional numerical PDE solvers, they have limited flexibility in terms
the dataset needed to train a deep learning model; for example, of parameters such as source terms, material properties, loads, and
complicated problems demand large datasets to develop surrogate boundary and initial conditions. Any change in such parameters
models capable of correctly predicting the solution. As a result, to necessitates retraining or transfer learning. The traditional numer-
ical approaches are similar, and each new input parameter value
necessitates a new simulation.
∗
Corresponding author at: National Center for Supercomputing Applications, To solve this challenge, Lu et al. [13] recently developed
University of Illinois at Urbana-Champaign, 1205 W. Clark St, Urbana, IL 61801, USA. DeepONet, a novel operator learning architecture motivated by the
E-mail address: koric@illinois.com (S. Koric).
https://doi.org/10.1016/j.ijheatmasstransfer.2022.123809
0017-9310/© 2022 Elsevier Ltd. All rights reserved.
S. Koric and D.W. Abueidda International Journal of Heat and Mass Transfer 203 (2023) 123809
universal approximation theorem for operators [14]. DeepONet ef- 2D conditions and regular rectangular geometries. With further
fectively mapped between unseen parametric functions and solu- code modification beyond the scope of the current work, they can
tion spaces for a few linear and nonlinear PDEs in that seminal be implemented to various loading, boundary conditions, material
work, in addition to learning explicit operators such as integrals. properties, and even irregular 2 and 3D geometries.
This provided an effective new technique to solve parametric and
stochastic PDEs. Wang et al. [15] have enhanced the DeepONet 2.2. Data-driven DeepONet
formulation by the information from governing PDE in so-called
physics-informed DeepONet and reported increased prediction ac- DeepONet was proposed in [13] as a method for learning non-
curacy and data handling efficiency but under the higher compu- linear operators by mapping input functions into matching output
tational cost for training. Both of these works have solved mostly functions. We illustrate how DeepONet can be used to tackle the
spatially one-dimensional PDEs. We extended and compared these challenge of learning the parametric PDE solution operator for the
DeepONet formulations to solve the heat conduction equation with heat conduction Eq. (1), with the source term u being a paramet-
spatially 2D parametric source term, one of the most solved PDEs ric function that can take on a wide range of values. In an infinite
using traditional numerical approaches in science and engineering functional U space, u ∈ U represents the source term parameters
research. (i.e., input functions) and s ∈ S refers to the PDE’s unknown solu-
tions in the functional space S. We assume that there is a single
solution s = s(u) in S to the Eq. (1) for every u in U, which is also
2. Formulations subject to the boundary conditions BC. As a result, the mapping
solution operator G : U → S may be defined as:
2.1. Heat conduction (Poisson’s) equation G (u ) = s (u ) (2)
The Poisson’s equation governs physical phenomena such as Instead of only a collection of points ȳ on a domain, the Deep-
heat conduction with a moving heat source (laser head) in addi- ONet considers both u and ȳ, and predicts G(u) by combining them,
tive manufacturing, potential flow and pressure solvers in Compu- as shown in Fig. 1.
tational Fluid Dynamics (CFD), electrostatics, gravity in astronomy, Because it can accept a source term u function as an input vari-
and molecular dynamics, to name a few. The Poisson’s equation able, this network is significantly more capable than other PINN
is an inhomogeneous elliptic PDE, with the inhomogeneous part networks. Every G(u) computation at a point ȳ generates a new
u(x,y) representing the source of the field. In many engineering and solution point, which can be written as G(u)(ȳ). Please keep in
scientific applications, a Poisson’s equation with the same bound- mind that, in this case, ȳ represents coordinate points in the 2D
ary conditions but different source terms are frequently solved, domain (a unit square) where the network predicts the solution of
which may consume a significant fraction of the entire appli- the parametric heat conduction equation.
cation solution time, even with state-of-the-art numerical meth- Every input function u is specified on m discrete points in the
ods and computer technology. Moreover, many iterative compu- domain known as input sensors, and the output solution can be
tationally processes used in thermal parametric studies, design, assessed on P output sensor locations. The unstacked DeepONet
sensitivity analysis, uncertainty quantification, and optimization is employed, which is made up of two independent fully con-
of classical and advanced manufacturing processes require a vast nected neural networks termed branch and trunk. According to Lu
number of forward functional evaluations by sampling the para- et al. [13], the unstacked DeepONet achieves better outcomes than
metric space and calculating temperature solutions to obtain con- the stacked DeepONet in balancing training between u and ȳ in-
verging statistics. These evaluations are traditionally performed by puts while consuming fewer computing resources. The branch and
classical numerical methods such as finite elements. For high- trunk networks employ the hyperbolic tangent activation function
fidelity thermo-mechanical models, these simulations are often (Tanh) to model intricate functional linkages between inputs and
prohibitively computationally expensive in the design and op- outputs. The branch network receives u at each of its m locations
timization loops, particularly with traditional sampling methods and outputs q intermediate outputs bk . The trunk network receives
such as Monte Carlo or Latin Hypercube. Instead of numerical ȳ as an input and outputs q intermediate outputs to tk . In Eq. (3), a
forward evolutions, the Deep Operator Neural Network, which in dot product is used to combine the intermediate outputs, resulting
addition to the solution, can also learn its parameters, is a nat- in a DeepONet solution operator prediction Gˆ (u )(ȳ ).
ural choice to be used as a surrogate model for instant func-
q
tional evaluation. In the rest of this work, we will investigate Gˆ (u )(ȳ ) = bi ti (3)
how the data-driven and physics-informed DeepONet formulations i=1
can learn the solution of a parametric heat conduction equation. Eq. (4) gives the loss function L for a data-driven DeepONet
We also validate the DeepONet predictions and compare both ap- based on mean squared error:
proaches’ computational performance and accuracy on the latest 2
1
N P
high-performance computing resources.
L= Gˆ (ui )(ȳuj i ) − s(ui )(ȳuj i ) (4)
We employ the heat conduction (Poisson’s) equation, Eq. (1), on N∗P
i=1 j=1
a unit square domain with zero Dirichlet boundary conditions as a
basic reference equation to solve with DeepONet throughout this N is a number of sample functions u. In a training step, the net-
paper. work predicts Gˆ (u )(ȳ ) for each sample function ui , which is as-
sessed at P output sensor locations ȳ j and compared to the associ-
kx ∂∂ x2s + ky ∂∂y2s + u(x, y ) = 0(x, y ) ∈ [0, 1]x[0, 1]
2 2
ated target solutions s(u )(ȳ ) calculated by a classical second-order
(1)
BC : s(x, 0 ) = s(x, 1 ) = s(0, y ) = s(1, y ) = 0 finite difference solution in an offline (data-generation) stage. The
gradient of the loss function with respect to the weights in both
Where s(x,y) is an unknown function of two independent vari- networks is then calculated as a part of a backpropagation pro-
ables, x and y, u(x,y) is a source term function, and kx = ky = 0.01 cess, and the Adam optimizer minimizes the loss value by mod-
is the diffusion coefficient. It is important to note that the data- ifying the weights. DeepONet learns the solution operator of the
driven and physics-informed DeepONet methodologies developed heat conduction equation after a sufficient number of feedforward
in this work for parametric heat conduction are not limited to and backpropagation iterations and can inference a discrete point
2
solution for an unknown source term parametric function almost which satisfy the governing heat conduction PDE and thus provides
instantly. a physics-based regularization contribution to the overall loss that
constrains the space of admissible deep learning solutions.
2.3. Physics-informed DeepONet
1
N Q
L phys =
The original, or purely data-driven, DeepONet architecture from N∗Q
i=1 j=1
Fig. 1 requires a large number of outputs, also known as labels 2
∂ 2 G(ui )(xi , yi ) ∂ 2
( i
)( i
, i
)
or targets, s(u )(ȳ ), which are used to calculate the loss function j j
G u x j
y j i
×kx + k y + u (
i i
x , y ) (5)
in Eq. (4) and thus properly train the network. As stated before, ∂ (xij )
2
∂ (yij )
2 j j

the generation of such data often requires repeated evaluation
with classical numerical methods such as higher-order finite dif- N is again the number of sample functions u. Since u is orig-
ference, finite volume, or finite element methods. This can be par- inally defined on m points, whose coordinates are not necessarily
ticularly time and computationally expensive with the governing coincided with the collocation points, the 2D interpolation is used
PDEs defined on large multi-dimensional domains, even on high- to provide discrete u values at the collocation point coordinates.
performance computing platforms. It is even more difficult to ob-
tain a sufficient number of labels from the experimental approach. 2.4. Data sampling and computing environment
We have devised a DeepONet model that can be trained without
any generated or observable data at all, given only knowledge of We use a 2D correlated and scale-invariant Gaussian random
the heat conduction PDE and its corresponding BCs. The so-called field to generate random input functions u(x,y). We utilized a
physics-informed DeepONet model architecture is given in Fig. 2. python implementation described in [16], in which the correla-
The major difference is that there are two contributions to the loss tions are explained by a scale-free spectrum P(k) ∼ 1/|k|α /2 (with
function. One is the operator loss similar to the data-driven Deep- α = 4). The smoothness of the sampled function is determined by
ONet loss in Eq. (4) but applied only to the points on the bound- the length-scale coefficient, and a larger value means a smoother
ary conditions where the targets (solutions) are already defined, u. Because the grid where the finite difference solution for s tar-
zero-valued in the case of our Dirichlet BCs. The other loss is the gets is calculated is generally not the same as the P grid for out-
physics loss calculated at the Q collocation points in the domain’s put sensors, and similarly, the grid where u is generated is gener-
interior where the estimated solution operator G is differentiated ally different than the input sensor grid m, bilinear 2D interpola-
with respect to input coordinates by means of automatic differen- tion is used to provide discrete values for s and u across the dif-
tiation. For each collocation point, residual in Eq. (5) is calculated, ferent grids. Bilinear 2D interpolation is used to provide discrete
Fig. 1. Data-driven DeepONet.
Fig. 2. Physics-informed deep operator network.
3
values for s and u across the different grids since the grid where Table 1
Solution times for classical non-optimized and highly opti-
the finite difference solution for s targets is calculated is gener-
mized Poisson’s solvers and DeepONet inference.
ally not the same as the P grid for output sensors, and similarly,
the grid where u is generated is generally different than the in- FD Iterative (Jacobi) FD Implicit DeepONet Inference
put sensor grid m. In this novel work, we have provided an exten- 2.1 sec. 0.05 sec. 9 x 10-4 sec.
sion of the data-driven and physics-informed DeepONet formula-
tions to a 2D heat conduction domain under the variable spatial
distribution of the heat source. The code is written in JAX [17], spreading at random in the 2-dimensional domain where the pre-
a relatively new Python-based toolkit built for high-performance dicted solution is compared to targets, P has less impact on the
machine learning research and created by Google. Many sophisti- testing error than the size of the training data set #u. Nevertheless,
cated capabilities of JAX include advanced automated differentia- the error was as low as 3% for #u=80 0 0 and P=600. The diffusion-
tion (grad), just-in-time compilation (JIT), and cross-device com- reaction PDE across its space-time grid has a smaller L2 test error,
pute replication (pmap). JIT was utilized to offload some of the according to the supplement of [15]. Whilst its u parametric func-
computationally heavy kernels to the GPU, and pmap was very tion is defined on a one-dimensional spatial space (line), our u in-
helpful in speeding up target creation. The Adam optimizer, also put function, on the other hand, is defined on a two-dimensional
available in JAX as a high-level function, employed automatic dif- spatial domain. To have the same closeness of the discrete points
ferentiation. Computing was done on a computing node of the defining u in [14], it would need a square of m (number of input
HPC cluster Delta [18], housed at the National Center for Super- sensors), which would surpass the device memory of our present
computing Applications (NCSA), and has four A100 Nvidia GPU GPU hardware.
cards. If the branch and trunk network sizes are varied, i.e., the num-
ber of neurons per hidden layer (network width) in Fig. 5 and the
3. Results number of hidden layers (network depth) in Fig. 6, a similar pat-
tern emerges. Increasing the network’s width or depth, in particu-
3.1. Data-driven DeepONet lar, tends to enhance prediction accuracy. Surprisingly, as network
sizes get larger by increasing their width and depth, the compu-
The data-driven DeepONet network was trained during 80,0 0 0 tational training time rises very little (2 min maximum), owing
epochs with 1,0 0 0-8,0 0 0 training u samples (m=121). For each u to JAX’s excellent handling of deep learning training kernels on
data sample, target solutions were supplied for 20 0-60 0 output GPUs. Finally, if we had naively utilized a single fully connected
sensor points (P=20 0-60 0), whose coordinates were picked at ran- feedforward neural network instead of DeepONet, we would have
dom in the 2D domain. During the offline data-generation stage, needed 14,641 outputs to provide predictions on the 121x121 grid
the second-order finite-difference (FED) solution on the 121x121 used for DeepONet data generation and prediction validations. This
grid is used to derive the target solutions on those points. We would call for a significantly larger neural network. It is debat-
tested both simple explicit Jacobi iterative and implicit FED solver able whether it would be feasible to correctly train such a network
schemes, Özişik et al [19]. A vector-style mapping of computing even on the most recent and powerful high-performance comput-
across devices (pmap) in JAX is used to significantly accelerate ing platforms.
data sample generation. It took the Jacobi FED iterative solver 4– Table 1 shows the computational cost of inferencing using a
6 min to generate all training data samples with pmap, while using trained data-driven DeepONet model, as well as the solution times
highly optimized python libraries, the implicit solver needed only of a heat conduction PDE using second-order finite difference it-
30-35 sec. erative (Jacobi) and highly optimized implicit solution schemes on
Fig. 3 depicts three randomly picked test u samples, which the GPU with the numpy and scipy implementations in JAX [17]. Be-
network has never seen before, as well as the matching classical cause JAX employs asynchronous dispatch to disguise python over-
numerical heat conduction equation solutions (targets) and data- heads, we properly waited for the JAX calculation to finish be-
driven DeepONet forecasts. The network was trained using 5,0 0 0 u fore providing accurate measurements. Particularly, once training
samples and 400 output sensors (P=400) in this scenario. For both data is generated, and a data-driven DeepONet model is adequately
the branch and trunk networks, we start with the nominal setup trained (which combined takes about 22 min on A100 GPU), the
by Wang et al. [15], which had five hidden layers, each with 50 DeepONet can predict the solution of a heat conduction PDE in a
neurons, trained on 80,0 0 0 epochs. Even though the peaks and val- fraction of a second on modern GPU, which is two to three orders
leys in the parametric source distributions u differ significantly be- of magnitude faster than traditional PDE solvers. The inferencing
tween the test samples, data-driven DeepONet correctly predicted involves a single forward pass consisting of dense matrix-vector
their 2D diffusive nature solution in the interior governed by the multiplication kernels, which are exceedingly well optimized not
heat conduction equation and on the zero-valued Dirichlet bound- only on GPUs but also on CPUs, which allows inferencing on low-
aries. Visual assessment of the numerous additional test samples end computers, too, with the trained parameters transferred from
corroborated the data-driven DeepONet predictions’ quantitative high-end computers with GPUs. This is also comparable to tradi-
correctness. tional PINNs, but with the significant difference that trained Deep-
The mean of the relative L2 prediction error over NP number of ONets can produce solutions relatively instantaneously when a
examples in the test data-set, each defined on ȳ with P training new and unknown spatially distributed parametric input is sup-
point coordinates in the domain (output sensors), provided a more plied, whereas PINNs must be retrained. Since DeepONets can gen-
qualitative test error analysis of a trained data-driven DeepONet in erally be trained to solve any other PDE with its parameters, the
Eq. (6). surrogate DeepONet-trained models might possibly replace the tra-
NP ˆ
ditional and often computationally expensive PDE solver kernels,
1 G(ui )(ȳ ) − s(ui )(ȳ ) requiring just a forward pass of the network (inferencing) for each
2
L̄ = 2
(6)
NP Gˆ (ui )(ȳ ) new input such as source term, material properties, boundary con-
i=1 2 ditions, loads, and other parameters. This can considerably speed
For a variable number of input parametric functions #u (or N) up high-fidelity scientific and engineering applications governed
and output sensors P utilized for training, the test error is shown by parametric PDEs, particularly those that solve large problems
in Fig. 4 for Np=100. Because the output sensors are sparsely [20,21] or repeatedly a large number of PDEs with parameters.
4
Fig. 3. Target (numerical) solution versus the prediction of a trained data-driven DeepONet for 3 random u data samples, represented in 3 rows, from the test dataset.
Fig. 5. Effect of number of neurons on test error and corresponding training time
with 5-hidden-layer branch and trunk networks (#u=50 0 0, P=40 0).
Fig. 4. Effect of training data set size #u and number of output sensors P on test
error.
number of input sensors m and collocation points Q. The num-
ber of output sensors P where data–driven DeepONet is evalu-
3.2. Comparison of data-driven and physics-informed DeepONets ated matches the number of random points enforcing zero-valued
Dirichlet boundary conditions in the physics-informed DeepONet.
Solution predictions from Data-driven and Physics-informed The test error analysis in Fig. 8 compares prediction errors across
networks are compared in Fig. 7 for three randomly chosen source 100 test samples from the data-driven and physics-informed nom-
distributions from the test datasets. The nominal neural networks inal size DeepONets. While the number of input sensors m is set
are used consisting of 5 hidden layers with 50 neurons each, 5,0 0 0 to be equal to the number of collocation points Q in both physics-
u training samples trained with 80,0 0 0 epochs, and a variable informed cases, in the first case, their spatial coordinates do not
5
Fig. 8. Test Error comparison between data-driven and physics-informed Deep-

Fig. 6. Effect of number of hidden layers, with 50 neurons each, on test error and
ONets.
corresponding training time of the branch and trunk networks (#u=50 0 0, P=40 0).
polation that interpolates u values between input sensor and col-

match (collocation points are randomly chosen, Q not on m), while location coordinates, as stated before. Wang et al. [14], however,
in the second case, they are matched precisely (collocation points reported in his work with the Diffusion-Reaction equation sam-
are exactly at input sensor point locations, Q on m). pling one-dimensional u space that the physics-informed Deep-
The data-driven prediction is slightly more accurate than ONet provided more accurate test predictions than the data-driven
physics-informed prediction with the collocation points coincid- DeepONet. Therefore, it seems that physics-informed DeepONet
ing with the input sensor locations for u. The error is, however, is more sensitive to the dimensionally cursed phenomena of the
larger for the physics-informed case with the randomly spaced col- source term than the data-driven one, i.e., when the dimension-
location points. This is a consequence of the 2D bi-linear inter- ality of u sampling space increases, the surface of the sampling
Fig. 7. Data-driven and Physics-informed DeepONet predictions for 3 random u data samples, represented in 3 rows, from the test dataset.
6
rameters, such as anisotropic material properties with kx and ky

thermal conductivities, can be easily added as additional two pa-
rameters to u. Similarly, a spatial variation of boundary conditions
can be represented by additional parameters in u representing dis-
crete values on the boundaries. It is merely a function of ade-
quately training a DeepONet with the value ranges of additional
parameters, which is an increased computational task, particularly
with the data-driven approach that also requires additional data
generation. Nevertheless, this is readily feasible on modern com-
puters with GPUs, particularly since the DeepONet training has to
be usually done only once. Similar DeepONet-based surrogate deep
learning models are expected to aid or even replace traditional nu-
merical PDE solver kernels in the future, allowing for considerably
faster high-fidelity simulations, optimizations, designs, and online
Fig. 9. Training time comparison between data-driven and physics-informed Deep- controls in many scientific and engineering applications and pro-
ONets. cesses governed by parametric PDEs.
Declaration of Competing Interest

space increases much faster so that the available sampling data be-
come sparse. Similar results have been reported in other works for
The authors declare that they have no known competing finan-
convolutional physics-informed neural networks compared to the
cial interests or personal relationships that could have appeared to
data-driven equivalents, such as by Zhu et al. [22] and, lately, Fuhg
influence the work reported in this paper.
et al. [23].
Finally, the comparison between training time for data-driven
CRediT authorship contribution statement
and physics-informed DeepONet networks on an A100 GPU node
is given in Fig. 9. Similarly to what Wang et al. [14] have ob-
Seid Koric: Conceptualization, Methodology, Supervision, For-
served, training physics-informed DeepONets is computationally
mal analysis, Writing – review & editing. Diab W. Abueidda: In-
more expensive than the data-driven equivalent mainly due to a
vestigation, Validation, Writing – review & editing.
physics-informed network needing to compute the PDE residual
by means of automatic differentiation producing a larger compu- Data availability
tational graph then the data-driven network. However, since the
physics-informed network’s offline (non-training) step just pro- Data will be made available on request.
duces u samples, it takes only 1-2 min. for the Jacobi and 10-15
sec. for the implicit FED schemes. This is 2-3x quicker than in data- Acknowledgements
driven DeepONets, which must also compute matching solutions
(targets) at collocation locations. The authors would like to thank the National Center for Su-
percomputing Applications (NCSA) at the University of Illinois, and
4. Conclusions particularly its Industry Program and the Center for Artificial Intel-
ligence Innovation (CAII) for their support and hardware resources.
After presenting the data-driven and physics-informed Deep- This research is also a part of the Delta research computing project,
ONet frameworks, which employ the dual branch-trunk neural net- which is supported by the National Science Foundation (award OCI
work to learn parametric PDE solution operators, we concentrated 2005572) and the State of Illinois.
on solving the heat conduction equation with spatially 2D para-
metric source term inputs in this paper. This is best to our knowl- References
edge one of the first uses of DeepONets to a spatially multi-
[1] J. Zhao, W. Zhao, Z. Ma, W. Yong, B. Dong, Finding models of heat conduction
dimensional application governed by an extremely important para- via machine learning, Int. J. Heat Mass Transf. 185 (2022) 122396, doi:10.1016/
metric PDE in science and engineering. We have found that data- j.ijheatmasstransfer.2021.122396.
driven DeepONets can reasonably quickly learn to solve the heat [2] J. Athavale, M. Yoda, Y. Joshi, Comparison of data driven modeling approaches
for temperature prediction in data centers, Int. J. Heat Mass Transf. 135 (2019)
conduction equation and inference reasonably accurate results on 1039–1052, doi:10.1016/j.ijheatmasstransfer.2019.02.041.
contemporary GPUs when a new random multi-dimensional source [3] H.T. Kollmann, D.W. Abueidda, S. Koric, E. Guleryuz, N.A. Sobh, Deep learning
term input is presented from the test dataset previously unseen by for topology optimization of 2D metamaterials, Mater. Des. 196 (2020) 109098,
doi:10.1016/j.matdes.2020.109098.
the network. We also looked at the impact of several factors on [4] M. Mozaffar, R. Bostanabad, W. Chen, K. Ehmann, J. Cao, M. Bessa, Deep learn-
test error, such as training size, the number of output sensors, net- ing predicts path-dependent plasticity, Proc. Natl. Acad. Sci. 116 (52) (2019)
work depth and width, and their effect in spatially 2D dimensional 26414–26420, doi:10.1073/pnas.1911815116.
[5] D.W. Abueidda, S. Koric, N.A Sobh, H. Sehitoglu, Deep learning for plasticity
input and output regions. Most importantly, we have found that and thermo-viscoplasticity, Int. J. Plast. 136 (2021) 102852, doi:10.1016/j.ijplas.
the computational cost of running a previously trained DeepONet 2020.102852.
model to solve spatially multi-dimensional parametric PDEs can be [6] B.P. Savali, A. Mielke, T. Ricken, Data-driven stress prediction for thermoplastic
materials, Proc. Appl. Math. Mech. (PAMM) 21 (1) (2021) e202100225, doi:10.
magnitudes of order lower than for the classical numerical solvers.
1002/pamm.202100225.
Finally, we have compared test accuracy and training time between [7] Y. Feng, W. Gao, D. Wu, F. Tin-Loi, Machine learning aided stochastic elasto-
data-driven and physics-informed DeepONets. We have found that plastic analysis, Comput. Methods Appl. Mech. Eng. 357 (2019) 112576, doi:10.
1016/j.cma.2019.112576.
the data-driven DeepONet provides a more accurate prediction for
[8] M. Raissi, P. Perdikaris, G.E. Karniadakis, Physics-informed neural networks: a
test samples with less training time. On the other hand, physics- deep learning framework for solving forward and inverse problems involving
informed DeepONet takes significantly less time in the offline (data nonlinear partial differential equations, J. Comput. Phys. 378 (2019) 686–707,
generation) stage and is able to learn the solution operator of the doi:10.1016/j.jcp.2018.10.045.
[9] K.M. Kim, P. Hurley, J.P. Duarte, Physics-informed machine learning-aided
parametric heat conduction equation without any generated or ob- framework for prediction of minimum film boiling temperature, Int. J. Heat
served solution targets. It is worth mentioning here that other pa- Mass Transf. 191 (2022) 122839, doi:10.1016/j.ijheatmasstransfer.2022.122839.
7
[10] D.W. Abueidda, Q. Lu, S. Koric, Meshless physics-informed deep learning [17] Bradbury J., Frostig R., Hawkins P., Johnson M.J., Leary C., Maclaurin D., Necula
method for three-dimensional solid mechanics, Int. J. Numer. Methods Eng. G., Paszke, VanderPlas A.J., Wanderman-Milne S., Zhang Q.: JAX: composable
122 (23) (2021) 7182–7201, doi:10.1002/nme.6828. transformations of Python+NumPy programs (2018).
[11] J.N. Fuhg, N. Bouklas, The mixed deep energy method for resolving concen- [18] Delta HPC system at NCSA, https://www.ncsa.illinois.edu/research/project-
tration features in finite strain hyperelasticity, J. Comput. Phys. 451 (2022) highlights/delta/.
110839, doi:10.1016/j.jcp.2021.110839. [19] M.N. Özişik, H.R.B. Orlande, M.J. Colaço, R.M. Cotta, Finite Difference Methods
[12] S. Cai, Z. Wang, F. Fuest, Y.J. Jeon, C. Gray, G.E. Karniadakis, Flow over an in Heat Transfer, 2nd ed., CRC Press, 2017, doi:10.1201/9781315121475.
espresso cup: inferring 3-D velocity and pressure fields from tomographic [20] S. Koric, A. Gupta, Sparse matrix factorization in the implicit finite element
background oriented Schlieren via physics-informed neural networks, J. Fluid method on petascale architecture, Comput. Methods Appl. Mech. Eng. 32
Mech. 915 (2021) A102, doi:10.1017/jfm.2021.135. (2016) 281–292, doi:10.1016/j.cma.2016.01.011.
[13] L. Lu, P. Jin, G. Pang, Z. Zhang, G.E. Karniadakis, Learning nonlinear operators [21] M. Vázquez, G. Houzeaux, S. Koric, et al., Alya: Multiphysics engineering simu-
via DeepONet based on the universal approximation theorem of operators, Nat. lation toward exascale, J. Comput. Sci. 14 (2016) 15–27, doi:10.1016/j.jocs.2015.
Mach. Intell. 3 (2021) 218–229, doi:10.1038/s42256- 021- 00302- 5. 12.007.
[14] T. Chen, H. Chen, Universal approximation to nonlinear operators by neural [22] Y. Zhu, N. Zabaras, P.S. Koutsourelakis, P. Perdikaris, Physics-constrained deep
networks with arbitrary activation functions and its application to dynamical learning for high-dimensional surrogate modeling and uncertainty quantifica-
systems, IEEE Trans. Neural Netw. 6 (1995) 911–917, doi:10.1109/72.392253. tion without labeled data, J. Comput. Phys. 394 (2019) 56–61, doi:10.1016/j.jcp.
[15] S. Wang, H. Wang, P. Perdikaris, Learning the solution operator of paramet- 2019.05.024.
ric partial differential equations with physics-informed DeepONets, Sci. Adv. 7 [23] J.N. Fuhg, A. Karmarkar, T. Kadeethum, H. Yoon, N. Bouklas, Deep convolutional
(40) (2021) 1–9, doi:10.1126/sciadv.abi8605. ritz method: parametric PDE surrogates without labeled data, arXiv:2206.
[16] Sciolla B.:, Generator of 2D gaussian random fields, https://github.com/bsciolla/ 04675v1 [cs.CE], Jun 2022.
gaussian- random- fields.

Data-Driven and Physics-Informed Deep Learning Operators For Solution of Heat Conduction Equation

Uploaded by

Data-Driven and Physics-Informed Deep Learning Operators For Solution of Heat Conduction Equation

Uploaded by

International Journal of Heat and Mass Transfer 203 (2023) 123809

Contents lists available at ScienceDirect

International Journal of Heat and Mass Transfer

Data-driven and physics-informed deep learning operators for solution

1. Introduction produce the data needed to train such deep-learning-based surro-

Fig. 1. Data-driven DeepONet.

Fig. 2. Physics-informed deep operator network.

Fig. 8. Test Error comparison between data-driven and physics-informed Deep-

polation that interpolates u values between input sensor and col-

rameters, such as anisotropic material properties with kx and ky

Declaration of Competing Interest

You might also like