Abstract

Physics-constrained machine learning is emerging as an important topic in the field of machine learning for physics. One of the most significant advantages of incorporating physics constraints into machine learning methods is that the resulting model requires significantly less data to train. By incorporating physical rules into the machine learning formulation itself, the predictions are expected to be physically plausible. Gaussian process (GP) is perhaps one of the most common methods in machine learning for small datasets. In this paper, we investigate the possibility of constraining a GP formulation with monotonicity on three different material datasets, where one experimental and two computational datasets are used. The monotonic GP is compared against the regular GP, where a significant reduction in the posterior variance is observed. The monotonic GP is strictly monotonic in the interpolation regime, but in the extrapolation regime, the monotonic effect starts fading away as one goes beyond the training dataset. Imposing monotonicity on the GP comes at a small accuracy cost, compared to the regular GP. The monotonic GP is perhaps most useful in applications where data are scarce and noisy, and monotonicity is supported by strong physical evidence.

1 Introduction

Physics-constrained machine learning is an important toolbox in the machine learning era, with many possible applications, including those in biology, materials science, astrophysics, and finance. Breakthroughs in deep learning have been recognized as some of the most revolutionary contributions to science, but its Achilles heel is the size of the dataset required to train huge deep learning architectures.

However, this requirement cannot always be satisfied, especially in fields involving experimentation, where data can be resource-intensive, in terms of both time and labor. In this context, physics-constrained machine learning arises as an alternative solution, with equivalent or better accuracy, while requiring significantly less data to train. By incorporating some fundamental physics rules into the machine learning framework, the behavior of the machine learning predictions could be more physically plausible. One common constraint is the monotonicity, where the predictions are expected to behave monotonically with respect to one or several input variables.

Materials science is a field where theory has overwhelmed in the last several decades and centuries. The process–structure–property relationship is perhaps one of the most well-studied topics, in terms of theoretical analysis. Existing well before the computer era, material scientists tended to agree that mathematics and physics are the right tools to bridge the gap between inputs and quantities of interest. With the emergence of computers, supercomputers, and subsequently deep learning, the main tool has changed significantly, starting with the Materials Genome Initiative [1]. However, brute-force applications of deep learning has suffered from the lack of data in materials science, where physics-constrained machine learning seems to be a more promising candidate. Furthermore, materials science theory suggests that there are many simple physical rules that one can take advantage of when constructing a machine learning framework, for example, the Hall–Petch relationship [2]. It should be noted that many of these rules or simple equations are involve monotonicity. If these rules are successfully exploited, then a physics-constrained machine learning could be achieved with much less data, while retaining the accuracy performance.

The Gaussian process (GP) stands out among other machine learning approaches and has been one of the most popular methods for small datasets across multiple disciplines. In particular, it is well known that GPs are often used as underlying surrogate models for Bayesian optimization, which is an efficient active machine learning method and very popular among materials scientists. For example, Tallman et al. [3,4] used a GP as a surrogate model to bridge microstructure crystallography and homogenized materials properties, where crystal plasticity finite element models are used to construct the database, and Yabansu et al. [5] applied GPs to capture the evolution of microstructure statistics. Tran and Wildey [6,7] also used a GP as a surrogate model for structure–property relationship and solved a stochastic inverse problem using the acceptance–rejection sampling method. More examples can be found in Refs. [8,9].

One of the main advantages of using a GP is its flexibility and, therefore, its ability to adopt extensions. Fernández-Godino et al. [10] proposed a novel approach to constrain symmetry on GPs. Linear operator constraints can be enforced through a novel covariance function, as demonstrated by Jidling et al. [11]. Agrell [12] presented an approach to constrain GPs such that a set of linear transformations of the process are bounded. Furthermore, Lange-Hegermann [13] constrained multi-output GP priors to the solution set of a system of linear differential equations subjected to boundary conditions. A comprehensive survey study of constrained GPs, including, but not limited to, bound constraints, non-negativity, monotonicity, linear partial differential operator constraints, and boundary conditions of a partial differential equation can be found in Swiler et al. [14].

The present work focuses on bound-type constraints. Riihimäki and Vehtari [15] proposed enforcing monotonicity to constrain GPs using binary classification, where the derivative is classified as either monotonically increasing or decreasing with respect to a set of specific input variables. Golchi et al. [16] extended this approach in a deterministic computer experiments setting and sampled from the exact joint posterior distribution rather than an approximation. More recently, Ustyuzhaninov et al. [17] presented a monotonicity constraint in a Bayesian nonparametric setting based on numerical solutions of stochastic differential equations, and Pensoneault et al. [18] presented a novel approach to construct a non-negative GP. Tan [19] proposed a B-spline model that linearly constrains the model coefficients to achieve monotonicity, which converges to the projection of the GP in L2. Chen et al. [20] proposed an optimal recovery method that is equivalent to maximum a posterior estimation for a GP constrained by a partial differential equation.

In this paper, we adopt the monotonic GP formulation from Riihimäki and Vehtari [15] and apply it to three different datasets, both in interpolation and extrapolation regimes, to survey the advantages and disadvantages between the regular GP and the monotonic GP. We focus on experimental and computational materials science applications where monotonicity is physically expected. Materials science, specifically experimental materials science, is well-known for data scarcity due to its resource-intensive experiments. An overview of Gaussian processes and the incorporation of monotonic constraints therein are discussed in Sec. 2. Three examples of how classical and monotonic GPs compare in performance are given in Secs. 46. An overall discussion and conclusion are given in Sec. 7.

2 Overview of Gaussian Processes

2.1 Classical Gaussian Processes.

A thorough mathematical explanation and derivation of Gaussian processes can be found in the canonical text by Rasmussen and Williams [21], but we provide a brief overview here.

Let us assume that we can model the true process, y, with a zero-mean GP,
(1)
where the entries in the covariance matrix Kf,f are defined by a covariance function. Although there are many covariance functions to choose from, in this paper, we focus on the squared exponential covariance function
(2)
where η and ρ={ρ1,,ρD} are hyper-parameters, representing the signal variance and length-scale (also known as correlation length) parameters, respectively. The observations y are then given by
(3)
where σ2 is the intrinsic noise variance.
At a desired input prediction location, x*, the posterior prediction distribution, also called the testing distribution, is a Gaussian distribution, where the posterior mean and posterior variance are, respectively,
(4)
and
(5)
Here, the hyper-parameters are collected into θ>={η,ρ,σ}.
The GP is trained by solving for the hyper-parameters θ that maximize the log-marginal likelihood
(6)
In Eq. (6), the first term, so-called the “data fit” term, quantifies how well the model fits the data described in the Mahalanobis distance. The second term, called the “complexity” term, quantifies the model complexity where smoother covariance matrix with smaller determinant is encouraged [21]. The last term indicates that the likelihood of data tends to decrease with larger datasets [22].
It is noteworthy that the derivative operator is a linear operator; therefore,
(7)
(8)
and
(9)
For the squared exponential kernel described in Eq. (2), Eqs. (8) and (9) become
(10)
and
(11)
respectively, where δgh=1 if g = h and 0 otherwise.
For an arbitrary testing point x*, the derivatives with respect to the dimension of the posterior mean and posterior variance are, respectively,
(12)
and
(13)

2.2 Monotonic Gaussian Processes.

In this work, we adopt the monotonic GP formulation from Riihimäki and Vehtari [15], which has been implemented in the GPstuff toolbox [23,24]. To constrain the classical GP to be monotonic, Riihimäki and Vehtari proposed a block covariance matrix structure similar to that of many multifidelity approaches, e.g., Refs. [2527], and approximated the posterior using expectation propagation [28], which is arguably more efficient than Laplace’s method. For the sake of completeness, we assume a zero-mean GP and briefly summarize the formulation. Readers are referred to the original work of Riihimäki and Vehtari [15] for further details. Roughly speaking, the crux of the approach is to augment the covariance matrix K using a block structure, as is done in multifidelity and gradient-enhanced GP. Now, the difference is that the augmented block encodes the binary classification of whether the GP is supposed to be monotonically increasing or decreasing at some locations. Naturally, the formulation of GP lends itself into binary classification problem with the linear logistic regression or the probit regression, taken from the cumulative density function of a standard normal distribution Φ(z)=tN(t|0,1)dt.

Monotonicity is imposed at M inducing locations XmRM×D. At the location x(i)Xm, the derivative of f is non-negative with respect to the input dimension di. The key idea is to assume a probit likelihood at the location x(i) as
(14)
where
(15)
is the cumulative distribution function of the standard normal distribution. Conceptually, imposing the probit likelihood is intrinsically similar to classification using GP (cf. Section 3.6 [21] and Kuss et al. [29] for binary classification {−1,+1} using GP). Put simply, Eq. (14) models the probability of “penalized” monotonicity with the parameter ν, in the same way one would in binary classification [29], where ν=106 is recommended.
Assume that at Xm, the function f is known to be monotonic. The joint prior for f and its derivatives f′ is given by
(16)
where
(17)
Using the Bayes rule, the joint posterior is
(18)
where
(19)
Since the posterior is analytically intractable, local likelihood approximations are given by the expectation propagation (EP) algorithm, allowing the approximation of the posterior distribution in Eq. (18)
(20)
where ti(Z~i,μ~i,σ~i2)=Z~iN(fi|μ~i,σ~i2) are local likelihood approximations with site parameters Z~i,μ~i,σ~ from the EP algorithm.
The approximate posterior is analytically tractable as a product of Gaussian distributions and can be simplified to
(21)
where
(22)
and
(23)
where μ~ is the vector of site means μ~i, and Σ~=Diag[σ~i2]i=1M. The approximation for the logarithm of the marginal likelihood is computed as
(24)
where μi and σi2 are the parameters of the cavity distribution in EP. By introducing M monotonic inducing point, the cost complexity for optimizing the log-marginal likelihood increases from O(N3) to O((N+M)3). The posterior mean and posterior variance of the testing distribution are, respectively, given by
(25)
and
(26)

3 Numerical Examples

3.1 Example 1: Noisy Logistic Regression.

In this example, we consider the 1D monotonic function
(27)
where εN(0,0.12) on [−3,3] with ten samples, as shown in Fig. 1. Figure 1 shows the comparison between the regular (dashed) and monotonic GP (solid), where the training dataset is plotted as black dots, and testing dataset is plotted as magenta squares. The noise ɛ is substantial enough to corrupt the monotonicity of the underlying function. Despite the fact that the noise is homoscedastic, i.e., the variance of the noise is constant throughout the bounded domain, the regular GP has not been able to capture the function correctly. On the other hand, the monotonic GP enforces the monotonicity while ignoring the noise, which results in a better approximation of the true function, which is also plotted as magenta squares as testing dataset.
Fig. 1
Comparison between the regular and monotonic GPs in the homoscedastic noisy logistic regression example. Training and testing data points are shown as black dots and magenta squares, respectively. By enforcing the monotonicity, the monotonic GP has a better approximation compared to the regular GP.
Fig. 1
Comparison between the regular and monotonic GPs in the homoscedastic noisy logistic regression example. Training and testing data points are shown as black dots and magenta squares, respectively. By enforcing the monotonicity, the monotonic GP has a better approximation compared to the regular GP.
Close modal

3.2 Example 2: Heteroscedastic Hall–Petch Relationship.

Following Counts et al. [30], we adopt the Hall–Petch relationship in polycrystalline materials from Fernandes and Vieira [31] with a heteroscedastic noise proportional to d,
(28)
where ε(d)N(0,2.21010d3), to model the finite-size effect of the representative volume element [6]. d is considered on the [15 μm, 350 μm]. For a fixed-size representative volume element, when d is large, the number of grain reduces, and therefore, the noise of σY increases. The same argument applies when d is small. Now, the variance of the Monte Carlo estimator roughly decays at the rate of N−1 where N is the number of grains or samples; in this hypothetical example, to examine the behavior of the monotonic GP, we simply model the volume of the grain as proportional to d3 and therefore, the noise term with zero-mean and O(d3) variance is considered. Obviously, the variance of the noise term increases as d increases, which is consistent to what we have observed in the literature [6].

Figure 2 shows the comparison between the regular and monotonic GP with 20 training data points, plotted as black dots. The regular GP overfits the training dataset, which results in an oscillatory behavior toward the large d region, which is unphysical. The monotonic GP, on the other hand, correctly identifies the trend and monotonically decreases as d increases, which results in a better approximation compared to the regular GP.

Fig. 2
Comparison between the regular and monotonic GPs in the heteroscedastic settings. The noise ɛ(d) increases as d increases. It is observed that the regular GP is strongly influence.
Fig. 2
Comparison between the regular and monotonic GPs in the heteroscedastic settings. The noise ɛ(d) increases as d increases. It is observed that the regular GP is strongly influence.
Close modal

4 Case Study: Fatigue Life Prediction Under Multiaxial Loading

4.1 Dataset.

We adopt a subset from Karolczuk and Słoński [32,33], where the training and testing datasets are listed in Table 1. The training dataset is taken to be 12 data points with high σa, whereas 13 data points with low σ are used as the testing dataset. We are interested in investigating the extrapolation capability of GPs beyond their training regime as well as their robustness against noise, which is fairly common in experimental materials science, particularly in fatigue, where stochasticity plays a nontrivial role. Much of the stochasticity can be traced back to the physics of fracture and fatigue, where void nucleation, void growth, and void coalescence all play important roles in the origin, development, and failure of a material. Due to this underlying stochasticity, sometimes the data show a substantially noisy behavior despite a clear physical monotonic trend. That is, the higher the stress amplitude is, the lower the fatigue life is. This behavior is typically modeled by the SN equation as
(29)
where A and B are material-dependent coefficients, σa is the stress amplitude, and Nf is the experimental fatigue life. Experimental materials science is resource-intensive [34]; thus, practically speaking, fatigue experimental data are scarce and may not be strictly monotonic, as shown in Karolczuk and Słoński [32]. In the present work, the input of the GPs is σa, and the output of the GPs is log Nexp.
Table 1

Training and testing fatigue datasets for S355N steel, adopted from Karolczuk and Słoński [32]

TrainingTesting
σa (MPa)Nexpσa (MPa)Nexp
674290842777,730
5588115400113,900
55610,035411117,275
50717,012403144,264
48319,955390192,920
50520,595391198,992
49823,780379243,816
49025,913366376,815
48428,045369396,987
47451,430345406,800
46952,0003421,252,208
47566,2003351,444,998
3351,528,487
TrainingTesting
σa (MPa)Nexpσa (MPa)Nexp
674290842777,730
5588115400113,900
55610,035411117,275
50717,012403144,264
48319,955390192,920
50520,595391198,992
49823,780379243,816
49025,913366376,815
48428,045369396,987
47451,430345406,800
46952,0003421,252,208
47566,2003351,444,998
3351,528,487

4.2 Results.

We implement and compare two GP variants: the regular GP and a monotonically constrained GP. Both are trained using a squared exponential kernel and the hyper-parameters are optimized using the L-BFGS method [35] with a maximum of 6000 iterations. Figure 3 compares the accuracy between the two GPs against the testing dataset. The root-mean-square errors for the regular GP and the monotonic GP are 183.3026 and 2.0441, respectively. While the regular GP tends to underestimate the experimental data, the monotonic GP provides more accurate predictions. As shown in Fig. 4, outside the training domain, the regular GP is unable to capture the general trend and behaves wildly, even when interpolating the data points. On the contrary, the monotonic GP is able to capture the monotonic trend and extrapolate to a significant distance outside the training interval. It is important to note that the monotonic GP is only strictly monotonic within the training domain and may fail to maintain monotonicity far beyond the training regime. Indeed, in Fig. 4, we see that the monotonic GP fails to maintain monotonicity when σa350 MPa. Figure 4 also shows a nonmonotonic behavior for the regular GP.

Fig. 3
Accuracy comparison of testing dataset predictions between the regular and monotonic GPs in S355N fatigue dataset, where the monotonic GP outperforms the regular GP
Fig. 3
Accuracy comparison of testing dataset predictions between the regular and monotonic GPs in S355N fatigue dataset, where the monotonic GP outperforms the regular GP
Close modal
Fig. 4
Comparison between the regular and monotonic GPs in S355N fatigue dataset. μ±1σ confidence intervals are highlighted. It is observed that the monotonicity starts to disappear as the extrapolation range goes further to σa≤375 MPa.
Fig. 4
Comparison between the regular and monotonic GPs in S355N fatigue dataset. μ±1σ confidence intervals are highlighted. It is observed that the monotonicity starts to disappear as the extrapolation range goes further to σa≤375 MPa.
Close modal

It is also observed that the posterior variance of the monotonic GP is significantly less than the posterior variance of the classic GP, using the same training dataset. This holds true for both interpolation and extrapolation cases. The main reason is that the monotonic GP uses an extra “pseudo” training dataset by imposing M inducing location in the input domain, as described in Sec. 2.2, and a training dataset with more data points reduces the posterior variance.

5 Case Study: Potts Kinetic Monte Carlo for Grain Growth

5.1 Model Description.

The details of the Potts kinetic Monte Carlo (KMC) simulation for grain growth and its implementation in SPPARKS are described in Refs. [36,37] and are summarized here for the sake of completeness. In the grain growth simulation, the Potts model [38] is used to simulate curvature-driven grain growth. Grain microstructures are represented by an integer value stored at each pixel. During a time-step (referred to here as a Monte Carlo or MC step), pixels in the simulation are visited and probabilistically change their grain membership to a neighboring grain based on the Metropolis algorithm. The probability P of successful change in grain site orientation is calculated as
(30)
where E is the total grain boundary energy calculated by summing all the neighbor interaction energies, ΔE can be regarded as the activation energy, kB is the Boltzmann constant, and Ts is the simulation temperature. In the basic Potts model, the interaction energy between two pixels belonging to the same grain is zero, and E is incremented by one for each dissimilar neighbor. From Eq. (30), changes that decrease system energy are preferred, and the total system energy is monotonically decreased through grain coarsening. It is worthy to note that the Ts simulation temperature is not the real system temperature: kBTs is an energy that defines the thermal fluctuation, i.e., noise, presented in the kMC simulation [36]. Increasing the value of kBTs results in microstructures with increasing grain boundary roughness.

5.2 Results.

Using SPPARKS as the forward model, we build a dataset with kBTs ∈ {0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85} at multiple snapshots in time t ∈ {0.000, 56.879, 106.899, 152.482, 206.331, 251.886, 306.530, 353.403, 406.482, 450.629, 502.591, 554.926, 603.345, 653.470, 706.620, 756.473, 800.112, 855.231, 901.232, 950.060, 1005.970, 1054.070, 1102.890, 1155.320, 1200.360, 1251.300, 1302.020, 1353.410, 1400.000} Monte Carlo steps. In this case study, the average grain size, measured in pixel2, is used as the quantity of interest. Similar to Tran et al. [39], we apply a filter of 50 pixel2 to eliminate small grains before computing the average grain size. The training and testing datasets are divided as follows: if kBTs ≤ 0.75 or if t < 1000, the data point belongs to the training dataset; otherwise, it belongs to the testing dataset. Under this condition, a training dataset of 140 data points and a testing dataset of 92 data points are obtained. The inputs of the GP are (kBTs, t), and the output of the GP is the average grain size.

It is noted that the data are naturally monotonic, as grains grow in time, which is shown in Fig. 5. At the boundary for the training and testing datasets, at t = 950.060 Monte Carlo step, for kBTs = 0.65, the average grain size is 1203.6605 pixel2, whereas at t = 1005.970 Monte Carlo step, for kBTs = 0.65, the average grain size is 1279.0099 pixel2. This is clearly shown in Fig. 7, where the testing predictions are mostly more than 1250 pixel2.

Fig. 5
Grain growth simulation via kinetic Monte Carlo (SPPARKS): (a) SPPARKS at 50 MC step, (b) SPPARKS at 100 MC step, (c) SPPARKS at 200 MC step, and (d) SPPARKS at 350 MC step
Fig. 5
Grain growth simulation via kinetic Monte Carlo (SPPARKS): (a) SPPARKS at 50 MC step, (b) SPPARKS at 100 MC step, (c) SPPARKS at 200 MC step, and (d) SPPARKS at 350 MC step
Close modal

Figure 6 compares the regular GP and the monotonic GP in terms of extrapolating beyond the training regime for kBTs = 0.65, showing a consistent behavior between the monotonic GP and the regular GP. The posterior means between the regular GP and the classical GP are almost exactly the same. However, the posterior variance is significantly reduced in the monotonic GP, which could be explained by M extra inducing points for monotonicity classification, as described in Sec. 2.2.

Fig. 6
Comparison between regular GP and monotonic GP in extrapolating QoI in the SPPARKS grain growth simulation for kBTs = 0.65. Shaded regions correspond to 90% confidence intervals μ±1.645σ.
Fig. 6
Comparison between regular GP and monotonic GP in extrapolating QoI in the SPPARKS grain growth simulation for kBTs = 0.65. Shaded regions correspond to 90% confidence intervals μ±1.645σ.
Close modal

Figure 7 compares the regular GP and the monotonic GP in terms of accuracy with the error bar of μ±1.645σ, showing that the regular GP (R2 = 0.9730) is slightly better than the monotonic GP (R2 = 0.9661). It can be explained that imposing monotonicity to the GP formulation comes at a small cost for accuracy.

Fig. 7
Comparison of testing dataset predictions between the regular and monotonic GPs in kinetic Monte Carlo SPPARKS grain growth simulation. The output is the average grain size, whereas the input is Monte Carlo step t. An error bar of μ±1.645σ, which corresponds to a 90% confidence interval is shown. The dashed line y = x is presented as a reference.
Fig. 7
Comparison of testing dataset predictions between the regular and monotonic GPs in kinetic Monte Carlo SPPARKS grain growth simulation. The output is the average grain size, whereas the input is Monte Carlo step t. An error bar of μ±1.645σ, which corresponds to a 90% confidence interval is shown. The dashed line y = x is presented as a reference.
Close modal

6 Case Study: Strain-Rate-Dependent Stress–Strain With Crystal Plasticity Finite Element

6.1 Model Description.

We adopt and extend the dataset from one of our previous studies [6], generated from Fe–22Mn–0.6C TWIP steel with the dislocation-density-based constitutive model. The material parameters are as described in Steinmetz et al. [40] and summarized in Section 6.2.3 and Tables 8 and 9 in [41]. The constitutive model was validated experimentally by Wong et al. [42], and for the sake of completeness, we briefly summarize the constitutive model here. The TWIP/TRIP steel constitutive model is parameterized in terms of dislocation density, ϱ, the dipole dislocation density, ϱdi, the twin volume fraction, ftw, and the ɛ-martensite volume fraction, ftr. A model for the plastic velocity gradient with contribution of mechanical twinning and phase transformation was developed in Kalidindi [43] and is given by

(31)
where χ=1,,Ntr is the ɛ-martensite with volume fraction ftr on Ntr transformation systems, ssα and nsα are unit vectors along the shear direction and shear plane normal of Ns slip systems α, stwα and ntwα are those of Ntw twinning systems β, and stwα and ntrα are those of Ntr transformation systems χ. The Orowan equation models the shear rate on the slip system α as
(32)
where bs is the length of the slip Burgers vector, ν0 is a reference velocity, Qs is the activation energy for slip, kB is the Boltzmann constant, T is the temperature, τeff is the effective resolved shear stress, τsol is the solid solution strength, and 0<ρs1 and 1 ≤ qs ≤ 2 are fitting parameters controlling the glide resistance profile. Blum and Eisenlohr [44] model the evolution of dislocation densities, particularly the generation of unipolar dislocation density and formation of dislocation dipoles, respectively, as
(33)
(34)
where the glide distance below which two dislocations form a stable dipole is
(35)
d=Dabs is the distance below which two dislocations annihilate, and the dislocation climb velocity is
(36)
Here, D0 is the prefactor of self-diffusion coefficient, Vcl is the activation volume for climb, and Qcl is the activation energy for climb. Strain hardening is described in terms of a dislocation mean free path, where the mean free path is denoted by Γ. The mean free path for slip is modeled as
(37)
where D is the average grain size and
(38)
(39)
Here, is is a fitting parameter, ttw is the average twin thickness, and ttr is the average ɛ-martensite thickness. The mean free path for twinning and for transformation is computed, respectively, as
(40)
(41)
with iw and itr being fitting parameters. The nucleation rates for twins and ɛ-martensite are N˙=N˙0PncsP, where the probability P that a nucleus bows out to form a twin or ɛ-martensite is
(42)
ptw and ptr are fitting parameters. For more details, readers are referred to Roters et al. [41] (cf. Section 6.2.3) and Wong et al. [42].

Figure 8 shows the equiaxed microstructure used in this case study, which has been generated using DREAM.3D [45] under random crystallographic textures. DAMASK [41,46] is used as a crystal plasticity finite element, where PETSc [47,48] is used as the underlying spectral solver.

Fig. 8
A crystal plasticity finite element example by DAMASK: (a) The microstructure statistical volume element with equiaxed grains investigated in this case study and (b) A snapshot of von Mises stress for D = 10−6 m, ε˙=10−4 s−1 before homogenization
Fig. 8
A crystal plasticity finite element example by DAMASK: (a) The microstructure statistical volume element with equiaxed grains investigated in this case study and (b) A snapshot of von Mises stress for D = 10−6 m, ε˙=10−4 s−1 before homogenization
Close modal

6.2 Results.

In this study, we focus on the effect of strain-rate on the stress–strain curve with various average grain sizes. The effect of strain-rate has been experimentally validated in Benzing et al. [49], and some numerical studies using crystal plasticity finite element model have been done as well, such as those found in Singh et al. [50]. The inputs of the GP are the average grain size, the strain-rate, the strain, i.e., (D,ε˙,ε), and the output of the GP is the stress σ. We vary independently the average grain size D ∈ {1.0 · 10−6, 2.5 · 10−6, 5.0 · 10−6, 1.0 · 10−5, 2.5 · 10−5} m, as well as the strain-rate ε˙{105,104,103,102,101,100,101,102,103,104} s−1. Numerically unstable results are removed in the post-process. Some examples of homogenized stress–strain responses are shown in Fig. 9 for visualization purpose.

Fig. 9
Homogenized stress–strain response with different (D,ε˙,ε) for a fixed microstructure shown in Fig. 8. σ is monotonically increasing when either D decreases, or ε˙ increases, or ɛ increases (within the specified ɛ domain)
Fig. 9
Homogenized stress–strain response with different (D,ε˙,ε) for a fixed microstructure shown in Fig. 8. σ is monotonically increasing when either D decreases, or ε˙ increases, or ɛ increases (within the specified ɛ domain)
Close modal

The training and testing datasets are divided as follows: if the average grain size D is less than 5 · 10−6 m or if the strain-rate ε˙=101 s−1, the data point belongs to the training dataset; otherwise, it belongs to the testing dataset. Under this splitting condition, a training dataset of 340 data points and a testing dataset of 120 data points are obtained.

Figure 10 shows the accuracy comparison between the regular GP and the classical GP. The R2 coefficient for the regular GP is 0.9895, whereas the R2 coefficient for the monotonic GP is slightly lower, 0.9860. Both GPs do very well, but the monotonic GP is slightly worse. The regular GP predictions are also more consistent in the large stress regime σ.

Fig. 10
Comparison of testing dataset predictions between the regular and monotonic GPs in crystal plasticity finite element DAMASK. The output is the homogenized stress σ, whereas the inputs are average grain size, strain-rate, and strain, respectively, (D,ε˙,ε).
Fig. 10
Comparison of testing dataset predictions between the regular and monotonic GPs in crystal plasticity finite element DAMASK. The output is the homogenized stress σ, whereas the inputs are average grain size, strain-rate, and strain, respectively, (D,ε˙,ε).
Close modal

7 Discussion and Conclusion

In this paper, we compare the monotonic GP against the regular GP for various materials datasets, namely fatigue, grain growth, and homogenized stress–strain behaviors. The fatigue dataset is the only experimental dataset with noise, whereas the other two datasets, i.e., grain growth and homogenized stress–strain, are computationally generated by SPPARKS and DAMASK, respectively. We test out their predictive capabilities in both interpolation and extrapolation, but are mainly concerned with extrapolation.

For the fatigue example featured in Sec. 4, the experimental data do not adhere strictly to theoretical derivations. This is where the monotonic GP outperforms the regular GP significantly. In the other two examples, which featured datasets generated by computational models, both GPs perform very closely with each other, even though the regular GP performs slightly better than the monotonic GP. This can be explained by imposing the monotonicity on the regular GP comes a small cost for accuracy and numerical performance. It is also noted that the monotonic behavior in the monotonic GP is only guaranteed within the training domain. Beyond the training regime, the monotonicity effect imposed by introducing M inducing points fades away: the further the monotonic GP goes beyond the training regime, the more likely it is the GP will violate the monotonicity constraints. Last but not least, the posterior variances of the monotonic GP are significantly smaller than the posterior variances of the regular GP, while the posterior means are roughly the same.

Constructing an accurate and physically faithful surrogate model is important for uncertainty quantification [51]. Computationally efficient surrogate models are commonly used, for example, in sampling [3,4] or to solve a stochastic inverse problem in structure–property relationship [6,7]. Monotonicity is common in materials science, such as the famous Hall–Petch relationship [2], i.e., σy=σ0+k(1/D). Exploiting such physical constraints and insights may improve the efficiency of machine learning in the future.

The objective of this work is to examine the performance of the monotonic GP compared to the regular GP in different settings, with an emphasis on materials science. Our conclusion is that the monotonic GP is best suited for sparse and noisy settings, which is fairly common in experimental materials science, where the data acquisition process is expensive (and intensive in terms of both time and labor perspectives) and the data are often corrupted by the noise induced by the finite-size effect of microstructures.

It is worthy to note that the monotonic GP indeed performs slightly worse in the second and third case studies. In these case studies, both the training and testing datasets are already monotonic. With the use of linear basis function for the global trend, the regular GP correctly picks up the monotonic behavior, and therefore, is able to make accurate prediction. While the monotonic GP shares the same settings with the regular GP, there are some approximations involved in constructing the joint covariance matrix. In ideally monotonic cases such as the second and third examples, the covariance matrix should asymptotically reduce to the diagonal block matrix as ν0
(43)
which points to how the step function is approximated. In practice, a small numerical degradation is expected as ν cannot be smaller than machine epsilon. In the monotonic GP approach adopted in this paper, the step function is approximated by a probit function described in Eq. (14). It might be possible to increase the performance of the monotonic GP by even further reducing the ν parameter. This may lead to an improvement in performance for the monotonic GP with monotonic datasets, but might also lead to other numerical conditioning problems with respect to the inverse of the joint covariance matrix Kjoint. We conclude that for datasets that are already monotonic, applying a monotonic GP formulation may be unnecessary and may lead to a slight degradation in performance. For datasets that are sparse and noisy, imposing monotonicity when it is appropriate may result in a substantial improvement in performance.

Acknowledgment

This work was supported by the Laboratory Directed Research and Development program at Sandia National Laboratories, a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy’s National Nuclear Security Administration under Contract DE-NA0003525.

Conflict of Interest

There are no conflicts of interest.

Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.

Nomenclature

x* =

testing input

f* =

testing output

i, j =

dummy data point index

d, g, h =

dummy dimensionality index

f =

noiseless output f(x)

f′ =

first derivative of f

x =

[x1,,xD]: training input of D dimensionality

y =

noisy observation y(x) = f(x) + ɛ, εN(0,σ2)

D =

input dimensionality

M =

number of inducing data points for derivatives

N =

number of data points

X =

{x(i)}i=1N: training dataset of size N, XRN×D

Xm =

derivative inducing points, XmRm×D

References

1.
National Science and Technology Council (US)
,
2011
,
Materials Genome Initiative for Global Competitiveness
,
Executive Office of the President, National Science and Technology Council
,
Washington, DC
.
2.
Cordero
,
Z. C.
,
Knight
,
B. E.
, and
Schuh
,
C. A.
,
2016
, “
Six Decades of the Hall–Petch Effect—A Survey of Grain-Size Strengthening Studies on Pure Metals
,”
Int. Mater. Rev.
,
61
(
8
), pp.
495
512
.
3.
Tallman
,
A. E.
,
Stopka
,
K. S.
,
Swiler
,
L. P.
,
Wang
,
Y.
,
Kalidindi
,
S. R.
, and
McDowell
,
D. L.
,
2019
, “
Gaussian-Process-Driven Adaptive Sampling for Reduced-Order Modeling of Texture Effects in Polycrystalline Alpha-Ti
,”
JOM
,
71
(
8
), pp.
2646
2656
.
4.
Tallman
,
A. E.
,
Swiler
,
L. P.
,
Wang
,
Y.
, and
McDowell
,
D. L.
,
2020
, “
Uncertainty Propagation in Reduced Order Models Based on Crystal Plasticity
,”
Comput. Methods Appl. Mech. Eng.
,
365
, p.
113009
.
5.
Yabansu
,
Y. C.
,
Iskakov
,
A.
,
Kapustina
,
A.
,
Rajagopalan
,
S.
, and
Kalidindi
,
S. R.
,
2019
, “
Application of Gaussian Process Regression Models for Capturing the Evolution of Microstructure Statistics in Aging of Nickel-Based Superalloys
,”
Acta Mater.
,
178
, pp.
45
58
.
6.
Tran
,
A.
, and
Wildey
,
T.
,
2020
, “
Solving Stochastic Inverse Problems for Property–Structure Linkages Using Data-Consistent Inversion and Machine Learning
,”
JOM
,
73
(
1
), pp.
72
89
.
7.
Tran
,
A.
, and
Wildey
,
T.
,
2021
, “
Solving Stochastic Inverse Problems for Property–Structure Linkages Using Data-Consistent Inversion
,”
TMS 2021 150th Annual Meeting & Exhibition
,
Virtual
,
Springer
, pp.
1
8
.
8.
Tran
,
A.
,
Tranchida
,
J.
,
Wildey
,
T.
, and
Thompson
,
A. P.
,
2020
, “
Multi-Fidelity Machine-Learning With Uncertainty Quantification and Bayesian Optimization for Materials Design: Application to Ternary Random Alloys
,”
J. Chem. Phys.
,
153
(
7
), p.
074705
.
9.
Khatamsaz
,
D.
,
Molkeri
,
A.
,
Couperthwaite
,
R.
,
James
,
J.
,
Arróyave
,
R.
,
Allaire
,
D.
, and
Srivastava
,
A.
,
2021
, “
Efficiently Exploiting Process–Structure–Property Relationships in Material Design by Multi-information Source Fusion
,”
Acta Mater.
,
206
, p.
116619
.
10.
Fernández-Godino
,
M. G.
,
Balachandar
,
S.
, and
Haftka
,
R.
,
2018
, “
On the Use of Symmetries in Building Surrogate Models
,”
ASME J. Mech. Des.
,
141
(
6
), p.
061402
.
11.
Jidling
,
C.
,
Wahlström
,
N.
,
Wills
,
A.
, and
Schön
,
T. B.
,
2017
, “
Linearly Constrained Gaussian Processes
,” arXiv preprint arXiv:1703.00787.
12.
Agrell
,
C.
,
2019
, “
Gaussian Processes With Linear Operator Inequality Constraints
,” arXiv preprint arXiv:1901.03134.
13.
Lange-Hegermann
,
M.
,
2021
, “
Linearly Constrained Gaussian Processes With Boundary Conditions
,”
International Conference on Artificial Intelligence and Statistics
,
Virtual
,
PMLR
, pp.
1090
1098
.
14.
Swiler
,
L. P.
,
Gulian
,
M.
,
Frankel
,
A. L.
,
Safta
,
C.
, and
Jakeman
,
J. D.
,
2020
, “
A Survey of Constrained Gaussian Process Regression: Approaches and Implementation Challenges
,”
J. Mach. Learn. Model. Comput.
,
1
(
2
), pp.
119
156
.
15.
Riihimäki
,
J.
, and
Vehtari
,
A.
,
2010
, “
Gaussian Processes With Monotonicity Information
,”
Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings
,
Sardinia, Italy
,
May 13–15
, pp.
645
652
.
16.
Golchi
,
S.
,
Bingham
,
D. R.
,
Chipman
,
H.
, and
Campbell
,
D. A.
,
2015
, “
Monotone Emulation of Computer Experiments
,”
SIAM/ASA J. Uncertainty Quantif.
,
3
(
1
), pp.
370
392
.
17.
Ustyuzhaninov
,
I.
,
Kazlauskaite
,
I.
,
Ek
,
C. H.
, and
Campbell
,
N.
,
2020
, “
Monotonic Gaussian Process Flows
,”
International Conference on Artificial Intelligence and Statistics
,
Virtual
,
PMLR
, pp.
3057
3067
.
18.
Pensoneault
,
A.
,
Yang
,
X.
, and
Zhu
,
X.
,
2020
, “
Nonnegativity-Enforced Gaussian Process Regression
,”
Theor. Appl. Mech. Lett.
,
10
(
3
), pp.
182
187
.
19.
Tan
,
M. H. Y.
,
2017
, “
Monotonic Metamodels for Deterministic Computer Experiments
,”
Technometrics
,
59
(
1
), pp.
1
10
.
20.
Chen
,
Y.
,
Hosseini
,
B.
,
Owhadi
,
H.
, and
Stuart
,
A. M.
,
2021
, “
Solving and Learning Nonlinear PDEs With Gaussian Processes
,”
J. Comput. Phys.
,
447
, p.
110668
.
21.
Rasmussen
,
C. E.
,
2006
,
Gaussian Processes in Machine Learning
,
MIT Press
,
London, UK
.
22.
Shahriari
,
B.
,
Swersky
,
K.
,
Wang
,
Z.
,
Adams
,
R. P.
, and
de Freitas
,
N.
,
2016
, “
Taking the Human Out of the Loop: A Review of Bayesian Optimization
,”
Proc. IEEE
,
104
(
1
), pp.
148
175
.
23.
Vanhatalo
,
J.
,
Riihimäki
,
J.
,
Hartikainen
,
J.
,
Jylänki
,
P.
,
Tolvanen
,
V.
, and
Vehtari
,
A.
,
2012
, “
Bayesian Modeling With Gaussian Processes Using the GPstuff Toolbox
,” arXiv preprint arXiv:1206.5754.
24.
Vanhatalo
,
J.
,
Riihimäki
,
J.
,
Hartikainen
,
J.
,
Jylänki
,
P.
,
Tolvanen
,
V.
, and
Vehtari
,
A.
,
2013
, “
GPstuff: Bayesian Modeling With Gaussian Processes
,”
J. Mach. Learn. Res.
,
14
, pp.
1175
1179
.
25.
Tran
,
A.
,
Wildey
,
T.
, and
McCann
,
S.
,
2020
, “
sMF-BO-2CoGP: A Sequential Multi-fidelity Constrained Bayesian Optimization for Design Applications
,”
ASME J. Comput. Inf. Sci. Eng.
,
20
(
3
), p.
031007
.
26.
Yang
,
X.
,
Zhu
,
X.
, and
Li
,
J.
,
2020
, “
When Bifidelity Meets CoKriging: An Efficient Physics-Informed Multifidelity Method
,”
SIAM J. Sci. Comput.
,
42
(
1
), pp.
A220
A249
.
27.
Xiao
,
M.
,
Zhang
,
G.
,
Breitkopf
,
P.
,
Villon
,
P.
, and
Zhang
,
W.
,
2018
, “
Extended Co-kriging Interpolation Method Based on Multi-fidelity Data
,”
Appl. Math. Comput.
,
323
, pp.
120
131
.
28.
Minka
,
T. P.
,
2001
, “
Expectation Propagation for Approximate Bayesian Inference
,”
Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence
,
Seattle, WA
,
Aug. 2–5
,
AUAI Press
, pp.
362
369
.
29.
Kuss
,
M.
,
Rasmussen
,
C. E.
, and
Herbrich
,
R.
,
2005
, “
Assessing Approximate Inference for Binary Gaussian Process Classification
,”
J. Mach. Learn. Res.
,
6
(
10
), pp.
1679
1704
.
30.
Counts
,
W. A.
,
Braginsky
,
M. V.
,
Battaile
,
C. C.
, and
Holm
,
E. A.
,
2008
, “
Predicting the Hall–Petch Effect in Fcc Metals Using Non-local Crystal Plasticity
,”
Int. J. Plast.
,
24
(
7
), pp.
1243
1263
.
31.
Fernandes
,
J.
, and
Vieira
,
M.
,
2000
, “
Further Development of the Hybrid Model for Polycrystal Deformation
,”
Acta Mater.
,
48
(
8
), pp.
1919
1930
.
32.
Karolczuk
,
A.
, and
Słoński
,
M.
,
2022
, “
Application of the Gaussian Process for Fatigue Life Prediction Under Multiaxial Loading
,”
Mech. Syst. Signal Process.
,
167
, p.
108599
.
33.
Karolczuk
,
A.
, and
Kluger
,
K.
,
2020
, “
Application of Life-Dependent Material Parameters to Lifetime Calculation Under Multiaxial Constant-and Variable-Amplitude Loading
,”
Int. J. Fatigue
,
136
, p.
105625
.
34.
Arróyave
,
R.
, and
McDowell
,
D. L.
,
2019
, “
Systems Approaches to Materials Design: Past, Present, and Future
,”
Annu. Rev. Mater. Res.
,
49
(
1
), pp.
103
126
.
35.
Zhu
,
C.
,
Byrd
,
R. H.
,
Lu
,
P.
, and
Nocedal
,
J.
,
1997
, “
Algorithm 778: L-BFGS-B: Fortran Subroutines for Large-Scale Bound-Constrained Optimization
,”
ACM Trans. Math. Softw. (TOMS)
,
23
(
4
), pp.
550
560
.
36.
Garcia
,
A. L.
,
Tikare
,
V.
, and
Holm
,
E. A.
,
2008
, “
Three-Dimensional Simulation of Grain Growth in a Thermal Gradient With Non-uniform Grain Boundary Mobility
,”
Scr. Mater.
,
59
(
6
), pp.
661
664
.
37.
Plimpton
,
S.
,
Battaile
,
C.
,
Chandross
,
M.
,
Holm
,
L.
,
Thompson
,
A.
,
Tikare
,
V.
,
Wagner
,
G.
, et al.,
2009
, “
Crossing the Mesoscale No-Man’s Land Via Parallel Kinetic Monte Carlo
,” Sandia Report SAND2009-6226.
38.
Anderson
,
M.
,
Grest
,
G.
, and
Srolovitz
,
D.
,
1989
, “
Computer Simulation of Normal Grain Growth in Three Dimensions
,”
Philos. Mag. B
,
59
(
3
), pp.
293
329
.
39.
Tran
,
A.
,
Mitchell
,
J. A.
,
Swiler
,
L. P.
, and
Wildey
,
T.
,
2020
, “
An Active-Learning High-Throughput Microstructure Calibration Framework for Process–Structure Linkage in Materials Informatics
,”
Acta Mater.
,
194
, pp.
80
92
.
40.
Steinmetz
,
D. R.
,
Jäpel
,
T.
,
Wietbrock
,
B.
,
Eisenlohr
,
P.
,
Gutierrez-Urrutia
,
I.
,
Saeed-Akbari
,
A.
,
Hickel
,
T.
,
Roters
,
F.
, and
Raabe
,
D.
,
2013
, “
Revealing the Strain-Hardening Behavior of Twinning-Induced Plasticity Steels: Theory, Simulations, Experiments
,”
Acta Mater.
,
61
(
2
), pp.
494
510
.
41.
Roters
,
F.
,
Diehl
,
M.
,
Shanthraj
,
P.
,
Eisenlohr
,
P.
,
Reuber
,
C.
,
Wong
,
S. L.
,
Maiti
,
T.
, et al.,
2019
, “
DAMASK—The Düsseldorf Advanced Material Simulation Kit for Modeling Multi-physics Crystal Plasticity, Thermal, and Damage Phenomena From the Single Crystal Up to the Component Scale
,”
Comput. Mater. Sci.
,
158
, pp.
420
478
.
42.
Wong
,
S. L.
,
Madivala
,
M.
,
Prahl
,
U.
,
Roters
,
F.
, and
Raabe
,
D.
,
2016
, “
A Crystal Plasticity Model for Twinning-and Transformation-Induced Plasticity
,”
Acta Mater.
,
118
, pp.
140
151
.
43.
Kalidindi
,
S. R.
,
1998
, “
Incorporation of Deformation Twinning in Crystal Plasticity Models
,”
J. Mech. Phys. Solids
,
46
(
2
), pp.
267
290
.
44.
Blum
,
W.
, and
Eisenlohr
,
P.
,
2009
, “
Dislocation Mechanics of Creep
,”
Mater. Sci. Eng. A
,
510
, pp.
7
13
.
45.
Groeber
,
M. A.
, and
Jackson
,
M. A.
,
2014
, “
DREAM. 3D: A Digital Representation Environment for the Analysis of Microstructure in 3D
,”
Integr. Mater. Manuf. Innovation
,
3
(
1
), p.
5
.
46.
Diehl
,
M.
,
Groeber
,
M.
,
Haase
,
C.
,
Molodov
,
D. A.
,
Roters
,
F.
, and
Raabe
,
D.
,
2017
, “
Identifying Structure–Property Relationships Through DREAM.3D Representative Volume Elements and DAMASK Crystal Plasticity Simulations: An Integrated Computational Materials Engineering Approach
,”
JOM
,
69
(
5
), pp.
848
855
.
47.
Abhyankar
,
S.
,
Brown
,
J.
,
Constantinescu
,
E. M.
,
Ghosh
,
D.
,
Smith
,
B. F.
, and
Zhang
,
H.
,
2018
, “
PETSc/TS: A Modern Scalable ODE/DAE Solver Library
,” arXiv preprint arXiv:1806.01437.
48.
Balay
,
S.
,
Abhyankar
,
S.
,
Adams
,
M.
,
Brown
,
J.
,
Brune
,
P.
,
Buschelman
,
K.
,
Dalcin
,
L.
, et al.,
2019
, “
PETSc Users Manual
.”
49.
Benzing
,
J. T.
,
Poling
,
W. A.
,
Pierce
,
D. T.
,
Bentley
,
J.
,
Findley
,
K. O.
,
Raabe
,
D.
, and
Wittig
,
J. E.
,
2018
, “
Effects of Strain Rate on Mechanical Properties and Deformation Behavior of an Austenitic Fe–25Mn–3Al–3Si TWIP-TRIP Steel
,”
Mater. Sci. Eng. A
,
711
, pp.
78
92
.
50.
Singh
,
L.
,
Vohra
,
S.
, and
Sharma
,
M.
,
2021
, “
Investigation of Strain Rate Behavior of Aluminium and AA2024 Using Crystal Plasticity
,”
Mater. Today: Proc.
,
50
, pp.
2345
2350
.
51.
Acar
,
P.
,
2021
, “
Recent Progress of Uncertainty Quantification in Small-Scale Materials Science
,”
Prog. Mater. Sci.
,
117
, pp.
100723
.