Physics-constrained non-Gaussian probabilistic learning on manifolds

Significance 

Machine learning revolves around empirical models such as kernels or Neural Networks (NN) that require big data and efficient algorithms for their identification and training. Constraint learning is a technique for improving efficiency in such models. As such, the consideration of constraints in learning algorithms remains a very important and active research topic. Noteworthy research is available, whereby approaches such as the Bayesian updating have provided rational frameworks for integrating data into predictive models and has been successfully adapted to situations where likelihoods are not readily available either because of expense or because of the nature of available information. Generally, relevant information is mainly available in the form of sample statistics rather than raw data or sample-wise constraints. To circumvent this shortfall of the Bayesian framework, alternative approaches such as the Kullback-Liebler divergence have been reported and extensively used particularly to impose constraints in the framework of learning with statistical models. However, in most instances, large data sets are not available. Consequently, the neural networks cannot be trained as desired. To overcome this, researchers have developed small data sets that can help circumvent this drawback; nonetheless, the small data sets have been seen to share conceptual and computational challenges of “big data” with further complications pertaining to scarcity of evidence, and the necessity to extract the most knowledge, with quantifiable confidence, from scarce data.

Therefore, a need arises to implement learning from a high-dimensional small data set, without invoking the Gaussian assumption. Recently, the probabilistic learning on manifolds (PLoM) has been reported and complementary development and applications with validations also demonstrated. This technique could potentially neutralize the problem at hand by improving the knowledge that one has from only a small number of expensive evaluations of a computational model in order to be able to solve a problem, such as a nonconvex optimization problem under uncertainties with nonlinear constraints, for which a large number of expensive evaluations would be required, which, in general, is not possible. In this view, Professor Christian Soize from the University of Paris, France, in collaboration with Professor Roger Ghanem at the University of Southern California, Los Angeles, California; proposed an extension of the PLoM for which, not only the initial data set was given, but in addition, constraints would be specified, in the form of statistics synthesized from experimental data, from theoretical considerations, or from numerical simulations. Their work is currently published in the research journal, International Journal for Numerical Methods in Engineering.

The two researchers considered a non-Gaussian random vector whose unknown probability distribution had to satisfy constraints. The technique involved construction of a generator using the PLoM and the classical Kullback-Leibler minimum cross-entropy principle. The resulting optimization problem was then reformulated using Lagrange multipliers associated with the constraints. The researchers then computed the optimal solution of the Lagrange multipliers using an efficient iterative algorithm. At each iteration, the Markov chain Monte Carlo algorithm developed for the PLoM was used, consisting in solving an Itô stochastic differential equation that was projected on a diffusion-maps basis.

The authors reported that the method and the algorithm were efficient and allowed for the construction of probabilistic models for high-dimensional problems from small initial data sets and for which an arbitrary number of constraints were specified. In fact, for the two sample applications build, the first one was seen to be sufficiently simple and easy to reproduce while as the second one was relative to a stochastic elliptic boundary value problem in high dimension.

In summary, the study introduced a methodology that extends the probabilistic learning on manifolds from a small data set to the case for which constraints are imposed, during the learning process, to a subset of QoI. Interestingly, the researchers mentioned that the methodology had the capability to consider more general constraints than the second-order statistical moments. Of much significance, the proposed approach allowed for analysis of non-Gaussian cases in high dimension related to functional inputs and outputs. In a statement to Advances in Engineering, Professor Christian Soize further pointed out that their iteration algorithm was very robust and seemed to be exponentially convergent with respect to the number of iterations.

Probabilistic machine learning for the small-data challenge in computational sciences

Illustration of the loss of concentration using a classical MCMC generator and of the efficiency of the probabilistic learning on manifolds that preserves the concentration and avoids the scattering. Figure 1-(left) displays N= 400 given points of the initial dataset for which the realizations of the random variable X = (X1,X2,X3) are concentrated around a helical. Figure 1-(central) shows M = 8,000 additional realizations of X generated with a classical MCMC for which the concentration is. Figure 1-(right) shows M= 8,000 additional realizations of X generated with the probabilistic learning on manifold for which the concentration is preserved.

Physics-constrained non-Gaussian probabilistic learning on manifolds - Advances in Engineering
Figure 1 (from doi:10.1016/j.jcp.2016.05.044). Left figure: N= 400 points (blue symbols) of the initial dataset. Central figure: M = 8,000 additional realizations (red symbols) generated with a classical MCMC algorithm. Right figure: M= 8,000 additional realizations generated with the probabilistic learning on manifold (red symbols).

About the author

Professor Christian Soize

Professional history. PhD from the Univ. Pierre et Marie Curie (1979), researcher at ONERA (French Aerospace lab) from 1981 to 2001, professor in Mechanics at University Paris-Est Marne-la-Vallée from 2001 to 2016, presently Professor Emeritus at University Gustave Eiffel (new name of Paris-Est Marne-la-Vallée).

Scientific Awards. Noury Prize from the French Academy of Sciences (1985), Research Award in Stochastic Dynamics from IASSAR (2001), ASA Fellow (2001), Senior Research Prize awarded by EASD-European Association of Structural Dynamics (2011), IACM Award Computational delivered at WCCM New York (2018).

Distinctions. Chevalier dans l’Ordre National du Mérite (2015), Officier dans l’Ordre des Palmes Académiques (2016).

Publications.
Scientific papers in refereed international journal: 238
Communications in conferences: 417
Chapters of book: 15
Books: 9 books among which
– Mathematics of Random Phenomena (in collaboration), Reidel, 1986.
– The Fokker-Planck Equation for Stochastic Dynamical, World Scientific, 1994.
– Structural Acoustics and Vibration (in collaboration), Academic Press, 1998.
– Stochastic Models of Uncertainties in Computational Mechanics, ASCE, 2012.
– Advanced Computational Vibroacoustics (in collaboration), Cambridge Univ. Press, 2014.
– Uncertainty Quantification. Springer, 2017.

Main fields of research.
– Uncertainty quantification and scientific machine learning using probability theory and mathematical statistics.
– Modeling and numerical simulation in dynamics and vibration of complex mechanical systems, in elastoacoustics, vibroacoustics, and coupled systems using deterministic and probabilistic approaches.
– Stochastic approach in micromechanics and multiscale mechanics of heterogeneous materials.

Major accomplishments.
-Theory for the medium-frequency range in computational dynamics.
– Fuzzy Structure Theory based on probabilistic modeling in computational dynamics for complex mechanical systems
– Nonparametric probabilistic approach of model uncertainties based on Random Matrix theory and Information Theory in computational mechanics for Uncertainty Quantification
– Probabilistic learning on manifolds in machine learning for nonconvex stochastic optimization and statistical inverse problems in high dimension.

Email:  [email protected]
Google Scholar

Reference

Christian Soize, Roger Ghanem. Physics-constrained non-Gaussian probabilistic learning on manifolds. International Journal for Numerical Methods in Engineering 2020;volume 121:page 110–145.

Go To International Journal for Numerical Methods in Engineering

Check Also

Human-centered perspective of production before and within Industry 4.0 - Advances in Engineering

Human-centered perspective of production before and within Industry 4.0