27. September 2022

Global Sensitivity Analysis Methods for Identifying Key Parameters

In many real-world applications involving production data, simulated data, or data from experiments, one is interested in identifying the key input parameters, i.e., those (groups of) parameters that have a high influence on the output. 

Sensitivity analysis methods address this task by measuring the uncertainty in output based on the change in the input. Global sensitivity analysis (GSA) methods focus on the variation of all inputs, leading to an overall analysis of the importance of each parameter. They can account for interactions between parameters and do not depend on the choice of a nominal parameter configuration around which sensitivity is measured, as local methods do. 

A concise comparison of GSA methods was recently accepted for publication in IEEE Access (Preprint: “A Comparison of Global Sensitivity Analysis Methods for Explainable AI with an Application in Genomic Prediction” by Bas van Stein, Elena Raponi, Zahra Sadeghi, Niek Bouman, Roeland van Ham and Thomas Bäck). The paper provides an introduction to these techniques and an overview of their different groups, including

  1. Variance-based methods: Sobol, Fourier Amplitude Sensitivity Test (FAST), Random Balance Designs FAST
  2. Derivative-based methods: Morris, Derivative-based global sensitivity measures (DSGM)
  3. Density-based methods: DELTA, PAWN
  4. Model-based methods: Linear models, random forest, Shapley, SHAP and TreeSHAP.

A comparison on artificial test problems is performed to compare the methods for robustness and accuracy. Concerning robustness, it turns out that only a few methods can deal with small sample sizes (number of data points) as the dimensionality (number of parameters) is growing. The Morris method is one of them, showing quite stable performance. Concerning accuracy, an experiment is performed in which a number of dummy parameters (having no influence on the output) needs to be correctly sorted out. Again, the Morris method performs well, even when the number of dimensions is high and the sample size is low. 

A qualitative comparison in terms of desirable characteristics of such methods is also performed. Among those characteristics are the following:

  1. Computation of first, second, and total order sensitivities. 
  2. Estimation of the direction of the effect.
  3. Providing a confidence indication.
  4. Being able to treat grouped factors.
  5. Being model-independent.
  6. Being independent of the sampling scheme.
  7. Including multidimensional averaging.

The comparison also identifies advantages for the Morris method, which is overall an interesting conclusion, pointing towards more research for investigating this method on simulation-based data. Response surface models created by machine learning methods on such data sets would open the possibilities of computing such global sensitivity measures for the simulation model’s parameters. This approach would be highly relevant for many real-world applications and is incorporated already in divis’ ClearVu Analytics software tool.