Assessing the Accuracy of the Surrogate Predictions

Before you define and run the reliability study, you verify the accuracy of the surrogate models that the Adaptive Sampling study generated. This verification is achieved through cross validation.

The Cross V Residual and PRESS Residual values are estimates of the error that you can expect in predicted values. These residuals are in the same units as the responses.

To estimate the accuracy of the surrogate predictions:
  1. Select the Surrogates Generation > Surrogates > Pressure Drop MA_Kriging node and make sure that the following cross validation properties are specified:
    Property Setting
    Cross Validation Scheme K-Fold
    Cross Validation K-Fold Value 10
    Cross Validation Seed 0.0
    When performing cross validation using the K-Fold cross validation scheme, Design Manager breaks up the 90 designs for this Surrogates Generation study into 10 unique groups of 9 designs. Design Manager then sets aside one group of 9 designs (test data) and uses the remaining 81 designs (training data) to calculate a surrogate model. This surrogate model is then used to predict the responses for the 9 designs that form the test data. Since the test data is independent from the training data, the deviation between the predicted responses and the test data provides an error estimate, that is, a cross validation residual. Cross validation therefore is a way to estimate the size of that error and to judge the accuracy of a surrogate model fit. Design Manager already performed this cross validation at the end of the Surrogates Generation study run.

    There is a certain randomness associated with the cross validation process: there are multiple possibilities for Design Manager to perform the K-Fold grouping. The way that the grouping is performed is determined by the Cross Validation Seed. A value of zero means that Design Manager auto-generates the seed.

  2. To view the results of the computation for the pressure drop surrogate, right-click the Surrogates > Pressure Drop MA_Kriging node and select Open Residual Table.
    For each design, the residual table lists the actual pressure drop resulting from the Adaptive Sampling study, the predicted pressure drop using the surrogate model, the corresponding residual between the two, the cross validation residual, and the PRESS residual.

    The Residual values are almost zero, which is to be expected. To create the surrogate, the Kriging method interpolates the actual pressure drop values from the Adaptive Sampling study making them part of the resulting response surface of the surrogate. Therefore, the actual and the predicted pressure values are nearly identical.

Rather than assessing accuracy based on the residual, it is more useful to analyze the Cross V Residual, which is derived from the K-Fold cross validation.

  1. To start the cross validation, right-click on the Surrogates > Pressure Drop MA_Kriging node and select Cross Validate.
  2. In the Residual Table - Surrogates Generation - Pressure Drop MA_Kriging tab, in the header of the table, click Cross V Residual.
    This action sorts the Cross V residuals from lowest value to highest value.
  3. Take note of the minimum and maximum cross validation residual values in the column—these values set the bounds.
    The cross validation residuals have the same units as the pressure drop, Pa, thus allowing you to directly assess the order of magnitude of the error. You can further compute the relative error for the minimum and maximum values.
  4. To visualize the cross validation residual for each design, right-click the Surrogates > Pressure Drop MA_Kriging node and select Create Plot > Actual vs Residual.
  5. In the Actual vs Residual Plot Setup dialog, click OK.
  6. Select the Actual vs Residual Plot tab above the graphics window and drag the plot downwards until you see a red line framing the lower half of the graphics window, then release the tab.
    Now you see the surrogate residual table displayed above the residual plot. The results that you see in your table and plot can be slightly different from the screenshot below due to the randomness of the cross validation process.

    In the cross validation residual plot, the residuals are more or less tightly scattered around the zero line.

    In the top left corner of the plot, as an annotation, you see a single value cross validation residual, which is the root mean square (RMS) of all cross validation residuals in the Cross V Residual column. This mean value is used to judge the accuracy of the surrogate model—the lower the value, the better the fit. When you assess the RMS value of the cross validation residuals, you interpret it relative to the absolute value of the given response and its range, which in this case is the pressure drop.

Due to the randomness of the grouping process of the cross validation scheme, you are advised to perform cross validation multiple times. Each cross validation yields different results.
  1. Right-click the Surrogates > Pressure Drop MA_Kriging node and select Cross Validate.
    The values in the Cross V Residual column and the cross validation residual plot change.

  2. Sort the Cross V Residual column by clicking the header as before and observe the minimum and maximum residual values. These values are of the same order of magnitude as for the first cross validation.
  3. To perform yet another cross validation step, repeat Steps 7 and 8 one more time.
    (For this tutorial, you perform only two additional cross validations. For an industrial simulation, you typically perform seven to eight additional cross validations.)


    As a result of the cross validation, the table values and the plot change accordingly. The minimum and maximum residual bounds remain of a similar order of magnitude. The same can be observed in the cross validation residual plot with the RMS cross validation value.

The predicted residual error sum of squares (PRESS) residual is another means by which to assess the accuracy of the surrogate model predictions. PRESS is a form of cross validation where only one design is set aside and the remaining designs, (here 89 designs), are used to compute the surrogate model. The PRESS residual is equivalent to the Leave-one-out cross validation, and it is also equivalent to the K-Fold cross validation scheme when the Cross Validation K-Fold Value is set to 1. For PRESS you only perform one cross validation as there are no random groupings and the PRESS residuals remain fixed. Generally, you are advised to look at both types of cross validation residuals as each of them has its advantages and disadvantages.
  1. Analyze the PRESS residual:
    1. In the Residual Table - Surrogates Generation - Pressure Drop MA_Kriging tab, in the header, click PRESS Residual to sort the residual values from lowest to hightest.
    2. Note that the minimum value is around -3 Pa and the maximum value is around 3.7 Pa.
    3. Select the Plots > Actual vs Residual Plot > Data Series > Surrogates Generation > Left Axis Data > Surrogate node and set Values Type to PRESS Residuals.


      The Actual vs Residual plot now displays the PRESS residual of each design. In the top left corner, you see the RMS PRESS residual.
    The fact that the relative error of both Cross V Residual and PRESS Residual is small and consistent over several cross validations gives you confidence that the surrogate model for pressure drop is acceptable.

    For an industrial simulation, you typically perform the cross validation for each response of interest. However, in this tutorial, you analyze cross validation for pressure drop only. For uniformity, the cross validation procedure is the same.