Uncertainty plays a critical role in AI model risk management since it can affect their performance, reliability, and ability to make accurate predictions. Practitioners involved with these models, depending on their role within organizations, may not have a complete understanding of the concepts behind model uncertainty which in turn can prevent addressing issues related to uncertainty entirely, particularly when resources are limited.
This series of articles provides model risk and ML/AI practitioners in financial institutions insights on how to think about model uncertainty in the context of model risk management (MRM). In this first article, we focus on defining model uncertainty and understanding how it impacts the model risk management process.
Model outputs are increasingly used to make automated decisions. Yet, end users do not always understand what can go wrong, much less what consequences might be expected from these hidden risks. When left undiscerned, uncertainty reduces the utility of model outputs as users cannot tell when a model is being truthful or not.
From a regulatory perspective, reliable uncertainty quantification is a critical step toward building explainable, transparent, and accountable models. According to the model risk management framework established by SR 11-7 (Fed, 2011), model uncertainty should be looked at as one type of model risk. The recent paper from the National Institute for Standards and Technology (NIST) also sets a risk management framework for AI systems (NIST, 2023) and increases the expectations on explainability, transparency and accountability.
Many institutions are already exploring using large language models (LLMs) such as ChatGPT in applications that may significantly impact customers’ lives (Heaven, 2023). While the hope is that these models result in benefits to customers and efficiency improvements for the institutions using them, questions have arisen regarding the trustability of outputs coming from LLMs and AI tools in general. Indeed, the convenient user experience of applications such as ChatGPT, where outcomes are perceived as ‘truth’, can easily result in over-reliance on the outputs and a lack of efficient challenge against their use (Walker, 2023).
Users of LLMs are not provided with a distribution of plausible outcomes given the data, nor information regarding whether there has been any bias introduced in the training dataset that conditions all the outcomes. As a result, it is increasingly important to encourage all model risk participants to develop prudential judgments when assessing models, or as SR 11-7 put it:
It can be prudent for banks to account for model uncertainty
Users should be incentivized to view these models with a skeptical lens and demand transparency before relying too much on abstract AI systems over their own successful experiences (King, 2020).
The aim of this article is to urge model risk and AI practitioners to maintain a critical mindset toward their models which can help enhance trustworthiness and ensure compliance with current and future regulatory requirements.
Defining Model Uncertainty
To understand model uncertainty in a regulatory environment, we first need to clarify what the term means. And to know what the term “model uncertainty” is, we must understand what a model is.
Models may be perceived as just calculation engines, where simple calculation means a simple model. Regulatory guidelines offer a more detailed definition for regulated models (Fed, 2011):
A model consists of three components: an information input component, which delivers assumptions and data to the model; a processing component, which transforms inputs into estimates; and a reporting component, which translates the estimates into useful business information.
This definition expands the academic definition of a model beyond the mathematical calculations and puts emphasis on how quantitative estimates are exposed to the end user.
In this framework, estimates refer to the fact that model outcomes are uncertain by nature. The implications of this are critical for model risk management. The supervisory letter from the US Federal Reserve SR 11-7 Model Risk Management Guidelines (Fed, 2011), places model uncertainty at the heart of efficient model risk management practices:
An understanding of model uncertainty and inaccuracy and a demonstration that the bank is accounting for them appropriately are important outcomes of effective model development, implementation, and use.
Similarly, the 2022 consultation paper issued by the Bank of England CP6/22 Model Risk Management Principles for Banks (Bank of England, 2022) **elevates model uncertainty compared to its predecessor (Bank of England, 2018) and provides a helpful definition:
Model uncertainty should be understood as the inherent uncertainty in the parameter estimates and results of statistical models, including the uncertainty in the results due to model choices or model misuse.
This regulatory definition is very appropriate and links with the statistical concepts of parameter uncertainty and model choice uncertainty. The former is known in statistics as aleatoric uncertainty and is caused by random variability each time we run an experiment. This type of error includes measurement errors and captures variations in the parameters of the model. The latter relates to epistemic uncertainty and accounts for the lack of knowledge about what generates the data we try to predict. This type of error is potentially reduced by exploring alternative modeling choices (Gelman, 2007).
Uncertainty in Model Risk Management
Model Development and Validation
Transparent model development practices encourage the use of appropriate quantitative tools to extract uncertainty from model outcomes. Computing confidence intervals, rather than just point estimates, is one such tool that helps avoid spurious predictions and to reduce bias during technical challenging sessions between the first and second lines of defense. Building models that can produce a distribution of plausible outcomes can help current and future developers critically examine their own choices more effectively.
During independent model validation exercises, uncertainty measures are helpful for model validators to challenge the inherited model risk, and elevate its criticality if the observed data crosses the prediction intervals too often.
Model Tiering and Performance Monitoring
The process of setting model tiering or inherited risk is straightforward in some cases, but for non-regulatory models, this task can become more subjective. For example, models that require supervisory approval are assigned the highest risk. Model uncertainty can be included with other risk measures, such as materiality, complexity, and dependency to increase transparency in setting model tiering (MRMIA, 2021).
Performance metrics should capture the prediction uncertainty of the model. This metric can be used both during developmental testing and live ongoing monitoring, by using probability intervals centered around the model projections. Each institution can then set thresholds for acceptable performance based on its risk appetite and level of error tolerance for specific use cases.
According to the Capital Requirements Regulation (CRR), firms must add a Margin of Conservatism (MoC) as part of their internal ratings-based systems to mitigate any deficiencies of quantitative estimates of risk parameters in their Expected Credit Loss calculations (EBA, 2017). The primary objective of the MoC framework is to manage two fundamental sources of model uncertainty: 1) the model estimation error (epistemic uncertainty), and 2) the variation of the model parameters and uncertainty about the model estimation error (aleatoric uncertainty) that is due to data quality or operational issues, for example.
Assessing model uncertainties is crucial for appropriately classifying model deficiencies and avoiding double-counting model errors, for example by including the same source of uncertainty in more than one category. Although an overestimation of conservatism may be seen as cautious, an excessive add-on to model estimates may result in users and regulators questioning the value of the model (Fed, 2011).
Uncertainty plays a central role in model and AI risk management, and its impact on business decisions is frequently overlooked. The use of models is often accompanied by unexpected behavior, which all stakeholders from model developers to senior management must be aware of.
In this article, we sought to provide model risk practitioners insights to better understand model uncertainty. We highlighted some of the benefits of developing a prudential modeling culture that accounts for uncertainty in their model risk practices, such as:
- More transparent and accountable models
- Robust testing during model development
- Effective challenge between the first-line and second-line of defense
- Quantifiable metrics for risk appetite
- Early warning indicators for ongoing monitoring
- Challengeable metrics for model tiering
- Effective model conservatism
Despite these benefits, it may seem impractical or challenging to systematically compute and report on all confidence intervals and plausible model outcomes to quantify model uncertainty. Practitioners must remember that quantitative models are not designed to provide a definitive, certain answer; they are meant to present with the most plausible options. Model uncertainty is one of the ways of extracting and making the most of these options.
In the next article in the series, we will explore quantitative methods to assess model uncertainty with a concrete model case study. To stay up to date on our upcoming publication, follow ValidMind on LinkedIn below!
Author: Juan Martinez, PhD, © Copyright 2023 ValidMind Inc.
This post was previously published on LinkedIn.
- Walker (2023). What ChatGPT and generative AI mean for science. Nature.
- Heaven (2023). ChatGPT is everywhere. Here’s where it came from. MIT Technology Review.
- NIST (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0).
- Bank of England (2022). CP6-22 Model Risk Management Principles for Banks.
- MRMIA (2021). MRMIA Best Practices: Volume 1.
- Mervyn King and John Kay (2020). Radical Uncertainty: Decision-making for an unknowable future. The Bridge Street Press.
- Bank of England (2018). SS3-18 Model Risk Management Principles for Stress Testing.
- EBA (2017). Guidelines on PD estimation, LGD estimation and the treatment of defaulted exposures.
- FED (2011). SR 11-7 Letter Supervisory Guidance on Model Risk Management.
- Gelman (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.