AI Atlas: Incorporating Uncertainty with Bayesian Neural Networks

AI breakthroughs, concepts, and techniques that are tangibly valuable, specific, and actionable. Written by Glasswing Founder and Managing Partner, Rudina Seseri

At its core, the goal of machine learning is to learn from data to find an optimized solution to a given problem. However, in real life there is not always a single path forward, and careful decision-making often requires acknowledging a risk of uncertainty or random behavior.

Bayesian Neural Networks (BNNs) offer a compelling approach to addressing this gap by not just making predictions on data, but also estimating the probability of each result. This unique capability enables BNNs to make more informed decisions, particularly in scenarios where uncertainty plays a crucial role.

The development of BNNs lies within the history of AI, stretching back decades. However, research on BNNs actively continues to this day and they remain a popular and powerful tool for data scientists and machine learning engineers. In today’s AI Atlas, I will be exploring how these models work and what has given their usefulness such longevity.

🗺️ What are Bayesian Neural Networks?

Bayesian Neural Networks (BNNs) are a class of neural networks that incorporate Bayesian principles into their architecture, giving them the ability to quantify uncertainty. Weights and biases of a neural network determine how information flows through the network and how the network learns from data, with weights controlling the strength of connections between neurons, while the biases introduce flexibility and adaptability. In a traditional neural network, you have fixed weights and biases that are learned through optimization techniques. These are the key variables in generating outputs, typically found within hidden layers and adjusted during the training process, as a model learns the best algorithm to describe a given problem. A Bayesian Neural Network, however, treats its weights and biases as probability distributions rather than fixed values. This means instead of having a single value for each weight and bias, you have a distribution of possible values.

BNNs operate by simulating multiple models at once, and then analyzing the range of outputs to determine probability. Agreement between the models implies high certainty, whereas disagreement implies uncertainty. Thus, BNNs are able to consider the entire distribution of outputs and communicate the probabilistic reasoning behind each inference. This characteristic also makes it easier to detect hallucinations, or the tendency of an AI model to confidently output incorrect results, because each output is directly associated with a transparent confidence level.

🤔 What is the significance of BNNs and what are their limitations?

Interpretability, or the ability to understand the reasoning behind predictions and decisions made by a neural network, is a significant topic of interest in the AI ecosystem. Traditional neural networks are typically viewed as “black boxes,” with logic occurring within opaque hidden layers, ultimately making it difficult to trust these models with high-impact use cases such as autonomous behavior. I have previously written about many targeted attempts to improve AI interpretability, including MILAN and selective feature engineering.

With BNNs, the incorporation of probability greatly improves transparency as decision-making is explained by statistical values. For example, consider a BNN model designed for computer vision. If a traditional neural network was trained on images of only humans and dogs, it would fail to detect and probably misclassify a new subject such as a cat. With a BNN, the uncertainty of this result can be flagged and the steps taken can be analyzed to determine key characteristics used in decision-making, such as ears and whiskers.

  • Explainability: By incorporating Bayesian statistics, defining conditional relationships such as “if X occurs, then Y has a 10% chance of occurring,” BNNs are extremely useful for inferring the root causes behind observed events.
  • Visibility on uncertainty: Real-world data often exhibits noise and incomplete information. In modeling this variability and providing associated risk estimates, BNNs are more robust to noisy or sparse data compared to standard neural networks.
  • Protection against overfitting: In machine learning, “overfitting” refers to an issue that can arise when a model trains “too well” on its data, becoming unable to generalize past the data it has already been trained on. By considering the entire distribution of outputs, BNNs are less likely to experience overfitting.

Alongside these benefits, there are several key limitations of BNNs that enterprises looking to leverage the technology will need to keep these in mind:

  • Resource intensiveness: BNNs require substantially more computational power both to train and to generate inferences because they simulate multiple models simultaneously.
  • Complexity: Building a Bayesian model requires domain experts to understand and define previous assumptions on probability. For example, in pharmaceutical manufacturing, creating a model would require defining which chemicals can possibly interact with each other and how.
  • Performance on high-dimensional data: BNNs struggle to scale to large deployments. As data has more characteristics to track and more interactions are possible between inputs, it becomes increasingly difficult to model the probability of specific outcomes.

🛠️ Applications of Bayesian Neural Networks

As BNNs are designed to model uncertainty, their most promising applications are those in which it is important to have visibility on the risk associated with predictions. For example, in industries where small failure rates can be costly, such as:

  • Manufacturing: BNNs can analyze sensor data to estimate the likelihood of equipment failures or maintenance needs. By proactively scheduling maintenance, manufacturers can minimize downtime and avoid unexpected breakdowns.
  • Financial models: By incorporating uncertainty estimates, BNNs can provide more accurate assessments of creditworthiness and help financial institutions manage risk more effectively.
  • Cybersecurity: BNNs can detect anomalous behavior or unauthorized access attempts more accurately by modeling uncertainty in user authentication.

Stay up-to-date on the latest AI news by subscribing to Rudina’s AI Atlas.