This is a summary of Chapter 16.1 and part of 16.2 in "Deep Learning" by Ian Goodfellow, Yoshua Bengio, Aaron Courvilley.
What is a structured probabilistic model
A structured probabilistic model is a way of describing a probability distribution with graphs. The graph describes which random variables in the probability distribution interact with each other directly. Because the structure of the model is defined by a graph, so it can also be referred to as graphical models.
The challenge of Unstructured Modelling
In the field of Deep leaning, we run into situations that we need to understand high-dimensional data with rich structure. For example, we need AI to understand natural images, audio waveforms representing speech and documents containing multiple words and punctuation characters. It is possible to ask probabilistic models to do these tasks. In general, if we wish to model a distribution over a random vector x containing n discrete variables capable of taking on k values each, a naive approach of representing P(x) is to store a lookup table with one probability value per possible outcome, which is kⁿ parameters.
It is not feasible for several reasons:
Memory: The cost of string the representation.
Statistical efficiency: The number of training data needed increases as the number of the parameters in a model.
Run time: The cost of inference. Computing distributions will require summing across the entire probability distribution table.
Run time: The cost of sampling: It also requires reading through the whole table in the worst case.
Structured probabilistic models provide a formal framework for modelling only direct interactions between random variables. This allows the models to have significantly fewer parameters and therefore be estimated reliably from less data and reduced computational cost.
The model can then been extended to tasks such as Density estimation, denoising, missing value imputation and sampling.
How a graphical model work
In a graphical model, each node represents a random variable and each edge represents a direct interaction. The models can be divided into two categories based on the way to describe the interaction: models based on directed acrylic graphs (discussed below), and models based on undirected graphs (discussed in the next post).
Directed models
These model is also known as belief network or Bayesian network. It is directed because the edges are directed from one node to another. It can also be represented in the drawing with and error.
In the graph above, A structured probabilistic model is described as
In this case, using a directed graphical model reduced our number of parameters by a factor of more than 50.
What's the difference between Structured Probabilistic Models and Graph Neural Networks
Structured Probabilistic Models (SPMs) and Graph Neural Networks (GNNs) are related concepts, but they are not exactly the same thing.
Structured Probabilistic Models refer to a class of models in machine learning that explicitly represent the dependencies between variables in a structured way, often using a graph or a similar structure. Examples of SPMs include Bayesian networks, Markov random fields, and conditional random fields. These models use probabilistic graphical models to capture and represent the relationships between different variables.
On the other hand, Graph Neural Networks (GNNs) are a class of neural networks designed to operate on graph-structured data. GNNs learn to process and extract information from graph-structured data, such as social networks, molecular structures, or citation networks. GNNs can be applied to problems where the input data can be naturally represented as a graph, and the neural network operates on the nodes, edges, or subgraphs of that graph.
While both SPMs and GNNs involve the concept of graphs, they have different focuses and use cases. SPMs are more rooted in probabilistic modelling and inference, capturing dependencies between variables through a graphical representation. GNNs, on the other hand, leverage neural networks to learn representations of nodes and edges in a graph, typically for tasks such as node classification, link prediction, or graph classification.
In summary, while there is a connection between the concepts of structured probabilistic models and graphs, SPMs are more general in representing dependencies between variables, while GNNs specifically focus on learning representations from graph-structured data using neural networks.
Comments