One of the biggest mental sinkholes into which AI students can get trapped is not quite understanding the fundamental difference between how our two basic “building block” networks operate: the Multilayer Perceptron (MLP), trained with backpropagation (or any form of gradient descent learning), and the (restricted) Boltzmann machine (RBM), trained with contrastive divergence.
It’s easy to see why this happens. First, a lot of students are being run at rapid pace through an AI program that is often more applications-driven than theory-centric. This isn’t bad – it’s the fastest way to a new job and a new career – but it leaves students with holes in their “AI understanding” armor.
Second, most students move (as rapidly as they can) into some deep learning application; one that uses layers of neural network nodes. When they start learning neural networks, they get well-grounded in the MLP basics. Then, there’s a brief mention of the problems inherent in adding layers into an MLP (vanishing and exploding gradients, etc.), and the need to have a different method. At some point, the name “Hinton” might be briefly mentioned. There may be a faint nod in the direction of (restricted) Boltzmann machines. But the students are so busy getting their application together, using the twin miracles of TensorFlow and Keras, that they skip that step of understanding (restricted) Boltzmann machines.
And who can blame them?
The third big factor is: all too often, the professors don’t understand Boltzmann machines. The textbook authors don’t understand Boltzmann machines.
Ergo, the simple and expedient thing is to focus on building the application, and glossing over that theoretical armor-hole.
Thus – we have a situations where hundreds, thousands, and possibly tens of thousands of people are labeling themselves as “AI professionals,” but who have serious knowledge-gap weaknesses.
It’s like bleeding out from multiple puncture wounds, each one getting through one of those gaping armor-holes.
Let’s see if we can address this, ok?
It’s not going to be that hard – and it won’t take long – and if you’re in the situation that I’ve described, you’ll emerge from this series of posts feeling -and actually being – much more confident in your understanding.
First Step Contrast-and-Compare: Neural Network Architectures
Perhaps the easiest way to get insight is to start by visualizing two network architectures – a classic, simple Multilayer Perceptron (MLP) and a classic, simple Boltzmann machine – in both its original and restricted forms. These are shown in Figure 1.
Here are the key points:
- An MLP MUST be arranged in layers – at least three layers. The “hidden nodes” (latent variables, in machine-learning-speak) comprise the middle layer.
- A Boltzmann machine DOES NOT have to be arranged in layers. However, a basic (non-restricted) Boltzmann machine has every node connecting to every other node – a spaghetti of node connections. (See the middle image in Figure 1.)
- A restricted Boltzmann machine has many of those connections removed – and it can be REDRAWN so that it LOOKS LIKE an MLP – but it is not, and never will be, and MLP. (Wolf in sheep’s clothing and all that.)
We can get a better insight into those last two points when we look at how a Boltzmann machine (and its derivative, the restricted Boltzmann machine) evolved from a Hopfield neural network. This is shown in Figure 2.
This blogpost will be continued, as we look at how the respective learning method for backpropagation (or in general, stochastic gradient descent) require the MLP architecture, and how the energy equation and contrastive divergence (for the restricted Boltzmann machine) indicate the RBM structure. We’ll play some contrast-and-compare, and increase our level of insight-understanding into both types of neural networks.
To your health and outstanding success!
Alianna J. Maren, Ph.D.
Founder and Chief Scientist, Themesis, Inc.
Previous Related Blogs
Maren, Alianna J. (2022). “Entropy in Energy-Based Neural Networks – Seven Key Papers (Part 3 of 3).” Themesis Blogpost Series (April 4, 2022). (Accessed April 3, 2022.) https://themesis.com/2022/04/04/entropy-in-energy-based-neural-networks-seven-key-papers-part-3-of-3/
Maren, Alianna J. (2021). “Latent Variables Enabled Effective Energy-Based Neural Networks: Seven Key Papers (Part 2 of 3).” Themesis Blogpost Series (November 16, 2021). (Accessed April 3, 2022.) https://themesis.com/2021/11/16/latent-variables-enabled-effective-energy-based-neural-networks-seven-key-papers-part-2-of-3/
Maren, Alianna J. (2021). “Seven Key Papers for Energy-Based Neural Networks and Deep Learning (Part 1 of 3).” Themesis Blogpost Series (November 5, 2021). (Accessed April 3, 2022.) https://themesis.com/2021/11/05/seven-key-papers-part-1-of-3/
Good Vibes – Music by Nadia Boulanger
Famous Salonnières & Professors
In this section (and for the next few blogposts), we’ll look at one of the most under-rated teachers of music in our recent times, and how she directly influence some of the most well-known recent and contemporary composers.
Nadia Boulanger and her sister, Lili, were highly influential figures. More on these two fascinating women to come.
Dr. A.J., Thanks for the article. Since you’ve also written about Nadia Boulanger, you might be interested in the following anecdote. The Australian composer Peggy Glanville-Hicks “…lay in wait on the pavement opposite Nadia’s house ( where she could be seen better ) at 36 Rue Ballu ( now Place Lili Boulanger ) and bombarded Boulanger with noted pleading to be taken on as a student…in all weathers, outside the house”[https://books.google.co.nz/books?id=MZ50ojGMfKYC&pg=PA24&lpg=PA24&dq=peggy+glanville+hicks+nadia+boulanger+applied&source=bl&ots=L1h5A2lXMr&sig=ACfU3U0wfq24_GwP3w_qRRguxyPVwnnDqA&hl=en&sa=X&ved=2ahUKEwiY_vnm2aj3AhWaRmwGHciXC8Y4ChDoAXoECBYQAw#v=onepage&q=peggy%20glanville%20hicks%20nadia%20boulanger%20applied&f=false]. It took 2 months of this to convince Boulanger that she was serious].
This is fabulous, Simon – thank you so much! I’ll be certain to follow up and share! (And yes, credit you w/ the inspiration!) – much appreciated! – AJM