The most important element in creating an AGI (artificial general intelligence) is that the latent node layer needs to allow a range of neural dynamics. The most important of these dynamics will be the ability for the system to rapidly undergo a state change, from mostly “off” nodes to mostly “on.” Neurophysiologists have observed this in mammalian brains; they call it neuronal avalanches. We discussed this in the last post on Next-Era AGI: Neurophysiology Basis.
Three key insights from neurophysiology are that we need :
- More advanced neuron models, allowing more complex behaviors and dynamics,
- Free energy minimization across the latent node layer, allowing us to model metastable states with rapid phase changes – essentially, creating neuronal avalanches, and
- Feedback control loops (or reentrant signal processing, to use an Edelman term) – to govern how latent layer nodes become active and form clusters (among many other things).
In this post, we discuss the second element – free energy minimization.
Cluster Variation Method Free Energy Minimization in the CORTECON Latent Layer
To get the desired free energy-based behavior, we’ve introduced the Kikuchi (1952) and the later Kikuchi-Brush (1967) cluster variation method.
To give us all a simple example – really a toy – we’ve first developed the code for a 1-D cluster variation method system, or 1D CVM.
We presented the full 1D CVM interactive code in THIS YOUTUBE:
We use the 1-D CVM (a single zigzag chain of “on” and “off” nodes), which is a good introduction to the CVM because it is relatively easy-to-use and easy-to-understand.
This 1D CVM code is available at:
The program to use is:
- simple-1D-CVM-w-turtle-1pt5pt5-2023-11-26.py
(This is the last program uploaded to the repository; uploaded on Sunday, Nov. 26, 2023.)
We invite you to access this code and run it yourself. This blogpost will walk through the essentials of understanding and running this code, and we will shortly be making a YouTube that will discuss this code as well. (We will likely insert screen shots into the YouTube.)
This code is made available under the MIT License Agreement, so you can use this code in own work. We would appreciate your acknowledgement of Themesis, Inc., in any work that you do building on this code.
This code is a simple (“toy”) instantiation of a simple 1-D grid, composed of a single zigzag chain of 24 nodes; 12 each in the “on” and “off” states.
The 1D CVM: Brief Overview
Nodes are binary; they are always either “on” or “off.”
For purposes of understanding this simple illustration of the 1D CVM, we keep the numbers of “on” and “off” nodes equal. This facilitates comparison with analytic results.
In the cluster variation method (CVM), nodes are arranged in a fixed grid, which is constructed using overlapping rows, where each row is offset a half-step from the preceding row. Thus, instead of having a regular “chessboard” type of grid, we have a grid where every other row is offset a half-step.
This particular 1D grid pattern (using a single zigzag chain) was invented by Ryoichi Kikuchi in 1952, and enhanced by Kikuchi and Brush in 1967.
In this grid layout, every node has:
- Nearest-neighbors,
- Next-nearest-neighbors, and
- Triplets.
Collectively, these are called the local configuration variables.
The free energy is a function of these configuration variables. This makes the CVM free energy a bit more complex than that typically used in neural networks and variational methods.
In the 1D CVM, every node participates in:
- Two nearest-neighbors (on the diagonals, so these are in the row either above or below the row where the specific node resides),
- Two next-nearest-neighbors (on the same row as the specific node), and
- Two triplets (a combination of nearest-neighbor and next-nearest-neighbor interactions).
What the 1D CVM Interactive Code Does
What this code does is three basic steps:
- Count the configuration variable values for the initial grid configuration (this initial grid is pre-defined and is embedded into the code).
- Invite the user to identify two nodes to swap – these must be nodes that have different activations (one “on,” the other “off”), and enter the row & column for each. (I typically use the first node, first row (Row 0, Column 0), which is an “on” node, and then the fourth node, first row (Row 0, Column 3), which is the first “off” node in that row.) The system then counts the configuration variable values for this new configuration.
- Finally, the system computes the fractional values for the configuration variables for both the original and user-modified grids, and from there, the negative entropies of the two grids.
Since we are setting the enthalpy parameters to zero, the free energy of each grid is identical with the negative free energy.
What we’ll find, as we experiment with different “node swaps,” is that every time we make a change, we create a higher-value (smaller magnitude) negative entropy – that is, we move the system away from its equilibrium point.
This is the first time that we’ve been able to play – to go “hands-on” with a system where we can modify the free energy just by swapping the on/off-states of different nodes. We can do this because the entropy term takes into account the nearest-neighbor, next-nearest-neighbor, and triplet values surrounding each node.
Implications for Controlling Equilibrium State
In a follow-on discussion (which will be offered by Themesis as a YouTube), we’ll investigate the next question: If we’ve moved the system away from its original equilibrium state, then:
- How far away are we from that original equilibrium state?
- Is the new state itself possibly at equilibrium, if we had different (non-zero) values for our enthalpy parameters?
- How can we find the enthalpy parameters that would define the equilibrium configuration closest to the new grid pattern that we’ve just created?
These are all questions that involve play with a simple 1-D grid representation of physics-based system.
The next step would be to experiment with a 2-D grid, for which the equations are also well-known. (The code is just a bit more complex.) Within the 2-D grid, we can see not only clusters of nodes but also connections between clusters.
With this kind of grid topology, and our ability to control cluster formation through enthalpy parameters, we now have the ability to control dynamic behavior in this system. We can cause clusters to emerge and then disappear. This means that we can use this grid as a mechanism for setting up a new kind of latent node layer in a new class of neural network.
By controlling activation patterns in the latent node layer, we can cause dynamic processes such as neuronal avalanches.
This means that we can now model a much more complex, diverse, and exciting realm of capabilities than has been possible before this time, setting the foundation for AGI.
Resources and References
Themesis YouTube: 1D CVM Code Walkthrough
Themesis YouTube: Worked Example for the 1D CVM
Themesis YouTube: CORTECONs: Architecture, Equations, and Connection with AGI
Themesis YouTube: CORTECONs and AGI: Powerful Inspiration from Brain Processes
Primary Reference on the 1D CVM
- Maren, A.J. (2016). The Cluster Variation Method: A Primer for Neuroscientists. Brain Sci. 6(4), 44, https://doi.org/10.3390/brainsci6040044; online access, pdf; accessed 2018/09/19.
2-D Cluster Variation Method: The Earliest Works (Theory Only)
- Kikuchi, R. (1951). A theory of cooperative phenomena. Phys. Rev. 81, 988-1003, pdf, accessed 2018/09/17.
- Kikuchi, R., & Brush, S.G. (1967), “Improvement of the Cluster‐Variation Method,” J. Chem. Phys. 47, 195; online as: online – for purchase through American Inst. Physics. Costs $30.00 for non-members.
CVM Work by Others
Sajid et al. on using the 2-D cluster variation method to model cancer niches.
- Sajid, Noor, Laura Convertino, and Karl Friston. 2021. “Cancer Niches and Their Kikuchi Free Energy.” Entropy (Basel) (May 2021) 23(5): 609. doi:10.3390/e23050609 (Accessed Sept. 11, 2023; pdf.)