Today, we focus on getting the entropy term in the 1-D cluster variation method (the 1D CVM), using a simple text string as the basis for our worked example.
This blog is in-progress. Please check back tomorrow for the updated version. Thank you! – AJM
Our End Goal
Our end goal – the reason that we’re investigating the cluster variation method (whether 1-D or 2-D) is that we believe it will have three very substantial uses:
- Straightforward modeling of data that can be represented as a 1-D or 2-D grid,
- Enabling artificial general intelligence (AGI) via introducing a new model into the variational inference (and active inference) framework, and
- Enabling AGI even more substantially, by allowing cognitive systems to create internal associations and have behaviors leading to new, temporally-based pattern formations.
All of these innovative steps are new, and only brief elements of these new advances have been published thus far.
In this blogpost, we build on previous work (both YouTube vids and the two prior blogposts) to perform a single worked example. We’re working with a 1-D cluster variation method grid (1D CVM), so we use a text string as the basis for our example.
Our first step, and the focus for this post (and the corresponding YouTube) is to compute values for the local configuration variables. Once we do that, we can compute the entropy term.
The enthalpy term is found as the interaction enthalpy parameter (epsilon1) multiplying a sum of certain configuration variables. This means that we can’t compute the enthalpy until we’ve selected our epsilon1 parameter.
Actually, we’ll choose a range of such parameters, and leave that for Part 2 of this example.
Context
Moving from a simple, straightforward Ising equation, where the entropy is expressed simply in terms of “on” and “off” units, to the cluster variation method (CVM) entropy is like going from a game of gin rummy to poker.
It takes a bit longer to learn the rules.
In particular, it takes a little while to learn about the various card combinations that are useful for winning.
What we are doing now is similar to learning about “royal flushes” in poker.
We are seeking to learn about the various local configuration variables, as these are the elements used for building our entropy equation.
Specifically, we want to understand:
- Nearest-neighbors,
- Next-nearest-neighbors, and
- Triplets.
Our First Example
We’re going to use a text string as our first example for the 1-D CVM.
Here’s the string that we’ll use:
“Please like me on facebook thank you”
We’ve constructed this text string to fit the criterion of EQUAL NUMBER of CONSONENTS AND VOWELS. (See more on this below.)
(AJM’s Note: To make this work, we followed the English language convention of letting “y” be a consonant when it is at the beginning of a sentence.)
See how we construct the 1-D cluster variation method grid (1D CVM grid) in Figure 2 below, incorporating this simple text string.
For this example, we are letting the consonants be the “on” or state A nodes, and the vowels be the “off” or state B nodes.
So, when we think about a y value being the nearest neighbor (on the diagonal) for a given node, there are three choices:
- y1: A-A, or (in our case) consonant-consonant (C-C), e.g., “P” + “L.”
- y2: A-B or B-A, or (in our case) consonant-vowel or vowel-consonant (C-V and V-C), e.g., “L” + “E,” or “A” + “S.”
- y3: B-B, or (in our case) vowel-vowel (V-V), e.g., “E” + “A.”
Simple Get-Starting Exercise
The objective for our game here is to start by noting the different configuration variable values.
There are three different kinds of local configuration variables, in addition to the basic “on/off” (x) variables. These are:
- The y(i) – nearest neighbors – which are actually on the diagonal to a given node),
- The w(i) – next nearest neighbors – which are actually the nearest neighbors on the same row as a given node, and
- The z(i) – the triplets – which are “chevrons” that include the given node, its nearest-neighbor (diagonal) and it’s next-nearest-neighbor (same row).
There are three different kinds of nearest neighbors (e.g., “on-on,” “on-off,” and “off-off.”) Similarly, there are two different kinds of next-nearest-neighbors, and six different kinds of triplets.
Figure 2 shows how we’re counting the triplets for this example. We denote the individual nodes as being either “C” (for consonant) or “V” (for vowel).
The first exercise is to verify the count for the six different z variables that we show in Figure 2. (Remember that the far left of the grid wraps around to connect with the far right, envelope-style.)
As an example, the first count indicates that we have just TWO “z1” triplets, or C-C-C triplets.
We should look at our text string, and find what instances there are of C-C-C patterns.
Here’s one C-C-C triplet: the “k” from “facebook” combined with the “th” from “thank you.”
The First Really Important Thing
First, for this example … and for a lot of other examples that we’ll run over the near future, it is very important that we have equal numbers of nodes in state A and state B. (In the case of this example, it means equal numbers of consonants and vowels.)
The reason that we need this is that when we have an equiprobable distribution of nodes in our two states (that’s just a fancy way of saying that the number of A nodes equals the number of B nodes), then we can derive an analytic solution for what each configuration variable should be for any given value of an interaction enthalpy parameter.
This is like learning how to swim and getting into the pool for the first time, and holding on to the edge of the pool while practicing some swim kicks.
The analytic solution is our safety-grip. It’s like holding on to the edge of the pool.
There will be plenty of time later to go off into the “deep end.”
… Taking this into the example of our text string … this means that there may be some awkward “string constructions” – because I’m working hard to get something with equal numbers of consonants and vowels!
Second Really Important Thing
If we didn’t have an interaction enthalpy between the nodes – that is, if there were nothing to make like-near-like nodes sort of “cling together,” then we would have a very simple probabilistic distribution of our various configuration variables.
Now, we will ALWAYS have:
y1 + 2*y2 + y3 =1,
because each of the y configuration variables is a fraction, and they have to add up to one.
In the case where the interaction enthalpy (epsilon1) = 0, then we would have:
y1 = y2 = y3 = 0.25.
Notice that we count y2 two different ways; we say that it has a degeneracy of two. That’s why we multiply it by the factor of two in the equation just prior to this one.
Counting the Triplets
Similar to counting the y configuration variables, we count the z triplets. We have a degeneracy of two for the A-A-B (or C-C-V) and the B-B-A (V-V-C) triplets, because they can be counted two different ways (forward and back).
So just as with the y variables, we know what to expect if we have NO interaction enthalpy, and everything is randomly distributed in terms of consonant-vowel patterns:
- z1: C-C-C – 0.125, or 30/8, or 3.75, or between 3 and 4.
- z2: C-C-V, and alternatively, V-C-C, so 2*3.75, or 7.5 – between 7 and 8.
- z3: C-V-C, or again 0.125, or 30/8, or between 3 and 4.
Back to Our Example
We have 30 characters in our text string; 15 each of consonants and vowels.
Based on simple probabilities, if we had NO INTERACTION ENTHALPY (epsilon1 = 0), then we would expect (approximately):
Just a Word on Languages
Some languages have a lot of simple consonant-vowel pairs. Japanese, for example, is one such language.
To say “good morning” in Japanese, one would say: “ohayō gozaimasu (おはようございます).
There is one (sort-of) complex vowel here, the “ai” vowel pair.
There are other instances where there are two consonants together, as in the “ch” sound.
HOWEVER … by and large, the Japanese language has a lot of short, staccato consonant-vowel syllables.
Hawai’ian similarly has a lot of consonant-vowel syllables. “Good morning” in Hawai’ian is “Aloha kakahiaka.” (There’s one of those complex vowels here also.)
English, however, has a lot of words that have consonant-consonant pairs, and even a fair number of triple-consonants, e.g., the words “knight” and “thought.”
German, and many of the Slavic languages, have LOTS of consonants – and we’re not even going to touch those. (Remember, we need equal number of consonants and vowels, and that game is totally not going to happen with German. I tried. And failed.)
Also, we can’t play this game with Chinese words – because for that language, tonality counts, and that introduces many more variables. We need a simple binary variable system in order to play our game.
About the Code
We do not yet have new code to go with this blogpost (and the associated YouTube vid). We will update this blogpost as soon as the new code is ready.
(We MAY pinch-hit, as a temporary measure) by making some old (first-gen) code available – simple, structured (non-object-oriented) Python.)
The code that we are introducing with this YouTube vid and blogpost series is second-generation code. It is object-oriented. We accomplished the same free energy minimization task that is our first objective in early, first-gen code (written in straightforward structured Python). However, we need an object-oriented approach in order to do the more interesting things … hence we are recasting our original code into an O-O framework, and are creating tutorial presentations (see the YouTube in Resources and References) as we do so.
“Live free or die,” * my friend!
* “Live free or die. Death is not the worst of evils.” – attrib. to U.S. Revolutionary War General John Starck. https://en.wikipedia.org/wiki/Live_Free_or_Die
Alianna J. Maren, Ph.D.
Founder and Chief Scientist
Themesis, Inc.
Resources and References
Prior Related YouTubes
This Themesis YouTube introduces the object-oriented code for a 1D CVM grid, which is a preparation study for working with a 2D CVM grid. We present a code walkthrough for computing the denotation for a single node attribute, wLeft (next-nearest-neighbors to the left) for a given node.
Also, in this following YouTube, we discuss the equations a bit more — and offer a few visualizations for the local configuration variables.
Prior Related Blogposts
This blogpost accompanies the code walkthrough YouTube identified in the previous section.
- Maren, Alianna J. 2023. “1D CVM Object Instance Attributes: wLeft Details.” Themesis, Inc. Blogpost Series (Aug. 6, 2023). (Accessed Sept. 5, 2023; available online at https://themesis.com/2023/08/06/1d-cvm-object-instance-attributes-wleft-details/.)
This blogpost accompanies the architecture and equations YouTube identified in the previous section.
- Maren, Alianna J. 2023. “CORTECONs: A New Class of Neural Networks.” Themesis, Inc. Blogpost Series (Sept. 5 2023). (Accessed Sept. 5, 2023; available online at https://themesis.com/2023/09/05/cortecons-a-new-class-of-neural-networks/.)
Themesis GitHub Repository
AJM’s Note: This is the same code that we’ve referred you to for the previous two blogposts.
Readings
2-D Cluster Variation Method: The Earliest Works (Theory Only)
- Kikuchi, R. (1951). A theory of cooperative phenomena. Phys. Rev. 81, 988-1003, pdf, accessed 2018/09/17.
- Kikuchi, R., & Brush, S.G. (1967), “Improvement of the Cluster‐Variation Method,” J. Chem. Phys. 47, 195; online as: online – for purchase through American Inst. Physics. Costs $30.00 for non-members.
1-D Cluster Variation Method: Computational Result
- Maren, A.J. (2016). The Cluster Variation Method: A Primer for Neuroscientists. Brain Sci. 6(4), 44, https://doi.org/10.3390/brainsci6040044; online access, pdf; accessed 2018/09/19.