Three “Golden Oldies” Point to AGI

This blogpost accompanies a YouTube video {* still in edit mode, check back for YouTube link coming soon *} in which I review three papers that I’d stashed in my “super-secret-special-storage tote” when I was moving from the mainland to Hawai’i – over ten years ago!

For ten years, the “special tote” languished – until I could move everything from a storage room in Illinois to my new home on Kaua’i. And then – wondering just what it was that made these papers so special – I unpacked the tote.

And I did so while looking at these papers along with you, in the accompanying vid. {* Coming soon. *}

Thumbnail: “Three Golden Oldies Point to AGI” (Image will be replaced by YouTube link when video is published)

TL;DR: Quick Summary; Two YouTubes, One Blogpost, One Paper

This blogpost accompanies the YouTube vid (video link to be inserted once it’s published) where I unpack a tub that stores some favorite papers from ten years ago – and assesses their themes and content in terms of AGI relevance.

While I discuss several papers here, there’s only one that I’d recommend for your weekend reading. That would be “Epistemic Communities under Active Influence,” by Albaraccin et al. (2022). (Details later in this section.)

The reason that I’m not recommending papers from my “super-special collection” is that while they are worthwhile, they’re VERY meaty.

Instead, here are THREE SUMMARY POINTS, drawn from this set of papers, and pointing towards AGI:

  • Bayesian-based (probabilistic) reasoning will continue to be very important for AGIs. Bayesian reasoning is already a cornerstone of generative AI, but it will be even more important in AGI. One of the reasons that it will be more important is that we’re going to activate concepts within the AGI world model. This means that our AGI, given a certain set of starting inputs, will have to identify which ontologies and protocols (processes) are likely to be important, and activate them (at least do some pre-activation), so that they are ready to be called up and/or integrated as needed. This means some probabilistic reasoning.
  • We will need control structures and control algorithms. Our AGIs will have internal feedback loops. We will want the AGI to adjust its parameters for a variety of internal processes. We’ll talk about this in future YouTubes and blogposts; for now – one of the key requirements for anyone assembling an AGI “Dream Team” will be at least one helluva-good controls engineer.
  • AGI internal behaviors will not be simplistic. Right now, we’re semi-enthralled with what “agentic AI” can do. Next-next gen, when we start working with REAL intelligence, we’ll see some collective behaviors that are VERY different from what current system are capable of doing. These may be behaviors entirely within a single AGI, or they may involve collectives (“swarms”). That’s the reason that I think the Langton (1990) paper is important. Doesn’t matter that it deals with cellular automata. Doesn’t matter that it’s 35 years old. It triggers our thinking about group behavior, and about phase transitions, or shifts in the overall nature of behavior. This pulls on some more advanced statistical mechanics than is typically involved in AI work, but I do believe that fully capable AGIs will need this.

Your choice of action-items and follow-ups at the end of this blogpost; look for “Going Deeper: What to Read/Watch Next.”


Why We Don’t Have AGI (Yet)

One of the big challenges in creating AGI is that it will be MUCH more complex than the relatively simpler AI systems.

Current AI systems, based on LLMs (Large Language Models) have succeeded (in large part) because they use the same architecture, repeated many, many times. Conceptually, fairly simple and straightforward.

Figure 1. Current LLM-based AI tools are like brutalist architectures – the same basic structure, repeated again and again.

One of the primary reasons that current AI tools have been so successful is that they use the same basic architecture, just taken to larger and larger scales.

Figure 2. Transformers (the basis for LLMs, or Large Language Models, along with image and video generators) generate the “next” token in a series based on a probabilistic assessment from the string of prior tokens. The newly-generated token becomes part of the string generating the next, and so on.

Ilya Sutskever, a Co-Founder of SSI (Safe Superintelligence), made a point about the computational simplicity of LLMs when he was Chief Scientist at OpenAI, and was bringing ChatGPT to the public. See Sutskever’s interview with Lex Fridman in Fridman’s Podcast #94 – look at about Min. 56 and onward … Sutskever points to how the “language model needs to be larger” in order to probabilistically-generate good “next” tokens. Also, he gives a good description of how and why transformers work so beautifully, starting at 1.00.44. (This may be an old interview (2020), but VERY useful in setting context!)

The reasoning capabilities in recent GPTs are due to coupling RLHF (reinforcement learning with human feedback) onto the baseline LLM. RLHF is also fairly straightforward, in that it is very goal-directed. While it may play a role in an AGI, it will not (in and of itself), give us AGI.

Figure 3. Current GPTS use RLHF (reinforcement learning with human feedback) to help the GPT answer questions that involve reasoning or “chain-of-thought” processes.

But First, a Bit of Backstory

In 2016, when I was getting ready to move from Illinois to the Big Island of Hawai’i, I was doing a major consolidation of my vast treasure trove of paper-printed research articles. This “trove” had originally occupied many drawers in many file cabinets, and had papers going back to the 1980’s (and some that were earlier).

Figure 4. The Hawai’ian islands, courtesy NASA.

But multiple moves forced multiple trim-downs, and getting ready for the Hawai’i move – that was the biggest “trim down” of them all.

I remember going to the local public library a few afternoons each week, and taking a tub or crate or box of papers into the library, dumping it on a table, and just spending the afternoon … dealing with the contents.

Sometimes it was processing. I’d had several conference presentations that (after several computer deaths) existed only in paper form. Each of those got scanned, consolidated into its own LaTeX file, and re-created as new digital files corresponding to the originals.

Some of that work was … just purging. Massive amounts of just letting go.

Over time, the massive quantities got consolidated. Where there had been boxes and tubs and totes, there were now … just tubs and totes.

Figure 5: Moving from the mainland to the Big Island, and then to the smaller island of Kaua’i meant consolidating LOTS of scientific papers and reports!

And as the day for my departure to Hawai’i got closer, I began putting tubs and and totes into a rented storage room. The “big plan” was to ship all of this to myself, once I got established on the Big Island.


The Great Move (and Unload)

Well … it took longer than expected to “ship stuff.”

In fact, it took about ten years – during which time, I’d moved from the Big Island to the much SMALLER island of Kauai’i, on the OTHER side of the Hawai’ian island archipelago.

Figure 6: The two moves: first from the mainland to the Big Island of Hawai’i, and the second from the Big Island to Kauai’i. Image courtesy USGS.

And it was not that I knew anyone on Kauai’i, or had a special reason or invitation.

I simply kept feeling this calling – as in, Kauai’i was where I was supposed to be.

And since I knew (on the Big Island) that another big move was forthcoming, I didn’t ship things … and when I actually did do the Big Island-to-Kauai’i move, I once again put a whole lotta stuff in storage.

On Kauai’i, I lived (and still live) in a lovely little plantation-style cottage; the perfect dreamsicle of a Hawai’ian abode.

And I waited until I had been here for three years before seizing the courage to move two loads of stuff – one from the mainland, and one from the Big Island – to my little home on Kaua’i.

And so, it was just a few months ago, with the shipment from the mainland washed up like tidal wave debris in my living room, that I looked inside the contents of that very special tote containing “very special stuff.”

And I thought – this would be good stuff to share with you. Hence this blogpost and the accompanying YouTube.


What I Realized When I Looked at These Papers a Second Time

When I took a first pass at these papers, a few months ago, I thought, “OK, these are interesting … I sort of get why I was saving them.”

But when I looked at them the night before shooting vid, I realized that they were more than interesting – each paper was an access, a pointer of sorts, to a whole topical area that was not just relevant to creating AGIs, but was essential.

In this blogpost (and in the accompanying YouTube), we not only look at certain specific papers – we identify how each paper acts as a pointer to a specific architectural AGI component. For example, a paper that discusses human cognition actually points to how AGIs will need world models with different representation levels (neural up through symbolic).

Figure 7. Each paper that we introduce in this blogpost acts as a “pointer” to a specific component of AGI architecture – in this case, the paper that we’ll discuss on “Operator’s Comprehension” (Yufik and Sheridan, 2002) opens the conversation about world models and multiple representation levels.

Prelude to the World of AGIs

(This paragraph and YouTube link replicates content from my previous post:) Within a very short time, we’ll be living in a world that includes not only AIs, but AGIs – artificial general intelligence. In a 2025 Lex Fridman YouTube interview with Google CEO Sundar Pichai, Pichai hovers around 2030 as an “AGI emergence” date that has been offered by several researchers.

Lex Fridman interviews Google & Alphabet CEO Sundar Pichai (June, 2025).

Dynamic Ontologies and Belief Propagation

The first paper that I pulled was “Understanding Belief Propagation and Its Generalizations” (Yedidia et al., 2001), and it is still relevant today. This paper discusses inference (a very important topic today), but also – more important to our needs – we need to think about how we will create ontology activations within our AGI world models.

We know world models will play an important role in AGI. One important component of a world model will be an ontological, or symbolic, representation of the important concepts and entities that are included within our world model.

As our AGI gains information pointing to any one or more concepts or entities within the world model, it needs to ask: “What other concepts or entities, related to the ones that I’ve just activated, are likely to be important and need to be similarly activated?” (The AGI may also have a means for partial activation, or activation readiness, for certain ontological elements.) The notions of belief activation, and also the notion (introduced by Pearl, 1988, of belief propagation) are ones that we may revisit conceptually, and use them to springboard new AGI-related ontology activation methods.

Pearl’s approach was very probabilistic, Yedidia et al. are able to transition this to a message-passing algorithm that works with graphical representations.

We’re not asserting that either Pearl’s belief propagation (BP) or Yedidia et al.’s generalized BP approach are exactly what we’ll need and use in AGI world models. Rather, we note that these works stand as important intellectual precursors in an area that must be addressed as we create effective AGIs.


Operator’s Comprehension (“Swiss Army Knife”)

Yufik and Sheridan (2002) proposed a method that they called a “Virtual Associative Network” (VAN), designed to “improve decision aiding in complex tasks involving multiple variables and rapidly changing constraints.” Their goal was to enable a shift from “search” (which we know all too well in the days leading up to transformers) to “comprehension.” They note that “The VAN model … attributes comprehension to a form of information integration which is unique to the human memory and diminishes the need for search.”

Figure 8. Extract from the Abstract of the “Swiss Army Knife” paper by Yufik and Sheridan (2002). (See References and Resources for full citation and link.)

Once again – we are not necessarily looking to use the VAN model directly; rather to take it as a point of inspiration as we seek to give AGIs the ability to “comprehend.”


Adaptive Critic

Paul Werbos (who invented backpropagation, and who funded Barto and Sutton for early reinforcement learning work while he was a Program Director at NSF) invented the adaptive critic, which is a neural control method. The paper that I pulled from the tote was actually a patent; the one listed in the References is a 1998 arXiv article on the same.

Figure 9. Extract from the Abstract of the “Stable Adaptive Control” paper by Werbos (1998). (See References and Resources for full citation and link.)

AGIs will need to include control methods as integral architectural components. Once again, we are not necessarily suggesting that adaptive critics are the methods that should be used. As an alternative, the active inference method proposed by Friston et al. may also prove useful. (Friston’s work, with colleagues, extends over the past more-than-decade, see references to Friston’s work in the previous blogpost; “How to Prepare for the AGI World.”)

The reason that we’ll need some sort of control method (possibly more than one, depending on how and where we use it) is that our AGIs will need to vary their parameter settings dynamically over time. Allowing for adaptive control will facilitate this.


Chaos, Phase Transitions, and AGI Swarms

Although I’d planned to pull only three papers for discussion, there was a fourth that was important – Langton’s 1990 work, “Computation at the Edge of Chaos” discussed cellular automata behaviors and made the connection to phase transitions (a notion from statistical mechanics). He notes that “by selecting an appropriate parameterization of the space of CAs [cellular automata], one observes a phase transition between highly ordered and highly disordered dynamics …”

This is important precursor work because our AGIs will not always exist in isolation, or exist within limited interactions with each other. In contrast, we envision that some AGIs will be found as communities, or swarms. Albarracin et al. (2022) describe “Epistemic [knowledge-seeking] Communities under Active Inference.”

Albarracin et al. describe their system, saying “We build a model based on active inference, where agents tend to sample information in order to justify their own view of reality, which eventually leads them to have a high degree of certainty about their own beliefs. We show that, once agents have reached a certain level of certainty about their beliefs, it becomes very difficult to get them to change their views.”

This recent paper is well worth a read, and – most significantly – closes the loop that we started with at the beginning of this survey of “three golden-oldies.” Beliefs, probabilistically expressed, were important over thirty years ago, as researchers were just beginning to think about how AIs could form their “reality representations.”

Now, we are moving away from a simplistic (but not necessarily trustworthy) probabilistic approach embedded into transformer-based systems. AGIs will need a more robust reality representation, one that can be verified and trusted with some known degree of certainty.

The thinking that researchers did between thirty and forty years ago can springboard today’s new methods, suitable for AGIs.


Going Deeper: What to Read/Watch Next

One YouTube and One Blogpost (Mine)

For a little reading on that last point, here’s a YouTube and accompanying blogpost.

YouTube: In particular, the discussion of neuronal avalanches (starting 27 min in) is useful and interesting for thinking about more complex internal AGI behaviors.

Maren, Alianna J. 2023. “CORTECONs and AGI: Powerful Inspirations from Brain Processes.” Themesis, Inc. YouTube Channel (Nov. 23, 2023). (Accessed Oct. 9, 2025; available online at CORTECONs and AGI.)

Blogpost:

  • Maren, Alianna J. 2023. “Next-Era AGI: Neurophysiology Basis.” Themesis, Inc. Blogpost Series (Nov. 15, 2023.) ( Next-Era AGI.) (NOTE: This blogpost has some links to VERY GOOD REFERENCES re/ neurophysiology behaviors – avalanches and other phenomenon; some very old, some current up to 2023.)

The Other YouTube and a More “Meaty” Paper (Albarracin et al., 2022)

I like this YouTube interview, where host Denise Holt (on the Spatial Web AI channel) interviews Dr. Mahault Albarracin (Dir. Research, Verses.ai) and Dr. David Bray (Principal & CEO, LDA Ventures). I’m about half-way through this vid (will finish up this weekend), and find that the points that Drs. Albarracin and Bray are making about the future of AI (specifically, AGI) are well-thought-out and cogent. Highly recommended; we’ll be talking about this in the future. This is a recent vid – July, 2025 – so the topics are very current.

As an accompanying paper, I’m reading back into the past just a bit with the (previously-mentioned) paper by Albarracin et al. (2022). The reason that this is important is that it deals with how communities of AGIs will behave and interact. This paper particularly deals with how – within AGI communities – some forms of “consensus beliefs” will emerge. I’m not going into detail just now. (We’ll pick up on this again in the future.) It’s just that this paper picks up on TWO very important streams-of-thought:

  • How will interacting AGI communities behave, and form a consensual world model?, and
  • The importance of probabilities, once again, shared across multiple interacting agents, in forming beliefs.

Important to note: the terminology used here is “epistemic,” meaning “knowledge-based.” So this paper, and related ones, deal with creating “knowledge-based beliefs,” that is, beliefs built up over observations made by various community members.

This paper – and the authors – are all squarely within the Fristonian active inference camp. Friston has several other papers that address “epistemic beliefs,” I’ve found one of the Friston classics on this subject – see the next subsection. (Bonus: extra YouTube/blogpost.)


Just a Little Bit More – One Extra YouTube; One Extra Blogpost (Friston on “Epistemic”), One Extra Paper

The Paper

Probably the best paper to get a sense of Friston and colleagues on their notion of “epistemic” behaviors (and active inference) is Friston et al., 2015, “Active Inference and Epistemic Value.” (I’m going to have to re-read this myself, this weekend – and it IS meaty, being a Friston original.) The important thing about this paper is that – coincident with the timeframe where he was introducing his major active inference concept, he was talking here about the different kinds of behaviors – which Friston and colleagues characterize as the difference between “extrinsic and epistemic (or intrinsic) value.”

It takes a while to understand Friston’s concepts and his particular vocabulary; I’m going to restart my studies with this one and build up to more recent works. (Yet again.)

The Blogpost

Here’s a blogpost where I discuss this; good walkthrough on what leads up to this (historical, a time-chart, lots of good refs):

And the YouTube

Here’s a YouTube on the same:

Maren, Alianna J. 2024. “AGI: Comparing Renormalising Generative Models (RGMs) with Action Perception Divergence (APD).” Themesis, Inc. YouTube Channel (Sept. 1, 2024). (Accessed Oct. 9, 2025; available at Renormalising.)

Taken together, these last three will give us a superb grounding on Friston’s approach to active inference and building up “epistemic” beliefs. These form the support basis for Albarracin et al.’s (2022) work, where they look at communities or swarms of interacting AGIs. And from there, we can start thinking about what it will be like to create AGIs as interacting and communicating communities, and not just as isolate beings – an important shift in our perspective!

Thank you for being with me – to this (rather substantive) end – and we’ll be following up on these topics soon!

All my best – AJM



References and Resources

  • Albarracin, Mahault, Daphne Demekas, Maxwell J. D. Ramstead, and Conor Heins. (Ref to be completed.) 2022. “Epistemic Communities under Active Inference.” Entropy 24(4): 476. doi:10.3390/e24040476. (Accessed Oct. 3, 2025, available online at Entropy)
  • Fridman, Lex. 2025. “Sundar Pichai: CEO of Google and Alphabet | Lex Fridman Podcast #471.” Lex Fridman YouTube Channel (June, 2025). (Accessed Sept. 12, 2025; available at Lex Fridman Interview with Sundar Pichai.)
  • Friston, K., F. Rigoli, D. Ognibene, C. Mathys, T. Fitzgerald, G. Pezzulo. 2015. “Active Inference and Epistemic Value.” Cogn Neurosci, 1-28. (Accessed Aug. 30, 2024; available online at ResearchGate, and also at ChrisMathys.com.)
  • Langton, Chris. 1990. “Computation at the Edge of Chaos: Phase Transitions and Emergent Computation.” Physica D. 42:12-37. (Accessed Oct. 2, 2025, photocopy of original journal article available online at Georgia Inst. Tech.)
  • Maren, Alianna J. 2024. “Contrast and Compare: Friston et al. (2024) and Hafner et al. (2022).” Themesis, Inc. Blogpost Series (Aug. 30, 2024). (Accessed Oct. 10, 2025; available at Contrast-and-Compare.)
  • Pearl, Judea. 1986. “Fusion, Propagation, and Structuring in Belief Networks.” Artificial Intelligence Technical Report R-42:(Sept., 1986): 241-288. (Accessed Oct. 7, 2025; available as PDF.)
  • Werbos, Paul J. 1998. “Stable Adaptive Control Using New Critic Designs.” arXiv adap-org/9810001v2 (v1: 25 Sep 1998, v2: 20 Nov 2012). doi:10.48550. (Accessed Oct. 2, 2025, available online at arXiv). Originally published in SPIE Ninth Workshop on Virtual Intelligence/Dynamic Neural Networks v3728. (Eds. Lindblad, Thomas, Padgett, Mary Lou and Kinser, Jason M.) (Mar., 1999), 510-579. doi:0.1117/12.343068.
  • Yedidia, Jonathan S., William T. Freeman, and Yair Weiss, 2001. “Understanding Belief Propagation and Its Generalizations.” Mitsubishi Electric Research Laboratories (http://www.merl.com). TR2001-22 November 2001. (Accessed Oct. 2, 2025; available online at Understanding Belief Propagation.)
  • Yufik, Yan M. and Thomas B. Sheridan. 2002. “Swiss Army Knife and Ockham’s Razor: Modeling and Facilitating Operator’s Comprehension in Complex Dynamic Tasks.” IEEE Trans. SMC Part A Systems and Humans 32(2) (March 2002): 185. doi: 10.1109/TSMCA.2002.1021107 · Source: IEEE Xplore. (Accessed Oct. 2, 2025; available online at ResearchGate.)

Leave a comment

Your email address will not be published. Required fields are marked *

Share via
Copy link
Powered by Social Snap