“Can an AI have a conscience?”
That was the question that my friend and colleague, Lee Goldberg, asked me as we were putting together a talk that we gave (Lee as moderator, me as speaker) last month at the Trenton Computer Festival.
AI with a Conscience?
The movie 2001: A Space Odyssey suggested that monolithic AIs could decide – on their own – to destroy humanity.
This isn’t a new concept.
From before Mary Shelley’s 1818 novel, Frankenstein, we have feared that we might, as gods, make something “in our own image and likeness” that would turn against us.
And interestingly, in Shelley’s Frankenstein, the creature that Dr. Frankenstein created DID have a conscience; it (he?) experienced remorse.
However … over the past few years, movie and TV producers have capitalized on our fears of “AI-without-a-conscience,” producing more and more shows with a malevolent AI.
YouTube: “Conscience, Consciousness, and AGI”
This blogpost is associated with the YouTube on AGI that riffs on the question of: “Can an AI have a conscience?”
Not a New Fear
We keep imagining monolithic AGI-entities with the power and autonomy to decide to turn against us, their creators.
This is a very, VERY old theme, and goes back as far as the 1970’s movie, Colossus: The Forbin Project.
As long as we’ve been able to imagine AGIs, we’ve also imagined that they would independently and autonomously turn against us.
Simultaneously, we’ve also envisioned these AGIs as large, complex, powerful … and monolithic. (Except that more recently, the “evil AGI” is resident in a pretty young woman or a child.)
Whether we’ve thought of these AGIs as literal monoliths (e.g., in the Space Odyssey book and later the movie series), or as figurative ones (the Russian and American AGIs of Colossus), we seem to envision each of them as a single large, very powerful entity.
Until Recently, Still Monolithic
Until very recently, this notion of the increasingly “monolithic” AI has also underlain our actual system design.
We’ve built increasingly-large LLMs (large language models), which replicate the same fundamental algorithm … again, and again … and again.
And as we’ve replicated this algorithm, our LLMs have required increasing numbers of parameters (now in the trillions).
In short, we’ve been building monolithic AIs.
First, Take a Deep Breath
So when someone asks me a question that suggests the possibility of an AI-turning-evil; some sort of independent free will that can emerge within an AI and cause it to “turn against” its creators — my first suggestion is: take a deep breath.
Look, really LOOK, at where this question is coming from.
The first thing that we need to do is to identify which of our questions are coming out of some fear-basis.
Psychologists would call this “projecting our shadow.” That means, we each have “stuff” that we don’t like to acknowledge; perhaps we’re even ashamed of it. And we tend to be very willful about holding onto our “stuff.”
But because we don’t feel good about it, we tend to project our own “stuff” (greed, ego, fear) onto others.
And it is very, VERY easy to project these fears, these “shadows” or dark sides of our own personalities, onto AIs – even AIs that are too dumb to tie their own shoelaces, much less take over the world.
Next, Look at the Architecture
Now, to answer the question of “conscience,” or even something remotely like “conscience,” we need to look at the architecture.
It’s important that we remember that an AI – no matter how fancy or complex – is still just lines of computer code working on tons of stored data. For now and for the foreseeable future, that’s all that an AI can be.
So for an AI to do something for which it could have a “conscience,” we first need to figure out where in the architecture this kind of ability would reside.
In the Trenton talk, and in the last YouTube vid, I introduced the five key evolutionary steps that we would need to get a fully-capable AGI.
For different and increasing capabilities, the AGI must have:
- Level 1: BOTH the signal and symbolic-level capabilities operating within the system,
- Level 2: Communication BETWEEN these two capabilities (or “representations”),
- Level 3: At least some minimal parameter controls governing the system that mediates this communication,
- Level 4: A “Reasoning” system that operates largely on the world representation encapsulated in the ontology (or symbolic reasoning component), and – FINALLY –
- Level 5: “Goals and Mega-Controls” along with a means of differentiating between “self” and “other.”
The following figure shows the full AGI-5 architecture.
The full set of five evolutionary steps to create an AGI architecture were shown in this YouTube:
Until we get to a Level 5, there’s not even the remotest possibility of a “conscience,” because until then, the AGI doesn’t understand itself as an “agent” that can act on some “other.”
Then, of course, it needs to have all sorts of rules about what it can and cannot do.
THEN … possibly … we can introduce the notion to the AGI that it might not have acted in accordance to its guidelines – or might have made a decision with undesirable outcomes.
This opens up a whole new level of discussion … FAR more than we want to go into in this blogpost.
Instead, we’ll pull back … we’ll pull WAY back … and assess where we are now with regard to building an AGI.
We Don’t Have AGI-1 Yet
We’re looking at a nomenclature for describing AGIs and their capabilities. (We’ll build this out more in future blogposts and YouTubes.)
Suffice to say, we don’t really have the signal level (e.g., an LLM) and the symbolic fully into the same sandbox just yet.
This means that we’re not yet at AGI-1.
So let’s look at the best baby steps that we have that are moving in the right direction.
What we want are indications that some R&D groups are moving out from the monolithic/megalithic LLMs (where more is infinitely and always “better”), towards something that is not only more sustainable … but also smarter.
What the Big AI Powerhouses are Up To
Honestly, I’m not seeing AGI.
I’m seeing a whole lot of making LLMs more effective and efficient.
I’ll put in a BUNCH of links in the Resources and References section below, but so far … it’s looking awful Ptolmaic.
That is, the kind of astronomy/astrology where the “sun-moves-around-the-earth.” (Geocentric point of view.)
And I can be wrong.
I really, truly, can be so damn wrong about the whole thing.
It may be that the key to AGI is just pouring on more hundreds-of-billions of tokens, and trillions of parameters.
It may be that the key to AGI is using up all of earth’s natural resources to make mega-computers out of NVIDIA’s Blackwells, or multiple instances of Microsoft’s Stargate.
Who really knows for sure?
And it may be that the whole idea of ontologies is just … something whose time has passed, and which never SHALL come again …
But if there’s life after transformers, we need to be on the lookout.
That said, this next section quickly (VERY quickly) highlights some recent work.
The Latest in LLMs. (Sigh.)
A quick summary of the MOST IMPORTANT THINGS that have happened over the past few months:
Mixtures of Experts
Mistral is doing it. And then, the word sneaks out that OpenAI is doing it. It’s the “mixtures of experts” notion …
And the important thing is that these are NOT really “experts” the way that you or I would think of experts … these are still fairly standard LLM encoder architectures that have been trained to specific purposes and are really more like “embedded architectures” – just like Doc2Vec is an “embedding architecture” for words in a document. (See the YouTube vid, link in the early part of this blogpost.)
I love the tech blogposts by Ignacio de Gregiore Noblejas, and link to two of his blogposts below … see his discussion on Google’s “Mixture of Depths.”
Final Word
The work propelling LLMs forward is making the existing methods work better. However, they are not allowing connection between signal and symbolic.
We’ll pick up on that in the future.
We’ll also talk about Meta’s latest, Llama-3, reputed to be the current winner in LLMs.
Resources and References
Mixture of Experts
Original Research Paper
The original Mistral arXiv paper:
- Jiang, Albert Q., et al. 2024. “Mixtral of Experts.” arXiv:2401.04088v1 [cs.LG] 8 Jan 2024. (Accessed April 20, 2024. Available online at Mixture of Experts.)
Blogposts
- Neves, Miguel Carreira. 2024. “LLM Mixture of Experts Explained.” TensorOps.com (Jan. 29, 2024). (Accessed April 20, 2024. Available online at LLM MoE.)
- TheWhiteBox.ai. 2024. “Mixture-of-Experts, The New Standard for LLMs.” TheWhiteBox.ai Blogpost series (April 5, 2024). (at Mixture of Experts.)
- de Grigoire Noblejas, Ignacio. 2024. “Microsoft’s $100 Billion Bet on OpenAI Is Stupid. Unless… (What AI is Going to Be Really About).” Medium.com (April 5, 2024). (Accessed April 20, 2024. Available online at Microsoft’s Bet.)
Mixture of Depths
Original Research Paper
Here’s the original research paper.
- Raposo, David, Sam Ritter, Blake Richards, Timothy Lillicrap, Peter Conway Humphreys, and Adam Santoro. 2024. “Mixture-of-Depths: Dynamically allocating compute in transformer-based language models.” arXiv:2404.02258v1 [cs.LG] 02 Apr 2024. (Accessed April 20, 2024. Available online at Mixture of Depths.)
Blogposts
I love Ignacio’s blogposts, and this one covers lots of recent news – including Google’s new “Mixture of Depths” algorithm:
- de Grigoire Noblejas, Ignacio. 2024. “Mixture-of-Depths, Unveiling the Future of Siri, LLaMa 3, & More.” The Tech Oasis (online blogpost series) (April 11, 2024). (Accessed April 20, 2024. Available online at Mixture of Depths.)
(Blogpost in progress. Check back for updates. AJM, Saturday, April 20, 2024. Noonish-HI Time.)