The Future of AI: Part 0 (Prelude) – Reductio ad Absurdum

A few weeks ago, I went to the Lihue-based Kaua’i Farmer’s Market for locally-grown fresh veggies. And, of course, I swung by the Kaua’i Master Gardeners booth.

Figure 1. ***Volunteer Kauai Master Gardeners*** providing free consulation regarding growing plants on the Island of Kaua’i, during one of the local Farmer’s Markets. Picture courtesy CTAHR (Cooperative Extension; University of Hawai’i at Manoa, College of Tropical Agriculture and Human Resources; reproduced under Fair Use Act (educational purposes).

Now, I’d visited with these guys before.

And yes, they knew that I taught AI at Northwestern.

But still, my jaw dropped when the first question that one of the volunteers asked me was, “What do you think about ChatGPT?”

I was so stunned that I blurted out the first thing that came to mind: “It is SO f***ed up!”

We all had a good laugh, and here I am to … not defend, but support that response.

Everybody’s Favorite New Toy

Largely because it is SO ACCESSIBLE – and FREE – ChatGPT has attracted a horde of enthusiasts who want to kick its wheels and take it for a little spin down the road.

No big surprise, there.

And no big surprise, either, that the results come back VERY MIXED.

The essential answer is, as expressed by no other than Andy Ng:

… for questions that require complex reasoning or specialized knowledge, today’s LLMs [large language models] may confidently hallucinate an answer, making the result misleading.”
Ng., Andrew. (2022). Deep Learning AI: The Batch Issue 180. (Dec. 7, 2022). (Accessed Jan. 24, 2023; available at https://www.deeplearning.ai/the-batch/large-language-models-like-chatgpt-need-a-way-to-express-different-degrees-of-confidence/)

See Andy Ng’s experiments with ChatGPT showing why it is (sometimes) hilariously wrong!

Reductio ad Absurdum: The AI Forecast

The really beautiful thing about AI, as it exists now, is that it has pushed our leading methods to their limit. We are truly at the point of reductio ad absurdum – Latin for, “reduced to the absurd.”

We’ll look at three broad areas that, taken together, make this point:

The combination of deep learning and GANs (generative adversarial networks),
The evolution of ChatGPT, and GPT-3, and other LLMs (large language models) from humble origins in Doc2Vec, and
Reinforcement learning (RL).

Neural Networks: DL and GANs

In the realm of neural networks, we have deep learning (DL), convolutional neural networks (CNNs), long short-term memory neural networks (LSTMs), and generative adversarial neural networks (GANs).

We can think of CNNs and LSTMs as DLs with some very specific tweaks – highly useful, but not that different (for the sake of this conversation).

The underlying methods enabling DLs and GANs are the same: a combination of (restricted) Boltzmann machines and backpropagation (or some stochastic gradient descent method).

Figure 2. The leading neural networks methods use a combination of (restricted) Boltzmann machine layers and backpropagation, aided by transfer learning which uses reinforcement learning. Both the Boltzmann machine and transfer learning rest on a statistical mechanics foundation.

The (restricted) Boltzmann learning method (contrastive divergence) operates either in conjunction with or in opposition to the gradient descent learning method.

That’s it.

When we have a satellite-level view of what’s going on, that summarizes things in a nutshell: two major learning methods, and they either support each other or act in opposition.

And we have layers and layers and layers of this.

So by now, having had fabulous achievements with both of these approaches (DL systems and GANs, with the specializations of CNNs and LSTMs included), we’re not going to get that huge a leap forward by adding more layers.

Transformers and diffusion methods simply let us transfer what we’ve learned in one system to another. Nice, very useful, very pragmatic – but not a radical breakthrough in fundamentals.

We’ve basically exploited this avenue to its reasonable limit.

This means that DL systems and GANS will, over the next few years, be about as sexy and exciting as ARIMA (auto-regressive integrated moving average). Still a very solid, workhorse set of tools – just not breaking through barriers.

ChatGPT, GPT-3, and All the LLMs

All the LLMs are essential logical outgrowths of earlier methods: Doc2Vec (invented by Mikolov et al. from Google, circa 2012), BERT (Bidirectional Encoder Representations from Transformers), and then the GPT series. (See Resources and References at the end.)

Essentially, we’re doing pattern-matching. Like is matched with like, and there is NO REAL INTELLIGENCE involved.

Reinforcement Learning

It’s beginning to sound as though I’m dissing all the recent breakthrough research.

No, not really.

I love the set of two tech-blogs published by Andrey Kurenkov on reinforcement learning (detailed citation in Resources and References, at the end).

I’ll let Kurenkov say it for me – we’ve about hit the limits with RL.

What This Means

So let’s sum this up: the essential AI forecast covering the next three years – from 2023 through 2025 – is that:

The nature of AI will transform – powerfully and radically – over the next three years.”

Alianna J. Maren, Ph.D. Inventor, Author, Thought-Leader. December, 2022

The primary reason for this major AI transformation is that we have reached the functional limit of the primary algorithms and methods now in place. In short, with existing methods, we’ve mined all the ore in these veins.

It’s time for something different.

Assessing the Present and Creating the Future: Three Tools

But before we start suggesting new AI avenues, we’re going to identify three tools that help us assess not only how current AI has hit its culminating glory, but how to discern what will happen next.

These three tools, or perspectives, are:

The evolutionary history of neural networks and machine learning – the more that we are clear about how we’ve evolved our current methods, the more that we can go back and identify the “road less traveled,” and use the branchpoints to create next steps,
The power of the problem statement – this is the most powerful method that we’ll use. We will assess the problem statements that the initiating researchers crafted as they developed and shared their original works. The contrast-and-compare lessons will help us craft new problem statements, and guide our next efforts.
The matrix – a tool that we’ll use to assess the fundamental notions underlying several major AI methods; this will help us identify and extract the most significant and useful insights – the ones that we want to keep as we move forward, and

A Rough Editorial Calendar

Here’s what will happen, both in this blogpost and in the subsequent ones:

In this post (Part 0; the Prelude), we discuss why AI is at an evolutionary dead end – in terms of methods that we’ve been using.

Then (carrying forward the points from the previous section):

Part 1: Celebrating 50 Years – of the Same Old Ising Equation
Part 2: The Power of the Problem Statement – How Clear Questions Motivated Major Inventions
Part 3: Introducing the Matrix: Fundamental Concepts Used across Multiple AI & Machine Learning Methods

Once we’ve established a framework, we do have some suggestions to offer. But … that’s for later.

YouTubes – Made these Points Two Years Ago – and Now, More than Ever!

The best way to get context for what we’ll be presenting here is to watch our December 2020 YouTube vid introducing the Future of AI.

Alianna J. Maren. 2020. “AI Future Directions: The Fundamentals (Part 1).” Alianna J. Maren YouTube Channel (Dec. 24, 2020). (Accessed Dec. 28, 2022; available at https://www.youtube.com/watch?v=C-wy9FHLF8c&t=2s )

The three key points and AI future challenges identified in this vid are:

The age-old dichotomy between symbolic and connectionist AI; still not well-resolved,
Energy-based neural networks (relying on the restricted Boltzmann machine, or RBM) have reached their zenith – deep learning and GANs simply illustrating this limit, and
Reinforcement learning – the go-to method for planning, optimization, and other tasks – is also limited; active inference is emerging as a more robust method.

To be continued.

To your health, success, and a fabulous year in 2023!

Alianna J. Maren, Ph.D.

Founder and Chief Scientist

Themesis, Inc.

Resources and References

Works Specifically Identified in This Post

Ng., Andrew. (2022). Deep Learning AI: The Batch Issue 180. (Dec. 7, 2022). (Accessed Jan. 24, 2023; available at https://www.deeplearning.ai/the-batch/large-language-models-like-chatgpt-need-a-way-to-express-different-degrees-of-confidence/)

Statistical Mechanics Underlying Transformers

I keep pointing to Mattais Bal’s work because I (like many others) need to dig into how stat mech underlies transformers, along with the more well-known energy-based neural networks.

Bal. Mattias. 2021. “Transformers are Secretly Collectives of Spin Systems: A Statistical Mechanics Perspective on Transformers.” Matthias Bal’s GitHub Blog Series (Nov. 29, 2021). (Accessed Jan. 25, 2023; https://mcbal.github.io/post/transformers-are-secretly-collectives-of-spin-systems/

ChatGPT NON-Technical Resources

Recommended by Yann LeCun, in a Jan. 30, 2023 LinkedIn post:

Tiku, Nitasha, Gerrit de Vynck, and Will Oremus. 2023. “Big Tech Was Moving Cautiously on AI. Then Came ChatGPT.” The Washington Post (Jan. 27, 2023). (Accessed Jan. 30, 2023; available at https://www.washingtonpost.com/technology/2023/01/27/chatgpt-google-meta/.)

Good assessment of ChatGPT from an educator’s perspective:

Terwiesch, Christian,. 2023. “Would Chat GPT3 Get a Wharton MBA? A Prediction Based on Its Performance in the Operations Management Course.” Mack Institute for Innovation Management at the Wharton School, University of Pennsylvania. (Accessed Jan. 24, 2023; available at: https://mackinstitute.wharton.upenn.edu/wp-content/uploads/2023/01/Christian-Terwiesch-Chat-GTP.pdf.)

I like Tarry Singh’s (Jan. 24, 2023) post regarding ChatGPT as an alternative to traditional search engines:

Singh, Tarry. 2023. “ChatGPT Could Disrupt the Search Game: How it’s Revolutionizing the Way We Get Answers.” Tarry’s AI Notes (Jan. 24, 2023). (Accessed Jan. 25, 2023; https://www.linkedin.com/pulse/chatgpt-could-disrupt-search-game-how-its-way-we-get-answers-singh/.)

And, a perspective from someone who is NOT a techno-geek, but who is observing others using ChatGPT – Nate has a good YouTube channel on … “how to make YouTubes”!

I like his summary comment (min 2:56 in a 8:43 min vid), “… ChatGPT has a fatal flaw (bold-italics mine): its dataset outputs … the average of everything that has been input, and because of that, it brings nothing new to the table (bold-italics again mine).”

So – for anyone who is creating content, and using this as a means of positioning themselves – whether writing a report (to be included in a Portfolio), a blog, a video – ANYTHING – I’m not saying that we shouldn’t use ChatGPT. It’s just that we should be exceedingly cautious of relying on ChatGPT (or any LLM).

We should especially be cautious if we’re trying to position ourselves as being “above the herd.” If we want to be in the top 10%; the tip of the iceberg that potential employers consider when we are hunting for jobs or looking for any professional relation, then we need to rely on our OWN insights – not an amalgam of what other people think.

Black, Nate. 2023. “ChatGPT Doesn’t Do What YouTubers Think It Does – Here’s Why You Need to be Careful.” ChannelMakers YouTube Channel (Jan. 23, 2023). (Accessed Jan. 24, 2023; https://www.youtube.com/watch?v=7HzQxKJa7Wc.)

Evolution of ChatGPT from Its Humble Origins

The progenitor: Doc2Vec – invented by Mikolov and colleagues at Google, and thus designed to handle Google-scale corpora:

Bojanowski, Piotr, Edouard Grave, Armand Joulio, and Tomas Mikolov. 2016. “Enriching Word Vectors with Subword Information.” Transactions of the Association for Computational Linguistics 5, no. 1 (July 2016): 135-146. doi:10.1162/tacl_a_00051. (Accessed Jan. 25, 2023; available at arXiv.)
Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. “Distributed Representations of Words and Phrases and their Compositionality.” Advances in Neural Information Processing Systems 26 (NIPS 2013) (Accessed Jan. 25, 2023.)

The primary contribution of the next algorithmic stage – BERT (Bidirectional Encoder Representations from Transformers) – was to include context; i.e., words both prior and subsequent to the word that is being embedded into vector form.

Here’s the original introduction of BERT in 2018:

Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” arXiv 1810 04806v2 [cs.CL] (Accessed Jan. 25, 2023, arXiv:1810.04805v2.)

While I don’t normally recommend Wiki articles, this AI Wiki has a good overview of the transition from Doc2Vec to BERT to GPT:

Chris Nicholson. “A Beginner’s Guide to Word2Vec and Neural Word Embeddings.” A.I Wiki. (Accessed Jan. 25, 2023; AI Wiki on Word2Vec.)

Here’s a good summary of how NLP (natural language processing) algorithms have evolved up to 2019:

Palachy, Shay. 2019. “Document Embedding Techniques.” Towards Data Science blogpost. (www.towardsdatascience.com) (Sep 9, 2019). (Accessed Jan. 25, 2023; https://towardsdatascience.com/document-embedding-techniques-fed3e7a6a25d.)

ChatGPT Resource Summary

Here’s a good one, for those who want to get started.

Boykis, Vicky. 2023. “Everything I Understand about ChatGPT.” VeeKayBee GitHub Gist. (Accessed Jan. 25, 2023; https://gist.github.com/veekaybee/6f8885e9906aa9c5408ebe5c7e870698.)

A lot of people want to fine-tune their chatbot for specific organizations or industries. Here’s a good assessment:

Fine-tuning GPT-3 vs. ChatGPT: https://www.allabtai.com/chatgpt-vs-gpt-3-fine-tuning-the-ultimate-comparison/

Reinforcement Learning

Check out Andrey Kurukov’s two tech-posts on The Gradient; both exceedingly well-written, with all the references that you’d need:

Kurenkov, Andrey. 2018. “Reinforcement Learning’s Foundational Flaw.” The Gradient. (Accessed Jan. 25, 2023; https://thegradient.pub/why-rl-is-flawed/.)
Kurenkov, Andrey. 2018. “How to Fix Reinforcement Learning.” The Gradient. (Accessed Jan. 25, 2023; https://thegradient.pub/how-to-fix-rl/.)

(She’s) Lost that Lovin’ Feelin’

Sometimes … a girl just has her moments.

Righteous Brothers – You’ve Lost That Lovin’ Feelin’ (Top Gun 1986)

Everybody’s Favorite New Toy

Reductio ad Absurdum: The AI Forecast

Neural Networks: DL and GANs

ChatGPT, GPT-3, and All the LLMs

Reinforcement Learning

What This Means

Assessing the Present and Creating the Future: Three Tools

A Rough Editorial Calendar

YouTubes – Made these Points Two Years Ago – and Now, More than Ever!

Resources and References

Works Specifically Identified in This Post

Statistical Mechanics Underlying Transformers

ChatGPT NON-Technical Resources

Evolution of ChatGPT from Its Humble Origins

ChatGPT Resource Summary

Reinforcement Learning

(She’s) Lost that Lovin’ Feelin’

1 comment