Reflections on a year studying “Consciousness and AI”

16 min readDec 7, 2022

I know this blog has been a little quiet recently. If you followed my work in the past, you might be wondering what I have been up to the past year or so, after leaving Unity in early 2021. Working on ML-Agents the previous four years was an incredibly rewarding (and challenging) experience. I had been with the project from its very inception and seen it grow both in popularity and maturity. I learned a lot about open source, project management, and machine learning. I also made a number of great friends. By early 2021 I felt I was ready for a new challenge. My formal academic background is in Psychology, and I had just received my PhD. I wanted what I did next to be in a direction that married my two interests: machine learning and the study of the mind.

Thankfully, soon after defending my dissertation, I got the opportunity to join Araya Inc in Tokyo as a research scientist and work under Ryota Kanai. Kanai is an expert in the neuroscience of consciousness, who was recruiting researchers to work at the intersection of consciousness studies and artificial intelligence. It seemed like a perfect fit (getting to visit Tokyo also didn’t hurt). In the year that I was there, I worked on two major projects, each exploring this connection through slightly different lenses.

With renewed discussion around consciousness and AI thanks to David Chalmers’ keynote address at NeurIPS 2022, it felt like the right time to share both the outcomes of those projects, as well as share my more general reflections on working at this rather unique intersection of fields.

What is consciousness, and can we really study it scientifically?

Perhaps it is worth backing up a bit, and discussing what exactly consciousness is. Only then can we even begin to discuss what it might have to do with artificial intelligence. The first thing to do here should be to share the definition of consciousness, but it turns out that no one has a definition which is anywhere close to universally agreed upon. It is one of those things which appeals to our common sense reasoning. “Of course I am conscious,” we say to ourselves. But upon deeper inspection, it is not so easy to say why, or what that consciousness even is.

The philosopher Thomas Nagel, in his now famous essay “What is it like to be a bat?” gave us a useful starting point in the science of consciousness by suggesting that “an organism has conscious mental states if and only if there is something that it is like to be that organism — something it is like for the organism.” This seems reasonable enough, and indeed it seems hard to argue with. The problem for a science of consciousness though is that here at the heart of a supposedly objective definition, we introduce subjectivity. Various philosophers, psychologists, and neuroscientists have attempted to deal with the seemingly inescapable subjectivity inherent in consciousness in different ways.

David Chalmers made the famous distinction between the easy and hard problems of consciousness. The easy problems are those amenable to typical observation-based scientific study, such as what brain regions may be minimally sufficient for a person to retain consciousness. This approach attempts to identify what are called the neural correlates of consciousness, or NCCs. On the other hand, there is the hard problem, which is the problem of why there is any consciousness at all. It is a hard problem because it seems so difficult to apply the scientific method to this question. Indeed, beyond being a hard problem, many have suggested that it is fundamentally impossible.

To make things perhaps a bit more clear, consider the case of gravity. Here again we can define analogous easy and hard problems. The easy problems of gravity consist of understanding what its governing dynamics are. Where to expect it, and what to expect it to do. These are the problems which physicists have been solving with increasing accuracy from Newton to Einstein. In contrast, the hard problem of gravity would be to understand why there is any gravity at all. Why should it be necessary that gravity be part of our physical universe? Here the answers do not come so easily, and the domain of science often ends up ceding the reins to the domain of philosophy. Or more often the mathematics required to think these problems becomes so abstract that it may as well be pure philosophy.

Other philosophers have made additional useful distinctions. Ned Block famously made the distinction between what he called phenomenal consciousness and access consciousness. Phenomenal consciousness corresponds to the pure subjective experience. If you see the color blue, you are having a phenomenal experience of blue. To use a philosophical term, a blue qualia is contained within your phenomenal field. In contrast, access consciousness corresponds to what you can act on or report. In most cases these two things seem to correspond, but not always. In an experiment reported on by Block, individuals were briefly presented with a grid of letters. While all participants reported having experienced the entire contents of the grid, each was only able to report on the identity of a small subset of the items in the grid. From here we get the notion that phenomenal consciousness “overflows” access, suggesting that they are not in fact identical.

Lastly, it is worth discussing one final useful distinction introduced by a philosopher of consciousness. The philosopher Daniel Dennet introduced a “hard question” to complement the “hard problem” of Chalmers. The “hard question” asks: if only animals (or even only certain animals) are conscious, and even then only conscious of some pieces of information and not others, then what is consciousness for? Here we finally arrive at a functional understanding of consciousness. If consciousness serves some adaptive evolutionarily derived role, then we have yet another scientific avenue from which to study it.

Together, the insights of Nagel, Chalmers, Block, and Dennet provide a reasonable enough amount of clarity to start to consider a science of consciousness. By focusing on the easy problems of neural correlates, experimentally verifiable conscious information access, and questions of the functional role of consciousness in living organisms, various actionable research programs present themselves. What also presents itself is the possibility that the study of consciousness in light of computational architectures, information access, and behavioral function might be applied just as easily to artificial life forms (and the artificial intelligence that accompanies it) as it is to natural life. And it is here we arrive at pronouncements such as:

After a year studying consciousness and AI, I don’t think the above statement is true. To understand my thinking, I want to briefly walk through the two projects I worked on the past year. Each approaches consciousness from a functional or information access approach. And each, I ultimately believe, is not really about consciousness at all.

Are there Global Workspaces among us?

The first project I worked on was an attempt to understand whether there are connections that can be made between contemporary methods in AI research such as Transformer models and any of the current functional theories of consciousness. In particular, I focused on a study of the Global Workspace Theory (GWT). GWT is one of the most developed and battle-tested theories in this area, going back to the mid-80s in its original formulation, but being updated and developed with empirical support in the subsequent decades.

GWT proposes that the brain can be thought of as a set of modules, each of which is semi-independent from one another. These might include modules for long-term memory, various sensory modalities, motor control, fear responses, etc. In order for these different modules to share information with one another, they need a common language, or more specifically a common representational space. The global workspace is that hypothesized space. In GWT, each module attempts to share information with the global workspace, and when a module “wins out” its signal makes it into the global workspace (an ignition event), and is then translated and shared with all other modules (a broadcasting event). What makes this a theory of consciousness is that it proposes that we are conscious only of the information which makes it into the workspace, with everything else remaining unconsciously processed.

With this blueprint in mind, I looked to the various recent neural network architectures being developed for similarities. What I found was that the Perceiver (and particularly the PerceiverIO) was a strong potential candidate for a global workspace. I explored this possibility both theoretically and empirically, finding good evidence that it meets all the criteria needed to be a proper Global Workspace. This work was published at the CogSci 2022 conference earlier this year, and you can read more about it here.

I wasn’t the only one to recognize this connection. Concurrently with my work Anirudh Goyal and his collaborators in Yoshua Bengio’s lab proposed a very similar neural architecture as a global workspace. Others last year also explored the potential for concretely implementing the global workspace in neural networks. These include the work of VanRullen and Kanai, as well as the work of Blum & Blum. It seems clear from this line of work that AI researchers are incorporating aspects of global workspaces, both intentionally and unintentionally, as they develop their systems.

Mental Time Travel and the function of consciousness

GWT isn’t the only theory of conscious function taken seriously by the research community. In the second project I worked on while at Araya, I attempted to synthesize multiple different theories of conscious function together in order to extract useful design principles which might be applied to developing more general artificial intelligences. This was no easy undertaking, since to do the topic justice philosophy, psychology, neuroscience, and machine learning all needed to be given a serious treatment. I was fortunate to have collaborators with just this diverse skill set.

In addition to GWT, we considered Information Generation Theory (IGT) and Attention Schema Theory (AST). Briefly, IGT suggests that it is the top-down predictions which the brain generates that corresponds to conscious experience, rather than the bottom-up information. IGT is related to a number of other theories such as Hierarchical Predictive Coding or Recurrent Processing Theory. IGT furthermore relies on the ability of the brain to produce complex models of the world from which coherent generated experience can be sampled. This is guided by what researchers sometimes refer to as a cognitive map.

AST meanwhile proposes that consciousness corresponds to the contents of a model of our own attentional process which we possess and utilize. This model of attention allows us to represent our own attentional process, which includes within it a sense of self and personal agency. The images below convey this main idea. In the top image a person is simply perceiving an apple, and representing it directly. In the bottom image, the person is representing the entire process of perceiving the apple, including their own attention and sense of self. The second image corresponds to the attention schema.

In order to combine these three theories into a coherent whole which might guide future AI research, we considered a specifically conscious ability that it seems is unique to humans: the ability to perform mental time travel. The concept of mental time travel was proposed by Endel Tulving a few decades ago as a way to account for the variety of possibilities which are seemingly unlocked by our complex memory systems. His key insight was that we not only utilize our memory systems to remember the past, but also to imagine the future. Mental time travel can be understood then as the ability to project oneself into any possible past or future, and to “play out” what it would be like to act in that environment. My hunch is that this capacity is one of the key contributing factors responsible for the seemingly “general” intelligence we find in humans.

Consider for example our ability to land humans on the moon. This was something never done before, and indeed nothing similar to it had been done before. Despite this, thousands of individuals were able to project themselves into this imagined future scenario in order to help bring it about.

In contrast, it seems that most animals, and AIs are not capable of projection into the past or future in this way. They are capable of things kind of like it though. Many mammals for example are capable of replaying past experiences during sleep or when making decisions about future actions. There is even evidence that within certain controlled contexts great apes are able to project themselves into the future, and act accordingly. We find a similar story in the current domain of AI research. Deep Reinforcement Learning algorithms such as DQN incorporate an experience replay buffer which enables replaying the past. More sophisticated models such as Dreamer or MuZero are even capable of generating hypothetical future trajectories, but only within limited behavioral contexts. Given this array of various abilities, it is possible to form a continuum between direct experience and full mental time travel. The figure below shows the ways in which mental time travel can be thought to differ from other forms of memory or imagination.

What cognitive architecture might be required for mental time travel? An example from literature is perhaps informative. In Marcel Proust’s classic novel “In Search of Lost Time” the story begins with the narrator describing himself eating a madeleine cake with tea. The taste of the cake triggers within his mind memories of his childhood which then form the basis for the beginning of the narrative in earnest, which is then conveyed from that point onward as he describes his experiences from childhood up until his present age.

“No sooner had the warm liquid mixed with the crumbs touched my palate than a shudder ran through me and I stopped, intent upon the extraordinary thing that was happening to me. An exquisite pleasure had invaded my senses, something isolated, detached, with no suggestion of its origin. And at once the vicissitudes of life had become indifferent to me, its disasters innocuous, its brevity illusory — this new sensation having had on me the effect which love has of filling me with a precious essence; or rather this essence was not in me it was me. … Whence did it come? What did it mean? How could I seize and apprehend it? … And suddenly the memory revealed itself. The taste was that of the little piece of madeleine which on Sunday mornings at Combray (because on those mornings I did not go out before mass), when I went to say good morning to her in her bedroom, my aunt Léonie used to give me, dipping it first in her own cup of tea or tisane. The sight of the little madeleine had recalled nothing to my mind before I tasted it. And all from my cup of tea.”

In this passage from Proust, we have an example of all three systems GWT, IGT, and AST working in harmony. First, the taste of the madeleine cake entering consciousness is an example of an ignition event in GWT, and the broadcasting of the taste to his long-term memory then triggers another ignition event, the childhood memory. The memory itself is then re-lived in all of its original detail, but is entirely generated by the mind itself, an example of IGT at play. Finally, the narrator is aware of himself as narrator, and is indeed creatively producing the narrative itself, an example of the self of self awareness provided by AST. Together these systems make possible Proust’s great novel, and serve as an example of mental time travel.

In the full paper, we then explore how we might use these three theories (GWT, IGT, and AST), and their combinations in order to guide future developments in artificial intelligence. We propose mental time travel as a kind of hypothetical benchmark for a future generally intelligent system. You can find the paper here: https://arxiv.org/abs/2204.05133. It was published in TMLR earlier this year. If you want to learn more without reading, I also recently gave an hour-long talk on this at the Consciousness Club Tokyo, a seminar series hosted by one of my former colleagues at Araya.

Concurrent with our work, researcher Adam Safron has been working on a similar program to bridge consciousness research and artificial intelligence. He refers to his approach as the Integrated World Model Theory. If you are interested in this line of work, it is also worth a look!

Coda

The field of AI moves rapidly, and since I wrote this paper there have been significant developments, with new models such as DALLE-2 and ChatGPT showing generalization performance not previously seen. Both of these systems are unique for their ability to produce images or text (respectively) based on any number of past or future imagined scenarios. While they don’t meet all the criteria of mental time travel (they are not totally grounded or coherent), they seem to be much closer than any other systems we know of, aside from humans ourselves. What’s more, these capacities are emergent in extremely large transformer-style models trained on an internet-scale dataset. Perhaps the cognitive architecture described by GWT, IGT, and AST might all be emergent in such a large system?

A different perspective on consciousness

I enjoyed working with my colleagues at Araya to think through this potential link between consciousness and AI. At the same time, I couldn’t help but consider just how much of the bigger picture I was ignoring. This often became apparent when talking with other people about my work. When I told them that I was working with the functional theories mentioned above, two wrong assumptions would come up. The first is that I thought not just that these were theories of cognitive function, or of information access, but that I believed that these were theories of consciousness itself. The second related assumption was that I ascribed to these theories myself. In truth neither is really the case. I think that by virtue of their very nature as (relatively) straightforward abstractable computational models, none of these are convincingly strong theories of consciousness in the “hard problem” or “phenomenological” sense. At best, each provides a limited view onto the problem of correlates of consciousness, or provides an interesting insight into human brain function and computation.

As for what I actually think about consciousness, if I had to take a position, I would likely ascribe to a form of panpsychism. Panpsychism states that consciousness is just another property of the physical universe (like gravity), and that in specific kinds of configurations of matter (animal brains at least), it looks like the phenomenal field that we are familiar with. This is related to another theory of consciousness: Integrated Information Theory (IIT), which attempts to utilize a set of mathematical principles and equations to precisely measure and quantify the consciousness of physical systems. Here again though, the more specific a theory becomes, the more likely it is to become specifically wrong. While IIT utilizes very precise axioms and mathematical formulations, they are incredibly difficult to compute in practice, or to know whether they actually correlate to consciousness at all. It does at least point out broad considerations, such as predicting that brain regions such as the cerebellum are not conscious, and indeed it is not. Following this same logic, it also precludes CPUs, GPUs, or TPUs from being conscious either.

Other perspectives which I appreciated discovering while doing my work included those that took affect seriously as a first-class characteristic of our phenomenal experience. Mark Solms released a great book in the last year called The Hidden Spring in which he proposes that the most basic form of conscious experience is affective experience (as opposed to being sensory or cognitive). For him, the “feels like” of Nagel is one that is always affectively charged in some basic way. There is something that seems quite right to me about this, especially when I perform any kind of deeper introspection. Along these lines is a non-profit doing research from this affect-first perspective called QRI (Qualia Research Institute). These approaches are appealing because similarly to IIT, they attempt to take the phenomenology of conscious experience seriously. Given its inherently subjective nature, I believe that phenomenological investigations still play a key role in disclosing useful insights into consciousness, regardless of whether they may or may not be easily translated into computational terms.

Final Thoughts

What these alternative perspectives suggest though is that consciousness may have little to nothing to do with cognitive computation, information access, or with intelligence per-se. They also suggest that we are unlikely to see a conscious artificial intelligence anytime soon, since the axis upon which AI is developing is almost orthogonal to that of actual consciousness in physical systems like us. It isn’t a matter of the computations being performed, but rather the physical system doing those computations. CPUs, GPUs, and TPUs are simply not physical systems which we should have any reason to believe produce phenomenal experience, and by extension the programs which these systems run are not likely to be conscious either.

What advanced AI systems like DALLE and ChatGPT force us to confront is not the possibility of sentient machines which require some kind of ethical consideration, but exactly the inverse: a system which is indistinguishable from a human, but almost certainly is not conscious. Individuals like Blake Lemoine are in a certain sense a canary in the coal mine for what will inevitably be increasingly human-like agents. It seems to me then to be an ethical imperative to not get caught up in the “AI and Consciousness” hype, and instead remember that there are currently billions of humans and non-humans which we know are conscious right now, and are living in situations suffering which we can collectively do something about.

I am glad that I spent the year that I did working on these problems. It is ultimately not the long-term direction I want to focus on, but thinking through just how far we can take the “AI and Consciousness’’ connection was fruitful as an exercise in exploring the limits of an idea. After all this time, I still don’t know what consciousness is, but at least I have a better sense of what it isn’t. I suppose that counts for something.