Category Archives: Interdisciplinary research

Research that crosses disciplines and does not fit easily into one category.

Metacognition: Part 2 – self-regulation

The previous post on metacognition (part 1) compared metacognition in humans with AI agents. Key concepts introduced were meta-level monitoring and control. The main focus was on detecting mistakes in reasoning and gaps in knowledge. This post (part 2) will argue that metacognition is also important in ensuring that requirements are met in the presence of conflicting pressures. For humans, this often called “self-regulation”.

Two kinds of thinking
The role of self-regulation is best understood within the larger context of decision-making processes. Human cognition is often described as two separate kinds of thinking called “system 1” and “system 2” (Kahneman[1]). System 1 responds quickly to events, but can be biased. System 2 is slower and more effortful, but is good at reasoning. An important property of system 2 is that it can generate hypothetical “what-if” scenarios. In contrast, system 1 only sees information that is immediately available.

Emotions and affective states are closely associated with system 1. This is particularly true of the effects of emotion on cognition. However, system 2 may be involved in generating emotions (such as fear caused by reasoning about hypothetical states). Metacognition is usually associated with system 2. The two kinds of thinking are intended as useful concepts only, and do not correspond to parts of the brain.

Computational models
In computational models of human cognition, metacognition is often represented as an additional level of processing which monitors and controls other components in the architecture, such as perception, learning, reasoning, and planning. An example is H-CogAff (described earlier in https://catmkennedy.com/2020/01/09/what-is-a-cognitive-architecture/). In H-CogAff, the reactive layer is similar to system 1 while the deliberative layer approximates system 2. The metacognition layer monitors and adjusts deliberative and reactive processing. (In H-CogAff, the reactive layer represents an older part of the brain than the deliberative layer, meaning that these layers do not correspond exactly to “system 1” and “system 2”, but their similarities are still important).


In the same way as for cognitive models, applied AI agents can have a hybrid architecture with a reactive and deliberative layer. Deliberation enables the agent to plan in advance while reactivity ensures that it can respond quickly to unexpected events. In this case, the purpose of a hybrid architecture is not to simulate human or animal cognition, but to add useful design features to a real-world system. Metacognition (a meta-level) can be added to monitor and control the reactive and deliberative layers (both of which are “object-level”).

Human self-regulation
For humans, system 1 reacts quickly, but not always in a way that is consistent with our goals or values. So we take corrective action (in psychology this is called “self-regulation”). Examples include:
  • Resisting distractions
  • Healthy eating (e.g. resisting cake)
  • Emotion regulation
The first two are about resisting pressures. Emotion regulation is more complex and includes strategies for re-interpreting the meaning of situations that cause emotions as well as strategies for modifying the emotional response itself. (See for example [2], which reviews emotion regulation theories). Some of my research involves computational modelling of emotion regulation [3].

Agent self-regulation
An AI agent can also have a self-regulation capability. For example, if the environment is unpredictable, it may be necessary to react to potentially dangerous events quickly. But if it is spending too much time reacting to minor events, this can cause a “distraction” problem and prevent a goal from being satisfied within the required time. To solve this problem, the agent must first detect the “distraction” (meta-level monitoring) and then make adjustments to its sensitivity to interruptions (meta-level control). It might reconfigure its priorities so that minor events can be ignored. Meta-level control may also generate learning goals, such as identifying what events are the most time-wasting or what kind of “system 1” reactions should be suppressed.

Other self-regulation scenarios exists. For example, an AI agent that makes decisions in safety-critical scenarios could monitor the integrity of critical software that it is relying on (e.g. providing sensor data) and re-configure or replace faulty components as necessary.

Some “self-adaptive” software architectures [4] have the foundations of self-regulation and could be called “meta-reasoning” if they include explicit reasoning and explanation about problems they have detected and corrections they are making.

In later blog posts, I plan to discuss the role of metacognition in ethical reasoning.

References
  1. Kahneman, D. Thinking Fast and Slow. Farrar, Straus and Giroux (2011)
  2. Kobylińska D. and Kusev P. Flexible Emotion Regulation: How Situational Demands and Individual Differences Influence the Effectiveness of Regulatory Strategies. Frontiers in Psychology Volume 10, Article 72, 2019. https://www.frontiersin.org/articles/10.3389/fpsyg.2019.00072/full
  3. Kennedy, C. M. Computational Modelling of Metacognition in Emotion Regulation. In 8th Workshop on Emotion and Computing at the German Conference on AI (KI-2018), Berlin, Germany, (2018). https://www.cs.bham.ac.uk/~cmk/emotion-regulation-metacognition.pdf
  4. Macías-Escrivá, F., Haber, R., del Toro, R., and Hernandez, V. Self-adaptive systems: A survey of current approaches, research challenges and applications. Expert Systems with Applications, Volume 40, Issue 18, 2013. https://www.sciencedirect.com/science/article/pii/S0957417413005125.

Metacognition: Part 1 – reasoning and learning

My research in cognitive systems is focused on metacognition (“thinking about thinking”). In this post, I will summarise some of its key features and briefly discuss some examples in the context of reasoning and learning, both for humans and AI systems.

In psychology, metacognition involves introspective monitoring of our reasoning and mental experiences, as well as the ability to control or adjust our thinking. Monitoring includes questions such as: “Have I made the right decision, or are there some other issues that I need to consider?” Control includes making decisions about what to focus our attention on, or what mental strategies to use for problem-solving. In education, metacognitive strategies include the learning of new concepts by connecting them to familiar concepts (e.g. using mind maps).

Metacognition also includes awareness of emotions and how they might affect learning and decisions. I will talk about this in part 2.

Application to AI systems
Some principles of metacognition can be applied to AI systems, such as robots or automated decision systems. The architecture of such systems is usually divided into two levels:
  • Object-level: solving the problem (e.g. route planning, medical diagnosis).
  • Meta-level: reasoning about the methods used to solve the problem (e.g. algorithms, representations).
The term “meta-reasoning” is often used for these systems. A key feature is transparent reasoning and explanation (see e.g. [Cox 2011]). The term “reasoning” can include a wide range of problem-solving techniques which can happen on the meta-level or object-level. Metacognition happens on the “meta-level” and can be divided into two processes:
  • Meta-level monitoring: monitor progress and recognise problems in object-level methods
  • Meta-level control: make adjustments to object-level methods.

Correcting mistakes in reasoning
Metacognition is often used in everyday situations when things go wrong. For example, if a hill walker is following a route and finds that a landmark has not appeared when expected, they may ask: “Did I make a navigation error?” or “is my route correct but am I overestimating how fast I am going?”. These questions are metacognitive because they are attempting to diagnose mistakes in reasoning, such as navigation errors or progress overestimation. In contrast, the hillwalker’s non-metacognitive reasoning solves problems in the outside world, such as determining the current location and planning a route to the destination.

In a similar way, a robot might detect problems in its automated planning or navigation. For example, it could use an algorithm to predict the duration of a route, or it might have learned to recognise typical landmarks. If the route is unusual, unexpected events can occur, such as a landmark failing to appear. Such a recognition of unexpectedness is part of meta-level monitoring. The robot could respond by recalculating its position and re-planning its route, or it could just ask for assistance. This would involve a minimal level of meta-level control (e.g. initiating new algorithms and stopping current ones). A more complex form of meta-level control would involve the robot making a decision on “what can be learned” from the failure. It could identify specific features of the route that were different from the type of route that it has learned about, and use the information to generate a new learning goal, along with a learning plan. The concept of generating learning goals in AI has been around for some time (See for example [Cox and Ram 1999] and [Radhakrishnan et al. 2009]).

If the robot can reason about its mistakes, explain them and take autonomous corrective action (even if it is just deciding that it needs help), it may be considered to be metacognitive.

Metacognition is not just about mistakes in reasoning. It may also be about self-regulation. In Part 2, I will talk about this kind of metacognition.

References
  • [Cox 2011] Cox, M. T. Metareasoning, Monitoring and Self-Explanation. In: Cox, M. T. and Raja, A. (eds.) Metareasoning: Thinking about Thinking, pp 3–14, MIT Press (2011).
  • [Cox and Ram 1999] Cox M. T. and Ram, A. Introspective multi-strategy learning: On the construction of learning strategies. Artificial intelligence, 1–55(112), 1999.
  • [Radhakrishnan et al. 2009] Radhakrishnan, J., Ontanon, S. and Ram, A. Goal-Driven Learning in the GILA Integrated Intelligence Architecture. International Joint Conference in Artificial Intelligence (IJCAI 2009).

Human-centered AI Systems

When considering ethical AI,  I find it useful to distinguish between two types of AI systems, each of which have different ethical and dependability considerations.

Type 1: “Nature inspired” AI systems: their purpose is to help understand and replicate key features of natural intelligence. They may be simulations or experimental hardware robots. Examples include neuroscience models, swarm intelligence, artificial societies and cognitive architectures. They are not deployed in environments where humans depend on them. Although these models would mostly be developed for research purposes, it is conceivable that they might be applied for a purpose (e.g. games, virtual characters, exploring Mars).

Type 2: “Human-centred” AI systems: they are developed for a particular purpose within a human environment (e.g. an organisation or a home).  Examples include medical decision support systems and home robots. These systems need to be dependable, particularly if they involve autonomous actions. They may contain biologically inspired models originally developed as Type 1 research models (e.g. cognitive architectures, neural network learning), but only if such models can be applied effectively to satisfy the system requirements.

Importance of requirements
The two categories above are approximate and not mutually exclusive. For example, a Type 1 system may involve hardware robots with experimental human interaction (e.g. humans teaching a robot the names of objects and how to manipulate them). In such cases, safety may become important, but to a lesser extent than for most Type 2 systems. Such a robot could misidentify objects or misinterpret human communication without serious consequences. The main difference between the two classes is that the requirements of a Type 2 system are specified by humans who depend on it, while the main requirement for a Type 1 system is to survive in a challenging environment or to be successful at solving problems that biological systems can solve. Both these requirements can be combined in Type 2 systems (but the human-specified requirements would take precedence).

Knowledge and communication
For Type 1 systems, any knowledge that the system acquires would be relevant for its goals but not necessarily relevant for humans.  For example, an agent-based model might be used to study the evolution of communication in a simulated society. The evolving language constructs would be used to label entities in the simulated world which are relevant for the agents, but the labels (and possibly the labelled entities) might not be meaningful for humans. Similarly, applied systems such as a Mars explorer may develop concepts which humans don’t use and need not be explained (unless they become relevant for human exploration). These AI systems would acquire their knowledge autonomously by interacting with their environment (although some innate “scaffolding” may be needed).

For Type 2, the AI system’s knowledge needs to connect with human concepts and values. Knowledge acquisition using machine learning is a possibility, but there are debates on its limitations. A good discussion is in “Rebooting AI”  [Marcus and Davis 2019]. Even if machine learning of human values were possible technically, it might not be desirable. See for example, the discussions in [Dignum 2019] where “bottom-up” learning of values is contrasted with “top-down” specification (hybrids are possible).  I think there is an important role for human participation and expertise in the knowledge acquisition process, and it makes sense for some knowledge to be hand-crafted (in addition to learning). In future posts, I plan to explore what this means for system design in practice.

    References:
  • [Dignum 2019] Dignum, V. (2019) Responsible Artificial Intelligence. Springer.
  • [Marcus and Davis 2019] Marcus, G and Davis, E.  (2019).  Rebooting AI: Building Artificial Intelligence We Can Trust. Pantheon Books, USA.

What is a Cognitive Architecture?

Some of my research involves the computational modelling of cognition and how it interacts with emotion. Computational modelling is useful for the study of human or animal cognition, as well as for the building of artificial cognitive systems (e.g. robots).  The cognitive process being modelled may be understood as an autonomous system which senses information from its environment and uses this information to determine its next action. Such an autonomous system is often called an “agent” [Russel and Norvig 2010].  A cognitive architecture is a specification of the internal structure of a cognitive agent, defining the components of cognition and their interactions. The concept of “architecture” is important because it integrates the various functions of cognition into a coherent system. Such integration is necessary for building complete autonomous agents and for the study of interactions between different components of natural cognition, such as reasoning and motivation.

Multiple Levels
Architectures can be defined at different levels of detail. For example, [Marr 1982] defines three levels which can be applied to cognitive architecture as follows:
Level 1: “Computational theory”: this specifies the functions of cognition – what components are involved and what are their inputs and outputs?
Level 2:. “Representation and algorithm”: this specifies how each component accepts its input and generate its output. For example, representations may include symbolic logic or neural nets; algorithms may include inference algorithms (for logical deductions) or learning algorithms.
Level 3: “Implementation”: this specifies the hardware, along with any supporting software and configurations (e.g. simulation software, physical robot or IT infrastructure).

At level 1, the architecture specifies the components and their interfaces. For example, a perception component takes raw sense data as input and identifies objects in a scene; a decision component generates an action depending on objects identified by the perception component. Level 2 fills in the detail of how these components work. For example, the perception component might generate a logic-based representation of the raw data that it has sensed, while the decision component uses logic-based planning to generate actions. Level 3 provides an executable instantiation of the architecture. An instantiation may be a physical robot, a software product or prototype, or a model of an agent/robot which can be run as a simulation on a particular platform.

Environment and requirements
When designing an architecture, the environment of the agent needs to be considered. This defines the situations and events that the agent encounters. It is also important to define requirements that the agent must satisfy in a given environment. These may be capabilities (e.g. to detect the novelty of an unforeseen situation or to act on behalf of human values sufficiently accurately so that humans can delegate tasks to the agent in the given environment). If a natural system is being modelled (e.g. an animal), the requirements may simply be survival in the given environment. Assumptions made about the environment help to constrain the requirements.

Architecture examples
Example architectures that are particularly relevant to my research include H-CogAff [Sloman et al. 2005] and MAMID [Hudlicka 2007]. Both are modelling human cognition. H-CogAff emphasises the difference between fast instinctive reactions and slower reasoning. MAMID focuses on emotion generation and its effects on cognition.  Architectures need not necessarily be executable (i.e. defined at levels 1 to 3). For example, H-CogAff is not a complete architecture that can be translated into an executable instance, but it is a useful guideline.

Broad-and-shallow architectures
Executable architectures can be developed using iterative stepwise refinement, beginning with simple components and gradually increasing their complexity. The complexity of the environment can also be gradually increased. To experiment with ideas quickly, it is important to use a rapid-prototyping methodology. This allows possibilities to be explored and unforeseen difficulties to be discovered early. To enable rapid-prototyping, an architecture should be made executable as early as possible in the development process. A useful approach is to start with a “broad and shallow” architecture [Bates et al. 1991].  This kind of architecture is mostly defined at level 1, with artificially simplified levels 2 and 3. For example, at level 2, the perception component may be populated temporarily by a simple data query method (does this object exist in the data?) and the decision component might include simplified “if-then” rules. For level 3, a simulation platform may be used which is suitable for rapid-prototyping.

In later posts, I will discuss how this methodology fits in with AI research more generally and ethical AI systems in particular.

References:

  • [Russell and Norvig 2010] Russell, S. J., Norvig, P., & Davis, E. (2010). Artificial intelligence: A modern approach.
  • [Marr 1982] Marr, D. (1982), Vision: A Computational Approach, San Francisco, Freeman & Co. Full text: http://s-f-walker.org.uk/pubsebooks/epubs/Marr]_Vision_A_Computational_Investigation.pdf
  • [Bates et al. 1991] Bates, J., Loyall, A. B., & Reilly, W. S. (1991). Broad agents. Proceedings AAAI Spring Symposium on Integrated Intelligent Architectures. Stanford, CA: Reprinted in Sigart Bulletin, 2(4), Aug. 1991, pp. 38-40.)
  • [Sloman et al. 2005] Sloman, A., Chrisley, R., Scheutz, M. (2005). The Architectural Basis of Affective States and Processes. In: Fellous, J.-M., Arbib, M.A. (eds.) Who Needs Emotions? New York: Oxford University Press.
    Full text: http://www.sdela.dds.nl/entityresearch/sloman-chrisley-scheutz-emotions.pdf
  • [Hudlicka 2007] Hudlicka, E.(2007): Reasons for emotions: Modeling emotions in integrated cognitive systems. In W. Gray (Ed.), Integrated Models of Cognitive Systems, 137. New York:Oxford University Press.