ONR: Hybrid Architecture for Complex Learning
A Hybrid Architecture for Metacognitive Learning

Yearly progress reports:

There is evidence for the importance in decision making of higher-level executive, or metacognitive, processes for verifying and improving the results of pattern recognition (Cohen, Freeman, & Wolf, 1994). Such metacognitive processes, once acquired, may also support more rapid explanation-based learning of domain knowledge. The proposed research will investigate an architecture for the connectionist learning of metacognitive skills and the related learning of domain facts within a naval tactical decision domain (Towne). The proposed design is a hybrid which uses (1) a localist connectionist model to support rapid recognitional domain reasoning, (2) a distributed connectionist architecture for the learning of metacognitive behavior, and (3) a symbolic subsystem for the control of inferencing that mediates between the domain and metacognitive systems. We will test implications of the design for three aspects of human learning: learning metacognitive skills in the testbed task; improved domain learning in the testbed task supported by metacognition; and transfer of metacognitive skills to new domains.

The objectives of the research are: (a) to explore the feasibility and benefits of a novel combination of symbolic, localist connectionist, and distributed connectionist processing, that captures the role of executive, higher-order skills and advanced domain learning; (b) to expand theoretical and empirical understanding of metacognitive skills; (c) to derive principles for the design of decision aids and training that support the acquisition and use of metacognitive skills; and (d) to produce a self-improving Naval tactical system that can serve as the core for an adaptive tactical decision aid or intelligent training system.

The proposed research will integrate and build on previous technical work by the project team in a number of areas: (i) a theory of human decision making, called the Recognition / Metacognition (R/M) framework, which combines recognition with metacognitive processes that monitor and improve the results of recognition; (ii) empirical research testing that model and its training implications in the Naval tactical decision making context (as part of an on-going project in the TADMUS program); (iii) research on and implementation of innovative connectionist adaptive critic architectures for integrated reinforcement-driven adaptive behavior and model learning; and (iv) development of a connectionist model for rapid recognition-based domain reasoning (Shastri's Shruti system).

The significance of this research lies in what it can tell us about the structure and dynamics of human long-term memory (LTM). We are exploring the interactions of reflexive, inferential LTM, Shruti, with semi-autonomic meta-cognitive processes. The key interaction between these two systems is the adaptive direction of focused attention within the reflexive memory in response to learned metacognitive behaviors. There are inherent, and dynamic, limits in the scope of LTM which may be brought to bear in interpreting any evidence. Recency effects and learned attention shifting behaviors are used to assemble such intermediate results into composite assessments. The model suggests that the development of executive attention functions may be necessary for, and integral to, the development of LTM. In addition, and equally significant, we are exploring ways in which an agent may evaluate the utility of a situation estimate and utilize that situation estimate, and a model of utility, to (reflexively) construct plans and take actions. Agents using these mechanisms (attention shifting, a model of utility, and a structured, inferential LTM) will be able to operate a multiple levels of spatial and temporal abstraction, resulting in a non-linear increase in planning horizon.

The algorithms for reflexive inference are computationally bounded by the length of the longest inference chain explored (not the number of facts or rules the agent possesses, nor the number of inferences drawn). When implemented on parallel hardware, the agents will be capable of real-time interactions with their environments. The reflexive system has an intended decision cycle time of less than 500 milliseconds. The metacognitive system integrates reflexive results and operates in longer cycles as it examines and structures uncertainity by exploring alternative interpretations of the evidence. The tradeoff between further exploration and action is moderated by the expected value of not-acting compared with the expected value of continuing to explore (e.g., 'the Quick Test' gating function).

If borne out by future research, this work will have a profound impact on how we model intelligent behavior and learning, and on what we can hope to accomplish with such models. For example, the present research points to a conceptually novel architecture for model-based adaptive control in real-time systems. An agent based on this research would not need to 'back-up' expectations of future rewards and would automatically (reflexively) compute optimal policies as the model changes in response to (a) discovery by the agent or (b) changes in the dynamics of the environment. In contrast, the existing methods 'crystallize' on a value function. If the model changes, the value function must often undergo catastrophic re-learning or remain markedly sub-optimal. Further research is needed to (1) extend the implementation; (2) demonstrate these capabilities; and (3) firmly map out the theoretical relationship between the present work and (a) other models of human memory, and (b) existing theory and proofs in model-based adaptive control.

Technology Transfer

The hybrid system will be a self-improving artificial system that can provide the basis for intelligent decision aids and adaptive human-system interfaces. CTI staff have developed and applied a concept of personalized and prescriptive design to decision aids and adaptive information display, in which aids adapt to the decision making strategies of individual users, while at the same time guarding against potential errors. The hybrid system can significantly advance the technology of adaptive decision aiding in three ways. (a) It provides a model of human decision making that can be matched with user performance, in order to infer the user's decision making strategy, to adapt the interface to that strategy, and to diagnose potential problems. (b) It provides a self-improving domain model (essentially, an expert model) to serve as the basis for comparison with the inferred user model, for assessment of the significance of observed discrepancies, and for displays and prompts that mitigate potential errors in the user model. (c) Metacognitive processes for critiquing and correcting problems in the domain model are embodied in the hybrid system, and provide a naturalistic basis for displays and prompts that alert users to potential difficulties with their own domain models and plans. The hybrid system, for example, will already be equipped with processes to determine when time, uncertainty, and costs of potential errors warrant a more careful look at a conclusion or plan. Similarly, it will already be equipped with mechanisms for detecting problems such as incompleteness, conflict, or unreliability in a model or plan, and with methods that can be used to address each type of problem.

This research will also expand the scope of automated instruction. The benefits of embedding the computational model within an Intelligent Tutoring System parallel the potential contributions of the model to decision aiding (as discussed above): (a) The hybrid architecture can be utilized as a student model, in order to tailor instruction to individual trainees. Student models can be used, for example, to select and sequence cases and examples, to regulate the level of difficulty, to diagnose specific problems in user domain models or in decision making processes, and to provide adaptive feedback. (b) The hybrid system can also be used in an ITS as the basis for a self-improving expert model of the domain. This expert model can, in turn, be used as the basis of coaching that provides hints, advice, and feedback when there are discrepancies between the inferred student domain model and the expert domain model. (c) Metacognitive processes embedded within the hybrid model provide a naturalistic basis for an set of instructional strategies or rules for intervention. At the same time, they effectively model expert processes of metacognitive monitoring and regulation. Intervention strategies adopted by an ITS would reflect expert metacognitive strategies for interrupting the flow of recognitional processes (when time, stakes, and uncertainty warrant such interruption) and for critiquing recognitional models for such problems as incompleteness, unreliability, and conflict.

In previous work for the Tactical Decision Making under Stress (TADMUS) program, a cognitive and naturalistic approach to training CIC decision making skills has been developed. The basic premise of the training is that proficient decision making requires critical thinking skills. These skills are embodied in effectively structured domain knowledge and in strategies for monitoring and regulating the use of that knowledge. The critical thinking training has resulted in improvements in performance in tests with department head students at the Surface Warfare Officers' School, Newport, RI and with individual students at the Naval Post Graduate School, and is being extended to the context of team training.

The cognitive concepts of a recognition / metacognition model which have been explored in this project were integrated into a prototype DSS (Decision Support System) for the Navy CIC. Working with NAWC/TSD and SPAWAR, CTI developed displays and behaviors for the DSS. These were tested experimentally and found to significantly improve decision-making performance. In addition, the Navy is interested in integrating the computational model of a recognitional / metacognitive agent into the DSS. It would serve a decision support role and aid the CO and TAO in developing a situation estimate and elaborating appropriate responses. In particular, the agent would provide support for reasoning about unreliable, conflicting, and incomplete evidence and the stakes involved in delaying decisions and acting. Another on-going application of this technology is in the context of an Army project on training battlefield critical thinking skills. The hybrid system is being utilized to (1) provide feedback to officers regarding their situation assessments and decisions in battlefield scenarios and (2) represent and filter information through coordinated mental models.

Contract Information.
Sponsor: Office of Navy Research
Contract Number: N00014-95-C-0182
Contract Title: A Hybrid Architecture for Metacognitive Learning
Program Officer: Dr. Helen Gigley (703-696-0407; gigley@itd.nrl.navy.mil
Funding Program: Hybrid Architectures for Complex Learning
Period of Performance: 3 years  (1 May 95-30 April 98)
Principle Investigators
Cohen, Marvin
Cognitive Technologies, Inc.

4200 Lorcom Lane
Arlington, VA 22207
703-527-6417 (fax)
Thompson, Bryan
Cognitive Technologies, Inc.

1737 Harvard St, NW
Washington, DC 20009
202-265-3342 (fax)
Shastri, Lokendra
International Computer Science Institute
1947 Center St, Suite 600
Berkeley, CA 94704
510-642-4274 x310
510-643-7684 (fax)

In this work we used two different sets of scenarios. One was developed under the TADMUS (Tactical Decision Making Under Stress) program for the DEFTT simulator. The second set was developed for Douglas Towne's Rides-based CIC simulation package. In the course of our research, we developed a tool for translating the DEFTT scenarios, to the extent possible, into the format used by Towne's CIC simulation. In addition, we developed a CIC kinematics simulation to be used with the machine learning experiments. It uses the same scenario format as Towne's.



Copyright © 2000-2011 Cognitive Technologies, Inc.
Questions? Comments? Contact webmaster@cog-tech.com