Agents in a Non-stationary World

We are interested in understanding the mechanisms that enable agents to operate in a complex, changing, and uncertain world. To what extent do agents explore versus exploit? How do they balance internal versus external computation? What kind of dynamics emerge when agents have multiple competing or cooperating learning systems? 


In one strand, we are interested in using modular reinforcement learning to study the idea of having “multiple selves”, in an attempt to synthesize longstanding observations of psychological conflict (i.e. psychodynamic theories) with modern computational principles. We evolved to satisfy many ongoing needs; the more independent and possibly conflicting these needs are, the more we expect a modular system would be better suited to balance them. The computational benefits of modularity in such environments, like improvements in exploration and learning, provide a normative basis for the internal tug-of-war people tend to experience when making decisions.

Figure 1. Do brains balance our needs in a global fashion (left; monolithic) or, like the semi-autonomous legs of an octopus, might sub-agents compete for control? (right; modular). In the framework of reinforcement learning, this contrast corresponds to the one between a network learning a single policy based on a scalar reward value r, versus multiple sub-networks each learning a distinct policy based on separate reward components (r1,r2,...).

In another, we seek to determine the source and effects of opportunity costs an agent faces when balancing different objectives. To a certain extent, many of these opportunity costs can be boiled down to time: agents simply do not have enough time to fulfill all their desires. We offer an integrative account of the limitations, connecting and dissociating the phenomenological states of flow, fatigue, boredom, and mind-wandering as different instantions of these tradeoffs.



Overall, we aim to uncover principles that explain features of human psychology and at the same time inform the design of artificial agents.