Two-Step Decision Task
AKA: Alien Task, Space Treasure Task
Background
The Two-step Decision Task (also referred to as the "Alien Task" or "Space Treasure Task") is a behavioral and computational neuroscience paradigm developed by Johannes H. Decker and colleagues in 2016. It is designed to separate and measure how the human brain balances two different learning strategies: model-based and model-free reinforcement learning.
The basic goal of the Alien Task is to collect as much 'treasure' as possible. In order to collect 'treasure' a player needs to fly to different planets and request treasure from aliens that mine the treasure. Some planets have a higher probability of providing treasures than others and some aliens work in mines that provide more treasure. The underlying structure of the Alien Task is essentially a sequential 2-armed bandit design, as the choice is always been between two options.
Step 1 (Decision 1): A player chooses between two spaceships. Each spaceship has a high probability to go to one specific planet (70%) and a low probability to go to the other planet (30%). For example, spaceship A takes participants usually to the 'red' plant (70% of the time) and only infrequently to the 'purple' planet (30% of the time). Spaceship B, on the other hand, shows the opposite behavior, with frequently going to the 'purple' planet and less so to the 'red' planet.
Step 2 (Decision 2): Once the player lands on the probabilistically decided planet, participants make a choice between two aliens. Each alien yields a treasure with a certain probability, and these probabilities slowly drift over time.
The goal of the task is to observe whether participants rely on past experiences as habits when making decisions. It measures two distinct types of learning: A Model-free (Habitual) one in which a player ignores the transition rules of the environment and simply repeats whatever action led to a reward on the previous trial. If they got a treasure from AlienB, they will stubbornly reuse whichever spaceship got them there, regardless of how improbable that route originally was. Essentially, these players blur both decisions into one event. A model-based (Goal-Directed) type of learning, on the other hand, builds an internal representation of the game, clearly mapping the different probabilities connected to both phases. If an alien pays out a high reward, they will calculate how to get back there using the spaceship associated with that path, even if it requires switching spaceships after an unusual or "rare" transition.
By mathematically analyzing a subject's sequence of choices, researchers can identify if a person's decision-making is overly habitual or rigidly goal-directed. It is actively used to study behavioral flexibility and psychiatric disorders, such as obsessive-compulsive disorder (OCD), problem gambling, and addiction.
The Millisecond implementation of the game uses mouse or touch input, so that the task can easily played on computers and touchscreen devices (e.g. ipads) alike.
Task Procedure
The Alien Task starts with a training module that practices selecting an alien from two aliens that differ in their abilities (aka probabilities) to provide treasure. This training module is followed by 20 practice trials that include both steps of the game: (1) choosing a spaceship and (2) choosing an alien. The training images for spaceships, aliens and planets are different ones than the ones used for the actual test. The default setup of the test phase runs 200 trials total. Each trial requires the participant to (a) select a spaceship within 3 seconds and (b) select an alien within 3 seconds (if no spaceship/alien is chosen in time, the participant loses the opportunity to win any treasure in that trial). The payoff probabilities associated with each alien for each trial are pregenerated and the game provides four different versions to choose from. The pre-generated probabilities were computed in such a fashion that they 'drift' over time, so that sometimes the alien of a 'good' mine might get tired and cannot produce treasure at the same rate anymore whereas the other alien friend might finally wake up and work a little harder. The drift in probabilities forces participants to stay flexible and adapt their treasure hunting behavior over the course of the game. All payoff probabilities are constrained to be between 0.25 ≤ p ≤ 0.75, so that they never drift completely out of range.
What it Measures
The Alien Task measures how individuals adapt their decision-making under uncertainty by balancing habit-based routines against flexible, goal-directed planning.
Psychological domains
- Decision-making: Response to potential rewards and losses over time
- Risk-taking: Preference for high-reward/high risk or low-reward/low-risk
- Executive Control: The ability of our prefrontal cortex to override automatic, reward-seeking behavior to execute an exploratory choice
Main Performance Metrics
- Perseveration Score: Simple measure for how often a participant stays on their choice regardless of getting rewards
- Habit Score: Simple measure for habit-based ('stay with whatever selections led to a reward') decision making
- Model Score: Simple measure for model-based decision making
Psychiatric Conditions
The Alien Task has been used with the following patient groups:
- Obsessive-Compulsive Disorder (OCD)
- Substance Use Disorders
- Major Depressive Disorder
- Schizophrenia
A child-friendly task measuing probablistic decision making using a two-stage reinforcement learning task
References
Decker JH, Otto AR, Daw ND, Hartley CA. From Creatures of Habit to Goal-Directed Learners: Tracking the Developmental Emergence of Model-Based Reinforcement Learning. Psychol Sci. 2016 Jun;27(6):848-58