Script Author: Katja Borchert, Ph.D. (katjab@millisecond.com), Millisecond
Created: January 29, 2018
Last Modified: January 24, 2025 by K. Borchert (katjab@millisecond.com), Millisecond
Script Copyright © Millisecond Software, LLC
This script implements the 2-armed bandit task, a decision making game in which participants tradeoff pursuing a known resource vs exploring a new resource as described in Knox et al (2012).
Knox, W.B., Otto, A.R., Stone, P. & Love, B.C. (2012). The nature of belief-directed exploratory choice in human decision-making. Frontiers in Psychology, Volume 2, Article 298, 1-12
35 minutes
Participants select between 2 options ("bandits") that are tied to different payoff schedules. One option is always worth 10 points more than the other. The pay off schedules change with a certain probability (flipRate) each trial. When they change, 20 points are added to the lesser paid option which flips the relative payoffs. Participants are told that their final reward is tied to the number of times they selected the higher payoff choice. 2 phases: 1. Passive Observation Phase: participants watch the computer select choices for 300 trials (default). Payoff changes are made explicit to participants by presenting a text message on screen. Every 100 trials (starting at trial 200), participants are asked to estimate how many reversals in payoff they expect to observe during the next set of 100 trials 2. Active Game: participants play the game for 300 trials (default)
Two phases:
1. Passive Observation Phase: participants watch the computer select choices for 500 trials.
Every 100 trials (starting at trial 200), participants are asked to estimate how many reversals in payoff
they expect to observe during the next set of 300 trials.
Trial Sequence:
choice ("Choose") (2000ms, editable) -> selected choice highlighted and total updated (1000ms, editable)
If the payoff schedule has changed that trial, the choice trial displays "Changed" in red font
2. Active Game Phase: participants plays the game for 300 trials
Trial Sequence
choice ("Choose") (max.1500ms, editable) -> if no response: feedback and trial gets repeated until a response is made->
point update (1000ms, editable) -> etc.
The default fliprate is 0.075 (editable) => each trial there is a 7.5% chance that the payoffs flip
(so you could expect to see 7.5 flips every 100 trials)
• in this script, the fliprate is a true probability.
• the first trial cannot be a flip trial
provided by Millisecond - can be edited under section Editable Stimuli
provided by Millisecond - can be edited under section Editable Instructions
File Name: twoarmedbandittask_summary*.iqdat
| Name | Description |
|---|---|
| inquisit.version | Inquisit version number |
| computer.platform | Device platform: win | mac |ios | android |
| computer.touch | 0 = device has no touchscreen capabilities; 1 = device has touchscreen capabilities |
| computer.hasKeyboard | 0 = no external keyboard detected; 1 = external keyboard detected |
| startDate | Date the session was run |
| startTime | Time the session was run |
| subjectId | Participant ID |
| groupId | Group number |
| sessionId | Session number |
| elapsedTime | Session duration in ms |
| completed | 0 = Test was not completed 1 = Test was completed |
| flipRate | Parameter: p(flip) = the fixed probability with which the payoff schedules will switch and the former lower payoff choice will be the higher payoff choice until the next flip |
| meanFlipRateEstimate | The estimate flip rate based on the mean Number of Flips over 100 trials (PHASE 1) |
| meanFlipEstimate | The mean estimate of flips over 100 trials (PHASE 1) |
| sumFlipEstimate | The total number of estimated flips for 400 trials (phase 1) (PHASE 1) |
| countFlipsPractice | The total number of actual flips for 400 trials (phase 1) (PHASE 1) |
| propHigherPayOffs | Proportion of selecting the higher payoff option (PHASE 2) |
| propExploit | Proportion of selecting the option with the known higher payoff (PHASE 2) ( at the time of making the choice the participant does not know whether a flip occurred (PHASE 2) and can only use the information gathered up to this point) |
| propExplore | Proportion of selecting the option with the known lower payoff (PHASE 2) ( at the time of making the choice the participant does not know whether a flip might have occurred and can only use the information gathered up to this point) |
File Name: twoarmedbandittask_raw*.iqdat
| Name | Description |
|---|---|
| build | Inquisit version number |
| computer.platform | Device platform: win | mac |ios | android |
| computer.touch | 0 = device has no touchscreen capabilities; 1 = device has touchscreen capabilities |
| computer.hasKeyboard | 0 = no external keyboard detected; 1 = external keyboard detected |
| date | Date the session was run |
| time | Time the session was run |
| subject | Participant ID |
| group | Group number |
| session | Session number |
| blockcode | The name the current block (built-in Inquisit variable) |
| blocknum | The number of the current block (built-in Inquisit variable) |
| trialcode | The name of the currently recorded trial (built-in Inquisit variable) |
| trialnum | The number of the currently recorded trial (built-in Inquisit variable) trialnum is a built-in Inquisit variable; it counts all trials run even those that do not store data to the data file. In this script trial.selectBandit and trial.summary store data to the data file. trial.summary stores the comprehensive summary of all relevant values at the end of each 'successful' (= a choice was made) trial.) |
| flipRate | Parameter: p(flip) = the fixed probability with which the payoff schedules will switch and the former lower payoff choice will be the higher payoff choice (until the next flip) |
| countSelections | Counts the number of choices A or B made (excludes trials with no responses) |
| attempts | Running total of attempts for the current trial ( if participant takes too long to decide, the current trial terminates and is repeated after feedback. Those repeats are called 'attempts' in this script) |
| flip | 1 = a flip of the pay-offs occurred during this trial; 0 = no payoff changes occurred |
| countFlipsPractice | Counts the number of flips |
| valueA | Stores the value of choice A ( changes in values after flips are stored in trial.summary only) |
| valueB | Stores the value of choice B ( changes in values after flips are stored in trial.summary only) trial.selectBandit: values present the pre-flip values (preflip values are used to assess whether participants selected the known higher payoff) trial.summary: present the post-flip values |
| higherPayOff | Stores the choice with the higher payoff ("A" or "B") ! this variable is only updated in trial.summary, see valueA and valueB |
| selectionRT | Stores the response time (in ms) of selecting the last response; measured from start of last trial.selectBandit |
| previousArm | Stores the previously (preceding trial) selected choice ("A" or "B") |
| response | The participant's response during current trial trial.selectBandig: 18 = E; 23 = I or 0 (no response) trial.summary: 0 = no response |
| selectedArm | Stores the selected choice ("A" or "B") |
| selectedHigherPayOff | 1 = participant selected the currently higher payoff (note: if there was a payoff flip, participant wasn't aware of the flip at time of making response) 0 = otherwise |
| choice | 0 = no choice made 1 = exploitative choice (the choice selected was the one attached to the highest seen payoff up to this point) 2 = exploratory choice (participant selected the other choice) |
| highestSeenPayOff | Stores the current (at time of data saving) highest payoff that participant has seen trial.selectBandit: stores the highestSeenPayOff from perspective of participant at time of making response trial.summary: stores the highestSeenPayOff from perspective of participant at the end of the trial (after totalpoints give feedback) ( the highestSeenPayOff can change from trial.selectBandit to trial.summary depending on whether or not there was a payoff flip) |
| consecutiveSameChoice | Running total of selecting the same choice consecutively (resets after switch) |
| selectionChange | 1 = participant changed selection 0 = otherwise |
| totalPoints | Stores the current total points earned |
The procedure can be adjusted by setting the following parameters.
| Name | Description | Default |
|---|---|---|
| flipRate | P(flip) = the fixed probability with which the payoff schedules will switch and the former lower payoff choice will be the higher payoff choice until the next flip (default: 0.075) in order for the script to run without error messages, make sure that flipRate*1000 results in an integer number | 0.075 |
| startA | The start value of option A | 10 |
| startB | The start value of option B | 20 |
| nrDemoTrials | The number of passive demo trials | 300 |
| nrTestTrials | The number of active test trials Knox et al (2012) run 500 trials in each phase | 300 |
| demoSelectionTime | The time (in ms) it takes computer to make a choice during the passive demo trials | 2000 |
| demoResponseFeedback | The feedback duration (in ms) during the passive demo trials | 1000 |
| readyDuration | The duration (in ms) of the get-ready trial | 3000 |
| selectionTime | The response timeout (in ms) of making a selection during a test trial | 1500 |
| feedbackDuration | The duration (in ms) of the feedback during a test trial | 1000 |
| iti | Inter trial interval (in ms) | 1000 |
| leftKey | The left response button - this key is attached to option A | "E" |
| rightKey | The right response button - this key is attached to option B | "I" |