___________________________________________________________________________________________________________________ Two Armed Bandit Task ___________________________________________________________________________________________________________________ Script Author: Katja Borchert, Ph.D. (katjab@millisecond.com) for Millisecond Software, LLC Date: 01-29-2018 last updated: 02-17-2022 by K. Borchert (katjab@millisecond.com) for Millisecond Software, LLC Script Copyright © 02-17-2022 Millisecond Software ___________________________________________________________________________________________________________________ BACKGROUND INFO ___________________________________________________________________________________________________________________ This script implements the 2-armed bandit task, a decision making game in which participants tradeoff pursuing a known resource vs exploring a new resource as described in: Knox, W.B., Otto, A.R., Stone, P. & Love, B.C. (2012). The nature of belief-directed exploratory choice in human decision-making. Frontiers in Psychology, Volume 2, Article 298, 1-12 ___________________________________________________________________________________________________________________ TASK DESCRIPTION ___________________________________________________________________________________________________________________ Participants select between 2 options ("bandits") that are tied to different payoff schedules. One option is always worth 10 points more than the other. The pay off schedules change with a certain probability (flipRate) each trial. When they change, 20 points are added to the lesser paid option which flips the relative payoffs. Participants are told that their final reward is tied to the number of times they selected the higher payoff choice. 2 phases: 1. Passive Observation Phase: participants watch the computer select choices for 300 trials (default). Payoff changes are made explicit to participants by presenting a text message on screen. Every 100 trials (starting at trial 200), participants are asked to estimate how many reversals in payoff they expect to observe during the next set of 100 trials 2. Active Game: participants play the game for 300 trials (default) ___________________________________________________________________________________________________________________ DURATION ___________________________________________________________________________________________________________________ the default set up with 300 trials in each phase takes about 35 minutes to run ___________________________________________________________________________________________________________________ DATA FILE INFORMATION ___________________________________________________________________________________________________________________ The default data stored in the data files are: (1) Raw data file: 'twoarmedbandittask_raw*.iqdat' (a separate file for each participant)* build: The specific Inquisit version used (the 'build') that was run computer.platform: the platform the script was run on (win/mac/ios/android) date, time: date and time script was run subject, group, with the current subject/groupnumber session: with the current session id blockcode, blocknum: the name and number of the current block (built-in Inquisit variable) trialcode, trialnum: the name and number of the currently recorded trial (Note: not all trials that are run might record data; by default data is collected unless /recorddata = false is set for a particular trial/block; in this script trial.selectBandit and trial.summary store data to the data file. Trial.summary stores the comprehensive summary of all relevant values at the end of each 'successful' (= a choice was made) trial.) (parameter) flipRate: p(flip) = the fixed probability with which the payoff schedules will switch and the former lower payoff choice will be the higher payoff choice (until the next flip) countSelections: counts the number of choices A or B made (excludes trials with no responses) attempts: running total of attempts for the current trial (Note: if participant takes too long to decide, the current trial terminates and is repeated after feedback. Those repeats are called 'attempts' in this script) flip: 1 = a flip of the pay-offs occurred during this trial; 0 = no payoff changes occurred countFlipsPractice: counts the number of flips valueA: stores the value of choice A (NOTE: changes in values after flips are stored in trial.summary only) valueB: stores the value of choice B (NOTE: changes in values after flips are stored in trial.summary only) trial.selectBandit: values present the pre-flip values (preflip values are used to assess whether participants selected the known higher payoff) trial.summary: present the post-flip values higherPayOff: stores the choice with the higher payoff ("A" or "B") !Note: this variable is only updated in trial.summary, see valueA and valueB selectionRT: stores the response time (in ms) of selecting the last response; measured from start of last trial.selectBandit previousArm: stores the previously (preceding trial) selected choice ("A" or "B") response: the participant's response during current trial trial.selectBandig: 18 = E; 23 = I or 0 (no response) trial.summary: 0 = no response selectedArm: stores the selected choice ("A" or "B") selectedHigherPayOff: 1 = participant selected the currently higher payoff (note: if there was a payoff flip, participant wasn't aware of the flip at time of making response); 0 = otherwise choice: 0 = no choice made 1 = exploitative choice (the choice selected was the one attached to the highest seen payoff up to this point); 2 = exploratory choice (participant selected the other choice) highestSeenPayOff: stores the current (at time of data saving) highest payoff that participant has seen trial.selectBandit: stores the highestSeenPayOff from perspective of participant at time of making response trial.summary: stores the highestSeenPayOff from perspective of participant at the end of the trial (after totalpoints give feedback) (Note: the highestSeenPayOff can change from trial.selectBandit to trial.summary depending on whether or not there was a payoff flip) consecutiveSameChoice: running total of selecting the same choice consecutively (resets after switch) selectionChange: 1 = participant changed selection; 0 = otherwise totalPoints: stores the current total points earned (2) Summary data file: 'twoarmedbandittask_summary*.iqdat' (a separate file for each participant)* inquisit.version: Inquisit version run computer.platform: the platform the script was run on (win/mac/ios/android) startDate: date script was run startTime: time script was started subjectid: assigned subject id number groupid: assigned group id number sessionid: assigned session id number elapsedTime: time it took to run script (in ms); measured from onset to offset of script completed: 0 = script was not completed (prematurely aborted); 1 = script was completed (all conditions run) (parameter) flipRate: p(flip) = the fixed probability with which the payoff schedules will switch and the former lower payoff choice will be the higher payoff choice until the next flip ************* Phase1: ************* meanFliprateEstimate: the estimate flip rate based on the mean Number of Flips over 100 trials meanFlipEstimate: the mean estimate of flips over 100 trials sumFlipEstimate: the total number of estimated flips for 400 trials (phase 1) countFlipsPractice: the total number of actual flips for 400 trials (phase 1) ************* Phase 2: ************* propHigherPayOffs: proportion of selecting the higher payoff option propExploit: proportion of selecting the option with the known higher payoff (Note: at the time of making the choice the participant does not know whether a flip occurred and can only use the information gathered up to this point) propExplore: proportion of selecting the option with the known lower payoff (Note: at the time of making the choice the participant does not know whether a flip might have occurred and can only use the information gathered up to this point) * separate data files: to change to one data file for all participants (on Inquisit Lab only), go to section "DATA" and follow further instructions ___________________________________________________________________________________________________________________ EXPERIMENTAL SET-UP ___________________________________________________________________________________________________________________ Two phases: 1. Passive Observation Phase: participants watch the computer select choices for 500 trials. Every 100 trials (starting at trial 200), participants are asked to estimate how many reversals in payoff they expect to observe during the next set of 300 trials. Trial Sequence: choice ("Choose") (2000ms, editable) -> selected choice highlighted and total updated (1000ms, editable) If the payoff schedule has changed that trial, the choice trial displays "Changed" in red font 2. Active Game Phase: participants plays the game for 300 trials Trial Sequence choice ("Choose") (max.1500ms, editable) -> if no response: feedback and trial gets repeated until a response is made-> point update (1000ms, editable) -> etc. The default fliprate is 0.075 (editable) => each trial there is a 7.5% chance that the payoffs flip (so you could expect to see 7.5 flips every 100 trials) Notes: * in this script, the fliprate is a true probability. * the first trial cannot be a flip trial ___________________________________________________________________________________________________________________ STIMULI ___________________________________________________________________________________________________________________ provided by Millisecond Software - can be edited under section Editable Stimuli ___________________________________________________________________________________________________________________ INSTRUCTIONS ___________________________________________________________________________________________________________________ provided by Millisecond Software - can be edited under section Editable Instructions ___________________________________________________________________________________________________________________ EDITABLE CODE ___________________________________________________________________________________________________________________ check below for (relatively) easily editable parameters, stimuli, instructions etc. Keep in mind that you can use this script as a template and therefore always "mess" with the entire code to further customize your experiment. The parameters you can change are: /flipRate: p(flip) = the fixed probability with which the payoff schedules will switch and the former lower payoff choice will be the higher payoff choice until the next flip (default: 0.075) Note: in order for the script to run without error messages, make sure that flipRate*1000 results in an integer number /start_A: the start value of option A (default: 10) /start_B: the start value of option B (default: 20) /nrDemotrials: the number of passive demo trials (default: 300) /nrTesttrials: the number of active test trials (default: 300) Note: Knox et al (2012) run 500 trials in each phase /demoSelectionTime: the time (in ms) it takes computer to make a choice during the passive demo trials (default: 2000) /demo_ResponseFeedback: the feedback duration (in ms) during the passive demo trials (default: 1000) /readyDuration: the duration (in ms) of the get-ready trial (default: 3000ms) /selectionTime: the response timeout (in ms) of making a selection during a test trial (default: 1500ms) /feedbackDuration: the duration (in ms) of the feedback during a test trial (default: 1000ms) /leftKey: the left response button (default: "E") - this key is attached to option A /rightKey: the right response button (default: "I") - this key is attached to option B