Return to the Four-Armed Bandit Task page
___________________________________________________________________________________________________________________	

										Four Armed Bandit Task
___________________________________________________________________________________________________________________

Script Author: Katja Borchert, Ph.D. (katjab@millisecond.com) for Millisecond Software, LLC
Date: 02-14-2018
last updated:  03-30-2020 by K. Borchert (katjab@millisecond.com) for Millisecond Software, LLC

Script Copyright © 03-30-2020 Millisecond Software

___________________________________________________________________________________________________________________
BACKGROUND INFO 	
___________________________________________________________________________________________________________________	
This script implements a 'Four Armed Bandit Task'; a paradigm to study the conflict between 
the opposing demands of gathering new information and exploiting known information to maximize 
profits in human decision making.
The implemented procedure is similarly to the one described in:

Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B. & Dolan, R.J. (2006).
Cortical substrates for exploratory decisions in humans.
Nature. 2006 June 15; 441(7095): 876–879.

and Supplemental Materials (Refer to Web version on PubMed Central for supplementary material)

___________________________________________________________________________________________________________________
TASK DESCRIPTION	
___________________________________________________________________________________________________________________
Participants have to choose between four slots that are each tied to different payoffs.
The payoffs of each slot fluctuate from trial to trial. Specifically,
the payoffs for each slot are drawn from a Gaussian distribution (standard deviation = 4)
around mean M, rounded to the nearest integer (enforced range 1-100).
At the beginning of each selection trial, the mean for each slot diffuses in a decaying 
Gaussian random walk and the new payoffs are drawn from their respective (updated) distributions (Daw et al, 2006).
Participants are instructed to maximize their payoffs.

___________________________________________________________________________________________________________________	
DURATION 
___________________________________________________________________________________________________________________	
the default set-up of the script takes appr. 30 minutes to complete

___________________________________________________________________________________________________________________	
DATA FILE INFORMATION 
___________________________________________________________________________________________________________________
The default data stored in the data files are:

(1) Raw data file: 'fourarmedbandittask_raw*.iqdat' (a separate file for each participant)*

build:								The specific Inquisit version used (the 'build') that was run
computer.platform:					the platform the script was run on (win/mac/ios/android)
date, time, 						date and time script was run 
subject, group, 					with the current subject/groupnumber
script.sessionid:					with the current session id

blockcode, blocknum:				the name and number of the current block (built-in Inquisit variable)
trialcode, trialnum: 				the name and number of the currently recorded trial (built-in Inquisit variable)
										Note: trialnum is a built-in Inquisit variable; it counts all trials run; even those
										that do not store data to the data file such as feedback trials. Thus, trialnum 
										may not reflect the number of main trials run per block. 
																				
values.countRounds:					counts the number of rounds played
values.TotaltrialCount:				running count of all test trials run across rounds
values.trialCount_perRound: 		running count of test trials per round
values.noResponseCount:				running count of all test trials where no selection was made (in time) across rounds									
									
response:							the participant's response
									=> trial.selection: the selected slot ('slot1' (red), 'slot2' (green), 'slot3' (blue), 'slot4' (yellow) )
									
latency: 							the response latency (in ms); measured from:
values.selectionRT:					the latency (ms) of selecting the current slot; measured from onset of all four 'slots'
values.selectedSlot:					the selected slot: 1, 2, 3, 4

values.choice:						0 = no selection made (timed-out)
									1 = exploitative selection made (participant selected the slot with the highest known payoff at this point)
									2 = exploratory selection (participant selected an slot that did not have the highest known payoff at this point)
									Note: values.choice is determined in 'trial.selection' based on the last known payoffs seen before making the choice

values.currenthighestSeenPayOffSlot:	the slot that has the currently known highest payoff of the four options

values.currenthighestSeenPayOff:		the currently known highest payoff
										Note: 
										trial.selection stores the value that is known when the selection is made (but before the result has been revealed)
										trial.iti stores the potentially updated value that is known at the end of the trial
						
values.lastSeenPayOff1:					stores the last seen payOff for 'slot1'
values.lastSeenPayOff2:					stores the last seen payOff for 'slot2'
values.lastSeenPayOff3:					stores the last seen payOff for 'slot3'
values.lastSeenPayOff:					stores the last seen payOff for 'slot4'
											Note: values.currenthighestSeenPayOff is calculated as the highest of these 4 values
											Note: all 'lastSeenPayOffs' are updated at the END of each trial sequence
											trial.selection: stores the known payOffs at the start of the trial sequence
											trial.iti: stores the known payOffs at the end of the trial sequence

values.total:							the total points won
values.currentPayoff:					stores the current payoff 'paid' based on the selected slot

values.highestPayOffSelected:			1 = the slot with the currently highest payoff was selected
										2 = a slot with a lesser payoff was selected
										0 = no choice was made
										Note: at time of making their choice, participants were not aware of any potential changes in relative 
										payoffs.

values.currentHighestPayOffSlot:		the slot with the currently highest payoff (is not necessarily known by player)

payOff1:								current payoff for selecting 'slot1'			
payOff2:								current payoff for selecting 'slot2'
payOff3:								current payoff for selecting 'slot3'
payOff4:								current payoff for selecting 'slot4'

mean:								helper variable to calculate the current means for the four slots
mean1:								the current mean payoff for 'slot1'; used to calculate payoff1
mean2:								the current mean payoff for 'slot2'; used to calculate payoff2
mean3:								the current mean payoff for 'slot3'; used to calculate payoff3
mean4:								the current mean payoff for 'slot4'; used to calculate payoff4


(2) Summary data file: 'fourarmedbandittask_summary*.iqdat' (a separate file for each participant)*

computer.platform:					the platform the script was run on (win/mac/ios/android)
script.startdate:					date script was run
script.starttime:					time script was started
script.subjectid:					assigned subject id number
script.groupid:						assigned group id number
script.sessionid:					assigned session id number
script.elapsedtime:					time it took to run script (in ms); measured from onset to offset of script
script.completed:					0 = script was not completed (prematurely aborted); 
									1 = script was completed (all conditions run)

values.TotaltrialCount:				number of test trials run across rounds
values.noResponseCount:				number of test trials where no selection was made (in time) across rounds
expressions.propNoResponses:		proportion no responses (no choice was made)
expressions.prop_highestPayOff:		proportion highest payOff option selected (of all trials in which a choice was made)
expressions.prop_exploitative:		proportion exploitative choices (of all trials in which a choice was made)
									Note: 'exploitative' selection (in this script): 
									participant selected the slot with the highest known payoff at this point	
									
									
* separate data files: to change to one data file for all participants (on Inquisit Lab only), go to section
"DATA" and follow further instructions

___________________________________________________________________________________________________________________	
EXPERIMENTAL SET-UP 
___________________________________________________________________________________________________________________	

(1) Demo: 5 trials (runs with different payoff values than test trials)
(2) Test: 2 rounds with 150 trials each; break in between

Trial Set-Up:

4 slots: represented by a red (1), green (2), blue (3) and yellow (4) box; 
slots are selected by mouse click

Trial Sequence:
slot selection (max. 1500ms) -> animated slot (2000ms) -> reveal of selected slot's payoff (1000ms) -> blank screen (1000ms)
(if no slot is selected within 1500ms -> error feedback (4200ms) -> blank screen (1000ms))

PayOff Calculations:
PayOffs for each slot calculated at beginning of each trial.selection (with i = 'slots' 1-4):

payOff(i) = round(randgaussian(values.newmean(i), 4)); rounded to the nearest integer 
with Max PayOff = 100 and Min PayOff = 1 (see Daw et al, 2006, Supplementary Methods),

additional constraint implemented in this script: the calculated payOffs have to be different from each other
(new payoffs are calculated based on the same means if two of the payoffs end up being the same)

with:
newmean(i) = 0.9836*previousmean(i) + (1-0.9836)*50 + randgaussian(0, 2.8) (see Daw et al, 2006 for further explanation, Supplementary Methods)

Notes: 
* function randgaussian(mean, standarddeviation) samples a value from the normal distribution with the given mean
and standarddeviation
* initial means of 20, 40, 60, 80 (editable parameters) are assigned randomly to the four 'slots'

___________________________________________________________________________________________________________________	
STIMULI
___________________________________________________________________________________________________________________
provided by Millisecond Software - can be edited under section Editable Stimuli

___________________________________________________________________________________________________________________	
INSTRUCTIONS 
___________________________________________________________________________________________________________________
provided by Millisecond Software (not original to Daw et al, 2006)- can be edited under section Editable Instructions
* main instructions are provided via *.htm files. To edit instructions replace existing htm files or edit 
provided ones using simple text editors such as Notepad (win) or TextEdit (Mac)

___________________________________________________________________________________________________________________	
EDITABLE CODE 
___________________________________________________________________________________________________________________	
check below for (relatively) easily editable parameters, stimuli, instructions etc. 
Keep in mind that you can use this script as a template and therefore always "mess" with the entire code 
to further customize your experiment.

The parameters you can change are:

/skipTotal:						true (1): the total points won are NOT presented on screen (default)
								false (0): the total points won are presented on screen 
								
/startMean1:					the first mean value to use to calculate payoffs for the first trial (default: 20)
/startMean2:					the second mean value to use to calculate payoffs for the first trial (default: 40)
/startMean3:					the third mean value to use to calculate payoffs for the first trial (default: 60)
/startMean4:					the fourth mean value to use to calculate payoffs for the first trial (default: 80)
								Note: these means are randomly assigned to 'slot1'-'slot4'

/slotSize:						the proportional size of the four 'slots' (proportional to canvas) (default: 40%)

/selectionTimeout:				the response timeout (in ms) for making a choice (default: 1500ms)
/timeoutWarningDuration:		the duration (in ms) of the red X to signal a no response timeout (default: 4200ms)
/animationDuration:				the duration (in ms) of the animated slot (the wait time until result is revealed) (default: 2000ms)
/outcomeDuration:				the duration (in ms) of the result (the reveal of the points won) (default: 1000ms)

/iti:							the intertrial interval in ms (default: 1000)
								Note: in this script, the iti is fixed (compare to: Daw et al, 2006, Supplementary Methods)
								
/breakDuration:					btw. round rest duration (in ms)  (default: 60000ms)
/readyDuration:					'get ready' duration (in ms) (default: 2000ms)