User Manual: Inquisit Two-Armed Bandit Task

Two-Armed Bandit Task

___________________________________________________________________________________________________________________	

										Two Armed Bandit Task
___________________________________________________________________________________________________________________

Script Author: Katja Borchert, Ph.D. (katjab@millisecond.com) for Millisecond Software, LLC
Date: 01-29-2018
last updated:  02-17-2022 by K. Borchert (katjab@millisecond.com) for Millisecond Software, LLC

Script Copyright © 02-17-2022 Millisecond Software

___________________________________________________________________________________________________________________
BACKGROUND INFO 	
___________________________________________________________________________________________________________________
This script implements the 2-armed bandit task, a decision making game in which participants tradeoff pursuing 
a known resource vs exploring a new resource as described in:

Knox, W.B., Otto, A.R., Stone, P. & Love, B.C. (2012). The nature of belief-directed exploratory choice in human 
decision-making. Frontiers in Psychology, Volume 2, Article 298, 1-12

___________________________________________________________________________________________________________________
TASK DESCRIPTION	
___________________________________________________________________________________________________________________
Participants select between 2 options ("bandits") that are tied to different payoff schedules.
One option is always worth 10 points more than the other. The pay off schedules change with a certain probability (flipRate)
each trial. When they change, 20 points are added to the lesser paid option which flips the relative payoffs.
Participants are told that their final reward is tied to the number of times they selected the higher payoff choice.
2 phases:
1. Passive Observation Phase: participants watch the computer select choices for 300 trials (default).
Payoff changes are made explicit to participants by presenting a text message on screen.
Every 100 trials (starting at trial 200), participants are asked to estimate how many reversals in payoff
they expect to observe during the next set of 100 trials
2. Active Game: participants play the game for 300 trials (default)

___________________________________________________________________________________________________________________	
DURATION 
___________________________________________________________________________________________________________________	
the default set up with 300 trials in each phase takes about 35 minutes to run

___________________________________________________________________________________________________________________	
DATA FILE INFORMATION 
___________________________________________________________________________________________________________________
The default data stored in the data files are:

(1) Raw data file: 'twoarmedbandittask_raw*.iqdat' (a separate file for each participant)*

build:							The specific Inquisit version used (the 'build') that was run
computer.platform:				the platform the script was run on (win/mac/ios/android)
date, time: 					date and time script was run 
subject, group, 				with the current subject/groupnumber
session:						with the current session id

blockcode, blocknum:			the name and number of the current block (built-in Inquisit variable)
trialcode, trialnum: 			the name and number of the currently recorded trial
										(Note: not all trials that are run might record data; 
										by default data is collected unless /recorddata = false is set for a particular trial/block;
										in this script trial.selectBandit and trial.summary store data to the data file.
										Trial.summary stores the comprehensive summary of all relevant values at the end of each
										'successful' (= a choice was made) trial.)
																												
(parameter) flipRate:			p(flip) = the fixed probability with which the payoff schedules will switch and the former
								lower payoff choice will be the higher payoff choice (until the next flip)
countSelections:				counts the number of choices A or B made (excludes trials with no responses)	

attempts:						running total of attempts for the current trial 
								(Note: if participant takes too long to decide, the current trial terminates and is repeated after feedback. 
								Those repeats are called 'attempts' in this script)
								
flip:							1 = a flip of the pay-offs occurred during this trial; 0 = no payoff changes occurred	
countFlipsPractice:				counts the number of flips

valueA:							stores the value of choice A (NOTE: changes in values after flips are stored in trial.summary only)
valueB:							stores the value of choice B (NOTE: changes in values after flips are stored in trial.summary only)
								trial.selectBandit: values present the pre-flip values (preflip values are used to assess whether participants selected the known higher payoff)
								trial.summary: present the post-flip values
										
higherPayOff:					stores the choice with the higher payoff ("A" or "B")
								!Note: this variable is only updated in trial.summary, see valueA and valueB

selectionRT:					stores the response time (in ms) of selecting the last response; measured from start of last trial.selectBandit		
previousArm:					stores the previously (preceding trial) selected choice ("A" or "B")						

response:						the participant's response during current trial
								trial.selectBandig: 18 = E; 23 = I or 0 (no response)
								trial.summary: 0 = no response

selectedArm:					stores the selected choice ("A" or "B")

selectedHigherPayOff:			1 = participant selected the currently higher payoff 
								(note: if there was a payoff flip, participant wasn't aware of the flip at time of making response); 
								0 = otherwise
								
choice:							0 = no choice made
								1 = exploitative choice (the choice selected was the one attached to the highest seen payoff up to this point);
								2 = exploratory choice (participant selected the other choice)

highestSeenPayOff:				stores the current (at time of data saving) highest payoff that participant has seen
									trial.selectBandit: stores the highestSeenPayOff from perspective of participant at time of making response
									trial.summary: stores the highestSeenPayOff from perspective of participant at the end of the trial (after totalpoints give feedback)
									(Note: the highestSeenPayOff can change from trial.selectBandit to trial.summary depending on whether or not
									there was a payoff flip)
								
consecutiveSameChoice:			running total of selecting the same choice consecutively (resets after switch)

selectionChange:				1 = participant changed selection; 
								0 = otherwise
								
totalPoints:					stores the current total points earned									


(2) Summary data file: 'twoarmedbandittask_summary*.iqdat' (a separate file for each participant)*

inquisit.version:			Inquisit version run
computer.platform:			the platform the script was run on (win/mac/ios/android)
startDate:					date script was run
startTime:					time script was started
subjectid:					assigned subject id number
groupid:					assigned group id number
sessionid:					assigned session id number
elapsedTime:				time it took to run script (in ms); measured from onset to offset of script
completed:					0 = script was not completed (prematurely aborted); 
							1 = script was completed (all conditions run)
							
(parameter) flipRate:		p(flip) = the fixed probability with which the payoff schedules will switch and the former
							lower payoff choice will be the higher payoff choice until the next flip
*************
Phase1:								
*************								
meanFliprateEstimate:		the estimate flip rate based on the mean Number of Flips over 100 trials								
meanFlipEstimate:			the mean estimate of flips over 100 trials
sumFlipEstimate:			the total number of estimated flips for 400 trials (phase 1)
countFlipsPractice:			the total number of actual flips for 400 trials (phase 1)

*************
Phase 2:
*************
propHigherPayOffs:			proportion of selecting the higher payoff option
propExploit:				proportion of selecting the option with the known higher payoff
							(Note: at the time of making the choice the participant does not know whether a flip occurred
							and can only use the information gathered up to this point)
									
propExplore:				proportion of selecting the option with the known lower payoff
							(Note: at the time of making the choice the participant does not know whether a flip might have occurred
							and can only use the information gathered up to this point)

* separate data files: to change to one data file for all participants (on Inquisit Lab only), go to section
"DATA" and follow further instructions

___________________________________________________________________________________________________________________	
EXPERIMENTAL SET-UP 
___________________________________________________________________________________________________________________

Two phases:
1. Passive Observation Phase: participants watch the computer select choices for 500 trials.
Every 100 trials (starting at trial 200), participants are asked to estimate how many reversals in payoff
they expect to observe during the next set of 300 trials.

Trial Sequence:
choice ("Choose") (2000ms, editable) -> selected choice highlighted and total updated (1000ms, editable)
If the payoff schedule has changed that trial, the choice trial displays "Changed" in red font


2. Active Game Phase: participants plays the game for 300 trials

Trial Sequence
choice ("Choose") (max.1500ms, editable) -> if no response: feedback and trial gets repeated until a response is made->
point update (1000ms, editable) -> etc.

The default fliprate is 0.075 (editable) => each trial there is a 7.5% chance that the payoffs flip
(so you could expect to see 7.5 flips every 100 trials)
Notes: 
* in this script, the fliprate is a true probability.
* the first trial cannot be a flip trial

___________________________________________________________________________________________________________________	
STIMULI
___________________________________________________________________________________________________________________
provided by Millisecond Software - can be edited under section Editable Stimuli

___________________________________________________________________________________________________________________	
INSTRUCTIONS 
___________________________________________________________________________________________________________________
provided by Millisecond Software - can be edited under section Editable Instructions

___________________________________________________________________________________________________________________	
EDITABLE CODE 
___________________________________________________________________________________________________________________	
check below for (relatively) easily editable parameters, stimuli, instructions etc. 
Keep in mind that you can use this script as a template and therefore always "mess" with the entire code 
to further customize your experiment.

The parameters you can change are:

/flipRate:					p(flip) = the fixed probability with which the payoff schedules will switch and the former
							lower payoff choice will be the higher payoff choice until the next flip (default: 0.075)
							Note: in order for the script to run without error messages, make sure that flipRate*1000 results in an integer number

/start_A:					the start value of option A (default: 10)
/start_B:					the start value of option B (default: 20)

/nrDemotrials:				the number of passive demo trials (default: 300)
/nrTesttrials:				the number of active test trials (default: 300)
							Note: Knox et al (2012) run 500 trials in each phase

/demoSelectionTime:			the time (in ms) it takes computer to make a choice during the passive demo trials (default: 2000)
/demo_ResponseFeedback:	the feedback duration (in ms) during the passive demo trials (default: 1000)
									
/readyDuration:				the duration (in ms) of the get-ready trial (default: 3000ms)

/selectionTime:				the response timeout (in ms) of making a selection during a test trial (default: 1500ms)
/feedbackDuration:			the duration (in ms) of the feedback during a test trial (default: 1000ms)

/leftKey:					the left response button (default: "E") - this key is attached to option A
/rightKey:					the right response button (default: "I") - this key is attached to option B