Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regret model #1

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion vba/bandit_vba.m
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
use_first_150 = 0;



%% Where to look for data
%Quick username check, and path setting

Expand All @@ -65,11 +66,16 @@
options.GnFigs = 0;
end
%% set up dim defaults
if valence && ~disappointment
if valence && ~disappointment && ~regret
n_theta = 3; %Number of evolution params (AlphaWin AlphaLoss LossDecay WinDecay)
options.inF.valence = 1;
options.inF.disappointment= 0;

elseif valence && regret
n_theta = 4;
options.inF.valence = 1;
options.inF.regret = 1;

elseif valence && disappointment
n_theta = 4; %Number of evolution params (AlphaWin AlphaLoss LossDecay WinDecay)
options.inF.valence = 1;
Expand Down
11 changes: 10 additions & 1 deletion vba/f_bandit_Qlearn.m
Original file line number Diff line number Diff line change
Expand Up @@ -24,18 +24,25 @@
end



if in.fix_decay %This is the fixed version
%if in.decay %This logic is somewhat confusing... leave for relic vba data for now
alpha_win = 1./(1+exp(-theta(1))); % learning rate is bounded between 0 and 1.
alpha_loss = 1./(1+exp(-theta(2))); % learning rate is bounded between 0 and 1.
decay=0.5;
elseif in.valence && ~in.disappointment
elseif in.valence && ~in.disappointment && ~in.regret
%Params
alpha_win = 1./(1+exp(-theta(1))); % learning rate is bounded between 0 and 1.
alpha_loss = 1./(1+exp(-theta(2))); % learning rate is bounded between 0 and 1.
decay = 1./(1+exp(-theta(3))); % decay is bounded between 0 and 1.
% loss_decay = 1./(1+exp(-theta(3))); % learning rate is bounded between 0 and 1.
% win_decay = 1./(1+exp(-theta(4))); % learning rate is bounded between 0 and 1.
elseif in.valence && in.regret
%Params
alpha_win = 1./(1+exp(-theta(1))); % learning rate is bounded between 0 and 1.
alpha_loss = 1./(1+exp(-theta(2))); % learning rate is bounded between 0 and 1.
decay = 1./(1+exp(-theta(3))); % decay is bounded between 0 and 1.
omega_r = 1./(1+exp(-theta(4))); % regret is bounded between 0 and 1.
elseif in.valence && in.disappointment
%Params
alpha_win = 1./(1+exp(-theta(1))); % learning rate is bounded between 0 and 1.
Expand All @@ -56,6 +63,8 @@

if in.disappointment
r = (1 - omega)*r + omega*(stake - r);
elseif in.regret
r = omega_r*r + (1-omega_r)*(r-max(x));
end

fx = zeros(length(x),1);
Expand Down