I want to apply theano.scan over multiple parameters in a hierarchical model.

The code works great for a single model, however, I’m not sure how to convert the model to a hierarchical model (requiring additional looping).

Here is the code for the single model:

```
with pm.Model() as bayesian_reinforcement_learning:
# data
actions_ = pm.Data('actions', actions)
rewards_ = pm.Data('rewards', rewards)
# priors
alpha = pm.Beta('alpha', 1, 1)
beta = pm.HalfNormal('beta', 10)
# model
# init Qs
Qs = 0.3333 * tt.ones(3, dtype="float64")
# Compute the Q values for each trial
Qs, _ = theano.scan(
fn=lambda action, reward, Qs, alpha: tt.set_subtensor(Qs[action], Qs[action] + alpha * (reward - Qs[action])),
sequences=[actions_, rewards_],
outputs_info=[Qs],
non_sequences=[alpha])
BQs = beta*Qs
pi = tt.nnet.softmax(BQs)
like = pm.Categorical('like', p=pi, observed=actions_)
# opt
trace = pm.sample(tune=1000, chains=2, target_accept=0.85) # tune=5000, target_accept=0.9 njobs=4 ,target_accept=0.99
idata = az.from_pymc3(trace)
pm.traceplot(idata)
```

& here is the structure for the hierarchical model:

```
with pm.Model(coords=coords) as hierarchical_bayes_RL:
# Data
subject_idx = pm.Data("subject_idx", subject_idxs, dims="sub_id")
# data
actions_ = pm.Data('actions', df.actions.values)
rewards_ = pm.Data('rewards', df.rewards.values)
# Hyperpriors
a_alpha = pm.HalfNormal("a_alpha", 5.0)
b_alpha = pm.HalfNormal("b_alpha", 5.0)
mu_b = pm.HalfNormal('mu_b', 10.0)
sigma_b = pm.HalfNormal("sigma_b", 5.0)
# Intercept for each county, distributed around group mean mu_a
# Above we just set mu and sd to a fixed value while here we
# plug in a common group distribution for all a and b (which are
# vectors of length n_counties).
# parameters
alpha = pm.Beta('alpha', a_alpha, b_alpha, dims="subject")
beta = pm.Normal("beta", mu=mu_b, sigma=sigma_b, dims="subject")
# model
# init Qs
# 3 Qs values per parameter are required...?
Qs = 0.3333 * tt.ones((3, 100), dtype="float64")
# Compute the Q values for each trial
# Qs, _ = theano.scan(
# # fn=lambda action, reward, Qs, alpha: Qs[action],
# fn = lambda action, reward, Qs, alpha: tt.set_subtensor(Qs[action], Qs[action] + alpha * (reward - Qs[action])),
# sequences=[actions_[subject_idx], rewards_[subject_idx]],
# outputs_info=[Qs],
# non_sequences=[alpha]
# )
BQs = beta[subject_idx] * Qs[:, subject_idx]
pi = tt.nnet.softmax(BQs)
like = pm.Categorical('P[A]', p=pi, observed=actions_, dims="sub_id")
pm.model_to_graphviz(hierarchical_bayes_RL)
```

If you have a better solution that is completely vectorized that would be awesome too! Anything that works, I’m new to theano…

Thanks!

2

Not sure I grok what is happening here. Is this a recurrence equation? I.e.,

`Q_{n+1} = f(Q_{n}, reward_{n}, action_{n}, alpha)`

. Perhaps it might help to describe the model (edit question).Hi Merv! Yes the recursion is part of the model. The first model works fine, however the second monday (where I attempt to convert the first mode to a hierarchical model) is invalid.. I’lm looking for assistance with making this conversion.