Deep in the Weeds: Complex Hierarchical Models in PyMC3

If you've read Thomas Wiecki's awesome introduction to hierarchical models (here: http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/) you probably started working to figure out how to apply the technique to your own work. A challenge you may run into pretty quickly is how do you go about modeling N different levels? Even more challenging is how do you model N-levels and also keep the model vectorized? This post will be fairly terse and will illustrate how to actually set this up in PyMC3.

I've generated a small dataset to act as an example here. Let's pretend we have data on how much  money the average person in different states made over time and counties made and we have information on whether or not they got a degree. Using hierarchical modeling, we can relate these different factors to understand their impact on income. As I said, this data is all generated solely to illustrate this technical concept, for one blog post try to not think critically about the data (or do and expect to have some good laughs).

If you want to see how to generate the example data, I have included that at the end of this post.

WARNING: My main goal is to put an example online. This post assumes an understanding of PyMC3, Hierarchical modeling, pandas, etc.