[Pycon] [new paper] "Shreya Khurana" - Bayesian Modeling with PyMC3: An ecological application
info a pycon.it
info a pycon.it
Dom 6 Gen 2019 20:26:58 CET
Title: Bayesian Modeling with PyMC3: An ecological application
Duration: 45 (includes Q&A)
Q&A Session: 15
Language: en
Type: Talk
Abstract: In this talk we will be learning how to define a Bayesian Hierarchical Model (BHM) and fit it on some data for a simple ecological application using PyMC3 - finding the population distribution of a plant species. We'll also be going over some diagnostics to check our model.
Markov chain Monte Carlo (MCMC) methods are used to sample from various complex probability distributions. Recently, they have also become very popular because they provide an enormous scope for dealing with sophisticated statistical models. In this talk, we'll primarily go over two techniques - Gibbs sampler and a random-walk Metropolis-Hastings sampler - using an ecological application.
Ecologists are often interested in approximating the distributions of plant species and in this talk, we'll be building a BHM to integrate data from three sources. BHMs split a complicated model into three basic components. The data model occupies one level of the hierarchy, while the process model resides below it. Typically, a third hierarchical level contains statistical models, also called priors, for unknown parameters that include additional physical information. The levels are formally generated by a series of conditioning steps where one level is conditioned on knowledge of the levels below it.
All the statistical jargon aside, all we're doing is simply building a model by assuming certain priors and then making some more assumptions to explain the data we have - just like any ML model. MCMC sampling techniques help us to approximate certain posterior distributions. And we'll use PyMC3 library for this. PyMC3 is a highly popular library for probabilistic programming.
Audience
Audience level:
Python : Beginner
Computational skills: Intermediate
This talk should be equally interesting for statisticians and researchers using MCMC sampling and working in any field. R has been a popular choice for the former group but coming from the same background, I can safely say that PyMC3 is a really good choice when it comes to working on larger projects that require additional model building and combining the BHM with other ETL tasks, that require more of Pythonic work and pipelines.
Since we will be going over the model for a simple application, I would expect the audience to have at least some basic knowledge of Bayesian statistics (what model priors-posteriors are), but minimal Python knowledge is required (basic syntax).
By the end of this talk, the audience would have :
1. Learnt how to define a BHM in Python
2. Learnt the basics of using two MCMC sampling techniques in PyMC3 - gibbs and Metropolis Hastings
3. Learnt how to conduct a proper diagnosis of the model using metrics like autocorrelation plots, standard error and histogram plots
Outline
1. Introduction (10 mins)
- Bayesian Hierarchical Models - priors, conjugate posteriors (5 mins)
- MCMC sampling techniques in PyMC3 (5 mins)
2. Building the model (10 mins)
- Defining the model for our ecological application (5 mins)
- Model hyperparameters - initial values and priors (5 mins)
3. Results and Diagnostics (10 mins)
- Diagnostic check of model using metrics mentioned above (5 mins)
- Comparing the probability distribution sampled with the true distribution (5 mins)
Tags: [u'Statistical Learning', u'statistics', u'bayesian_statistic', u'mathematical-modelling', u'data', u'statistical-modeling', u'markov-model-markov-chain', u'pymc3', u'sampling', u'pydata', u'bayes']
Maggiori informazioni sulla lista
Pycon