THE ONLINE MCMC

EXAMPLES

This site generates 2 plots: one of the posterior plots for each parameter that the user defines...

... and another showing the distribution of all the best fit models that were drawn randomly from the posterior distribution.

While full instructions and explanations can be found here, the following video showing how to use the website might also be useful. In this case a basic linear model and a small data set were used for simplicity.

INPUT

Instructions for using this page can be found below the input options.

Input model equation:

Input the data:

Input the likelihood:
 --Type-- Gaussian Student's t

Input the sampler:
 --Type-- Emcee Dynesty Nestle PYMC3 (Under development)

Any results will be available for 15 days following completion.
They will then be deleted, so please download any results that you would like to keep for longer.

INSTRUCTIONS

The model

Firstly, you must input the model that you want to fit to your data. When inputting this model you can use the standard operators "+", "-", "*" (multiplication), "/" (division). Allowable functions (such as trigonometric functions) and constants are listed below. To raise a value to a given power use either "^" or "**".

When entering the model be careful to use parentheses to group the required parts of the equation. Click here to show an example input model.

Parameter types

Once the model is submitted you can choose each parameter's type:

• constant: the parameter is a fixed constant that you can define a numerical value for;
• variable: the parameter is a variable that you would like to fit and for which you will need to define a prior (see here for information on the prior type);
• independent variable / abscissa: the parameter is a value, or set of values, at which the model is defined (e.g. in the above example the t (time) value could be such a parameter) that you can input directly or through file upload (uploaded files can be plain ascii text with whitespace or comma separated values). Currently only one parameter can be given as an independent variable, i.e. only one-dimensional models are allowed.

Prior

There are currently three prior probability distributions that you can choose for a variable:

If you are unsure about what is best to use then a Uniform distribution with a range broad enough to cover your expectations of the parameter is the simplest option.

Data input

Input the data that you would like to fit the model to. You can directly choose to input values directly in the form below (with whitespace or comma separated values), or upload a file containing the data (again with whitespace, or comma separated values). The number of input data points must be the same as the number of values input for the independent variable/abscissa parameter provided above.

Likelihood input

There are currently two allowed likelihood functions:

• Gaussian: a Gaussian (or Normal) probability distribution (this is one of the most common, and is often the least informative, likelihood functions). If using this likelihood function there are three additional options:
• input a single known value for the standard deviation, σ, of noise in the data;
• input a set of values (either directly into the form as a set of whitespace or comma separated values, or though uploading an ascii text file of the values) of the standard deviation of the noise, with one value per data point;
• choose to include the noise standard deviation as another parameter to be fit (i.e. if it is unknown). If you choose this option then a prior (as above) is required.
• Student's t: the Student's t likelihood is similar to the Gaussian likelihood, but it does not require a noise standard deviation to be given (the noise is assumed to be stationary over the dataset and has been analytically marginalised over).

Sampler Inputs

The MCMC aims to draw samples (a chain of points) from the posterior probability distributions of the parameters. You need to tell it how many points to draw. There are three inputs required:

• No. of ensemble points ("walkers"): this is essentially the number of independent chains within the MCMC. This needs to be an even number and in general should be at least twice the number of fitting parameters that you have. Using a large value (e.g. 100) should be fine, but you could run into lack-of-memory issues if the number is too high (1000s);
• No. of iterations: this is the number of points per chain for each of the ensemble points. The product of this number and the number of ensemble points will be the total number of samples that you have for the posterior;
• No. of burn-in iterations: this is the number of iterations (for each "walker") that are thrown away from the start of the chain (the iteration points above come after the burn-in points). This allows time for the MCMC to converge on the bulk of the posterior and for points sampled away from that to not be included in the final results.

For Dynesty, only one input is required:
• No. of live points : this is described in greater detail here. This needs to be a positive integer and in general should be at least 1 greater than the number of fitting parameters that exist.
Nestle sampling is similar to the MCMC method, however the nature of the sampling allows one to calculate the integral of the probability distribution. For Nestle, two inputs are required:
• No. of live points : the number of active points, a positive interger at least one greater than the number of fitting parameters that exist.
• Method : How the sampler chooses new points within the target parameter space. Currently can choose from 'Classic', 'Single' or 'Multi'. Further information can be found here
For PYMC3, three inputs are required:
• No. of draws : The number of sample draws from the posterior per chain.
• No. of chains : The number of independent MCMC chains to run.
• No. of burn-in iterations: this is the number of iterations (for each "walker") that are thrown away from the start of the chain (the iteration points above come after the burn-in points). This allows time for the MCMC to converge on the bulk of the posterior and for points sampled away from that to not be included in the final results.
If in doubt use the defaults and see how things turn out.

ALLOWABLE FUNCTIONS AND CONSTANTS

Here is a list of allowable functions within your model. When entering your model use the form given in the monospace font, with the function argument surrounded by brackets, e.g. sin(x).

Constants

These constants can be input rather than having to give their numerical values.

CAVEATS

The MCMC algorithm is not guaranteed to produce sensible results every time, and your output may contain errors or look odd. Some information and trouble shooting can be found here.

If users really want to understand what is being done by this code I would advise learning about Bayesian analyses and Markov chain Monte Carlo methods. I would also advise learning python, or another programming language, and coding the analysis up themselves, particularly if you have a more complex problem. However, this site aims to be useful starting point.