Book Table of Content + model (JAGS code) and data download.


1 Recipes for a good statistical analysis

2 A bit of theory


2.1
Axiom 1: Probabilities are in the range zero to one


2.2
Axiom 2: When a probability is either zero or one


2.3
Axiom 3: The sum, or marginalization, axiom


2.4
Product rule


2.5
Bayes Theorem


2.6
Error propagation


2.7
Bringing it all home


2.8
Profiling is not marginalization


2.9
Exercises

3 A bit of numerical computation


3.1
Some technicalities


3.2
How to sample from a generic function
model model model data data
4 Single Parameter Models


4.1
Step-by-step guide for building a basic model



4.1.1
A little bit of (science) background



4.1.2
Bayesian Model Specification
model data


4.1.3
Obtaining the Posterior Distribution



4.1.4
Bayesian Point and Interval Estimation



4.1.5
Checking chain convergence



4.1.6
Model checking and sensitivity analysis



4.1.7
Comparison with older analyses


4.2
Other Useful Distributions with One Parameter



4.2.1
Measuring a rate: Poisson
model data


4.2.2
Combining Two or More (Poisson) Measurements
model model-gen data


4.2.3
Measuring a fraction: Binomial
model data

4.3
Exercises

5 The Prior


5.1
Conclusions depend on the prior ...



5.1.1
... sometimes a lot. The Malmquist-Eddington bias
model data


5.1.2
... by lower amounts with increasing data quality
model


5.1.3
... but eventually becomes negligible



5.1.4
... and the precise shape of the prior often does not matter
model

5.2
Where to find priors


5.3
Why there are so many uniform priors in this book?


5.4
Other examples on the influence of priors on conclusions



5.4.1
The important role of the prior in the determination of the mass of the most distant known galaxy cluster
model data


5.4.2
The importance of population gradients for photometric redshifts


5.5
Exercises

6 Multi-parameters models


6.1
Common simple problems



6.1.1
Location and spread
model model-gen data


6.1.2
The source intensity in the presence of a background
model data


6.1.3
Estimating a fraction in the presence of a background
model data


6.1.4
Spectral slope: Hardness ratio
model data


6.1.5
Spectral shape
model data

6.2
Mixtures



6.2.1
Modeling a bimodal distribution: the case of Globular Cluster Metallicity
model data


6.2.2
Average of incompatible measurements
model data

6.3
Advanced Analysis



6.3.1
Source intensity with over-Poisson background fluctuations
model data


6.3.2
The cosmological mass fraction derived from the cluster's baryon fraction
model data


6.3.3
Light concentration in the presence of a background
model data


6.3.4
A complex background modeling for geo-neutrinos
model model model data data



6.3.4.1
An initial modeling of the background





6.3.4.2
Discriminating natural from human-induced neutrinos





6.3.4.3
Improving detection of geo-neutrinos





6.3.4.4
Concluding remarks




6.3.5
Upper limits from counting experiments
model model data data



6.3.5.1
Zero observed events





6.3.5.2
Non-zero events



6.4
Exercises

7 Non-random data collection


7.1
The general case


7.2
Sharp selection on the value
model data

7.3
Sharp selection on the value, mixture of Gaussians: measuring the gravitational redshift
model data

7.4
Sharp selection on the true value
model model-gen data

7.5
Probabilistic selection on the true value
model-gen

7.6
Sharp selection on the observed value, mixture of Gaussians
model-gen

7.7
Numerical implementation of the models



7.7.1
Sharp selection on the value
model


7.7.2
Sharp selection on the true value
model


7.7.3
Probabilistic selection on the true value
model data


7.7.4
Sharp selection on the observed value, mixture of Gaussians
model data

7.8
Final remarks


7.9
Exercises

8 Fitting Regression Models


8.1
Clearing up some misconceptions



8.1.1
Pay attention to selection effects



8.1.2
Avoid fishing expeditions



8.1.3
Do not confuse prediction with parameter estimation
model data



8.1.3.1
Prediction and parameter estimation differ





8.1.3.2
Direct and inverse relations also differ





8.1.3.3
Summary



8.2
Non-linear fit with no error on predictor and no spread: Efficiency and completeness
model data

8.3
Fit with spread and no errors on predictor: varying physical constants?
model data

8.4
Fit with errors and spread: the Magorrian relation
model data

8.5
Fit with more than one predictor and a complex link: star formation quenching
model data

8.6
Fit with upper and lower limits: the optical-to-X flux ratio
model model-gen data

8.7
Fit with an important data structure: the mass-richness scaling
model model-gen data

8.8
Fit with a non-ignorable data collection
model

8.9
Fit without anxiety about non-random data collection
model data

8.10
Prediction
model data

8.11
A meta-analysis: combined fit of regressions with different intrinsic scatter
model data

8.12
Advanced Analysis



8.12.1
Cosmological parameters from SNIa
model data


8.12.2
The enrichment history of the ICM
model data



8.12.2.1
Enrichment history





8.12.2.2
Intrinsic scatter





8.12.2.3
Controlling for temperature T





8.12.2.4
Abundances systematics





8.12.2.5
T and Fe abundance likelihood





8.12.2.6
Priors





8.12.2.7
Results




8.12.3
The enrichment history after binning by redshift
model data


8.12.4
With an over-Poissons spread
model data

8.13
Exercises

9 Model checking and sensitivity analysis


9.1
Sensitivity analysis



9.1.1
Check alternative prior distributions
model data


9.1.2
Check alternative link functions
model data


9.1.3
Check alternative distributional assumptions



9.1.4
Prior sensitivity summary


9.2
Model checking



9.2.1
Overview



9.2.2
Start simple: visual inspection of real and simulated data and of their summaries



9.2.3
A deeper exploration: using measures of discrepancy



9.2.4
Another deep exploration


9.3
Summary

10 Bayesian vs simple methods


10.1
Conceptual differences


10.2
Maximum likelihood



10.2.1
Average vs. Maximum Likelihood



10.2.2
Small samples


10.3
Robust estimates of location and scale



10.3.1
Bayes has a lower bias
model-gen


10.3.2
Bayes is fairer and has less noisy errors


10.4
Comparison of fitting methods



10.4.1
Fitting methods generalities



10.4.2
Regressions without intrinsic scatter
model model-gen



10.4.2.1
Preamble: restating the obvious





10.4.2.2
Testing how fitting models perform for a regression without intrinsic scatter




10.4.3
One more comparison, with different data structures
model model-gen

10.5
Summary and experience of a former non-Bayesian astronomer

A Probability Distributions


A.1
Discrete Distributions



A.1.1
Bernoulli



A.1.2
Binomial



A.1.3
Poisson


A.2
Continuous Distributions



A.2.1
Gaussian or Normal



A.2.2
Beta



A.2.3
Exponential



A.2.4
Gamma and Schechter



A.2.5
Lognormal



A.2.6
Pareto or Power Law



A.2.7
Central Student's-t



A.2.8
Uniform



A.2.9
Weibull

B The third axiom of probability, conditional probability, independence and conditional independence


B.1
The third axiom of probability


B.2
Conditional probability


B.3
Independence and conditional independence

References