Bayesian primer for astronomers
Practical applications of Bayesian inference in Astronomy

Or, how to avoid the bad things listed in my talk held at Bayesian methods in Cosmology 2006, and in my 2007 talk at the Liege University (and other places as well), and how to better use information in the data (see the second part of my november 2007 talk here at Brera) and have errors as you expect they are (e.g not including negative values for positively defined quantities).

Start by reading Andreon (2008).

Language definitions:
    'background' is an unrelated population contaminating data (e.g. the 'sky' background in an image or a galaxy background population in a catalog)
    'heteroscedastic errors' means that the error is datum-dependent (each datum has its own error, possibly different from the error of another datum)
    'intrinsic scatter' means that there is extra variance of the data points not accounted by the experimental errors.

How to: -estimation 

What quantity?
Example
Reference
estimate the intensity of a Poisson process in presence of a background
n of photons i.e. flux or luminosity, n. of galaxies, i.e. richness, x-ray count-rate, etc.
Almost every Bayesian book.
Appendix B of Andreon et al. (2006)
Discussion in sec 4.4.1 in Andreon et al. (2006)
estimate the intensity of a structured (i.e spatially dependent) Poisson process in presence of a background
-the x-ray flux of an extended source, including upper limit determinations.
-the richness of a clusters using spatial information
-the X-ray flux within r500, including uncertainties on T (used to estimate r500) and parameters of the radial profile
- the total optical luminosity of galaxies in clusters (integral of the luminosity function)
Appendix A of Andreon et al. (2008)

Andreon et al. (2010).

Andreon (2010)
estimate a scale (width) in presence of a background, heteroscedastic errors and when
(velocity) dispersion, scatter, width of a distribution, Beers scale, etc.

    only a summary of the data is available
cluster velocity dispersion
Appendix A of Andreon et al. (2006)
    individual data are at hand
cluster velocity dispersion Appendix B of Andreon et al. (2008)
estimate a fraction in absence of a background
blue fraction, hardness ratio (=1-2 f_b), AGN fraction, E+A fraction, completeness, fraction of galaxies having large equivalent width, etc.
Laplace (1812), almost every Bayesian textbook.
estimate a fraction in presence of a background
blue fraction, hardness ratio (=1-2 f_b), AGN fraction Appendix C of Andreon et al. (2006)
estimate the luminosity or mass function in presence of a background, or any other similar measurements of them.
luminosity function

dwarf on giant ratio
Andreon, Punzi & Grado (2005).
Andreon (2007)
estimate a linear trend (correlation, regression) between quantities  with heteroscedastic errors, in presence of an intrinsic scatter, and in absence of a background
Fundamental Plane, Tully-Fisher, Colour-Magnitude

stellar baryon fraction vs mass
gas baryon fraction vs mass
ratio of faint to bright galaxies vs z
D'Agostini (2005)


Andreon (2010)

Andreon (2008)
as above, but with noisy errors
richness-mass regression
Andreon & Hurn (2010)
estimate a linear trend (correlation, regression) between quantities with heteroscedastic errors, in presence of an intrinsic scatter, and in presence  of a background, eventually showing itself a  trend.
colour-magnitude relation
Appendix A of Andreon (2006)
predict something, using a calibrating sample
predict mass, given richness
Andreon & Hurn (2010)
estimate the colour distribution, for a sample contaminated by a uninteresting population
the colour distribution of cluster galaxies
Andreon et al. (2008)
A lot of the above at once.
Estimate the mass grow in galaxies Andreon (2006)
All the above in a organized paper

Andreon (2008), by Camdrige University Press



How to: -model selection


Read first Liddle (2004), then note that evidence, BIC, and likelihood ratio are three estimates, with decreasing accuracy, of the number you are looking for: the relative probability of two hypothesis. Trotta (2007), especially the first astro-ph version, also clarifies the subject and introduce the Savage-Dickey density ratio, computationally quite useful for nested models.
What is the question?
Example
Reference
Is there a trend?
  Model it and compare with no trend at all.
Does the blue fraction depend on the cluster velocity dispersion?
Does the fraction of obscured AGN depends on x-ray luminosity?
Is my source variable?
Andreon et al. (2006), sec 6.2
Tajer et al. (2007), sec 8.1
Is the trend a linear one ?
  Compare the simple linear trend with a more complicate one.
Does the colour-magnitude relation bend? Andreon et al. (2006), sec 4.1
Are two trends equal?
  Model them as equal first, and then as different. Then, compare them.
Does the fraction of absorbed and obscured AGNs shows the same trend?
Does the slope of a trend, as measured with a dataset the same of the one measured on another one?
Tajer et al. (2007), sec 8.1

Andreon  (2009)
Is the feature hinted in my plot significant?
 Add the feature to the model and compare with the simpler model.
 You are asking a reply to: Do I need to consider a more complex model?
Is there a dip in the LF?

Is the colour-magnitude relation curved?

Is there an emission line in the spectrum?

Is there a cluster (detection) here?

Is it a cluster or a blend of two groups?
Andreon et al. (2006), sec 4.3
Andreon et al. (2006), sec 4.1
Protassov et al. (2002)
Andreon et al. 2009, sec 3.2
Andreon et al. 2009,
sec 3.6
Should I refine my model adding one more parameter/feature?

 Do it and compare with the simpler model.
Must the mass function of galaxy in clusters expected from simple model be refined with a further evolutionary term?
Andreon (2006)
Mantra: model what you guess and compare complex and simple models.








My hompage
My Inference page