Skip to content

Commit 4422b65

Browse files
committed
add docstring references
1 parent ad2cce4 commit 4422b65

File tree

6 files changed

+96
-12
lines changed

6 files changed

+96
-12
lines changed

docs/src/index.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,20 @@ CurrentModule = NestedSamplers
55
# NestedSamplers.jl
66

77
[![GitHub](https://img.shields.io/badge/Code-GitHub-black.svg)](https://github.com/TuringLang/NestedSamplers.jl)
8-
[![Build Status](https://github.com/turinglang/NestedSamplers.jl/workflows/CI/badge.svg?branch=main)](https://github.com/turinglang/NestedSamplers.jl/actions)
8+
[![Build Status](https://github.com/TuringLang/NestedSamplers.jl/workflows/CI/badge.svg?branch=main)](https://github.com/TuringLang/NestedSamplers.jl/actions)
99
[![PkgEval](https://juliaci.github.io/NanosoldierReports/pkgeval_badges/N/NestedSamplers.svg)](https://juliaci.github.io/NanosoldierReports/pkgeval_badges/report.html)
10-
[![Coverage](https://codecov.io/gh/turinglang/NestedSamplers.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/turinglang/NestedSamplers.jl)
11-
12-
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3950594.svg)](https://doi.org/10.5281/zenodo.3950594)
10+
[![Coverage](https://codecov.io/gh/TuringLang/NestedSamplers.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/turinglang/NestedSamplers.jl)
11+
[![LICENSE](https://img.shields.io/github/license/TuringLang/NestedSamplers.jl?color=yellow)](https://github.com/TuringLang/NestedSamplers.jl/blob/main/LICENSE)
1312

1413
A Julian implementation of single- and multi-ellipsoidal nested sampling algorithms using the [AbstractMCMC](https://github.com/turinglang/abstractmcmc.jl) interface.
1514

1615
This package was heavily influenced by [`nestle`](https://github.com/kbarbary/nestle), [`dynesty`](https://github.com/joshspeagle/dynesty), and [`NestedSampling.jl`](https://github.com/kbarbary/NestedSampling.jl).
1716

17+
## Citing
18+
19+
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3950594.svg)](https://doi.org/10.5281/zenodo.3950594)
20+
21+
If you use this library, or a derivative of it, in your work, please consider citing it. This code is built off a multitude of academic works, which have been noted in the docstrings where appropriate. These references, along with references for the more general calculations, can all be found in [CITATIONS.bib](https://github.com/TuringLang/NestedSamplers.jl/blob/main/CITATIONS.bib)
1822

1923
## Installation
2024

docs/src/intro.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# Introduction
2+
3+
Nested sampling is a statistical technique first described in Skilling (2004)[^1] as a method for estimating the Bayesian evidence. Conveniently, it also produces samples with importance weighting proportional to the posterior distribution. To understand what this means, we need to comprehend [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem).
4+
5+
## Bayes' theorem
6+
7+
Bayes' theorem, in our nomenclature, is described as the relationship between the *prior*, the *likelihood*, the *evidence*, and the *posterior*. In its entirety-
8+
9+
```math
10+
p(\theta | x) = \frac{p(x | \theta)p(\theta)}{p(x)}
11+
```
12+
13+
### Posterior
14+
15+
``p(\theta | x)`` - the probability of the model parameters (``\theta``) conditioned on the data (``x``)
16+
17+
### Likelihood
18+
19+
``p(x | \theta)`` - the probability of the data (``x``) conditioned on the model parameters (``\theta``)
20+
21+
### Prior
22+
23+
``p(\theta)`` - the probability of the model parameters
24+
25+
### Evidence
26+
27+
``p(x)`` - the probability of the data
28+
29+
If you are familiar with Bayesian statistics and Markov Chain Monte Carlo (MCMC) techniques, you should be somewhat familiar with the relationships between the posterior, the likelihood, and the prior. The evidence, though, is somewhat hard to describe; what does "the probability of the data" mean? Well, another way of writing the evidence, is this integral
30+
31+
```math
32+
p(x) \equiv Z = \int_\Omega{p(x | \theta) \mathrm{d}\theta}
33+
```
34+
35+
which is like saying "the likelihood of the data [``p(x | \theta)``] integrated over *all of parameter space* [``\Omega``]". We have to write the probability this way, because the data are statistically dependent on the model parameters. This integral is intractable for all but the simplest combinations of distributions ([conjugate distributions](https://en.wikipedia.org/wiki/Conjugate_prior)), and therefore it must be estimated or approximated in some way.
36+
37+
## What can we do with the evidence?
38+
39+
Before we get into approximating the Bayesian evidence, let's talk about why it's important. After all, for most MCMC applications it is simply a normalization factor to be ignored (how convenient!).
40+
41+
## Further reading
42+
43+
For further reading, I recommend reading the cited sources in the footnotes, as well as the references below
44+
45+
* [dynesty documentation](https://dynesty.readthedocs.io)
46+
47+
48+
[^1]: Skilling 2004

src/bounds/Bounds.jl

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,9 @@ end
7777
"""
7878
Bounds.NoBounds([T=Float64], N)
7979
80-
Unbounded prior volume; equivalent to the unit cube in `N` dimensions.
80+
Unbounded prior volume; equivalent to the unit cube in `N` dimensions. This matches the original nested sampling derivation in Skilling (2004).[^1]
81+
82+
[^1]: John Skilling, 2004, AIP 735, 395 ["Nested Sampling"](https://aip.scitation.org/doi/abs/10.1063/1.1835238)
8183
"""
8284
struct NoBounds{T} <: AbstractBoundingSpace{T}
8385
ndims::Int

src/bounds/ellipsoid.jl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ An `N`-dimensional ellipsoid defined by
99
```
1010
1111
where `size(center) == (N,)` and `size(A) == (N,N)`.
12+
13+
This implementation follows the algorithm presented in Mukherjee et al. (2006).[^1]
14+
15+
[^1]: Pia Mukherjee, et al., 2006, ApJ 638 L51 ["A Nested Sampling Algorithm for Cosmological Model Selection"](https://iopscience.iop.org/article/10.1086/501068)
1216
"""
1317
mutable struct Ellipsoid{T} <: AbstractBoundingSpace{T}
1418
center::Vector{T}

src/bounds/multiellipsoid.jl

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,21 @@
33
Bounds.MultiEllipsoid([T=Float64], ndims)
44
Bounds.MultiEllipsoid(::AbstractVector{Ellipsoid})
55
6-
Use multiple [`Ellipsoid`](@ref)s in an optimal clustering to bound prior space. For more details about the bounding algorithm, see the extended help (`??Bounds.MultiEllipsoid`)
6+
Use multiple [`Ellipsoid`](@ref)s in an optimal clustering to bound prior space. This implementation follows the MultiNest implementation outlined in Feroz et al. (2009).[^1] For more details about the bounding algorithm, see the extended help (`??Bounds.MultiEllipsoid`)
7+
8+
[^1]: Feroz et al., 2009, MNRAS 398, 4 ["MultiNest: an efficient and robust Bayesian inference tool for cosmology and particle physics"](https://academic.oup.com/mnras/article/398/4/1601/981502)
9+
10+
## Extended help
11+
12+
The multiple-ellipsoidal implementation is defined as follows:
13+
14+
1. Fit a [`Bounds.Ellipsoid`](@ref) to the sample.
15+
2. Perform K-means clustering (here using [Clustering.jl](https://github.com/JuliaStats/Clustering.jl)) centered at the endpoints of the bounding ellipsoid. This defines two clusters within the sample.
16+
3. If either cluster has fewer than two points, consider it ill-defined and end any recursion.
17+
4. Fit [`Bounds.Ellipsoid`](@ref) to each of the clusters assigned in (2).
18+
5. If the volume of the parent ellipsoid is more than twice the volume of the two child ellipsoids, recurse (1-5) to each child.
19+
20+
To sample from this distribution, a random ellipsoid is selected, and a random sample is sampled from that ellipsoid. We then reverse this and find all of the ellipsoids which enclose the sampled point, and select one of those randomly for the enclosing bound.
721
"""
822
struct MultiEllipsoid{T} <: AbstractBoundingSpace{T}
923
ellipsoids::Vector{Ellipsoid{T}}

src/proposals/Proposals.jl

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,9 @@ unitcheck(us) = all(u -> 0 < u < 1, us)
4444
"""
4545
Proposals.Rejection(;maxiter=100_000)
4646
47-
Propose a new live point by uniformly sampling within the bounding volume and rejecting samples that do not meet the likelihood constraints.
47+
Propose a new live point by uniformly sampling within the bounding volume and rejecting samples that do not meet the likelihood constraints. This follows the original nested sampling algorithm proposed in Skilling (2004)[^1]
48+
49+
[^1]: John Skilling, 2004, AIP 735, 395 ["Nested Sampling"](https://aip.scitation.org/doi/abs/10.1063/1.1835238)
4850
4951
## Parameters
5052
- `maxiter` is the maximum number of samples that can be rejected before giving up and throwing an error.
@@ -79,7 +81,9 @@ Base.show(io::IO, p::Rejection) = print(io, "NestedSamplers.Proposals.Rejection"
7981
"""
8082
Proposals.RWalk(;ratio=0.5, walks=25, scale=1)
8183
82-
Propose a new live point by random walking away from an existing live point.
84+
Propose a new live point by random walking away from an existing live point. This follows the algorithm outlined in Skilling (2006).[^1]
85+
86+
[^1] Skilling, 2006, Bayesian Anal. 1(4), ["Nested sampling for general Bayesian computation"](https://projecteuclid.org/journals/bayesian-analysis/volume-1/issue-4/Nested-sampling-for-general-Bayesian-computation/10.1214/06-BA127.full)
8387
8488
## Parameters
8589
- `ratio` is the target acceptance ratio
@@ -172,7 +176,9 @@ end
172176
Propose a new live point by random staggering away from an existing live point.
173177
This differs from the random walk proposal in that the step size here is exponentially adjusted
174178
to reach a target acceptance rate _during_ each proposal, in addition to _between_
175-
proposals.
179+
proposals. This follows the algorithm outlined in Skilling (2006).[^1]
180+
181+
[^1] Skilling, 2006, Bayesian Anal. 1(4), ["Nested sampling for general Bayesian computation"](https://projecteuclid.org/journals/bayesian-analysis/volume-1/issue-4/Nested-sampling-for-general-Bayesian-computation/10.1214/06-BA127.full)
176182
177183
## Parameters
178184
- `ratio` is the target acceptance ratio
@@ -263,7 +269,9 @@ end
263269
Proposals.Slice(;slices=5, scale=1)
264270
265271
Propose a new live point by a series of random slices away from an existing live point.
266-
This is a standard _Gibbs-like_ implementation where a single multivariate slice is a combination of `slices` univariate slices through each axis.
272+
This is a standard _Gibbs-like_ implementation where a single multivariate slice is a combination of `slices` univariate slices through each axis. This follows the algorithm outlined in Neal (2003).[^1]
273+
274+
[^1]: Neal, 2003, Ann. Statist. 31(3), ["Slice Sampling"](https://projecteuclid.org/journals/annals-of-statistics/volume-31/issue-3/Slice-sampling/10.1214/aos/1056562461.full)
267275
268276
## Parameters
269277
- `slices` is the minimum number of slices
@@ -321,8 +329,12 @@ end # end of function Slice
321329

322330
"""
323331
Proposals.RSlice(;slices=5, scale=1)
324-
Propose a new live point by a series of random slices away from an existing live point.
325-
This is a standard _random_ implementation where each slice is along a random direction based on the provided axes.
332+
333+
Propose a new live point by a series of random slices away from an existing live point. This is a standard _random_ implementation where each slice is along a random direction based on the provided axes. This more closely matches the PolyChord implementation outlined in Handley et al. (2015a,b).[^1][^2]
334+
335+
[^1]: Handley, et al., 2015a, MNRAS 450(1), ["polychord: nested sampling for cosmology"](https://academic.oup.com/mnrasl/article/450/1/L61/986122)
336+
[^2]: Handley, et al., 2015b, MNRAS 453(4), ["polychord: next-generation nested sampling"](https://academic.oup.com/mnras/article/453/4/4384/2593718)
337+
326338
## Parameters
327339
- `slices` is the minimum number of slices
328340
- `scale` is the proposal distribution scale, which will update _between_ proposals.

0 commit comments

Comments
 (0)