@@ -922,6 +922,10 @@ of applications in statistics.
922922:class: `NormalDist ` Examples and Recipes
923923^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
924924
925+
926+ Classic probability problems
927+ ****************************
928+
925929:class: `NormalDist ` readily solves classic probability problems.
926930
927931For example, given `historical data for SAT exams
@@ -947,6 +951,10 @@ Find the `quartiles <https://en.wikipedia.org/wiki/Quartile>`_ and `deciles
947951 >>> list (map (round , sat.quantiles(n = 10 )))
948952 [810, 896, 958, 1011, 1060, 1109, 1162, 1224, 1310]
949953
954+
955+ Monte Carlo inputs for simulations
956+ **********************************
957+
950958To estimate the distribution for a model than isn't easy to solve
951959analytically, :class: `NormalDist ` can generate input samples for a `Monte
952960Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method> `_:
@@ -963,6 +971,9 @@ Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
963971 >>> quantiles(map (model, X, Y, Z)) # doctest: +SKIP
964972 [1.4591308524824727, 1.8035946855390597, 2.175091447274739]
965973
974+ Approximating binomial distributions
975+ ************************************
976+
966977Normal distributions can be used to approximate `Binomial
967978distributions <https://mathworld.wolfram.com/BinomialDistribution.html> `_
968979when the sample size is large and when the probability of a successful
@@ -1000,6 +1011,10 @@ probability that the Python room will stay within its capacity limits?
10001011 >>> mean(trial() <= k for i in range (10_000 ))
10011012 0.8398
10021013
1014+
1015+ Naive bayesian classifier
1016+ *************************
1017+
10031018Normal distributions commonly arise in machine learning problems.
10041019
10051020Wikipedia has a `nice example of a Naive Bayesian Classifier
@@ -1054,6 +1069,48 @@ The final prediction goes to the largest posterior. This is known as the
10541069 'female'
10551070
10561071
1072+ Kernel density estimation
1073+ *************************
1074+
1075+ It is possible to estimate a continuous probability density function
1076+ from a fixed number of discrete samples.
1077+
1078+ The basic idea is to smooth the data using `a kernel function such as a
1079+ normal distribution, triangular distribution, or uniform distribution
1080+ <https://en.wikipedia.org/wiki/Kernel_(statistics)#Kernel_functions_in_common_use> `_.
1081+ The degree of smoothing is controlled by a single
1082+ parameter, ``h ``, representing the variance of the kernel function.
1083+
1084+ .. testcode ::
1085+
1086+ import math
1087+
1088+ def kde_normal(sample, h):
1089+ "Create a continous probability density function from a sample."
1090+ # Smooth the sample with a normal distribution of variance h.
1091+ kernel_h = NormalDist(0.0, math.sqrt(h)).pdf
1092+ n = len(sample)
1093+ def pdf(x):
1094+ return sum(kernel_h(x - x_i) for x_i in sample) / n
1095+ return pdf
1096+
1097+ `Wikipedia has an example
1098+ <https://en.wikipedia.org/wiki/Kernel_density_estimation#Example> `_
1099+ where we can use the ``kde_normal() `` recipe to generate and plot
1100+ a probability density function estimated from a small sample:
1101+
1102+ .. doctest ::
1103+
1104+ >>> sample = [- 2.1 , - 1.3 , - 0.4 , 1.9 , 5.1 , 6.2 ]
1105+ >>> f_hat = kde_normal(sample, h = 2.25 )
1106+ >>> xarr = [i/ 100 for i in range (- 750 , 1100 )]
1107+ >>> yarr = [f_hat(x) for x in xarr]
1108+
1109+ The points in ``xarr `` and ``yarr `` can be used to make a PDF plot:
1110+
1111+ .. image :: kde_example.png
1112+ :alt: Scatter plot of the estimated probability density function.
1113+
10571114..
10581115 # This modelines must appear within the last ten lines of the file.
10591116 kate: indent-width 3; remove-trailing-space on; replace-tabs on; encoding utf-8;
0 commit comments