From the Binomial to the Normal DistributionLeslie WrightThe purpose of this worksheet is to demonstrate how the binomial distribution becomes the normal distribution when the sample size approaches infinity.restart;This is the standard expression for the r-th binomial probability after the np-th (or middle) term for the expansion of NiMpLCYlInBHIiIiJSJxR0YmJSJuRw==, where NiMvLCYlInBHIiIiJSJxR0YmRiY=:y:=unapply(convert(binomial(n,n*p+r)*p^(n*p+r)*q^(n*q-r),factorial),r);This constitutes the ratio between two adjacent probabilities:rat:=simplify(y(r+1)/y(r));The following is the numerator of this ratio after dividing through by NiMqKCUibkciIiIlInBHRiUlInFHRiU=:numer(rat)/n/p/q;algsubs(p+q=1,%);top:=expand(%);Ditto for the denominator of that same ratio:bot:=expand(denom(rat)/n/p/q);The following estimates the difference between the logarithms of the numerator and denominator, capitalizing on the Maclaurin series approximation NiMvLSUjbG5HNiMsJiIiIkYoJSJ4R0YoRik= when x is small. Since we will eventually assume n to be big, the approximation is certainly appropriate: logtop:=top-1;logbot:=bot-1;This line of course is based on the relation that the logarithm of a quotient is equal to the logarithm of the dividend less the logarithm of the divisor:logtop-logbot;simplify(%);dlny:=algsubs(p+q=1,%)*dr;Lets do a little scaling of the independent variable axis here--i.e., create a new variable NiMvJSJ4RyomJSJyRyIiIi0lJXNxcnRHNiMlIm5HISIi:dlny:=subs({r=sqrt(n)*x,dr=sqrt(n)*dx}, dlny);Another scale substitution, using the standard deviation of the binomial distribution, NiMvJSZzaWdtYUctJSVzcXJ0RzYjKiglIm5HIiIiJSJwR0YqJSJxR0Yq, scaled down by NiMtJSVzcXJ0RzYjJSJuRw== to reflect the above scaling:algsubs(1/q/p=1/sigma^2,%);Here we begin simplify the element in question, and see what happens when n goes to infinity:simplify(%/dx);expand(%);dlny:=limit(%,n=infinity)*dx;Now take the indefinite integral to get an expression for the logarithm of the ordinate in relation to x. The logarithm of A is the constant of integration here:lny:=int(dlny/dx,x)+ln(A);Now take the exponential of both sides and get an expression for y:y:=expand(exp(lny));We need to find a value of A so that the area under the curve is unity. We need to tell Maple the NiMlJnNpZ21hRw== is a positive number so it can find the definite integral:assume(sigma>0);int(y,x=-infinity..infinity)=1;A:=solve(%,A);Let's set up an expression for what we now have found to be the probability density function of the normal distribution of mean NiMlI211Rw== and standard deviation NiMlJnNpZ21hRw==. Remember that we have been assuming up to now that our distribution has a mean of zero, hence the substitution to account for any mean:f:=unapply(subs(x=X-mu,y),(X,mu,sigma));Here's is the standard normal curve, with mean zero and standard deviation one:plot(f(x,0,1),x=-3..3);Now let's set up an expression for the cumulative distribution function--i.e., the probability that a normal variate will take on a value of x or less. Maple uses the error function to express this analytically:Q:=unapply((int(f(t,0,1),t=-infinity..x)),x);Now let's plot it to see what it looks like. Very familiar shape indeed:plot(Q(x),x=-3..3);What about the reverse issue, i.e. what is the value of the variate associated with a particular percentile (a.k.a. the one-tailed probability)? We need to solve an equation to find out:Q(z)=.975;fsolve(%,z);What about a two-tailed probability? We need to solve a slightly different equation:(Q(z)-1/2)*2=.95;
fsolve(%,z);