Continuous distributions
The numeric variables in the survey, Age
, Mileage
, and Odometer
, can take any values over a continuous interval and these are examples of continuous RVs. In the previous section, we dealt with RVs that had discrete output. In this section, we will deal with RVs that have continuous output. A distinction from the previous section needs to be pointed out explicitly.
In the case of a discrete RV, there is a positive number for the probability of an RV taking on a certain value that is determined by the pmf. In the continuous case, an RV necessarily assumes any specific value with zero probability. These technical issues cannot be discussed in this book. In the discrete case, the probabilities of certain values are specified by the pmf, and in the continuous case the probabilities, over intervals, are decided by the probability density function, abbreviated as pdf.
Suppose that we have a continuous RV X with the pdf f(x) defined over the possible x values; that is, we assume that the pdf f(x) is well defined over the range of the RV X, denoted by . It is necessary that the integration of f(x) over the range is necessarily 1; that is, .The probability that the RV X takes a value in an interval [a, b] is defined by:
In general, we are interested in the cumulative probabilities of a continuous RV, which is the probability of the event P(X<x). In terms of the previous equations, this is obtained as:
A special name for this probability is the cumulative density function. The mean and variance of a continuous RV are then defined by:
As in the previous section, we will begin with the simpler RV in uniform distribution.
Uniform distribution
A RV is said to have uniform distribution over the interval if its probability density function is given by:
In fact, it is not necessary to restrict our focus on the positive real line. For any two real numbers a and b, from the real line, with b > a, the uniform RV can be defined by:
The uniform distribution has a very important role to play in simulation, as will be seen in Chapter 6, Simulation. As with the discrete counterpart, in the continuous case any two intervals of the same length will have an equal probability occurring. The mean and variance of a uniform RV over the interval [a, b] are respectively given by:
Example 1.4.1. Horgan’s (2008), Example 15.3: The International Journal of Circuit Theory and Applications reported in 1990 that researchers at the University of California, Berkeley, had designed a switched capacitor circuit for generating random signals whose trajectory is uniformly distributed over the unit interval [0, 1]. Suppose that we are interested in calculating the probability that the trajectory falls in the interval [0.35, 0.58]. Though the answer is straightforward, we will obtain it using the punif
function:
> punif(0.58)-punif(0.35) [1] 0.23
Of course, we don’t need software for such simple integrals, nevertheless:
Exponential distribution
The exponential distribution is probably one of the most important probability distributions in statistics, and more so for computer scientists. The numbers of arrivals in a queuing system, the time between two incoming calls on a mobile, the lifetime of a laptop, and so on, are some of the important applications where this distribution has a lasting utility value. The pdf of an exponential RV is specified by:
The parameter is sometimes referred to as the failure rate. The exponential RV enjoys a special property called the memory-less property, which conveys that:
The mathematical statement translates into the property that if X is an exponential RV, then its failure in the future depends on the present, and the past (age) of the RV does not matter. In simple words, this means that the probability of failure is constant in time and does not depend on the age of the system. Let us obtain the plots of a few exponential distributions:
> par(mfrow=c(1,2)) > curve(dexp(x,1),0,10,ylab=”f(x)”,xlab=”x”,cex.axis=1.25) > curve(dexp(x,0.2),add=TRUE,col=2) > curve(dexp(x,0.5),add=TRUE,col=3) > curve(dexp(x,0.7),add=TRUE,col=4) > curve(dexp(x,0.85),add=TRUE,col=5) > legend(6,1,paste("Rate = ",c(1,0.2,0.5,0.7,0.85)),col=1:5,pch= + "___”) > curve(dexp(x,50),0,0.5,ylab=”f(x)”,xlab=”x”) > curve(dexp(x,10),add=TRUE,col=2) > curve(dexp(x,20),add=TRUE,col=3) > curve(dexp(x,30),add=TRUE,col=4) > curve(dexp(x,40),add=TRUE,col=5) > legend(0.3,50,paste("Rate = ",c(1,0.2,0.5,0.7,0.85)),col=1:5,pch= + "___”)
The mean and variance of this exponential distribution are listed as follows:
The complete Python code block is given next:
Normal distribution
The normal distribution is in some sense an all-pervasive distribution that arises sooner or later in almost any statistical discussion. In fact, it is very likely that the reader may already be familiar with certain aspects of the normal distribution; for example, the shape of a normal distribution curve is bell-shaped. The mathematical appropriateness is probably reflected through the reason that though it has a simpler expression, its density function includes the three most famous irrational numbers
Suppose that X is normally distributed with the mean and the variance . Then, the probability density function of the normal RV is given by:
If the mean is zero and the variance is 1, the normal RV is referred to as the standard normal RV, and the standard is to denote it by Z.
Example 1.4.2. Shady Normal Curves: We will again consider a standard normal random variable, which is more popularly denoted in Statistics by Z. Some of the most needed probabilities are P(Z > 0) and P(-1.96 < Z < 1.96). These probabilities are now shaded:
> par(mfrow=c(3,1)) > # Probability Z Greater than 0 > curve(dnorm(x,0,1),-4,4,xlab=”z”,ylab=”f(z)”) > z=seq(0,4,0.02) > lines(z,dnorm(z),type=”h”,col=”grey”) > # 95% Coverage > curve(dnorm(x,0,1),-4,4,xlab=”z”,ylab=”f(z)”) > z=seq(-1.96,1.96,0.001) > lines(z,dnorm(z),type=”h”,col=”grey”) > # 95% Coverage > curve(dnorm(x,0,1),-4,4,xlab=”z”,ylab=”f(z)”) > z=seq(-2.58,2.58,0.001) > lines(z,dnorm(z),type=”h”,col=”grey”)
The Python program for the shady normal probabilities is given next: