# Statistics & Math Worksheet

John E. 1:-reund’ sMArJ I

S1A11 1 ICS

\”111 11

PP I I .,,A I I >

f. I I I I :-..:

John E. Freund’s Mathematical Statistics

with Applications

Irwin Miller Marylees Miller

Eighth Edition

Pearson Education Limited

Edinburgh Gate

Harlow

Essex CM20 2JE

England and Associated Companies throughout the world

Visit us on the World Wide Web at: www.pearsoned.co.uk

© Pearson Education Limited 2014

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted

in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the

prior written permission of the publisher or a licence permitting restricted copying in the United Kingdom

issued by the Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS.

All trademarks used herein are the property of their respective owners. The use of any trademark

in this text does not vest in the author or publisher any trademark ownership rights in such

trademarks, nor does the use of such trademarks imply any afﬁliation with or endorsement of this

book by such owners.

ISBN 10: 1-292-02500-X

ISBN 13: 978-1-292-02500-1

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library

Printed in the United States of America

P E

A

R

S

O

N

C U

S T O

M

L

I

B

R

A

R Y

Table of Contents

1. Introduction

Irwin Miller/Marylees Miller

1

2. Probability

Irwin Miller/Marylees Miller

21

3. Probability Distributions and Probability Densities

Irwin Miller/Marylees Miller

61

4. Mathematical Expectation

Irwin Miller/Marylees Miller

113

5. Special Probability Distributions

Irwin Miller/Marylees Miller

145

6. Special Probability Densities

Irwin Miller/Marylees Miller

177

7. Functions of Random Variables

Irwin Miller/Marylees Miller

207

8. Sampling Distributions

Irwin Miller/Marylees Miller

233

9. Decision Theory

Irwin Miller/Marylees Miller

261

10. Point Estimation

Irwin Miller/Marylees Miller

283

11. Interval Estimation

Irwin Miller/Marylees Miller

317

12. Hypothesis Testing

Irwin Miller/Marylees Miller

337

13. Tests of Hypothesis Involving Means, Variances, and Proportions

Irwin Miller/Marylees Miller

359

I

14. Regression and Correlation

II

Irwin Miller/Marylees Miller

391

Appendix: Sums and Products

Irwin Miller/Marylees Miller

433

Appendix: Special Probability Distributions

Irwin Miller/Marylees Miller

437

Appendix: Special Probability Densities

Irwin Miller/Marylees Miller

439

Statistical Tables

Irwin Miller/Marylees Miller

443

Index

469

Introduction

1 Introduction

2 Combinatorial Methods

1 Introduction

3 Binomial Coefficients

4 The Theory in Practice

In recent years, the growth of statistics has made itself felt in almost every phase

of human activity. Statistics no longer consists merely of the collection of data and

their presentation in charts and tables; it is now considered to encompass the science

of basing inferences on observed data and the entire problem of making decisions

in the face of uncertainty. This covers considerable ground since uncertainties are

met when we flip a coin, when a dietician experiments with food additives, when an

actuary determines life insurance premiums, when a quality control engineer accepts

or rejects manufactured products, when a teacher compares the abilities of students,

when an economist forecasts trends, when a newspaper predicts an election, and

even when a physicist describes quantum mechanics.

It would be presumptuous to say that statistics, in its present state of development, can handle all situations involving uncertainties, but new techniques are

constantly being developed and modern statistics can, at least, provide the framework for looking at these situations in a logical and systematic fashion. In other

words, statistics provides the models that are needed to study situations involving

uncertainties, in the same way as calculus provides the models that are needed to

describe, say, the concepts of Newtonian physics.

The beginnings of the mathematics of statistics may be found in mid-eighteenthcentury studies in probability motivated by interest in games of chance. The theory

thus developed for “heads or tails” or “red or black” soon found applications in situations where the outcomes were “boy or girl,” “life or death,” or “pass or fail,” and

scholars began to apply probability theory to actuarial problems and some aspects

of the social sciences. Later, probability and statistics were introduced into physics

by L. Boltzmann, J. Gibbs, and J. Maxwell, and by this century they have found

applications in all phases of human endeavor that in some way involve an element

of uncertainty or risk. The names that are connected most prominently with the

growth of mathematical statistics in the first half of the twentieth century are those

of R. A. Fisher, J. Neyman, E. S. Pearson, and A. Wald. More recently, the work of

R. Schlaifer, L. J. Savage, and others has given impetus to statistical theories based

essentially on methods that date back to the eighteenth-century English clergyman

Thomas Bayes.

Mathematical statistics is a recognized branch of mathematics, and it can be

studied for its own sake by students of mathematics. Today, the theory of statistics is

applied to engineering, physics and astronomy, quality assurance and reliability, drug

development, public health and medicine, the design of agricultural or industrial

experiments, experimental psychology, and so forth. Those wishing to participate

From Chapter 1 of John E. Freund’s Mathematical Statistics with Applications,

Eighth Edition. Irwin Miller, Marylees Miller. Copyright 2014 by Pearson Education, Inc.

All rights reserved.

Introduction

in such applications or to develop new applications will do well to understand the

mathematical theory of statistics. For only through such an understanding can applications proceed without the serious mistakes that sometimes occur. The applications

are illustrated by means of examples and a separate set of applied exercises, many

of them involving the use of computers. To this end, we have added at the end of the

chapter a discussion of how the theory of the chapter can be applied in practice.

We begin with a brief review of combinatorial methods and binomial

coefficients.

2 Combinatorial Methods

In many problems of statistics we must list all the alternatives that are possible in a

given situation, or at least determine how many different possibilities there are. In

connection with the latter, we often use the following theorem, sometimes called the

basic principle of counting, the counting rule for compound events, or the rule for

the multiplication of choices.

THEOREM 1. If an operation consists of two steps, of which the first can be

done in n1 ways and for each of these the second can be done in n2 ways,

then the whole operation can be done in n1 · n2 ways.

Here, “operation” stands for any kind of procedure, process, or method of selection.

To justify this theorem, let us define the ordered pair (xi , yj ) to be the outcome

that arises when the first step results in possibility xi and the second step results in

possibility yj . Then, the set of all possible outcomes is composed of the following

n1 · n2 pairs:

(x1 , y1 ), (x1 , y2 ), . . . , (x1 , yn2 )

(x2 , y1 ), (x2 , y2 ), . . . , (x2 , yn2 )

…

…

…

(xn1 , y1 ), (xn1 , y2 ), . . . , (xn1 , yn2 )

EXAMPLE 1

Suppose that someone wants to go by bus, train, or plane on a week’s vacation to one

of the five East North Central States. Find the number of different ways in which this

can be done.

Solution

The particular state can be chosen in n1 = 5 ways and the means of transportation

can be chosen in n2 = 3 ways. Therefore, the trip can be carried out in 5 · 3 = 15

possible ways. If an actual listing of all the possibilities is desirable, a tree diagram

like that in Figure 1 provides a systematic approach. This diagram shows that there

are n1 = 5 branches (possibilities) for the number of states, and for each of these

branches there are n2 = 3 branches (possibilities) for the different means of transportation. It is apparent that the 15 possible ways of taking the vacation are represented by the 15 distinct paths along the branches of the tree.

Introduction

bus

train

plane

Oh

io

bus

train

plane

a

an

di

In

Illinois

bus

train

plane

M

ich

ig

an

sin

on

sc

Wi

bus

train

plane

bus

train

plane

Figure 1. Tree diagram.

EXAMPLE 2

How many possible outcomes are there when we roll a pair of dice, one red and

one green?

Solution

The red die can land in any one of six ways, and for each of these six ways the green

die can also land in six ways. Therefore, the pair of dice can land in 6 · 6 = 36 ways.

Theorem 1 may be extended to cover situations where an operation consists of

two or more steps. In this case, we have the following theorem.

Introduction

THEOREM 2. If an operation consists of k steps, of which the first can be

done in n1 ways, for each of these the second step can be done in n2 ways,

for each of the first two the third step can be done in n3 ways, and so forth,

then the whole operation can be done in n1 · n2 · . . . · nk ways.

EXAMPLE 3

A quality control inspector wishes to select a part for inspection from each of four

different bins containing 4, 3, 5, and 4 parts, respectively. In how many different ways

can she choose the four parts?

Solution

The total number of ways is 4 · 3 · 5 · 4 = 240.

EXAMPLE 4

In how many different ways can one answer all the questions of a true–false test

consisting of 20 questions?

Solution

Altogether there are

2 · 2 · 2 · 2 · . . . · 2 · 2 = 220 = 1,048,576

different ways in which one can answer all the questions; only one of these corresponds to the case where all the questions are correct and only one corresponds to

the case where all the answers are wrong.

Frequently, we are interested in situations where the outcomes are the different

ways in which a group of objects can be ordered or arranged. For instance, we might

want to know in how many different ways the 24 members of a club can elect a president, a vice president, a treasurer, and a secretary, or we might want to know in how

many different ways six persons can be seated around a table. Different arrangements like these are called permutations.

DEFINITION 1. PERMUTATIONS. A permutation is a distinct arrangement of n different elements of a set.

EXAMPLE 5

How many permutations are there of the letters a, b, and c?

Solution

The possible arrangements are abc, acb, bac, bca, cab, and cba, so the number of

distinct permutations is six. Using Theorem 2, we could have arrived at this answer

without actually listing the different permutations. Since there are three choices to

Introduction

select a letter for the first position, then two for the second position, leaving only

one letter for the third position, the total number of permutations is 3 · 2 · 1 = 6.

Generalizing the argument used in the preceding example, we find that n distinct

objects can be arranged in n(n − 1)(n − 2) · . . . · 3 · 2 · 1 different ways. To simplify our

notation, we represent this product by the symbol n!, which is read “n factorial.”

Thus, 1! = 1, 2! = 2 · 1 = 2, 3! = 3 · 2 · 1 = 6, 4! = 4 · 3 · 2 · 1 = 24, 5! = 5 · 4 · 3 · 2 · 1 =

120, and so on. Also, by definition we let 0! = 1.

THEOREM 3. The number of permutations of n distinct objects is n!.

EXAMPLE 6

In how many different ways can the five starting players of a basketball team be

introduced to the public?

Solution

There are 5! = 5 · 4 · 3 · 2 · 1 = 120 ways in which they can be introduced.

EXAMPLE 7

The number of permutations of the four letters a, b, c, and d is 24, but what is the

number of permutations if we take only two of the four letters or, as it is usually put,

if we take the four letters two at a time?

Solution

We have two positions to fill, with four choices for the first and then three choices for

the second. Therefore, by Theorem 1, the number of permutations is 4 · 3 = 12.

Generalizing the argument that we used in the preceding example, we find that n

distinct objects taken r at a time, for r > 0, can be arranged in n(n − 1) · . . . ·

(n − r + 1) ways. We denote this product by n Pr , and we let n P0 = 1 by definition.

Therefore, we can state the following theorem.

THEOREM 4. The number of permutations of n distinct objects taken r at a

time is

n Pr =

for r = 0, 1, 2, . . . , n.

n!

(n − r)!

Proof The formula n Pr = n(n − 1) · . . . · (n − r + 1) cannot be used for

r = 0, but we do have

n!

=1

n P0 =

(n − 0)!

Introduction

For r = 1, 2, . . . , n, we have

n Pr = n(n − 1)(n − 2) · . . . · (n − r − 1)

=

=

n(n − 1)(n − 2) · . . . · (n − r − 1)(n − r)!

(n − r)!

n!

(n − r)!

In problems concerning permutations, it is usually easier to proceed by using

Theorem 2 as in Example 7, but the factorial formula of Theorem 4 is somewhat

easier to remember. Many statistical software packages provide values of n Pr and

other combinatorial quantities upon simple commands. Indeed, these quantities are

also preprogrammed in many hand-held statistical (or scientific) calculators.

EXAMPLE 8

Four names are drawn from among the 24 members of a club for the offices of president, vice president, treasurer, and secretary. In how many different ways can this

be done?

Solution

The number of permutations of 24 distinct objects taken four at a time is

24 P4 =

24!

= 24 · 23 · 22 · 21 = 255,024

20!

EXAMPLE 9

In how many ways can a local chapter of the American Chemical Society schedule

three speakers for three different meetings if they are all available on any of five

possible dates?

Solution

Since we must choose three of the five dates and the order in which they are chosen

(assigned to the three speakers) matters, we get

5 P3 =

120

5!

=

= 60

2!

2

We might also argue that the first speaker can be scheduled in five ways, the second speaker in four ways, and the third speaker in three ways, so that the answer is

5 · 4 · 3 = 60.

Permutations that occur when objects are arranged in a circle are called

circular permutations. Two circular permutations are not considered different (and

are counted only once) if corresponding objects in the two arrangements have the

same objects to their left and to their right. For example, if four persons are playing

bridge, we do not get a different permutation if everyone moves to the chair at his

or her right.

Introduction

EXAMPLE 10

How many circular permutations are there of four persons playing bridge?

Solution

If we arbitrarily consider the position of one of the four players as fixed, we can seat

(arrange) the other three players in 3! = 6 different ways. In other words, there are

six different circular permutations.

Generalizing the argument used in the preceding example, we obtain the following theorem.

THEOREM 5. The number of permutations of n distinct objects arranged in

a circle is (n − 1)!.

We have been assuming until now that the n objects from which we select r

objects and form permutations are all distinct. Thus, the various formulas cannot be

used, for example, to determine the number of ways in which we can arrange the

letters in the word “book,” or the number of ways in which three copies of one novel

and one copy each of four other novels can be arranged on a shelf.

EXAMPLE 11

How many different permutations are there of the letters in the word “book”?

Solution

If we distinguish for the moment between the two o’s by labeling them o1 and o2 ,

there are 4! = 24 different permutations of the symbols b, o1 , o2 , and k. However, if

we drop the subscripts, then bo1 ko2 and bo2 ko1 , for instance, both yield boko, and

since each pair of permutations with subscripts yields but one arrangement without

subscripts, the total number of arrangements of the letters in the word “book” is

24

2 = 12.

EXAMPLE 12

In how many different ways can three copies of one novel and one copy each of four

other novels be arranged on a shelf?

Solution

If we denote the three copies of the first novel by a1 , a2 , and a3 and the other four

novels by b, c, d, and e, we find that with subscripts there are 7! different permutations of a1 , a2 , a3 , b, c, d, and e. However, since there are 3! permutations of a1 , a2 ,

and a3 that lead to the same permutation of a, a, a, b, c, d, and e, we find that there

are only 7!

3! = 7 · 6 · 5 · 4 = 840 ways in which the seven books can be arranged on a

shelf.

Generalizing the argument that we used in the two preceding examples, we

obtain the following theorem.

Introduction

THEOREM 6. The number of permutations of n objects of which n1 are of

one kind, n2 are of a second kind, . . . , nk are of a kth kind, and

n1 + n2 + · · · + nk = n is

n!

n1 ! · n2 ! · . . . · nk !

EXAMPLE 13

In how many ways can two paintings by Monet, three paintings by Renoir, and two

paintings by Degas be hung side by side on a museum wall if we do not distinguish

between the paintings by the same artists?

Solution

Substituting n = 7, n1 = 2, n2 = 3, and n3 = 2 into the formula of Theorem 6, we get

7!

= 210

2! · 3! · 2!

There are many problems in which we are interested in determining the number

of ways in which r objects can be selected from among n distinct objects without

regard to the order in which they are selected.

DEFINITION 2. COMBINATIONS. A combination is a selection of r objects taken from

n distinct objects without regard to the order of selection.

EXAMPLE 14

In how many different ways can a person gathering data for a market research organization select three of the 20 households living in a certain apartment complex?

Solution

If we care about the order in which the households are selected, the answer is

20 P3 = 20 · 19 · 18 = 6,840

but each set of three households would then be counted 3! = 6 times. If we do not

care about the order in which the households are selected, there are only

6,840

= 1,140 ways in which the person gathering the data can do his or her job.

6

Actually, “combination” means the same as “subset,” and when we ask for the

number of combinations of r objects selected from a set of n distinct objects, we are

simply asking for the total number of subsets of r objects that can be selected from

a set of n distinct objects. In general, there are r! permutations of the objects in a

subset of r objects, so that the n Pr permutations of r objects selected from a set of

n distinct objects contain each subset r! times. Dividing n Pr by r! and denoting the

! ”

n

result by the symbol r , we thus have the following theorem.

Introduction

THEOREM 7. The number of combinations of n distinct objects taken r at a

# $

n!

n

=

r

r!(n − r)!

time is

for r = 0, 1, 2, . . . , n.

EXAMPLE 15

In how many different ways can six tosses of a coin yield two heads and four tails?

Solution

This question is the same as asking for the number of ways in which we can select

the two tosses on which heads is to occur. Therefore, applying Theorem 7, we find

that the answer is

# $

6!

6

= 15

=

2

2! · 4!

This result could also have been obtained by the rather tedious process of enumerating the various possibilities, HHTTTT, TTHTHT, HTHTTT, . . . , where H stands

for head and T for tail.

EXAMPLE 16

How many different committees of two chemists and one physicist can be formed

from the four chemists and three physicists on the faculty of a small college?

Solution

% &

4!

4

= 6 ways and one of

Since two of four chemists can be selected in 2 =

2!

· 2!

% &

3!

3

= 3 ways, Theorem 1 shows that the

three physicists can be selected in 1 =

1! · 2!

number of committees is 6 · 3 = 18.

A combination of r objects selected from a set of n distinct objects may be considered a partition of the n objects into two subsets containing, respectively, the r

objects that are selected and the n − r objects that are left. Often, we are concerned

with the more general problem of partitioning a set of n distinct objects into k subsets, which requires that each of the n objects must belong to one and only one of

the subsets.† The order of the objects within a subset is of no importance.

EXAMPLE 17

In how many ways can a set of four objects be partitioned into three subsets containing, respectively, two, one, and one of the objects?

† Symbolically, the subsets A , A2 , . . . , A constitute a partition of set A if A ∪ A2 ∪ · · · ∪ A = A and Ai ∩ Aj =

k

k

1

1

Ø for all i Z j.

Introduction

Solution

Denoting the four objects by a, b, c, and d, we find by enumeration that there are

the following 12 possibilities:

ab|c|d ab|d|c ac|b|d ac|d|b

ad|b|c ad|c|b bc|a|d bc|d|a

bd|a|c bd|c|a cd|a|b cd|b|a

The number of partitions for this example is denoted by the symbol

#

$

4

= 12

2, 1, 1

where the number at the top represents the total number of objects and the numbers

at the bottom represent the number of objects going into each subset.

Had we not wanted to enumerate all the possibilities in the preceding example,

we

could

have argued that the two objects going into the first subset can be chosen

% & in

% &

2

4

2 = 6 ways, the object going into the second subset can then be chosen in 1 = 2

% &

1

ways, and the object going into the third subset can then be chosen in 1 = 1 way.

Thus, by Theorem 2 there are 6 · 2 · 1 = 12 partitions. Generalizing this argument, we

have the following theorem.

THEOREM 8. The number of ways in which a set of n distinct objects can be

partitioned into k subsets with n1 objects in the first subset, n2 objects in

the second subset, . . . , and nk objects in the kth subset is

#

$

n!

n

=

n1 , n2 , . . . , nk

n1 ! · n2 ! · . . . · nk !

Proof Since the n1 objects going into the first subset can be chosen in

! ”

n

n1 ways, the n2 objects going into the second subset can then be chosen

”

!

n − n1

ways, the n3 objects going into the third subset can then be

in

n2

”

!

n − n1 − n2

ways, and so forth, it follows by Theorem 2 that

chosen in

n3

the total number of partitions is

#

n

n1 , n2 , . . . , nk

$

# $ #

$

#

$

n

n − n1

n − n1 − n2 − · · · − nk−1

=

·

·…·

n1

n2

nk

=

n!

(n − n1 )!

·

n1 ! · (n − n1 )! n2 ! · (n − n1 − n2 )!

· … ·

=

(n − n1 − n2 − · · · − nk−1 )!

nk ! · 0!

n!

n1 ! · n2 ! · . . . · nk !

Introduction

EXAMPLE 18

In how many ways can seven businessmen attending a convention be assigned to one

triple and two double hotel rooms?

Solution

Substituting n = 7, n1 = 3, n2 = 2, and n3 = 2 into the formula of Theorem 8, we get

#

$

7!

7

= 210

=

3, 2, 2

3! · 2! · 2!

3 Binomial Coefficients

If n is a positive integer and we multiply out (x + y)n term by term, each term will be

the product of x’s and y’s, with an x or a y coming from each of the n factors x + y.

For instance, the expansion

(x + y)3 = (x + y)(x + y)(x + y)

= x·x·x+x·x·y+x·y·x+x·y·y

+ y·x·x+y·x·y+y·y·x+y·y·y

= x3 + 3×2 y + 3xy2 + y3

yields terms of the form x3 , x2 y, xy2 , and y3 . Their coefficients are 1, 3, 3, and 1, and

! ”

3

the coefficient of xy2 , for example, is 2 = 3, the number of ways in which we can

! ”

3

choose the two factors providing the y’s. Similarly, the coefficient of x2 y is 1 = 3,

the number of ways in which we can choose the one factor providing the y, and the

! ”

! ”

3

3

coefficients of x3 and y3 are 0 = 1 and 3 = 1.

More generally, if n is a positive integer and we multiply out (x + y)n term by

! ”

n

term, the coefficient of xn−r yr is r , the number of ways in which we can choose

! ”

n

the r factors providing the y’s. Accordingly, we refer to r as a binomial coefficient.

Values of the binomial coefficients for n = 0, 1, . . . , 20 and r = 0, 1, . . . , 10 are given

in table Factorials and Binomial Coefficients of “Statistical Tables.” We can now

state the following theorem.

THEOREM 9.

# $

n

‘

n n−r r

(x + y)n =

x y

r

r=0

for any positive integer n

Introduction

DEFINITION 3. BINOMIAL COEFFICIENTS. The coefficient of( x) n−r yr in the binomial

expansion of (x + y)n is called the binomial coefficient nr .

The calculation of binomial coefficients can often be simplified by making use

of the three theorems that follow.

THEOREM 10. For any positive integers n and r = 0, 1, 2, . . . , n,

# $ #

$

n

n

=

r

n−r

Proof We might argue that when we select a subset of r objects from a set

of n distinct objects, we leave a subset of n − r objects; hence, there are as

many ways of selecting r objects as there are ways of leaving (or selecting)

n − r objects. To prove the theorem algebraically, we write

#

$

n!

n!

n

=

=

n−r

(n − r)![n − (n − r)]!

(n − r)!r!

# $

n!

n

=

=

r

r!(n − r)!

Theorem 10 implies that if we calculate the binomial coefficients for

r = 0, 1, . . . , n2 when n is even and for r = 0, 1, . . . , n−1

2 when n is odd, the remaining

binomial coefficients can be obtained by making use of the theorem.

EXAMPLE 19

# $

# $

# $

# $

# $

4

4

4

4

4

Given

= 1,

= 4, and

= 6, find

and

.

0

1

2

3

4

Solution

# $ #

$ # $

# $ #

$ # $

4

4

4

4

4

4

=

=

= 4 and

=

=

=1

3

4−3

1

4

4−4

0

EXAMPLE 20

# $

# $

# $

# $ # $

# $

5

5

5

5

5

5

Given

= 1,

= 5, and

= 10, find

,

, and

.

0

1

2

3

4

5

Solution

# $ #

$ # $

# $ #

$ # $

5

5

5

5

5

5

=

=

= 10,

=

=

= 5, and

3

5−3

2

4

5−4

1

# $ #

$ # $

5

5

5

=

=

=1

5

5−5

0

It is precisely in this fashion that Theorem 10 may have to be used in connection

with table Factorials and Binomial Coefficients of “Statistical Tables.”

Introduction

EXAMPLE 21

! ”

! ”

17

20

Find 12 and 10 .

Solution

! ” ! ”

! ”

20

20

20

Since 12 is not given in the table, we make use of the fact that 12 = 8 , look

! ”

! ”

! ”

17

20

20

up 8 , and get 12 = 125,970. Similarly, to find 10 , we make use of the fact

! ”

! ”

! ” ! ”

17

17

17

17

that 10 = 7 , look up 7 , and get 10 = 19,448.

THEOREM 11. For any positive integer n and r = 1, 2, . . . , n − 1,

# $ #

$ #

$

n

n−1

n−1

=

+

r

r

r−1

Proof Substituting x = 1 into (x + y)n , let us write (1 + y)n = (1 + y)

(1 + y)n−1 = (1 + y)n−1 + y(1 + y)n−1 and equate the coefficient of yr in

(1 + y)n with that in (1 + y)n−1 + y(1 + y)n−1 . Since the coefficient of yr in

! ”

n

(1 + y)n is r and the coefficient of yr in (1 + y)n−1 + y(1 + y)n−1 is the

”

!

n−1

sum of the coefficient of yr in (1 + y)n−1 , that is,

r , and the coeffi”

!

n−1

cient of yr−1 in (1 + y)n−1 , that is, r − 1 , we obtain

# $ #

$ #

$

n

n−1

n−1

=

+

r

r

r−1

which completes the proof.

Alternatively, take any one of the n objects. If it is not to be included among the

”

!

n−1

ways of selecting the r objects; if it is to be included, there

r objects, there are

r

”

”

!

!

n−1

n−1

+

are r − 1 ways of selecting the other r − 1 objects. Therefore, there are

r

”

!

n−1

r − 1 ways of selecting the r objects, that is,

# $ #

$ #

$

n

n−1

n−1

=

+

r

r

r−1

Theorem 11 can also be proved by expressing the binomial coefficients on both

sides of the equation in terms of factorials and then proceeding algebraically, but we

shall leave this to the reader in Exercise 12.

An important application of Theorem 11 is a construct known as Pascal’s

triangle. When no table is available, it is sometimes convenient to determine binomial coefficients by means of a simple construction. Applying Theorem 11, we can

generate Pascal’s triangle as follows:

Introduction

1

1

1

1

1

1

2

3

4

1

3

6

1

4

1

1

5

10

10

5

1

……………………….

In this triangle, the first and last entries of each row are the numeral “1” each other

entry in any given row is obtained by adding the two entries in the preceding row

immediately to its left and to its right.

To state the third theorem about binomial coefficients, let us make the following

! ”

n

definition: r = 0 whenever n is a positive integer and r is a positive integer greater

than n. (Clearly, there is no way in which we can select a subset that contains more

elements than the whole set itself.)

THEOREM 12.

# $#

$ #

$

k

‘

m

n

m+n

=

r

k−r

k

r=0

Proof Using the same technique as in the proof of Theorem 11, let us

prove this theorem by equating the coefficients of yk in the expressions

on both sides of the equation

(1 + y)m+n = (1 + y)m (1 + y)n

!m+n”

, and the coefficient of yk in

k

# $ # $

# $

m

m

m m

+

y+···+

y

(1 + y)m (1 + y)n =

0

1

m

The coefficient of yk in (1+y)m+n is

# $ # $

# $

n

n

n n

*

+

y+···+

y

0

1

n

is the sum of the products that we obtain by multiplying the constant

term of the first factor by the coefficient of yk in the second factor, the

coefficient of y in the first factor by the coefficient of yk−1 in the second

factor, . . . , and the coefficient of yk in the first factor by the constant term

of the second factor. Thus, the coefficient of yk in (1 + y)m (1 + y)n is

# $# $ # $#

$ # $#

$

# $# $

m

n

m

n

m

n

m

n

+

+

+···+

0

k

1

k−1

2

k−2

k

0

# $#

$

k

‘

m

n

=

r

k−r

r=0

and this completes the proof.

Introduction

EXAMPLE 22

Verify Theorem 12 numerically for m = 2, n = 3, and k = 4.

Solution

Substituting these values, we get

# $# $ # $# $ # $# $ # $# $ # $# $ # $

2

3

2

3

2

3

2

3

2

3

5

+

+

+

+

=

0

4

1

3

2

2

3

1

4

0

4

! ”

! ” ! ”

2

2

3

and since 4 , 3 , and 4 equal 0 according to the definition on the previous page,

the equation reduces to

# $# $ # $# $ # $

2

3

2

3

5

+

=

1

3

2

2

4

which checks, since 2 · 1 + 1 · 3 = 5.

Using Theorem 8, we can extend our discussion to multinomial coefficients, that

is, to the coefficients that arise in the expansion of (x1 + x2 + · · · + xk )n . The multir

r

r

nomial coefficient of the term x11 · x22 · . . . · xkk in the expansion of (x1 + x2 + · · · +

xk )n is

#

$

n!

n

=

r1 , r2 , . . . , rk

r 1 ! · r2 ! · . . . · rk !

EXAMPLE 23

What is the coefficient of x31 x2 x23 in the expansion of (x1 + x2 + x3 )6 ?

Solution

Substituting n = 6, r1 = 3, r2 = 1, and r3 = 2 into the preceding formula, we get

6!

= 60

3! · 1! · 2!

Exercises

1. An operation consists of two steps, of which the first

can be made in n1 ways. If the first step is made in the ith

way, the second step can be made in n2i ways.†

(a) Use a tree diagram to find a formula for the total number of ways in which the total operation can be made.

(b) A student can study 0, 1, 2, or 3 hours for a history

test on any given day. Use the formula obtained in part

(a) to verify that there are 13 ways in which the student

can study at most 4 hours for the test on two consecutive

days.

2. With reference to Exercise 1, verify that if n2i equals

the constant n2 , the formula obtained in part (a) reduces

to that of Theorem 1.

3. With reference to Exercise 1, suppose that there is a

third step, and if the first step is made in the ith way and

the second step in the jth way, the third step can be made

in n3ij ways.

(a) Use a tree diagram to verify that the whole operation

can be made in

† The first subscript denotes the row to which a particular element belongs, and the second subscript denotes the column.

Introduction

n1 ‘

n2i

‘

n3ij

i=1 j=1

different ways.

(b) With reference to part (b) of Exercise 1, use the formula of part (a) to verify that there are 32 ways in which

the student can study at most 4 hours for the test on three

consecutive days.

4. Show that if n2i equals the constant n2 and n3ij equals

the constant n3 , the formula of part (a) of Exercise 3

reduces to that of Theorem 2.

5. In a two-team basketball play-off, the winner is the first

team to win m games.

(a) Counting separately the number of play-offs requiring

m, m + 1, . . . , and 2m − 1 games, show that the total number of different outcomes (sequences of wins and losses

by one of the teams) is

.%

& %

&

%

&/

m−1

m

2m − 2

2

+

+···+

m−1

m−1

m−1

(b) How many different outcomes are there in a “2 out

of 3” play-off, a “3 out of 5” play-off, and a “4 out of 7”

play-off?

6. When n is large, n! can be approximated by means of

the expression

% &n

√

n

2πn

e

called Stirling’s formula, where e is the base of natural

logarithms. (A derivation of this formula may be found in

the book by W. Feller cited among the references at the

end of this chapter.)

(a) Use Stirling’s formula to obtain approximations for

10! and 12!, and find the percentage errors of these

approximations by comparing them with the exact values given in table Factorials and Binomial Coefficients of

“Statistical Tables.”

(b) Use Stirling’s formula to obtain an approximation for

the number of 13-card bridge hands that can be dealt with

an ordinary deck of 52 playing cards.

7. Using Stirling’s formula (see Exercise 6) to approximate 2n! and n!, show that

% &

2n √

πn

n

L1

22n

8. In some problems of occupancy theory we are concerned with the number of ways in which certain distinguishable objects can be distributed among individuals,

urns, boxes, or cells. Find an expression for the number of

ways in which r distinguishable objects can be distributed

among n cells, and use it to find the number of ways in

which three different books can be distributed among the

12 students in an English literature class.

9. In some problems of occupancy theory we are concerned with the number of ways in which certain indistinguishable objects can be distributed among individuals,

urns, boxes, or cells. Find an expression for the number

of ways in which r indistinguishable objects can be distributed among n cells, and use it to find the number of

ways in which a baker can sell five (indistinguishable)

loaves of bread to three customers. (Hint: We might argue

that L|LLL|L represents the case where the three customers buy one loaf, three loaves, and one loaf, respectively, and that LLLL||L represents the case where the

three customers buy four loaves, none of the loaves, and

one loaf. Thus, we must look for the number of ways

in which we can arrange the five L’s and the two vertical bars.)

10. In some problems of occupancy theory we are concerned with the number of ways in which certain indistinguishable objects can be distributed among individuals,

urns, boxes, or cells with at least one in each cell. Find

an expression for the number of ways in which r indistinguishable objects can be distributed among n cells with

at least one in each cell, and rework the numerical part

of Exercise 9 with each of the three customers getting at

least one loaf of bread.

11. Construct the seventh and eighth rows of Pascal’s triangle and write the binomial expansions of (x + y)6 and

(x + y)7 .

12. Prove Theorem 11 by expressing all the binomial

coefficients in terms of factorials and then simplifying

algebraically.

13. Expressing the binomial coefficients in terms of factorials and simplifying algebraically, show that

% &

%

&

n−r+1

n

n

(a)

=

·

;

r

r−1

r

% &

%

&

n

n

n−1

(b)

=

·

;

r

r

n−r

%

&

%

&

n−1

n

(c) n

= (r + 1)

.

r

r+1

14. Substituting appropriate values for x and y into the

formula of Theorem 9, show that

n % &

‘

n

= 2n ;

(a)

r

r=0

% &

n

‘

n

(−1)r

= 0;

(b)

r

r=0

n % &

‘

n

(a − 1)r = an .

(c)

r

r=0

Introduction

15. Repeatedly applying Theorem 11, show that

(a)

&

% & ‘

r+1 %

n−i

n

=

r−i+1

r

16. Use Theorem 12 to show that

r=0

r

=

% &

2n

n

% &

n

= n2n−1 by setting x = 1 in Ther

r=0

orem 9, then differentiating the expressions on both sides

with respect to y, and finally substituting y = 1.

17. Show that

n

0

1

2

4

and

%

&

−3

;

3

√

√

(b) 5 writing 5 = 2(1 + 14 )1/2 and using the first four

terms of the binomial expansion of (1 + 14 )1/2 .

i=1

n % &2

‘

n

# $

r

20. With reference to the generalized definition of binomial coefficients in Exercise 19, show that

% &

−1

(a)

= (−1)r ;

r

% &

%

&

−n

n+r−1

(b)

= (−1)r

for n > 0.

r

r

21. Find the coefficient of x2 y3 z3 in the expansion of

(x + y + z)8 .

18. Rework Exercise 17 by making use of part (a) of

Exercise 14 and part (c) of Exercise 13.

22. Find the coefficient of x3 y2 z3 w in the expansion of

(2x + 3y − 4z + w)9 .

19. If n is not a positive integer or zero, the binomial

expansion of (1 + y)n yields, for −1 < y < 1, the infinite series
23. Show that
%
& %
&
n
n−1
=
n1 , n2 , . . . , nk
n1 − 1, n2 , . . . , nk
&
%
n−1
+···
+
n1 , n2 − 1, . . . , nk
&
%
n−1
+
n1 , n2 , . . . , nk − 1
% &
% &
% &
% &
n
n 2
n 3
n r
1+
y+
y +
y +···+
y +···
1
2
3
r
% &
n(n − 1) · . . . · (n − r + 1)
n
=
for r = 1, 2, 3, . . . .
r
r!
Use this generalized definition of binomial coefficients
to evaluate
where
by expressing all these multinomial coefficients in terms
of factorials and simplifying algebraically.
4 The Theory in Practice
Applications of the preceding theory of combinatorial methods and binomial coefficients are quite straightforward, and a variety of them have been given in Sections 2
and 3. The following examples illustrate further applications of this theory.
EXAMPLE 24
An assembler of electronic equipment has 20 integrated-circuit chips on her table,
and she must solder three of them as part of a larger component. In how many ways
can she choose the three chips for assembly?
Solution
Using Theorem 6, we obtain the result
20 P3 = 20!/17! = 20 · 19 · 18 = 6,840
EXAMPLE 25
A lot of manufactured goods, presented for sampling inspection, contains 16 units.
In how many ways can 4 of the 16 units be selected for inspection?
Introduction
Solution
According to Theorem 7,
# $
16
= 16!/4!12! = 16 · 15 · 14 · 13/4 · 3 · 2 · 1 = 1,092 ways
4
Applied Exercises
24. A thermostat will call for heat 0, 1, or 2 times a night.
Construct a tree diagram to show that there are 10 different ways that it can turn on the furnace for a total of 6
times over 4 nights.
25. On August 31 there are five wild-card terms in the
American League that can make it to the play-offs, and
only two will win spots. Draw a tree diagram which shows
the various possible play-off wild-card teams.
26. There are four routes, A, B, C, and D, between a person’s home and the place where he works, but route B
is one-way, so he cannot take it on the way to work, and
route C is one-way, so he cannot take it on the way home.
(a) Draw a tree diagram showing the various ways the
person can go to and from work.
(b) Draw a tree diagram showing the various ways he
can go to and from work without taking the same route
both ways.
27. A person with $2 in her pocket bets $1, even money,
on the flip of a coin, and she continues to bet $1 as long
as she has any money. Draw a tree diagram to show the
various things that can happen during the first four flips
of the coin. After the fourth flip of the coin, in how many
of the cases will she be
(a) exactly even;
(b) exactly $2 ahead?
28. The pro at a golf course stocks two identical sets of
women’s clubs, reordering at the end of each day (for
delivery early the next morning) if and only if he has sold
them both. Construct a tree diagram to show that if he
starts on a Monday with two sets of the clubs, there are
altogether eight different ways in which he can make sales
on the first two days of that week.
29. Suppose that in a baseball World Series (in which the
winner is the first team to win four games) the National
League champion leads the American League champion
three games to two. Construct a tree diagram to show the
number of ways in which these teams may win or lose the
remaining game or games.
30. If the NCAA has applications from six universities
for hosting its intercollegiate tennis championships in two
SECS. 1–4
consecutive years, in how many ways can they select the
hosts for these championships
(a) if they are not both to be held at the same university;
(b) if they may both be held at the same university?
31. Counting the number of outcomes in games of chance
has been a popular pastime for many centuries. This was
of interest not only because of the gambling that was
involved, but also because the outcomes of games of
chance were often interpreted as divine intent. Thus, it
was just about a thousand years ago that a bishop in what
is now Belgium determined that there are 56 different
ways in which three dice can fall provided one is interested only in the overall result and not in which die does
what. He assigned a virtue to each of these possibilities
and each sinner had to concentrate for some time on the
virtue that corresponded to his cast of the dice.
(a) Find the number of ways in which three dice can all
come up with the same number of points.
(b) Find the number of ways in which two of the three
dice can come up with the same number of points, while
the third comes up with a different number of points.
(c) Find the number of ways in which all three of the dice
can come up with a different number of points.
(d) Use the results of parts (a), (b), and (c) to verify
the bishop’s calculations that there are altogether 56
possibilities.
32. In a primary election, there are four candidates for
mayor, five candidates for city treasurer, and two candidates for county attorney.
(a) In how many ways can a voter mark his ballot for all
three of these offices?
(b) In how many ways can a person vote if he exercises
his option of not voting for a candidate for any or all of
these offices?
33. The five finalists in the Miss Universe contest are Miss
Argentina, Miss Belgium, Miss U.S.A., Miss Japan, and
Miss Norway. In how many ways can the judges choose
(a) the winner and the first runner-up;
(b) the winner, the first runner-up, and the second
runner-up?
Introduction
34. A multiple-choice test consists of 15 questions, each
permitting a choice of three alternatives. In how many different ways can a student check off her answers to these
questions?
35. Determine the number of ways in which a distributor
can choose 2 of 15 warehouses to ship a large order.
36. The price of a European tour includes four stopovers
to be selected from among 10 cities. In how many different ways can one plan such a tour
(a) if the order of the stopovers matters;
(b) if the order of the stopovers does not matter?
37. A carton of 15 light bulbs contains one that is defective. In how many ways can an inspector choose 3 of the
bulbs and
(a) get the one that is defective;
(b) not get the one that is defective?
38. In how many ways can a television director schedule a sponsor’s six different commercials during the six
time slots allocated to commercials during a two-hour
program?
39. In how many ways can the television director of Exercise 38 fill the six time slots for commercials if there are
three different sponsors and the commercial for each is to
be shown twice?
40. In how many ways can five persons line up to get on
a bus? In how many ways can they line up if two of the
persons refuse to follow each other?
41. In how many ways can eight persons form a circle for
a folk dance?
42. How many permutations are there of the letters in the
word
(a) “great”;
(b) “greet”?
43. How many distinct permutations are there of the letters in the word “statistics”? How many of these begin
and end with the letter s?
44. A college team plays 10 football games during a season. In how many ways can it end the season with five
wins, four losses, and one tie?
45. If eight persons are having dinner together, in how
many different ways can three order chicken, four order
steak, and one order lobster?
46. In Example 4 we showed that a true–false test consisting of 20 questions can be marked in 1,048,576 different
ways. In how many ways can each question be marked
true or false so that
(a) 7 are right and 13 are wrong;
(b) 10 are right and 10 are wrong;
(c) at least 17 are right?
47. Among the seven nominees for two vacancies on a
city council are three men and four women. In how many
ways can these vacancies be filled
(a) with any two of the seven nominees;
(b) with any two of the four women;
(c) with one of the men and one of the women?
48. A shipment of 10 television sets includes three that
are defective. In how many ways can a hotel purchase
four of these sets and receive at least two of the defective
sets?
49. Ms. Jones has four skirts, seven blouses, and three
sweaters. In how many ways can she choose two of the
skirts, three of the blouses, and one of the sweaters to take
along on a trip?
50. How many different bridge hands are possible containing five spades, three diamonds, three clubs, and two
hearts?
51. Find the number of ways in which one A, three B’s,
two C’s, and one F can be distributed among seven students taking a course in statistics.
52. An art collector, who owns 10 paintings by famous
artists, is preparing her will. In how many different ways
can she leave these paintings to her three heirs?
53. A baseball fan has a pair of tickets for six different
home games of the Chicago Cubs. If he has five friends
who like baseball, in how many different ways can he take
one of them along to each of the six games?
54. At the end of the day, a bakery gives everything that
is unsold to food banks for the needy. If it has 12 apple
pies left at the end of a given day, in how many different
ways can it distribute these pies among six food banks for
the needy?
55. With reference to Exercise 54, in how many different ways can the bakery distribute the 12 apple pies
if each of the six food banks is to receive at least
one pie?
56. On a Friday morning, the pro shop of a tennis club
has 14 identical cans of tennis balls. If they are all sold
by Sunday night and we are interested only in how many
were sold on each day, in how many different ways could
the tennis balls have been sold on Friday, Saturday, and
Sunday?
57. Rework Exercise 56 given that at least two of the cans
of tennis balls were sold on each of the three days.
Introduction
References
Among the books on the history of statistics there are
Walker, H. M., Studies in the History of Statistical
Method. Baltimore: The Williams & Wilkins Company,
1929,
Westergaard, H., Contributions to the History of Statistics. London: P. S. King & Son, 1932,
and the more recent publications
Kendall, M. G., and Plackett, R. L., eds., Studies in the
History of Statistics and Probability, Vol. II. New York:
Macmillan Publishing Co., Inc., 1977,
Pearson, E. S., and Kendall, M. G., eds., Studies in the
History of Statistics and Probability. Darien, Conn.:
Hafner Publishing Co., Inc., 1970,
Porter, T. M., The Rise of Statistical Thinking, 1820–
1900. Princeton, N.J.: Princeton University Press, 1986,
and
Stigler, S. M., The History of Statistics. Cambridge,
Mass.: Harvard University Press, 1986.
A wealth of material on combinatorial methods can be
found in
Cohen, D. A., Basic Techniques of Combinatorial Theory. New York: John Wiley & Sons, Inc., 1978,
Eisen, M., Elementary Combinatorial Analysis. New
York: Gordon and Breach, Science Publishers, Inc.,
1970,
Feller, W., An Introduction to Probability Theory and Its
Applications, Vol. I, 3rd ed. New York: John Wiley &
Sons, Inc., 1968,
Niven, J., Mathematics of Choice. New York: Random
House, Inc., 1965,
Roberts, F. S., Applied Combinatorics. Upper Saddle
River, N.J.: Prentice Hall, 1984,
and
Whitworth, W. A., Choice and Chance, 5th ed. New
York: Hafner Publishing Co., Inc., 1959, which has
become a classic in this field.
More advanced treatments may be found in
Beckenbach, E. F., ed., Applied Combinatorial Mathematics. New York: John Wiley & Sons, Inc., 1964,
David, F. N., and BARTON, D. E., Combinatorial
Chance. New York: Hafner Publishing Co., Inc., 1962,
and
Riordan, J., An Introduction to Combinatorial Analysis.
New York: John Wiley & Sons, Inc., 1958.
Answers to Odd-Numbered Exercises
1 (a)
n
'
n2ni .
i=1
5 (b) 6, 20, and 70.
#
$
r+n−1
9
and 21.
r
11 Seventh row: 1, 6, 15, 20, 15, 6, 1; Eighth row: 1, 7, 21, 35,
35, 27, 7, 1.
19 (a) −15
384 and −10; (b) 2.230.
21 560.
27 (a) 5; (b) 4.
31 (a) 6; (b) 30; (c) 20; (d) 56.
33 (a) 20; (b) 60.
35 (a) 105.
37 (a) 91; (b) 364.
39 90.
41 5040.
43 50,400 and 3360.
45 280.
47 (a) 21; (b) 6; (c) 12.
49 630.
51 420.
53 15,625.
55 462.
57 45.
Probability
1
2
3
4
5
Introduction
Sample Spaces
Events
The Probability of an Event
Some Rules of Probability
1 Introduction
6
7
8
9
Conditional Probability
Independent Events
Bayes’ Theorem
The Theory in Practice
Historically, the oldest way of defining probabilities, the classical probability concept, applies when all possible outcomes are equally likely, as is presumably the case
in most games of chance. We can then say that if there are N equally likely possibilities, of which one must occur and n are regarded as favorable, or as a “success,” then
n
.
the probability of a “success” is given by the ratio N
EXAMPLE 1
What is the probability of drawing an ace from an ordinary deck of 52 playing cards?
Solution
Since there are n = 4 aces among the N = 52 cards, the probability of drawing
4
1
an ace is 52
= 13
. (It is assumed, of course, that each card has the same chance of
being drawn.)
Although equally likely possibilities are found mostly in games of chance, the
classical probability concept applies also in a great variety of situations where gambling devices are used to make random selections—when office space is assigned to
teaching assistants by lot, when some of the families in a township are chosen in such
a way that each one has the same chance of being included in a sample study, when
machine parts are chosen for inspection so that each part produced has the same
chance of being selected, and so forth.
A major shortcoming of the classical probability concept is its limited applicability, for there are many situations in which the possibilities that arise cannot all
be regarded as equally likely. This would be the case, for instance, if we are concerned with the question whether it will rain on a given day, if we are concerned
with the outcome of an election, or if we are concerned with a person’s recovery
from a disease.
Among the various probability concepts, most widely held is the frequency interpretation, according to which the probability of an event (outcome or happening) is
the proportion of the time that events of the same kind will occur in the long run.
If we say that the probability is 0.84 that a jet from Los Angeles to San Francisco
will arrive on time, we mean (in accordance with the frequency interpretation) that
such flights arrive on time 84 percent of the time. Similarly, if the weather bureau
From Chapter 2 of John E. Freund’s Mathematical Statistics with Applications,
Eighth Edition. Irwin Miller, Marylees Miller. Copyright 2014 by Pearson Education, Inc.
All rights reserved.
Probability
predicts that there is a 30 percent chance for rain (that is, a probability of 0.30), this
means that under the same weather conditions it will rain 30 percent of the time.
More generally, we say that an event has a probability of, say, 0.90, in the same sense
in which we might say that our car will start in cold weather 90 percent of the time.
We cannot guarantee what will happen on any particular occasion—the car may start
and then it may not—but if we kept records over a long period of time, we should
find that the proportion of “successes” is very close to 0.90.
The approach to probability that we shall use in this chapter is the axiomatic
approach, in which probabilities are defined as “mathematical objects” that behave
according to certain well-defined rules. Then, any one of the preceding probability
concepts, or interpretations, can be used in applications as long as it is consistent
with these rules.
2 Sample Spaces
Since all probabilities pertain to the occurrence or nonoccurrence of events, let us
explain first what we mean here by event and by the related terms experiment, outcome, and sample space.
It is customary in statistics to refer to any process of observation or measurement as an experiment. In this sense, an experiment may consist of the simple process of checking whether a switch is turned on or off; it may consist of counting the
imperfections in a piece of cloth; or it may consist of the very complicated process
of determining the mass of an electron. The results one obtains from an experiment, whether they are instrument readings, counts, “yes” or “no” answers, or values
obtained through extensive calculations, are called the outcomes of the experiment.
DEFINITION 1. SAMPLE SPACE. The set of all possible outcomes of an experiment is
called the sample space and it is usually denoted by the letter S. Each outcome
in a sample space is called an element of the sample space, or simply a sample
point.
If a sample space has a finite number of elements, we may list the elements in
the usual set notation; for instance, the sample space for the possible outcomes of
one flip of a coin may be written
S = {H, T}
where H and T stand for head and tail. Sample spaces with a large or infinite number
of elements are best described by a statement or rule; for example, if the possible
outcomes of an experiment are the set of automobiles equipped with satellite radios,
the sample space may be written
S = {x|x is an automobile with a satellite radio}
This is read “S is the set of all x such that x is an automobile with a satellite radio.”
Similarly, if S is the set of odd positive integers, we write
S = {2k + 1|k = 0, 1, 2, . . .}
How we formulate the sample space for a given situation will depend on the
problem at hand. If an experiment consists of one roll of a die and we are interested
in which face is turned up, we would use the sample space
Probability
S1 = {1, 2, 3, 4, 5, 6}
However, if we are interested only in whether the face turned up is even or odd, we
would use the sample space
S2 = {even, odd}
This demonstrates that different sample spaces may well be used to describe an
experiment. In general, it is desirable to use sample spaces whose elements cannot
be divided (partitioned or separated) into more primitive or more elementary kinds
of outcomes. In other words, it is preferable that an element of a sample space not
represent two or more outcomes that are distinguishable in some way. Thus, in the
preceding illustration S1 would be preferable to S2 .
EXAMPLE 2
Describe a sample space that might be appropriate for an experiment in which we
roll a pair of dice, one red and one green. (The different colors are used to emphasize
that the dice are distinct from one another.)
Solution
The sample space that provides the most information consists of the 36 points given by
S1 = {(x, y)|x = 1, 2, . . . , 6; y = 1, 2, . . . , 6}
where x represents the number turned up by the red die and y represents the number
turned up by the green die. A second sample space, adequate for most purposes
(though less desirable in general as it provides less information), is given by
S2 = {2, 3, 4, . . . , 12}
where the elements are the totals of the numbers turned up by the two dice.
Sample spaces are usually classified according to the number of elements that
they contain. In the preceding example the sample spaces S1 and S2 contained a
finite number of elements; but if a coin is flipped until a head appears for the first
time, this could happen on the first flip, the second flip, the third flip, the fourth flip,
. . ., and there are infinitely many possibilities. For this experiment we obtain the
sample space
S = {H, TH, TTH, TTTH, TTTTH, . . .}
with an unending sequence of elements. But even here the number of elements can
be matched one-to-one with the whole numbers, and in this sense the sample space
is said to be countable. If a sample space contains a finite number of elements or an
infinite though countable number of elements, it is said to be discrete.
The outcomes of some experiments are neither finite nor countably infinite. Such
is the case, for example, when one conducts an investigation to determine the distance that a certain make of car will travel over a prescribed test course on 5 liters
of gasoline. If we assume that distance is a variable that can be measured to any
desired degree of accuracy, there is an infinity of possibilities (distances) that cannot be matched one-to-one with the whole numbers. Also, if we want to measure
the amount of time it takes for two chemicals to react, the amounts making up the
sample space are infinite in number and not countable. Thus, sample spaces need
Probability
not be discrete. If a sample space consists of a continuum, such as all the points of
a line segment or all the points in a plane, it is said to be continuous. Continuous
sample spaces arise in practice whenever the outcomes of experiments are measurements of physical properties, such as temperature, speed, pressure, length, . . ., that
are measured on continuous scales.
3 Events
In many problems we are interested in results that are not given directly by a specific
element of a sample space.
EXAMPLE 3
With reference to the first sample space S1 on the previous page, describe the event
A that the number of points rolled with the die is divisible by 3.
Solution
Among 1, 2, 3, 4, 5, and 6, only 3 and 6 are divisible by 3. Therefore, A is represented
by the subset {3, 6} of the sample space S1 .
EXAMPLE 4
With reference to the sample space S1 of Example 2, describe the event B that the
total number of points rolled with the pair of dice is 7.
Solution
Among the 36 possibilities, only (1, 6), (2, 5), (3, 4), (4, 3), (5, 2), and (6, 1) yield
a total of 7. So, we write
B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
Note that in Figure 1 the event of rolling a total of 7 with the two dice is represented
by the set of points inside the region bounded by the dotted line.
Green
die
6
5
4
3
2
1
1
2
3
4
5
6
Red die
Figure 1. Rolling a total of 7 with a pair of dice.
Probability
In the same way, any event (outcome or result) can be identified with a collection
of points, which constitute a subset of an appropriate sample space. Such a subset
consists of all the elements of the sample space for which the event occurs, and in
probability and statistics we identify the subset with the event.
DEFINITION 2. EVENT. An event is a subset of a sample space.
EXAMPLE 5
If someone takes three shots at a target and we care only whether each shot is a hit
or a miss, describe a suitable sample space, the elements of the sample space that
constitute event M that the person will miss the target three times in a row, and the
elements of event N that the person will hit the target once and miss it twice.
Solution
If we let 0 and 1 represent a miss and a hit, respectively, the eight possibilities (0, 0, 0),
(1, 0, 0), (0, 1, 0), (0, 0, 1), (1, 1, 0), (1, 0, 1), (0, 1, 1), and (1, 1, 1) may be displayed
as in Figure 2. Thus, it can be seen that
M = {(0, 0, 0)}
and
N = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}
Third
shot
(0, 0, 1)
(0, 1, 1)
(1, 0, 1)
(1, 1, 1)
(1, 0, 0)
(0, 0, 0)
(0, 1, 0)
First
shot
(1, 1, 0)
Second
shot
Figure 2. Sample space for Example 5.
EXAMPLE 6
Construct a sample space for the length of the useful life of a certain electronic
component and indicate the subset that represents the event F that the component
fails before the end of the sixth year.
Probability
Solution
If t is the length of the component’s useful life in years, the sample space may be
written S = {t|t G 0}, and the subset F = {t|0 F t < 6} is the event that the component
fails before the end of the sixth year.
According to our definition, any event is a subset of an appropriate sample
space, but it should be observed that the converse is not necessarily true. For discrete sample spaces, all subsets are events, but in the continuous case some rather
abstruse point sets must be excluded for mathematical reasons. This is discussed further in some of the more advanced texts listed among the references at the end of
this chapter.
In many problems of probability we are interested in events that are actually
combinations of two or more events, formed by taking unions, intersections, and
complements. Although the reader must surely be familiar with these terms, let us
review briefly that, if A and B are any two subsets of a sample space S, their union
A ∪ B is the subset of S that contains all the elements that are either in A, in B,
or in both; their intersection A ∩ B is the subset of S that contains all the elements
that are in both A and B; and the complement A# of A is the subset of S that contains all the elements of S that are not in A. Some of the rules that control the
formation of unions, intersections, and complements may be found in Exercises 1
through 4.
Sample spaces and events, particularly relationships among events, are often
depicted by means of Venn diagrams, in which the sample space is represented by
a rectangle, while events are represented by regions within the rectangle, usually by
circles or parts of circles. For instance, the shaded regions of the four Venn diagrams
of Figure 3 represent, respectively, event A, the complement of event A, the union
of events A and B, and the intersection of events A and B. When we are dealing
with three events, we usually draw the circles as in Figure 4. Here, the regions are
numbered 1 through 8 for easy reference.
Figure 3. Venn diagrams.
Probability
Figure 4. Venn diagram.
Figure 5. Diagrams showing special relationships among events.
To indicate special relationships among events, we sometimes draw diagrams
like those of Figure 5. Here, the one on the left serves to indicate that events A and
B are mutually exclusive.
DEFINITION 3. MUTUALLY EXCLUSIVE EVENTS. Two events having no elements in common are said to be mutually exclusive.
When A and B are mutually exclusive, we write A ∩ B = ∅, where ∅ denotes
the empty set, which has no elements at all. The diagram on the right serves to
indicate that A is contained in B, and symbolically we express this by writing
A ( B.
Exercises
1. Use Venn diagrams to verify that
(a) (A ∪ B) ∪ C is the same event as A ∪ (B ∪ C);
(b) A ∩ (B ∪ C) is the same event as (A ∩ B) ∪ (A ∩ C);
(c) A ∪ (B ∩ C) is the same event as (A ∪ B) ∩ (A ∪ C).
2. Use Venn diagrams to verify the two De Morgan laws:
(b) (A ∪ B)# = A# ∩ B# .
(a) (A ∩ B)# = A# ∪ B# ;
3. Use Venn diagrams to verify that
(a) (A ∩ B) ∪ (A ∩ B# ) = A;
(b) (A ∩ B) ∪ (A ∩ B# ) ∪ (A# ∩ B) = A ∪ B;
(c) A ∪ (A# ∩ B) = A ∪ B.
4. Use Venn diagrams to verify that if A is contained in
B, then A ∩ B = A and A ∩ B# = ∅.
Probability
4 The Probability of an Event
To formulate the postulates of probability, we shall follow the practice of denoting
events by means of capital letters, and we shall write the probability of event A as
P(A), the probability of event B as P(B), and so forth. The following postulates of
probability apply only to discrete sample spaces, S.
POSTULATE 1
POSTULATE 2
POSTULATE 3
The probability of an event is a nonnegative real number;
that is, P(A) G 0 for any subset A of S.
P(S) = 1.
If A1 , A2 , A3 , . . ., is a finite or infinite sequence of mutually
exclusive events of S, then
P(A1 ∪ A2 ∪ A3 ∪ · · · ) = P(A1 ) + P(A2 ) + P(A3 ) + · · ·
Postulates per se require no proof, but if the resulting theory is to be applied,
we must show that the postulates are satisfied when we give probabilities a “real”
meaning. Let us illustrate this in connection with the frequency interpretation; the
relationship between the postulates and the classical probability concept will be
discussed below, while the relationship between the postulates and subjective probabilities is left for the reader to examine in Exercises 16 and 82.
Since proportions are always positive or zero, the first postulate is in complete
agreement with the frequency interpretation. The second postulate states indirectly
that certainty is identified with a probability of 1; after all, it is always assumed that
one of the possibilities in S must occur, and it is to this certain event that we assign
a probability of 1. As far as the frequency interpretation is concerned, a probability
of 1 implies that the event in question will occur 100 percent of the time or, in other
words, that it is certain to occur.
Taking the third postulate in the simplest case, that is, for two mutually exclusive
events A1 and A2 , it can easily be seen that it is satisfied by the frequency interpretation. If one event occurs, say, 28 percent of the time, another event occurs 39 percent
of the time, and the two events cannot both occur at the same time (that is, they are
mutually exclusive), then one or the other will occur 28 + 39 = 67 percent of the
time. Thus, the third postulate is satisfied, and the same kind of argument applies
when there are more than two mutually exclusive events.
Before we study some of the immediate consequences of the postulates of probability, let us emphasize the point that the three postulates do not tell us how to
assign probabilities to events; they merely restrict the ways in which it can be done.
EXAMPLE 7
An experiment has four possible outcomes, A, B, C, and D, that are mutually exclusive. Explain why the following assignments of probabilities are not permissible:
(a) P(A) = 0.12, P(B) = 0.63, P(C) = 0.45, P(D) = −0.20;
9
45
27
46
, P(B) = 120
, P(C) = 120
, P(D) = 120
.
(b) P(A) = 120
Solution
(a) P(D) = −0.20 violates Postulate 1;
9
45
27
46
(b) P(S) = P(A ∪ B ∪ C ∪ D) = 120
+ 120
+ 120
+ 120
= 127
120 Z 1, and this violates
Postulate 2.
Probability
Of course, in actual practice probabilities are assigned on the basis of past experience, on the basis of a careful analysis of all underlying conditions, on the basis
of subjective judgments, or on the basis of assumptions—sometimes the assumption
that all possible outcomes are equiprobable.
To assign a probability measure to a sample space, it is not necessary to specify
the probability of each possible subset. This is fortunate, for a sample space with as
few as 20 possible outcomes has already 220 = 1,048,576 subsets, and the number
of subsets grows very rapidly when there are 50 possible outcomes, 100 possible
outcomes, or more. Instead of listing the probabilities of all possible subsets, we
often list the probabilities of the individual outcomes, or sample points of S, and
then make use of the following theorem.
THEOREM 1. If A is an event in a discrete sample space S, then P(A) equals
the sum of the probabilities of the individual outcomes comprising A.
Proof Let O1 , O2 , O3 , . . ., be the finite or infinite sequence of outcomes
that comprise the event A. Thus,
A = O1 ∪ O2 ∪ O3 · · ·
and since the individual outcomes, the O’s, are mutually exclusive, the
third postulate of probability yields
P(A) = P(O1 ) + P(O2 ) + P(O3 ) + · · ·
This completes the proof.
To use this theorem, we must be able to assign probabilities to the individual
outcomes of experiments. How this is done in some special situations is illustrated
by the following examples.
EXAMPLE 8
If we twice flip a balanced coin, what is the probability of getting at least one head?
Solution
The sample space is S = {HH, HT, TH, TT}, where H and T denote head and tail.
Since we assume that the coin is balanced, these outcomes are equally likely and we
assign to each sample point the probability 14 . Letting A denote the event that we
will get at least one head, we get A = {HH, HT, TH} and
P(A) = P(HH) + P(HT) + P(TH)
1 1 1
+ +
4 4 4
3
=
4
=
Probability
EXAMPLE 9
A die is loaded in such a way that each odd number is twice as likely to occur as each
even number. Find P(G), where G is the event that a number greater than 3 occurs
on a single roll of the die.
Solution
The sample space is S = {1, 2, 3, 4, 5, 6}. Hence, if we assign probability w to each
even number and probability 2w to each odd number, we find that 2w + w + 2w +
w + 2w + w = 9w = 1 in accordance with Postulate 2. It follows that w = 19 and
1 2 1
4
P(G) = + + =
9 9 9
9
If a sample space is countably infinite, probabilities will have to be assigned to
the individual outcomes by means of a mathematical rule, preferably by means of a
formula or equation.
EXAMPLE 10
If, for a given experiment, O1 , O2 , O3 , . . ., is an infinite sequence of outcomes, verify that
! "i
1
P(Oi ) =
for i = 1, 2, 3, . . .
2
is, indeed, a probability measure.
Solution
Since the probabilities are all positive, it remains to be shown that P(S) = 1. Getting
P(S) =
1 1 1
1
+ + +
+···
2 4 8 16
and making use of the formula for the sum of the terms of an infinite geometric
progression, we find that
P(S) =
1
2
1 − 12
=1
In connection with the preceding example, the word “sum” in Theorem 1 will
have to be interpreted so that it includes the value of an infinite series.
The probability measure of Example 10 would be appropriate, for example, if
Oi is the event that a person flipping a balanced coin will get a tail for the first time
on the ith flip of the coin. Thus, the probability that the first tail will come on the
third, fourth, or fifth flip of the coin is
! "3 ! "4 ! "5
1
1
7
1
+
+
=
2
2
2
32
and the probability that the first tail will come on an odd-numbered flip of the coin is
! "1 ! "3 ! "5
1
1
1
1
2
+
+
+··· = 2 1 =
2
2
2
3
1−
4
Probability
Here again we made use of the formula for the sum of the terms of an infinite geometric progression.
If an experiment is such that we can assume equal probabilities for all the sample
points, as was the case in Example 8, we can take advantage of the following special
case of Theorem 1.
THEOREM 2. If an experiment can result in any one of N different equally
likely outcomes, and if n of these outcomes together constitute event A,
then the probability of event A is
P(A) =
n
N
Proof Let O1 , O2 , . . . , ON represent the individual outcomes in S, each
1
with probability . If A is the union of n of these mutually exclusive
N
outcomes, and it does not matter which ones, then
P(A) = P(O1 ∪ O2 ∪ · · · ∪ On )
= P(O1 ) + P(O2 ) + · · · + P(On )
=
1
1
1
+ +···+
N&
#N N $%
n
=
N
n terms
n
of Theorem 2 is identical with the one for
N
the classical probability concept (see below). Indeed, what we have shown here is
that the classical probability concept is consistent with the postulates of
probability—it follows from the postulates in the special case where the individual
outcomes are all equiprobable.
Observe that the formula P(A) =
EXAMPLE 11
A five-card poker hand dealt from a deck of 52 playing cards is said to be a full house
if it consists of three of a kind and a pair. If all the five-card hands are equally likely,
what is the probability of being dealt a full house?
Solution
The number of ways in which we can be dealt a particular full house, say three kings
' (' (
4 4
and two aces, is 3 2 . Since there are 13 ways of selecting the face value for the
three of a kind and for each of these there are 12 ways of selecting the face value for
the pair, there are altogether
) *) *
4
4
n = 13 · 12 ·
3
2
different full houses. Also, the total number of equally likely five-card poker
hands is
Probability
) *
52
N=
5
and it follows by Theorem 2 that the probability of getting a full house is
) *) *
4
4
13 · 12
3
2
n
) *
=
P(A) =
= 0.0014
N
52
5
5 Some Rules of Probability
Based on the three postulates of probability, we can derive many other rules that
have important applications. Among them, the next four theorems are immediate
consequences of the postulates.
THEOREM 3. If A and A# are complementary events in a sample space S, then
P(A# ) = 1 − P(A)
Proof In the second and third steps of the proof that follows, we make
use of the definition of a complement, according to which A and A# are
mutually exclusive and A ∪ A# = S. Thus, we write
1 = P(S)
(by Postulate 2)
#
= P(A ∪ A )
= P(A) + P(A# )
(by Postulate 3)
and it follows that P(A# ) = 1 − P(A).
In connection with the frequency interpretation, this result implies that if an
event occurs, say, 37 percent of the time, then it does not occur 63 percent of
the time.
THEOREM 4. P(∅) = 0 for any sample space S.
Proof Since S and ∅ are mutually exclusive and S ∪ ∅ = S in accordance
with the definition of the empty set ∅, it follows that
P(S) = P(S ∪ ∅)
= P(S) + P(∅)
(by Postulate 3)
and, hence, that P(∅) = 0.
It is important to note that it does not necessarily follow from P(A) = 0 that
A = ∅. In practice, we often assign 0 probability to events that, in colloquial terms,
Probability
would not happen in a million years. For instance, there is the classical example that
we assign a probability of 0 to the event that a monkey set loose on a typewriter will
type Plato’s Republic word for word without a mistake. The fact that P(A) = 0 does
not imply that A = ∅ is of relevance, especially, in the continuous case.
THEOREM 5. If A and B are events in a sample space S and A ( B, then
P(A) F P(B).
Proof Since A ( B, we can write
B = A ∪ (A# ∩ B)
as can easily be verified by means of a Venn diagram. Then, since A and
A# ∩ B are mutually exclusive, we get
P(B) = P(A) + P(A# ∩ B)
G P(A)
(by Postulate 3)
(by Postulate 1)
In words, this theorem states that if A is a subset of B, then P(A) cannot be
greater than P(B). For instance, the probability of drawing a heart from an ordinary
deck of 52 playing cards cannot be greater than the probability of drawing a red card.
Indeed, the probability of drawing a heart is 14 , compared with 12 , the probability of
drawing a red card.
THEOREM 6. 0 F P(A) F 1 for any event A.
Proof Using Theorem 5 and the fact that ∅ ( A ( S for any event A in S,
we have
P(∅) F P(A) F P(S)
Then, P(∅) = 0 and P(S) = 1 leads to the result that
0 F P(A) F 1
The third postulate of probability is sometimes referred to as the special addition rule; it is special in the sense that events A1 , A2 , A3 , . . ., must all be mutually
exclusive. For any two events A and B, there exists the general addition rule, or the
inclusion–exclusion principle:
THEOREM 7. If A and B are any two events in a sample space S, then
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
Proof Assigning the probabilities a, b, and c to the mutually exclusive
events A ∩ B, A ∩ B# , and A# ∩ B as in the Venn diagram of Figure 6, we
find that
P(A ∪ B) = a + b + c
= (a + b) + (c + a) − a
= P(A) + P(B) − P(A ∩ B)
Probability
A
B
b
a
c
Figure 6. Venn diagram for proof of Theorem 7.
EXAMPLE 12
In a large metropolitan area, the probabilities are 0.86, 0.35, and 0.29, respectively,
that a family (randomly chosen for a sample survey) owns a color television set, a
HDTV set, or both kinds of sets. What is the probability that a family owns either or
both kinds of sets?
Solution
If A is the event that a family in this metropolitan area owns a color television set
and B is the event that it owns a HDTV set, we have P(A) = 0.86, P(B) = 0.35, and
P(A ∩ B) = 0.29; substitution into the formula of Theorem 7 yields
P(A ∪ B) = 0.86 + 0.35 − 0.29
= 0.92
EXAMPLE 13
Near a certain exit of I-17, the probabilities are 0.23 and 0.24, respectively, that
a truck stopped at a roadblock will have faulty brakes or badly worn tires. Also,
the probability is 0.38 that a truck stopped at the roadblock will have faulty brakes
and/or badly worn tires. What is the probability that a truck stopped at this roadblock
will have faulty brakes as well as badly worn tires?
Solution
If B is the event that a truck stopped at the roadblock will have faulty brakes and T
is the event that it will have badly worn tires, we have P(B) = 0.23, P(T) = 0.24, and
P(B ∪ T) = 0.38; substitution into the formula of Theorem 7 yields
0.38 = 0.23 + 0.24 − P(B ∩ T)
Solving for P(B ∩ T), we thus get
P(B ∩ T) = 0.23 + 0.24 − 0.38 = 0.09
Repeatedly using the formula of Theorem 7, we can generalize this addition rule
so that it will apply to any number of events. For instance, for three events we obtain
the following theorem.
Probability
THEOREM 8. If A, B, and C are any three events in a sample space S, then
P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − P(A ∩ B) − P(A ∩ C)
− P(B ∩ C) + P(A ∩ B ∩ C)
Proof Writing A ∪ B ∪ C as A ∪ (B ∪ C) and using the formula of Theorem 7 twice, once for P[A ∪ (B ∪ C)] and once for P(B ∪ C), we get
P(A ∪ B ∪ C) = P[A ∪ (B ∪ C)]
= P(A) + P(B ∪ C) − P[A ∩ (B ∪ C)]
= P(A) + P(B) + P(C) − P(B ∩ C)
− P[A ∩ (B ∪ C)]
Then, using the distributive law that the reader was asked to verify in part
(b) of Exercise 1, we find that
P[A ∩ (B ∪ C)] = P[(A ∩ B) ∪ (A ∩ C)]
= P(A ∩ B) + P(A ∩ C) − P[(A ∩ B) ∩ (A ∩ C)]
= P(A ∩ B) + P(A ∩ C) − P(A ∩ B ∩ C)
and hence that
P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − P(A ∩ B) − P(A ∩ C)
− P(B ∩ C) + P(A ∩ B ∩ C)
(In Exercise 12 the reader will be asked to give an alternative proof of this theorem, based on the method used in the text to prove Theorem 7.)
EXAMPLE 14
If a person visits his dentist, suppose that the probability that he will have his teeth
cleaned is 0.44, the probability that he will have a cavity filled is 0.24, the probability
that he will have a tooth extracted is 0.21, the probability that he will have his teeth
cleaned and a cavity filled is 0.08, the probability that he will have his teeth cleaned
and a tooth extracted is 0.11, the probability that he will have a cavity filled and
a tooth extracted is 0.07, and the probability that he will have his teeth cleaned,
a cavity filled, and a tooth extracted is 0.03. What is the probability that a person
visiting his dentist will have at least one of these things done to him?
Solution
If C is the event that the person will have his teeth cleaned, F is the event that he
will have a cavity filled, and E is the event that he will have a tooth extracted, we
are given P(C) = 0.44, P(F) = 0.24, P(E) = 0.21, P(C ∩ F) = 0.08, P(C ∩ E) = 0.11,
P(F ∩ E) = 0.07, and P(C ∩ F ∩ E) = 0.03, and substitution into the formula of
Theorem 8 yields
P(C ∪ F ∪ E) = 0.44 + 0.24 + 0.21 − 0.08 − 0.11 − 0.07 + 0.03
= 0.66
Probability
Exercises
5. Use parts (a) and (b) of Exercise 3 to show that
(a) P(A) G P(A ∩ B);
(b) P(A) F P(A ∪ B).
6. Referring to Figure 6, verify that
13. Duplicate the method of proof used in Exercise 12 to
show that
P(A ∪ B ∪ C ∪ D) = P(A) + P(B) + P(C) + P(D)
− P(A ∩ B) − P(A ∩ C) − P(A ∩ D)
P(A ∩ B# ) = P(A) − P(A ∩ B)
− P(B ∩ C) − P(B ∩ D) − P(C ∩ D)
7. Referring to Figure 6 and letting P(A# ∩ B# ) = d, verify that
+ P(A ∩ B ∩ C) + P(A ∩ B ∩ D)
P(A# ∩ B# ) = 1 − P(A) − P(B) + P(A ∩ B)
8. The event that “A or B but not both” will occur can be
written as
(A ∩ B# ) ∪ (A# ∩ B)
Express the probability of this event in terms of P(A),
P(B), and P(A ∩ B).
9. Use the formula of Theorem 7 to show that
(a) P(A ∩ B) F P(A) + P(B);
(b) P(A ∩ B) G P(A) + P(B) − 1.
10. Use the Venn diagram of Figure 7 with the probabilities a, b, c, d, e, f , and g assigned to A ∩ B ∩ C,
A ∩ B ∩ C# , . . ., and A ∩ B# ∩ C# to show that if P(A) =
P(B) = P(C) = 1, then P(A ∩ B ∩ C) = 1. [Hint: Start
with the argument that since P(A) = 1, it follows that
e = c = f = 0.]
11. Give an alternative proof of Theorem 7 by making
use of the relationships A ∪ B = A ∪ (A# ∩ B) and B =
(A ∩ B) ∪ (A# ∩ B).
12. Use the Venn diagram of Figure 7 and the method by
which we proved Theorem 7 to prove Theorem 8.
A
B
b
g
e
a
d
c
f
C
+ P(A ∩ C ∩ D) + P(B ∩ C ∩ D)
− P(A ∩ B ∩ C ∩ D)
(Hint: With reference to the Venn diagram of Figure 7,
divide each of the eight regions into two parts, designating one to be inside D and the other outside D and letting
a, b, c, d, e, f , g, h, i, j, k, l, m, n, o, and p be the probabilities associated with the resulting 16 regions.)
14. Prove by induction that
P(E1 ∪ E2 ∪ · · · ∪ En ) F
P(Ei )
i=1
for any finite sequence of events E1 , E2 , . . ., and En .
15. The odds that an event will occur are given by the
ratio of the probability that the event will occur to the
probability that it will not occur, provided neither probability is zero. Odds are usually quoted in terms of positive
integers having no common factor. Show that if the odds
are A to B that an event will occur, its probability is
A
p=
A+B
16. Subjective probabilities may be determined by exposing persons to risk-taking situations and finding the odds
at which they would consider it fair to bet on the outcome.
The odds are then converted into probabilities by means
of the formula of Exercise 15. For instance, if a person
feels that 3 to 2 are fair odds that a business venture will
succeed (or that it would be fair to bet $30 against $20
3
that it will succeed), the probability is
= 0.6 that
3+2
the business venture will succeed. Show that if subjective
probabilities are determined in this way, they satisfy
(a) Postulate 1;
(b) Postulate 2.
See also Exercise 82.
Figure 7. Venn diagram for Exercises 10, 12, and 13.
n
+
Probability
6 Conditional Probability
Difficulties can easily arise when probabilities are quoted without specification of
the sample space. For instance, if we ask for the probability that a lawyer makes
more than $75,000 per year, we may well get several different answers, and they may
all be correct. One of them might apply to all those who are engaged in the private
practice of law, another might apply to lawyers employed by corporations, and so
forth. Since the choice of the sample space (that is, the set of all possibilities under
consideration) is by no means always self-evident, it often helps to use the symbol
P(A|S) to denote the conditional probability of event A relative to the sample space
S or, as we also call it, “the probability of A given S.” The symbol P(A|S) makes it
explicit that we are referring to a particular sample space S, and it is preferable to
the abbreviated notation P(A) unless the tacit choice of S is clearly understood. It is
also preferable when we want to refer to several sample spaces in the same example.
If A is the event that a person makes more than $75,000 per year, G is the event that
a person is a law school graduate, L is the event that a person is licensed to practice
law, and E is the event that a person is actively engaged in the practice of law, then
P(A|G) is the probability that a law school graduate makes more than $75,000 per
year, P(A|L) is the probability that a person licensed to practice law makes more
than $75,000 per year, and P(A|E) is the probability that a person actively engaged
in the practice of law makes more than $75,000 per year.
Some ideas connected with conditional probabilities are illustrated in the following example.
EXAMPLE 15
A consumer research organization has studied the services under warranty provided
by the 50 new-car dealers in a certain city, and its findings are summarized in the
following table.
Good service
under warranty
Poor service
under warranty
In business 10 years or more
16
4
In business less than 10 years
10
20
If a person randomly selects one of these new-car dealers, what is the probability that
he gets one who provides good service under warranty? Also, if a person randomly
selects one of the dealers who has been in business for 10 years or more, what is the
probability that he gets one who provides good service under warranty?
Solution
By “randomly” we mean that, in each case, all possible selections are equally likely,
and we can therefore use the formula of Theorem 2. If we let G denote the selection
of a dealer who provides good service under warranty, and if we let n(G) denote
the number of elements in G and n(S) the number of elements in the whole sample
space, we get
P(G) =
This answers the first question.
16 + 10
n(G)
=
= 0.52
n(S)
50
Probability
For the second question, we limit ourselves to the reduced sample space, which
consists of the first line of the table, that is, the 16 + 4 = 20 dealers who have been
in business 10 years or more. Of these, 16 provide good service under warranty, and
we get
16
P(G|T) =
= 0.80
20
where T denotes the selection of a dealer who has been in business 10 years or
more. This answers the second question and, as should have been expected, P(G|T)
is considerably higher than P(G).
Since the numerator of P(G|T) is n(T ∩ G) = 16 in the preceding example, the
number of dealers who have been in business for 10 years or more and provide good
service under warranty, and the denominator is n(T), the number of dealers who
have been in business 10 years or more, we can write symbolically
P(G|T) =
n(T ∩ G)
n(T)
Then, if we divide the numerator and the denominator by n(S), the total number of
new-car dealers in the given city, we get
n(T∩G)
P(T ∩ G)
n(S)
P(G|T) = n(T) =
P(T)
n(S)
and we have, thus, expressed the conditional probability P(G|T) in terms of two
probabilities defined for the whole sample space S.
Generalizing from the preceding, let us now make the following definition of
conditional probability.
DEFINITION 4. CONDITIONAL PROBABILITY. If A and B are any two events in a sample
space S and P(A) Z 0, the conditional probability of B given A is
P(B|A) =
P(A ∩ B)
P(A)
EXAMPLE 16
With reference to Example 15, what is the probability that one of the dealers who
has been in business less than 10 years will provide good service under warranty?
Solution
10
10 + 20
Since P(T # ∩ G) =
= 0.20 and P(T # ) =
= 0.60, substitution into the
50
50
formula yields
P(G|T # ) =
0.20
1
P(T # ∩ G)
=
=
#
P(T )
0.60
3
Probability
Although we introduced the formula for P(B|A) by means of an example in
which the possibilities were all equally likely, this is not a requirement for its use.
EXAMPLE 17
With reference to the loaded die of Example 9, what is the probability that the number of points rolled is a perfect square? Also, what is the probability that it is a
perfect square given that it is greater than 3?
Solution
If A is the event that the number of points rolled is greater than 3 and B is the event
that it is a perfect square, we have A = {4, 5, 6}, B = {1, 4}, and A ∩ B = {4}. Since
the probabilities of rolling a 1, 2, 3, 4, 5, or 6 with the die are 29 , 19 , 29 , 19 , 29 , and 19 , we
find that the answer to the first question is
P(B) =
2 1
1
+ =
9 9
3
To determine P(B|A), we first calculate
P(A ∩ B) =
1
9
and
P(A) =
1 2 1
4
+ + =
9 9 9
9
Then, substituting into the formula of Definition 4, we get
1
P(A ∩ B)
1
P(B|A) =
= 9 =
4
P(A)
4
9
To verify that the formula of Definition 4 has yielded the “right” answer in the
preceding example, we have only to assign probability v to the two even numbers
in the reduced sample space A and probability 2v to the odd number, such that the
sum of the three probabilities is equal to 1. We thus have v + 2v + v = 1, v = 14 , and,
hence, P(B|A) = 14 as before.
EXAMPLE 18
A manufacturer of airplane parts knows from past experience that the probability
is 0.80 that an order will be ready for shipment on time, and it is 0.72 that an order
will be ready for shipment on time and will also be delivered on time. What is the
probability that such an order will be delivered on time given that it was ready for
shipment on time?
Solution
If we let R stand for the event that an order is ready for shipment on time and D be
the event that it is delivered on time, we have P(R) = 0.80 and P(R ∩ D) = 0.72, and
it follows that
0.72
P(R ∩ D)
=
= 0.90
P(D|R) =
P(R)
0.80
Probability
Thus, 90 percent of the shipments will be delivered on time provided they are shipped
on time. Note that P(R|D), the probability that a shipment that is delivered on time
was also ready for shipment on time, cannot be determined without further information; for this purpose we would also have to know P(D).
If we multiply the expressions on both sides of the formula of Definition 4 by
P(A), we obtain the following multiplication rule.
THEOREM 9. If A and B are any two events in a sample space S and P(A) Z 0,
then
P(A ∩ B) = P(A) · P(B|A)
In words, the probability that A and B will both occur is the product of the probability of A and the conditional probability of B given A. Alternatively, if P(B) Z 0, the
probability that A and B will both occur is the product of the probability of B and
the conditional probability of A given B; symbolically,
P(A ∩ B) = P(B) · P(A|B)
To derive this alternative multiplication rule, we interchange A and B in the formula
of Theorem 9 and make use of the fact that A ∩ B = B ∩ A.
EXAMPLE 19
If we randomly pick two television sets in succession from a shipment of 240 television sets of which 15 are defective, what is the probability that they will both
be defective?
Solution
If we assume equal probabilities for each selection (which is what we mean by “ran15
domly” picking the sets), the probability that the first set will be defective is 240
, and
the probability that the second set will be defective given that the first set is defec14
15
14
7
tive is 239
. Thus, the probability that both sets will be defective is 240
· 239
= 1,912
.
This assumes that we are sampling without replacement; that is, the first set is not
replaced before the second set is selected.
EXAMPLE 20
Find the probabilities of randomly drawing two aces in succession from an ordinary
deck of 52 playing cards if we sample
(a) without replacement;
(b) with replacement.
Solution
(a) If the first card is not replaced before the second card is drawn, the probability
of getting two aces in succession is
1
4 3
·
=
52 51
221
Probability
(b) If the first card is replaced before the second card is drawn, the corresponding
probability is
4 4
1
·
=
52 52
169
In the situations described in the two preceding examples there is a definite
temporal order between the two events A and B. In general, this need not be the
case when we write P(A|B) or P(B|A). For instance, we could ask for the probability that the first card drawn was an ace given that the second card drawn (without
3
.
replacement) is an ace—the answer would also be 51
Theorem 9 can easily be generalized so that it applies to more than two events;
for instance, for three events we have the following theorem.
THEOREM 10. If A, B, and C are any three events in a sample space S such
that P(A ∩ B) Z 0, then
P(A ∩ B ∩ C) = P(A) · P(B|A) · P(C|A ∩ B)
Proof Writing A ∩ B ∩ C as (A ∩ B) ∩ C and using the formula of Theorem 9 twice, we get
P(A ∩ B ∩ C) = P[(A ∩ B) ∩ C]
= P(A ∩ B) · P(C|A ∩ B)
= P(A) · P(B|A) · P(C|A ∩ B)
EXAMPLE 21
A box of fuses contains 20 fuses, of which 5 are defective. If 3 of the fuses are selected
at random and removed from the box in succession without replacement, what is the
probability that all 3 fuses are defective?
Solution
If A is the event that the first fuse is defective, B is the event that the second fuse
5
,
is defective, and C is the event that the third fuse is defective, then P(A) = 20
4
3
P(B|A) = 19 , P(C|A ∩ B) = 18 , and substitution into the formula yields
5 4 3
·
·
20 19 18
1
=
114
P(A ∩ B ∩ C) =
Further generalization of Theorems 9 and 10 to k events is straightforward, and
the resulting formula can be proved by mathematical induction.
Probability
7 Independent Events
Informally speaking, two events A and B are independent if the occurrence or nonoccurrence of either one does not affect the probability of the occurrence of the other.
For instance, in the preceding example the selections would all have been independent had each fuse been replaced before the next one was selected; the probability
5
.
of getting a defective fuse would have remained 20
Symbolically, two events A and B are independent if P(B|A) = P(B) and
P(A|B) = P(A), and it can be shown that either of these equalities implies the other
when both of the conditional probabilities exist, that is, when neither P(A) nor P(B)
equals zero (see Exercise 21).
Now, if we substitute P(B) for P(B|A) into the formula of Theorem 9, we get
P(A ∩ B) = P(A) · P(B|A)
= P(A) · P(B)
and we shall use this as our formal definition of independence.
DEFINITION 5. INDEPENDENCE. Two events A and B are independent if and only if
P(A ∩ B) = P(A) · P(B)
Reversing the steps, we can also show that Definition 5 implies the definition of independence that we gave earlier.
If two events are not independent, they are said to be dependent. In the derivation of the formula of Definition 5, we assume that P(B|A) exists and, hence, that
P(A) Z 0. For mathematical convenience, we shall let the definition apply also when
P(A) = 0 and/or P(B) = 0.
EXAMPLE 22
A coin is tossed three times and the eight possible outcomes, HHH, HHT, HTH,
THH, HTT, THT, TTH, and TTT, are assumed to be equally likely. If A is the event
that a head occurs on each of the first two tosses, B is the event that a tail occurs
on the third toss, and C is the event that exactly two tails occur in the three tosses,
show that
(a) events A and B are independent;
(b) events B and C are dependent.
Solution
Since
A = {HHH, HHT}
B = {HHT, HTT, THT, TTT}
C = {HTT, THT, TTH}
A ∩ B = {HHT}
B ∩ C = {HTT, THT}
Probability
the assumption that the eight possible outcomes are all equiprobable yields
P(A) = 14 , P(B) = 12 , P(C) = 38 , P(A ∩ B) = 18 , and P(B ∩ C) = 14 .
(a) Since P(A) · P(B) = 14 · 12 = 18 = P(A ∩ B), events A and B are independent.
(b) Since P(B) · P(C) =
pendent.
1 3
2·8
=
3
16 Z P(B ∩ C), events B and C are not inde-
In connection with Definition 5, it can be shown that if A and B are independent,
then so are A and B# , A# and B, and A# and B# . For instance, consider the following
theorem.
THEOREM 11. If A and B are independent, then A and B# are also independent.
Proof Since A = (A ∩ B) ∪ (A ∩ B# ), as the reader was asked to show in
part (a) of Exercise 3, A ∩ B and A ∩ B# are mutually exclusive, and A and
B are independent by assumption, we have
P(A) = P[(A ∩ B) ∪ (A ∩ B# )]
= P(A ∩ B) + P(A ∩ B# )
= P(A) · P(B) + P(A ∩ B# )
It follows that
P(A ∩ B# ) = P(A) − P(A) · P(B)
= P(A) · [1 − P(B)]
= P(A) · P(B# )
and hence that A and B# are independent.
In Exercises 22 and 23 the reader will be asked to show that if A and B are
independent, then A# and B are independent and so are A# and B# , and if A and B
are dependent, then A and B# are dependent.
To extend the concept of independence to more than two events, let us make the
following definition.
DEFINITION 6. INDEPENDENCE OF MORE THAN TWO EVENTS. Events A1 , A2 , . . . , and
Ak are independent if and only if the probability of the intersections of any 2, 3,
. . . , or k of these events equals the product of their respective probabilities.
For three events A, B, and C, for example, independence requires that
P(A ∩ B) = P(A) · P(B)
P(A ∩ C) = P(A) · P(C)
P(B ∩ C) = P(B) · P(C)
Probability
and
P(A ∩ B ∩ C) = P(A) · P(B) · P(C)
It is of interest to note that three or more events can be pairwise independent
without being independent.
EXAMPLE 23
Figure 8 shows a Venn diagram with probabilities assigned to its various regions.
Verify that A and B are independent, A and C are independent, and B and C are
independent, but A, B, and C are not independent.
Solution
As can be seen from the diagram, P(A) = P(B) = P(C) =
P(A ∩ C) = P(B ∩ C) = 14 , and P(A ∩ B ∩ C) = 14 . Thus,
1
2 , P(A ∩ B)
=
1
= P(A ∩ B)
4
1
P(A) · P(C) = = P(A ∩ C)
4
1
P(B) · P(C) = = P(B ∩ C)
4
P(A) · P(B) =
but
P(A) · P(B) · P(C) =
1
Z P(A ∩ B ∩ C)
8
A
B
1
4
1
4
1
4
1
4
C
Figure 8. Venn diagram for Example 23.
Incidentally, the preceding example can be given a “real” interpretation by considering a large room that has three separate switches controlling the ceiling lights.
These lights will be on when all three switches are “up” and hence also when one
of the switches is “up” and the other two are “down.” If A is the event that the first
switch is “up,” B is the event that the second switch is “up,” and C is the event that
the third switch is “up,” the Venn diagram of Figure 8 shows a possible set of probabilities associated with the switches being “up” or “down” when the ceiling lights
are on.
Probability
It can also happen that P(A ∩ B ∩ C) = P(A) · P(B) · P(C) without A, B, and C
being pairwise independent—this the reader will be asked to verify in Exercise 24.
Of course, if we are given that certain events are independent, the probability
that they will all occur is simply the product of their respective probabilities.
EXAMPLE 24
Find the probabilities of getting
(a) three heads in three random tosses of a balanced coin;
(b) four sixes and then another number in five random rolls of a balanced die.
Solution
1
(a) The probability of a head on each toss is and the three outcomes are inde2
pendent. Thus we can multiply, obtaining
1 1 1
1
· · =
2 2 2
8
1
(b) The probability of a six on each toss is ; thus the probability of tossing a
6
5
number other than 6 is . Inasmuch as the tosses are independent, we can
6
multiply the respective probabilities to obtain
1 1 1 1 5
5
· · · · =
6 6 6 6 6
7, 776
8 Bayes’ Theorem
In many situations the outcome of an experiment depends on what happens in various intermediate stages. The following is a simple example in which there is one
intermediate stage consisting of two alternatives:
EXAMPLE 25
The completion of a construction job may be delayed because of a strike. The probabilities are 0.60 that there will be a strike, 0.85 that the construction job will be
completed on time if there is no strike, and 0.35 that the construction job will be
completed on time if there is a strike. What is the probability that the construction
job will be completed on time?
Solution
If A is the event that the construction job will be completed on time and B is the
event that there will be a strike, we are given P(B) = 0.60, P(A|B# ) = 0.85, and
P(A|B) = 0.35. Making use of the formula of part (a) of Exercise 3, the fact that A ∩
B and A ∩ B# are mutually exclusive, and the alternative form of the multiplication
rule, we can write
P(A) = P[(A ∩ B) ∪ (A ∩ B# )]
= P(A ∩ B) + P(A ∩ B# )
= P(B) · P(A|B) + P(B# ) · P(A|B# )
Probability
Then, substituting the given numerical values, we get
P(A) = (0.60)(0.35) + (1 − 0.60)(0.85)
= 0.55
An immediate generalization of this kind of situation is the case where the
intermediate stage permits k different alternatives (whose occurrence is denoted by
B1 , B2 , . . . , Bk ). It requires the following theorem, sometimes called the rule of total
probability or the rule of elimination.
THEOREM 12. If the events B1 , B2 , . . . , and Bk constitute a partition of the
sample space S and P(Bi ) Z 0 for i = 1, 2, . . . , k, then for any event A in S
P(A) =
k
+
i=1
P(Bi ) · P(A|Bi )
The B’s constitute a partition of the sample space if they are pairwise mutually exclusive and if their union equals S. A formal proof of Theorem 12 consists, essentially,
of the same steps we used in Example 25, and it is left to the reader in Exercise 32.
EXAMPLE 26
The members of a consulting firm rent cars from three rental agencies: 60 percent
from agency 1, 30 percent from agency 2, and 10 percent from agency 3. If 9 percent
of the cars from agency 1 need an oil change, 20 percent of the cars from agency 2
need an oil change, and 6 percent of the cars from agency 3 need an oil change, what
is the probability that a rental car delivered to the firm will need an oil change?
Solution
If A is the event that the car needs an oil change, and B1 , B2 , and B3 are the events
tha...

## We've got everything to become your favourite writing service

### Money back guarantee

Your money is safe. Even if we fail to satisfy your expectations, you can always request a refund and get your money back.

### Confidentiality

We don’t share your private information with anyone. What happens on our website stays on our website.

### Our service is legit

We provide you with a sample paper on the topic you need, and this kind of academic assistance is perfectly legitimate.

### Get a plagiarism-free paper

We check every paper with our plagiarism-detection software, so you get a unique paper written for your particular purposes.

### We can help with urgent tasks

Need a paper tomorrow? We can write it even while you’re sleeping. Place an order now and get your paper in 8 hours.

### Pay a fair price

Our prices depend on urgency. If you want a cheap essay, place your order in advance. Our prices start from $11 per page.