STAT200: Introduction to Statistics
Final Examination, Summer 2018 OL
2
Page 1 of 8
STAT 200
OL2 Sections
Final Exam
Summer 2018
The final exam will be posted at 12:01 am on July 27, 2018, and it
is due at 11:59 pm on July 29, 2018 Eastern Time.
This is an openbook exam. You may refer to your text and other course materials
for the current course as you work on the exam, and you may use a calculator,
applets, or Excel. You must complete the exam individually. Neither collaboration
nor consultation with others is allowed. It is a violation of the UMUC Academic
Dishonesty and Plagiarism policy to use unauthorized materials or work from
others.
Answer all 20 questions. Make sure your answers are as complete as possible,
particularly when it asks for you to show your work. Answers that come straight
from calculators, programs or software packages without any explanation will not
be accepted. If you need to use technology (for example, Excel, online or hand
held calculators, statistical packages) to aid in your calculation, you must cite the
sources and explain how you get the results. For example, state the Excel function
along with the required parameters when using Excel; describe the detailed steps
when using a handheld calculator; or provide the URL and detailed steps when
using an online calculator, and so on.
Record your answers and work on the separate answer sheet provided.
This exam has 20 problems; 5% for each problems.
You must include the Honor Pledge on the title page of your submitted final exam.
Exams submitted without the Honor Pledge will not be accepted.
STAT 200: Introduction to Statistics
Final Examination, Summer 2018 OL2
Page 2 of 8
1. Research has suggested that breakfast is the most important meal of the day. A nutritionist randomly
selects 100 individuals and asks them: “Did you have breakfast this morning? Yes or no?”
(a) What is an appropriate method for graphing the data?
(b) Why is it appropriate?
2. A pet store owner is interested in the number of pets owned by her customers. She takes a random
sample of 100 customers and asks them: “How many pets do you own?”
(a) What is an appropriate method for graphing the data?
(b) Why is it appropriate?
3. Choose the best answer. Justify for full
credit.
(a) The Knot.com surveyed nearly 13,000 couples, who married in 2017, and asked how much they
spent on their wedding. The average amount of money spent on was $33,391. The value $33,39
1
is a:
(i) parameter
(ii) statistic
(iii) cannot be determined from information provided.
(b) A marketing agent asked people to rank the quality of a new soap on a scale from
1 (poor) to 5 (excellent). The level of this measurement is
(i) nominal
(ii) ordinal
(iii) interval
(iv) ratio
4. A school district wanted to assess the effectiveness of a new reading readiness program for 1st
graders. The school district is divided into the individual first grade classrooms and 10
classrooms are randomly selected. All of the children in each of the 10 selected classrooms are
assessed.
(a) What type of sampling method is being used?
(b) Please explain your answer.
STAT 200: Introduction to Statistics
Final Examination, Summer 2018 OL2
Page 3 of 8
5. The frequency distribution below shows the distribution of average seasonal rainfall in San Francisco,
as measured in inches, for the years 19672017. (Show all work. Just the answer, without supporting
work, will receive no credit.)
Season Rainfall (in
Inches)
Frequency
Cumulative Relative
Frequency
0 – 9.99 1
10 – 19.99 22
20 – 29.99
0.8
4
30 – 30.99
0.98
40 – 40.99
1.00
Total 50
(a) Complete the frequency table with frequency and cumulative relative frequency. Express the
cumulative relative frequency to two decimal places.
(b) What percentage of season in this sample has a seasonal rainfall between 30 and 40.99 inches,
inclusive?
(c) Which of the following seasonal rainfall groups does the median of this distribution belong to?
1019.99, 20 – 29.99, or 30 – 39.99? Why?
6. Consider selecting one card at a time from a 52card deck. What is the probability that the first
card is a diamond and the second card is also a diamond? Express the probability in fraction
format. (Note: There are 13 diamonds in a deck of cards) (Show all work. Just the answer, without
supporting work, will receive no credit.)
(a) Assuming the card selection is without replacement.
(b) Assuming the card selection is with replacement.
7. Mimi has seven new summer outfits. She plans to pack three of the new summer outfits in her
trip to Tokyo.
(a) How many different ways can the three summer outfits be selected?
(b) Please describe the method used and the reason why it is appropriate for answering the
question.
Just the answer, without the description and reason, will receive no credit.
8. A businessman needs to visit clients in 5 different cities.
(a) How many different routes can he take?
(b) Please describe the method used and the reason why it is appropriate for answering the question.
Just the answer, without the description and reason, will receive no credit.
STAT 200: Introduction to Statistics
Final Examination, Summer 2018 OL2
Page 4 of 8
9. Recent research suggests that car ownership may have peaked. The following probability
distribution table shows the random variable, x, where x is number of cars owned by household:
x P(x)
0 0.10
1 0.19
2 0.4
5
3 0.22
4 0.04
(a) Determine the mean of x (Round the answer to two decimal places). Show all work. Answers
without supporting work will not receive credit.
(b) Determine the standard deviation of x. (Round the answer to two decimal places) Show all
work. Answers without supporting work will not receive credit.
10. Max Scherzer, the starting pitcher for the Nationals, on average, has a 0.250 probability of hitting
the ball in a single “at bat”. In one game, he gets 6 “at bats.”
(a) Let X be the number of hits that Max gets. As we know, the distribution of X is a binomial
probability distribution. What is the number of trials (n), probability of successes (p) and
probability of failures (q), respectively?
(b) Find the probability that he gets at least 4 hits in the one game. (Round the answer to
3
decimal places) Show all work. Just the answer, without supporting work, will receive no
credit.
11. Assume that gas mileage for cars is normally distributed with a mean of 23.5 miles
per gallon and a standard deviation of 10 miles per gallon. Show all work. Just the
answer, without supporting work, will receive no credit.
(a) What is the probability that a randomly selected car gets between 20 and 25 miles per gallon?
(Round the answer to 4 decimal places)
(b) Find the 80th percentile of the miles per gallon distribution. (Round the answer to 2 decimal
places)
STAT 200: Introduction to Statistics
Final Examination, Summer 2018 OL2
Page 5 of 8
12. Based on the performance of all individuals who tested between July 1, 2013 and June 30,
2016, the GRE Verbal Reasoning scores are normally distributed with a mean of 149.97 and a
standard deviation of 8.49. (https://www.ets.org/s/gre/pdf/gre_guide_table1a ). Show all
work. Just the answer, without supporting work, will receive no credit.
(a) For a sample of size 64, state the standard deviation of the sample mean (the “standard error of
the mean”). (Round your answer to three decimal places)
(b) Suppose a sample of size 64 is taken. Find the probability that the sample mean GRE Verbal
Reasoning scores is more than 152. (Round your answer to three decimal places)
13. The color distribution of plain M&M’s varies by the factory in which they were made. The
Hackettstown, New Jersey plant uses the following color distribution for plain M&M’s: 12.5%
red, 25% orange, 12.5% yellow, 12.5% green, 25% blue, and 12.5% brown. Each piece of candy
in a random sample of 100 plain M&M’s from the Hackettstown factory was classified according
to color, and the results are listed below. Use a 0.05 significance level to test the claim that the
Hackettstown factory color distribution is correct. Describe method used for calculating answer.
Color Red Orange Yellow Green Blue Brown
Number 11 28 20 9 20 12
(a) Identify the appropriate hypothesis test and explain the reasons why it is appropriate for analyzing
this data.
(b) Identify the null hypothesis and the alternative hypothesis.
(c) Determine the test statistic. (Round your answer to two decimal places)
(d) Determine the pvalue. (Round your answer to two decimal places)
(e) Compare pvalue and significance level α. What decision should be made regarding the null
hypothesis (e.g., reject or fail to reject) and why?
(f) Is there sufficient evidence to support the claim that the Hackettstown factory color distribution is
correct? Justify your answer.
14. A survey showed that 680 of the 1000 adult respondents believe in global warming.
(a) Construct a 95% confidence interval estimate of the proportion of adults believing in global
warming. (Round the lower bound and upper bound of the confidence interval to three decimal
places)
Include description of how confidence interval was constructed.
(b) Describe the results of the survey in everyday language.
STAT 200: Introduction to Statistics
Final Examination, Summer 2018 OL2
Page 6 of 8
15. In a study to assess the effectiveness of garlic for lowering cholesterol, 60 adults were treated with
garlic tablets. Cholesterol levels were measured before and after treatment. The changes in their LDL
cholesterol (in mg/dL) have a mean of 7 and a standard deviation of 4.
(a) Construct a 90% interval estimate of the mean change in LDL cholesterol after the garlic tablet
treatment. (Round the lower bound and upper bound of the confidence interval to two decimal places)
Include description of how confidence interval was constructed.
(b) Describe the results of the study in everyday language.
16. A health educator was interested in determining whether college students at her college really do gain
weigh during their freshman year. A random sample of 5 college students was chosen and the weight
for each student was recorded in August and May. Does the data below suggest that college students
gain weight during their freshman year? The health educator wants to use a 0.05 significance
level to test the claim.
Weight (pounds)
Student August May
1
2
3
4
5
175 180
170 164
135 142
160 166
200 208
(a) What is the appropriate hypothesis test to use for this analysis? Please identify and explain why
it is appropriate.
(b) Let μ1 = mean weight in May. Let μ2 = mean weight in August. Which of the following
statements correctly defines the null hypothesis?
(i) μ1 – μ2 > 0 (μd > 0)
(ii) μ1 – μ2 = 0 (μd = 0)
(iii) μ1 – μ2 < 0 (μd < 0)
(c) Let μ1 = mean weight in May. Let μ2 = mean weight in August. Which of the following
statements correctly defines the alternative hypothesis?
(i) μ1 – μ2 > 0 (μd > 0)
(ii) μ1 – μ2 = 0 (μd = 0)
(iii) μ1 – μ2 < 0 (μd < 0)
(d) Determine the test statistic. Round your answer to three decimal places. Describe method used
for obtaining the test statistic.
(e) Determine the pvalue. Round your answer to three decimal places. Describe method used for
obtaining the pvalue.
(f) Compare pvalue and significance level α. What decision should be made regarding the null
hypothesis (e.g., reject or fail to reject) and why?
(g) What do the results of this study tell us about freshman college student weight gain? Justify your
conclusion.
STAT 200: Introduction to Statistics
Final Examination, Summer 2018 OL2
Page 7 of 8
17. A psychologist is interested in studying the effectiveness of different therapies for depression. The
psychologist selects 90 clients and randomly assigns thirty to each of the following groups:
cognitivebehavioral treatment, psychodynamic psychotherapy, or clientcentered treatment. The
dependent measure is a score on a depression inventory after 4 weeks of treatment. The psychologist
wants to test the claim that all three therapies are equally effective in reducing symptoms of
depression.
(a) Which statistical approach should be used?
i. confidence interval
ii. ttest
iii. ANOVA
iv. Chi square
(b) Explain the rationale for your selection in (a). Specifically, why would this be the appropriate
statistical approach?
18. A study was conducted to see whether monetary incentives to use less water during times of drought
had an effect on water usage. Sixty single family homeowners were randomly assigned to one of two
groups: 1) monetary incentives and 2) no monetary incentives. At the end of three months, the total
amount of water usage for each household, in gallons, was measured.
(a) What would be the appropriate hypothesis test to use to test the claim that monetary incentives
reduce water usage?
i. ttest for two independent samples
ii. ttest for dependent samples
iii. ztest for population mean
iv. correlation
(b) Explain the rationale for your selection in (a). Specifically, why would this be the appropriate
statistical approach?
STAT 200: Introduction to Statistics
Final Examination, Summer 2018 OL2
Page 8 of 8
19. A researcher claims the proportion of auto accidents that involve teenage drivers is less than 20%.
ABC Insurance Company checks police records on 400 randomly selected auto accidents and notes
that teenagers were at the wheel in 64 of them. Assume the company wants to use a 0.05
significance level to test the researcher’s claim.
(a) What is the appropriate hypothesis test to use for this analysis? Please identify and explain
why it is appropriate.
(b) Identify the null hypothesis and the alternative hypothesis.
(c) Determine the test statistic. Round your answer to two decimal places. Describe method used
for obtaining the test statistic
(d) Determine the pvalue. Round your answer to three decimal places. Describe method used for
obtaining the pvalue
(e) Compare pvalue and significance level α. What decision should be made regarding the null
hypothesis (e.g., reject or fail to reject) and why?
(f) Is there sufficient evidence to support the researcher’s claim that the proportion of auto
accidents that involve teenage drivers is less than 20%? Explain your conclusion.
20. A business analyst believes that December holiday sales in 2016 are a good predictor of December
holiday sales in 2017. A random sample of 8 toys stores produced the following data where x is the
amount of December holiday sales in 2016 and y is the amount of December sales in 2017, in dollars.
x y
10257 11689
6556 6438
7224 8662
9987 9454
11568 12004
8453 8231
4235 5048
5576 4850
(a) Find an equation of the least squares regression line. Round the slope and yintercept
value to two decimal places. Describe method for obtaining results.
(b) Based on the equation from part (a), what is the predicted 2017 December holiday sales if the 2016
December holiday sales is 6,000 dollars? Show all work and justify your answer.
(c) Based on the equation from part (a), what is the predicted 2017 December holiday sales if the 2016
December holiday sales is 20,000 dollars? Show all work and justify your answer.
(d) Which predicted 2017 holiday sales that you calculated for (b) and (c) do you think is closer to
the true 2017 holiday sales and why?
