# 33 stats multiple choices

Question 1 (10 points)Which of the following characteristics are true of Nominal/Categorical
variables?
Question 1 options:
A. The “order” of values is known
B. It is possible to calculate the Mode of the values
C. It is possible to calculate the Median of the values
D. It is possible to calculate the Mean of the values
E. It is possible to quantify the difference between each value
F. It is possible to add or subtract values
G. It is possible to multiply and divide values
H. The variable has a “true” zero point
Question 2 (10 points)
Which of the variable types below are considered “continuous?”
Question 2 options:
A. Nominal / Categorical
B. Ordinal
C. Interval
D. Ratio
Question 3 (10 points)
Which of the following characteristics are true of Ordinal variables?
Question 3 options:
A. The “order” of values is known
B. It is possible to calculate the Mode of the values
C. It is possible to calculate the Median of the values
D. It is possible to calculate the Mean of the values
E. It is possible to quantify the difference between each value
F. It is possible to add or subtract values
G. It is possible to multiply and divide values
H. The variable has a “true” zero point
Question 4 (10 points)
Which of the following characteristics are true of Interval variables?
Question 4 options:
A. The “order” of values is known
B. It is possible to calculate the Mode of the values
C. It is possible to calculate the Median of the values
D. It is possible to calculate the Mean of the values
E. It is possible to quantify the difference between each value
F. It is possible to add or subtract values
G. It is possible to multiply and divide values
H. The variable has a “true” zero point
Question 5 (10 points)
Volume, Velocity, and Variety are characteristics commonly associated with:
Question 5 options:
A. Prescriptive analytics
B. Predictive analytics
C. The Analytical Framework
D. The CoNVO model
F. Big Data
G. The Internet of Things (IoT)
H. Measures of variability
Question 6 (10 points)
Online services like Netflix, Pandora, and Amazon provide their customers with
product recommendations based on their past behavior by primarily using:
Question 6 options:
A. Live human assistance
B. Descriptive analysis
C. Inferential analysis
D. Predictive analytics
E. Prescriptive analytics
F. Unstructured text analysis
Question 9
The ________ identifies the number of standard deviations a particular value is
from the mean of its distribution.
Question 9 options:
A. Z-score
B. F statistic
C. coefficient of variation
D. t-score
Question 10 (10 points)
You conduct a random survey of 1000 SPS graduate students at NYU to collect
basic demographic data. After doing some descriptive statistical analysis, you
notice that there is unimodal distribution of age among the students with
median that is much lower that the mean age. This would indicate that the age
data has a:
Question 10 options:
A. Normal distribution
B. Negatively skewed distribution
C. Positively skewed distribution
D. Distribution of unknown skewness given the information provided
Question 11 (10 points)
Open the attached Excel file, Sales Associate Performance Report.xlsx. It lists
the number of sales that each of the company’s 25 sales associates made in the
month of January.
Conduct a Pareto analysis of the data in Excel (do NOT simply create a Pareto
chart) and answer the following question:
Which sales associates are responsible for approximately the top 80% of all
sales in January? (Check all answers that apply)
list all of the steps you followed to calculate your Pareto analysis.
Sales Associate Performance Report.xlsx
Question 11 options:
A. Quentin
B. Laura
C. Ellen
D. Nancy
E. Jason
F. Isaac
G. Tanya
H. Charles
I. Charles
J. Karin
K. Brett
L. Jason
M. Thomas
N. Barbara
O. Bryan
P. Edgar
Q. Morgan
R. Roberta
S. MIguel
T. Beatrice
U. Allen
Question 12 (10 points)
Consider the following data set:
Variable 1
Variable 2
15
2
15
9
14
5
10
3
5
8
5
10
8
17
3
15
The sample correlation coefficient for this data set is __________.
Question 12 options:
A. -0.18
B. -0.55
C. 0.55
D. 0.47
E. -0.47
F. 0.18
G. -0.80
H. -0.61
Question 13 (10 points)
In a hypothesis test, if we fail to reject the null hypothesis when it is NOT true,
we are committing a:
Question 13 options:
A. Type I error
B. Type II error
C. Type III error
D. Type IV error
Question 14 (10 points)
A marketing analyst at a magazine publisher wants to determine if there are
statistically significant differences in the subscription renewal rates for two of
the fashion magazines that it sells. If the analyst conducts a one-way ANOVA
test on the renewal rate data in JMP and sees in the results that “Prob>F” is
equal to 0.0058, which of the following conclusions should be drawn?
Question 14 options:
A. If the “Prob>F” is greater than the F Ratio, we can reject the null hypothesis and conclude that the
means are statistically different.
B. Since the “Prob>F” is less than 0.05, we can accept the null hypothesis. Therefore, the means are
NOT statistically different.
C. Since the “Prob>F” is less than 0.05, we can reject the null hypothesis. Therefore, the means are
statistically different.
D. Since the “Prob>F” is positive, we can reject the null hypothesis. Therefore, the means are
statistically different.
E. Since the “Prob>F” is a non-zero value, we can accept the null hypothesis and conclude that the
means are NOT statistically different.
F. Since the “Prob>F” is positive, we can accept the null hypothesis. Therefore, the means are NOT
statistically different.
Question 15 (10 points)
Which analytical technique would you use to conduct hypothesis testing on a
sample of approximately 17,000 consumers of whether their marital status
(single, married, divorced, or widowed) plays a significant role in whether a they
prefer using 1) a standard dishwasher, 2) a mini counter-top dishwasher, 3) a
premium “smart home” dishwasher, or 4) simply washing dishes by hand?
Question 15 options:
A. Z-Score Analysis
B. Correlation Analysis
C. Prescriptive Analytics
D. Pareto Analysis
E. Chi-Square Analysis
F. Fisher’s Exact Test
Question 16 (10 points)
The marketing analyst at a national travel agency is evaluating a new
promotional email to customers against one that was used last time by using
A/B Testing. The new promotional email seems to have a higher conversion rate
than the older one, but the analyst still wants to conduct a chi squared test to
determine if the results are significant.
How should the analyst interpret the results?
Question 16 options:
A. If p value is less than 0.05, then we can say with 5% certainty that the results are not due to chance.
B. If p value is less than 0.05, then we can say with 95% certainty that the results are due to chance.
C. If p value is less than 0.95, then we can say with 95% certainty that the results are not due to
chance.
D. If p value is less than 0.05, then we can say with 95% certainty that the results are not due to
chance.
E. If p value is more than 0.05, then we can say with 95% certainty that the results are not due to
chance.
Question 17 (10 points)
Westworld Vacations has a customer database of 2000 people and decides to
create an email campaign with a discount code in order to generate sales
through its website. It creates an email and then modifies the Call to Action (the
part of the copy which encourages customers to do something — in the case of
a sales campaign, make a purchase).

To 1000 people it sends the email with the call to action stating, “Offer
ends this Saturday! Use code A1”
To another 1000 people it sends the email with the call to action stating,
“Offer ends soon! Use code B1”.
All other elements of the email’s copy and layout are identical.
The company then monitors which campaign has the higher success rate by
analyzing the use of the promotional codes.
The email using the code A1 has a 6.5% response rate (65 of the 1000 people
emailed used the code to buy a product), and the email using the code B1 has a
4.2% response rate (42 of the recipients used the code to buy a product).
Conduct a chi squared analysis using the attached Excel worksheet to analyze
the email campaign results.
What do the results show?
Test for Significance of A-B Test.xlsx
Question 17 options:
A. The p-value is less than 0.05, so the difference is due to chance
B. The p-value cannot be interpreted
C. The p-value is more than 0.05, so the difference is not due to chance
D. The p-value is less than 0.05, so the difference is not due to chance
Question 18 (40 points)
Match each of the following analytical terms with its intended purpose.
Question 18 options:
1. Chi-Square Analysis
2. Box-and-Whisker Plot (Box Plot)
3. Z-score
4. p-value
5. Coefficient of Determination
6. Kurtosis
7. Skewness
8. Fisher’s Exact Test

Using a statistical significance test on contingency tables with a small sample size (for example, one whe
any of the expected values is less than 5 or the total of the expected values is less than 50).

Showing a visualization of the distribution of a variable’s values around its median, as well as the
boundaries of its quartiles.

Measuring the degree of “tailedness” (i.e., the propensity to produce outliers) for the probability
distribution of a random value

Measuring the degree of asymmetry of the distribution of a variable’s values around its mean

Determining the proportion of the variance in the dependent variable that is predictable from the
independent variable

Estimating the probability for a given statistical model that, when the null hypothesis is true, the statistic
summary (such as the sample mean difference between two compared groups) would be greater or equal
the actual observed results

Hypothesis testing applied to two sets of data (observed vs expected frequencies) to evaluate how likely
is that the differences between the sets arose by random chance due to sampling error.

Determining the number of standard deviations a particular value is from the mean of its distribution
Question 19 (10 points)
A ________ predicts the change in a dependent variable due to a one-unit increase in an
independent variable while holding other variables constant.
Question 19 options:
A. residual
B. coefficient of determination
C. regression coefficient
D. correlation coefficient
Question 20 (10 points)
A strong, robust multiple linear regression model requires which of the following assumption
(be sure to choose ALL correct answers)
Question 20 options:
A. The model needs to show a high degree of overfitting
B. The coefficient of determination can never be positive
C. The presence of collinearity among the predictor variables
D. The absence of collinearity among the predictor variables
E. The intercept coefficient should always be positive
F. All of the independent variables must be continuous.
G. Homoscedasticity in regard to the error terms in the fitted model
H. Heteroscedasticity in regard to the error terms in the fitted model
Question 21 (10 points)
Z-Mobile, a mobile phone company, is trying to predict each month whether its customers w
cancel their service based on the following variables: 1) customer income, 2) monthly spend
domestic calls, 3) monthly spend on international calls, and 4) monthly spend on data service
Which type of analytical technique would be the most appropriate one for the company to
use?
Question 21 options:
A. Analysis of Variance (ANOVA)
B. Fischer’s Exact Test
C. Logistic Regression
D. T-Test
E. Multiple Linear Regression
F. Simple Linear Regression
G. Factor Analysis
H. Chi-Square Testing
Question 22 (10 points)
Which analytical technique would you use to determine how a museum’s monthly attendanc
is affected by the following factors?:

number of elementary and high school students in the area
monthly budget spent on “special exhibits”
number of “special exhibits” currently featured
amount spent on advertising that month and the month prior
number of days in the month that fall on a holiday
Question 22 options:
A. Chi-Squared Analysis
B. Logistic Regression
C. Factor Analysis
D. Multivariate Linear Regression
E. Fisher’s Exact Test
F. Simple Linear Regression
G. Analysis of Variance (ANOVA)
Question 23 (10 points)
Which of the following is terms is used to describe standard deviations of the error terms in
linear regression model that are constant and do not depend on the x-value?
Question 23 options:
A. Kurtosis
B. Lepto-Kurtosis
C. Heteroscedasticity
D. Multicollinearity
E. Homoscedasticity
F. Skewness
Question 24 (10 points)
The Least Squares Estimates method is used the to calculate the value of the dependent
variable in Logistic Regression analysis
Question 24 options:
A. True
B. False
Question 25 (10 points)
You are an entertainment blogger who runs a popular YouTube channel about movies, TV, a
video games that people can subscribe to.
Several months ago, you created a collection of movie/TV character-themed t-shirts which
seems to be selling well but could be doing much better in your opinion. While thinking up
You go to your customer database and pull up a large random sample of people and data on
whether they bought any of your t-shirts or not and whether they subscribe to your YouTub
channel or not. The variable BuyTshirt is equal “1” if they purchased any of your t-shirts and
“0” if they did not. The variable YouTube is equal “1” if they subscribe to your YouTube
channel and “0” if they do not.
You then run a Logistic Regression on the two variables to calculate odds ratios for buying a
shirt given whether one subscribes or not to your YouTube channel. You first decide to
compare those who are non-subscribers ( YouTube = 0 in the numerator) to those who are
subscribers ( YouTube = 1 in the denominator) as so:
When you calculate this in JMP, the result produced is: 0.07
True or False?: The result means that YouTube channel non-subscribers have a 0.07% chanc
Question 25 options:
A. True
B. False
Question 26 (40 points)
Which of the following factors make a data visualization more effective in a presentation to
i9mportant audience? (check all that apply)
Question 26 options:
B. A high signal-to-noise ratio
C. A low signal-to-noise ratio
D. Choosing 3D charts over 2D charts in order increase audience engagement
E. Minimizing the “white space” as much as possible
F. A sense of narrative
G. An emphasis on functional design over decorative design
H. Always choosing style over substance
I. Using as wide a variety of colors as possible to make data visualizations eye-catching
J. A greater emphasis on making all data visualizations more “Referenceable” rather than “Glanceable”
Question 27 (10 points)
Which of the following data visualizations is best for presentation to an important audience
(if the images do not appear in your browser, consult the attached file)
Which of the following data visualizations is best for presentation to an important
audience.pdf
Question 27 options:
A. It depends upon the goals, narrative and the audience of the presentation.
B. It depends specifically upon the viewer’s knowledge of statistics.
C. It depends upon your aesthetic or artistic tastes.
Kuiper Car Company
Kuiper is a (fictional) car manufacturer that has collected some data on its competitors’ cars
You work for the marketing department of the car manufacturer and your boss has asked
you to help with a competitive analysis of the market.
Use the following data file to conduct your analysis in JMP: Kuiper-b.jmp
Question 28 (10 points)
What is the price of the most expensive car?
Question 28 options:
A. \$74,819.22
B. \$69,872.92
C. None of these options
D. \$71,917.44
E. \$70,755.47
Question 29 (10 points)
What is the average (mean) price of a car?
Question 29 options:
A. \$21,235.53
B. \$18,024.99
C. None of these options
D. \$21,343.14
E. \$26,787.88
Question 30 (10 points)
What is the shape of the price distribution?
Question 30 options:
A. Positively-skewed
B. Symmetric
C. Bimodal
D. None of these options
E. Negatively-skewed
Question 31 (10 points)
On average, which manufacturer has the lowest priced cars?
Question 31 options:
A. Buick
B. Chevrolet
C. Saturn
D. Pontiac
E. Mercury
F. None of these options
Question 32 (10 points)
Which of the following best describes the relationship between liters and cylinders?
Question 32 options:
A. None of these options
B. There is no correlation between liters and cylinders
C. There is a strong positive correlation between liters and cylinders
D. There is a weak positive correlation between liters and cylinders
E. There is a strong negative correlation between liters and cylinders
Question 33 (10 points)
What are the slope and intercept of the regression line predicting liters based on cylinders?
Question 33 options:
A. The slope is 0.318 and the intercept is 0.917
B. The slope is 0.7632361 and the intercept is -0.983916
C. The slope is 0.917 and the intercept is 0.318
D. None of these options
E. The slope is -0.983916 and the intercept is 0.7632361

Don't use plagiarized sources. Get Your Custom Essay on
33 stats multiple choices
Just from \$13/Page
Calculator

Total price:\$26
Our features

## Need a better grade? We've got you covered.

Order your essay today and save 20% with the discount code GOLDEN