# statistics

A doctor’s surgery assumes that about 10% of their patients will come in withemergencies, 40% with non-emergencies, and 50% with general questions.

One day a doctor sees 15 emergencies, 20 non-emergencies and 19 patients with general

questions.

Calculate the Chi-square goodness of fit value.

Answer:

A farmer is having problems with birds eating her crops. She tries putting up different

numbers of scarecrows to keep the birds away. The average number of crops eaten is 200,

and the average number of scarecrows she has put up is 20.

The farmer finds some of the linear regression equation:

number of crops eaten = a – 0.8*(number of scarecrows).

What is the value of the intercept?

Answer:

You noticed that the bus is a lot more crowded when the weather is bad, and decided to

run a regression analysis of amount of rainfall, and the number of people on the bus.

Below is a table containing the observed number of people on the bus, and the predicted

number of people on the bus predicted by your regression model. The mean number of

people on the bus is 16.

Observed value

Predicted value

23

16.8

16

18.4

10

15.2

15

13.6

Use the values in the tables to calculate the R-squared.

Answer:

Sometimes it’s scary to ask people out on dates, and sometimes it’s easier. A dating

researcher decides to try to build a model to predict how likely a person is to ask someone

on a date based on the following predictors: level of attraction, amount of loneliness,

desperation, fear of rejection.

How many parameters are in the model?

Answer:

After 20 observations, the model predicting how likely a person is to ask someone on a

date based on level of attraction, amount of loneliness, desperation and fear of rejection

has an error sum of squares of 10.6 and a total sum of squares of 26.2.

What is the F-test statistic?

Answer:

Based on the politicians model expressed in terms of log-odds, with an intercept of 0.400

and a regression coefficient of 1.166, what is the probability that a person with a

conscientiousness score of 0.5 will turn up to vote?

Answer:

The table below shows the predicted and observed values for 20 voters using the

politician’s logistic regression model.

Predicted

Vote

Observed

Vote

8

Did not vote

2

Calculate the sensitivity.

Answer:

A vegetable shop owner was having trouble selling tomatoes, so he went to a sales

convention where vegetable shop owners could discuss their favourite strategies. Five

said that placing tomatoes at the window was best, five said near the cash desk, and five

said outside. To figure this out once and for all, he gathered data of tomato sales from all

15 of these vegetable shop owners, and conducted a one-way ANOVA comparing mean

tomato sales for different locations. The table below shows mean tomato sales for each

location. The within-group sum of squares is 102.

Window

Cash desk

Outsid

15

25

10

Calculate the upper 95% confidence interval boundary for the difference between the

cash desk location, and outside location (in your calculation use: cash desk – outside). Use

an alpha level of 0.05.

Answer:

The pet shop decides to run their analysis as a regression, and finds an error mean sum of

squares of 6.8, and an F-statistic of 6.2.

What is the between-group variance?

Answer:

You’re interested in Wayne East’s new album. People seem to be very excited about it, but

is it really that good? You ask your friends to rate it on a scale of 1 to 4, and expect that the

median score will be higher than 2. Your friends ratings were 3.5, 4.2, 2.3, 3.0 and 1.0.

You decide to check this with a Wilcoxon signed rank test. What is the test statistic?

Answer:

You’re still not convinced of your findings about Wayne East’s album, so you decide to do

an experiment on how happy people are after listening to Wayne’s new album, and

compare this to a group that listen to a lecture on Astrophysics. The table below shows

the enjoyment levels in both groups.

Wayne

Astrophysics

2.5

8.0

7.4

5.5

7.2

3.2

6.5

6.2

What is the sum of the ranks in the Wayne group?

Answer:

You are interested in whether the time spent studying leads to better grades. The table

below shows the number of hours spent studying, and the rank order of grades in a class

(1 = best grade, 5 = worst grade).

Study hours

Grade rank

10

1

13

4

3

2

16

3

4

5

What is the Spearman correlation coefficient for the relationship between study hours and

grades?

Answer:

A one-way variation analysis tested if mean scores on the Deadline Avoidance Scale were

similar for the three course completion status groups (graduated, dropped out before the

end of the first year, or dropped out after the first year). See the table below.

Sum of Squares

Between Groups

2.727

Within Groups

114.637

Total

117.364

Answer:

The correlation between the procrastination questionnaire score and course duration is

0.375. What is the proportion of variance is explained by the variable ‘procrastination’ in a

bivariate regression with course duration as the dependent variable?

Answer:

The researcher wonders if study success is associated with faculty (here, just Law and

Social Sciences). These are all nominal variables. On the basis of the table below the

researcher tests is whether the variables are associated.

Graduated

Dropped out before first year

Dropp

Law

51

15

37

Social Sciences

206

39

80

What is the value of the test statistic?

Answer:

