Statistics Question

PART ONE

Students should complete this Research Application Activity after other work in this Module is finished.

Don't use plagiarized sources. Get Your Custom Essay on
Statistics Question
Just from $13/Page
Order Essay

Students should open and read the attached document, write their responses as requested in the attached document, and submit the essay and SPSS file as an attachment to this link.

USE THE FOLLOWING ATTACHED DOCUMENTS “Data Cleaning Application Essay PSYC 3430.docx AND Data Cleaning Application Essay Raw Data.xls

PART TWO

Each discussion forum allows students to identify the most challenging or confusing concept from the current module to discuss with others in a helpful, mutually supportive manner.

Students are expected to identify the concept from this week’s chapter that they find the most challenging and write a brief explanation of 1) why they find the concept challenging or confusing and 2) the concept in thestudent’s own wordsafter researching the concept in the textbook and online. Student mustcite all online sources and include a working hyperlinkto the source to receive credit for the post. At least one online source must be used for all muddiest point explanations. Quoting or plagiarizing from the textbook and/or online sources willnotreceive credit.

USE THE FOLLOWING DOCUMENTS FOR THIS ASSIGNMENT

Chapter 2
Frequency Distributions
PowerPoint Lecture Slides
Essentials of Statistics for the Behavioral Sciences
Tenth Edition
by Frederick J Gravetter, Larry B. Wallnau, and Lori-Ann B. Forzano
Learning Outcomes
1.
2.
3.
4.
5.
6.
Understand hoe frequency distributions are used
Organize data into a frequency distribution table…
… and into a grouped frequency distribution table
Know how to interpret frequency distributions
Organize data into frequency distribution graphs
Know how to interpret and understand graphs
Tools You Will Need

Proportions (Appendix A)




Scales of measurement (Chapter 1)



Fractions
Decimals
Percentages
Nominal, ordinal, interval, and ratio
Continuous and discrete variables (Chapter 1)
Real limits (Chapter 1)
2-1 Frequency Distributions and
Frequency Distribution Tables

A frequency distribution is




An organized tabulation
Showing the number of individuals located in each
category on the scale of measurement
Can be either a table or a graph
Always shows


The set of categories that make up the original
measurement scale
A record of the frequency, or number, of individuals
in each category
Frequency Distribution Tables

Structure of a frequency distribution table




Categories in a column (often ordered from highest
to lowest)
Frequency count (f) next to category (X values)
Σf = N
To compute ΣX (sum of the scores) from a table


Convert table back to original scores or
Compute ΣfX
Proportions and Percentages
Proportions
• Measures the fraction
of the total group that
is associated with
each score
f
proportion = p =
N
• Called relative
frequencies because
they describe the
frequency (f) in relation
to the total number (N)
Percentages
• Expresses relative
frequency out of
100
f
percentage = p (100) = (100)
N

Can be included as
a separate column
in a frequency
distribution table
Example 2.4 Frequency,
Proportion, and Percentage
X
f
p = f/N
percent = p(100)
5
1
1/10 = .10
10%
4
2
2/10 = .20
20%
3
3
3/10 = .30
30%
2
3
3/10 = .30
30%
1
1
1/10 = .10
10%
Learning Check 1 (1 of 2)

Use the frequency
distribution table to
determine how many
subjects were in the
study.
A. 10
B. 15
C. 33
D. Impossible to
determine
X
f
5
2
4
4
3
1
2
0
1
3
Learning Check 1 – Answer (1 of 2)

Use the frequency
distribution table to
determine how many
subjects were in the
study.
A. 10
B. 15
C. 33
D. impossible to
determine
X
f
5
2
4
4
3
1
2
0
1
3
Learning Check 1 (2 of 2)


For the frequency distribution
shown, is each of these
statements True or False?
T/F


More than 50% of the individuals
scored above 3.
T/F

The proportion of scores in the
lowest category was p = 3.
X
f
5
2
4
4
3
1
2
0
1
3
Learning Check 1 – Answer (2 of 2)


For the frequency distribution
shown, is each of these
statements True or False?
True


Six out of ten individuals scored
above 3 = 60% = more than half
False

A proportion is a fractional part;
3 out of 10 scores = 3/10 = .3
X
f
5
2
4
4
3
1
2
0
1
3
2-2 Grouped Frequency Distribution
Tables


If the number of categories is very large, they are
combined (grouped into intervals) to make the
table easier to understand
However, some information is lost when categories
are grouped


Individual scores cannot be retrieved
The wider the grouping interval, the more
information is lost
Guidelines for Constructing Grouped
Frequency Distributions

Guidelines




Ten or fewer class intervals is typical (but use good
judgment for the specific situation)
The width of each interval should be a relatively
simple number (e.g., 2, 5,10, or 20)
The bottom score in each class interval should be a
multiple of the width
All intervals should be the same width
Real Limits and Frequency
Distributions (1 of 3)

Constructing either frequency distributions or
grouped frequency distributions for discrete
variables is uncomplicated


Individuals with the same recorded score had
precisely the same measurements
The score is an exact score
Real Limits and Frequency
Distributions (2 of 3)

Constructing frequency distributions for
continuous variables requires understanding that
a score actually represents an interval



A given “score” actually could have been any value
within the score’s real limits
The recorded value was rounded off to the middle
value between the score’s real limits
Individuals with the same recorded score probably
differed slightly in their actual measurements (the
measurements are simply located in the same
interval)
Real Limits and Frequency
Distributions (3 of 3)



Constructing grouped frequency distributions for
continuous variables also requires understanding
that a score actually represents an interval
Consequently, grouping several scores actually
requires grouping several intervals
Apparent limits of the (grouped) class interval are
always one unit smaller than the real limits of the
(grouped) class interval. (Why?)
Learning Check 2 (1 of 2)

A grouped frequency distribution table has
categories 0–9, 10–19, 20–29, and 30–39. What is
the width of the interval 20–29?
A. 9 points
B. 9.5 points
C. 10 points
D. 10.5 points
Learning Check 2 – Answer (1 of 2)

A grouped frequency distribution table has
categories 0–9, 10–19, 20–29, and 30–39. What is
the width of the interval 20–29?
A. 9 points
B. 9.5 points
C. 10 points (29.5 – 19.5 = 10)
D. 10.5 points
Learning Check 2 (2 of 2)

Decide if each of the following statements
is True or False.

T/F


You can determine how many individuals had each
score from a frequency distribution table.
T/F

You can determine how many individuals had each
score from a grouped frequency distribution.
Learning Check 2 – Answer (2 of 2)

True


The original scores can be recreated from the
frequency distribution table
False

Only the number of individuals in the class interval
is available once the scores are grouped
2-3 Frequency Distribution Graphs

Pictures of the data organized in tables




All have two axes
X-axis (abscissa) typically has categories of the
measurement scale increasing from left to right
Y-axis (ordinate) typically has frequencies with
values increasing from bottom to top
General principles


Both axes should have a value of zero (0) where
they intersect
Height should be about ⅔ to ¾ of the length
Data Graphing Questions



Level of measurement? (nominal, ordinal, interval,
or ratio)
Discrete or continuous data?
Describing samples or populations?
The answers to these questions determine which is
the appropriate graph
Graphs for Interval or Ratio Data (1 of 4)




Require numerical scores (measured on an interval or ratio
scale)
Represent all scores on X-axis from minimum thru
maximum observed values
Include all scores with frequency of zero
Draw bars above each score (interval)



The height of the bar corresponds to the frequency for that
category
For continuous variables, the width of the bar extends to the
real limits of the category
For discrete variables, each bar extends exactly half the
distance to the adjacent category on each side
Figure 2.3 Frequency Distribution
Histogram
Graphs for Interval or Ratio Data (2 of 4)


Grouped data: data grouped into class intervals
Draw bars above each (grouped) class interval


Bar width is the class interval real limits
Consequence? Apparent limits are extended out
one-half score unit at each end of the interval
Figure 2.4 Frequency Distribution
Histogram for Grouped Data
Graphs for Interval or Ratio Data (3 of 4)

A standard histogram can be made into an informal
histogram (“block” histogram)

Create a bar of the correct height by drawing a
stack of blocks

Each block represents one individual

Therefore, block histograms show the frequency
count in each bar
Figure 2.5 Frequency Distribution
Block Histogram
Graphs for Interval or Ratio Data (4 of 4)
Constructing a polygon



Draw a dot above the center of each interval




List all numeric scores on the X-axis
Include those with a frequency of f = 0
Height of dot corresponds to frequency
Connect the dots with a continuous line
Close the polygon with lines to the Y = 0 point
Can also be used with grouped frequency
distribution data
Figure 2.6 Frequency Distribution
Polygon
Figure 2.7 Frequency Distribution
Polygon for Grouped Data
Graphs for Nominal or Ordinal Data

For non-numerical scores (nominal and ordinal
data), use a bar graph


Similar to a histogram
Spaces between adjacent bars indicate discrete
categories
• Without a particular order (nominal)
• Nonmeasurable width (ordinal)
Figure 2.8 Bar Graph
Box 2.1, Figure 2.11 The Use and
Misuse of Graphs
Graphs for Population Distributions


When a population is small, scores for each member
are used to construct a frequency distribution graph
such as a histogram and bar graph
When a population is large, scores for each member
are not possible



Graphs based on relative frequencies are used
Graphs use smooth curves to indicate exact scores
were not used
Normal distribution


Symmetric with greatest frequency in the middle
Common data structure for many variables
Figure 2.9 Bar Graph of Relative
Frequencies
Figure 2.10 The Population
Distribution of IQ scores
The Shape of a Frequency
Distribution



Researchers describe a distribution’s shape in
words rather than drawing it
Symmetrical distribution: each side is a mirror
image of the other
Skewed distribution: scores pile up on one side
and taper off in a tail on the other


Tail on the right (high scores) = positive skew
Tail on the left (low scores) = negative skew
Figure 2.12 Shapes for
Frequency Distributions
Learning Check 3 (1 of 2)

What is the shape of this
distribution?
A. symmetrical
B. negatively skewed
C. positively skewed
D. discrete
Learning Check 3 – Answer (1 of 2)

What is the shape of this
distribution?
A. symmetrical
B. negatively skewed
C. positively skewed
D. discrete
Learning Check 3 (2 of 2)

Decide if each of the following statements
is True or False.

T/F


It would be correct to use a histogram to graph
parental marital status data (single, married,
divorced…) from a treatment center for children.
T/F

It would be correct to use a histogram to graph the
time children spent playing with other children from
data collected in a children’s treatment center.
Learning Check 3 – Answer
(2 of 2)

False


Marital status is a nominal variable; a bar graph is
required
True

Time is measured continuously and is an interval
variable
2-4 Stem and Leaf Displays


A simple alternative to a grouped frequency
distribution table or graph
Each score is separated into two parts: a stem and
a leaf




The first digit (or digits) is called the stem
The last digit is called the leaf
Example: X = 85 would be separated into a stem of
8 and a leaf of 5
Every individual score can be identified
Learning Check 4 (1 of 2)

For the scores shown in
the stem and leaf display,
what is the lowest score in
the distribution?
A. 7
B. 15
C. 50
D. 51
9
374
8
945
7
7042
6
68
5
14
Learning Check 4 – Answer (1 of 2)

For the scores shown in
the stem and leaf display,
what is the lowest score in
the distribution?
A. 7
B. 15
C. 50
D. 51
9
374
8
945
7
7042
6
68
5
14
Learning Check 4 (2 of 2)


Decide if each of the following statements
is True or False.
T/F


Any frequency distribution is suitable for a stem
and leaf display.
T/F

A score of 54 is displayed as 5 (stem) and 4 (leaf)
in a stem and leaf display.
Learning Check 4 – Answer (2 of 2)

False


A stem and leaf display is a simple alternative for a
grouped frequency distribution
True

The first digit (5) is the stem and the last digit (4) is
the leaf
Clear Your Doubts, Ask Questions
Application Essay
Module 2 – Cleaning Data Prior to Statistical Analysis
Gathering data from research participants allows psychologists and other professionals to test how
effective their techniques are. Without data gathering and systematic analysis of that data, we could not be
sure that new methods of helping people are any more effective than traditional ways of doing things or
doing nothing at all. So, data collection and data analysis – via the statistical procedures you will learn in
this course – are critical to our body of knowledge as we engage in our careers in or related to psychology.
Data collection can be a messy process. Sometimes research participants drop out of a study or they may
misunderstand the directions for participation or they may provide information that is not accurate both
unintentionally (e.g., when they are asked to provide information they don’t know) or intentionally (e.g.,
whey they try to guess the researcher’s prediction and prove it wrong). There are many ways that data
could be difficult to interpret, so it is vital that the data is cleaned before analysis.
Data cleaning involves making decisions about the data that you can explain and defend to others. It
requires us to carefully examine all the information that a research participant provides and verify that it
makes sense. Though the information might include extreme values (e.g., a participant could be 99 years
old, a participant could be 7 feet tall), the information should not include values that are impossible and
were likely entered in error (e.g., a participant could not be 215 years old, a participant could not be 22
feet tall).
Also, the data should not include out of range values, which are scores that are not within the range of
scores that were measured on a particular scale. For example, a scale assessing self-esteem may ask
participants to rate how much they value themselves on a scale from 1 (not at all) to 7 (very much). A
score of 6 would indicate that the participant valued themselves a lot, which could be a sign of high selfesteem; however, a score of 9 would not be possible because the measured values range only from 1 to 7.
A value of 9 would be a data entry error because it is higher than the highest possible value on the scale.
In addition to checking the values that research participants provide one variable at a time (e.g., making
sure the ages are plausible then verifying there are no out of range values on self-esteem), a researcher
also needs to verify that the responses make sense across the multiple variables that are measured for each
participant. For example, a 65-year-old individual (age variable) should not claim to be currently enrolled
in middle school (grade level variable), an individual who reports never having served in the military
(military service variable) should not report that they served two tours in combat with the Army (combat
exposure variable), and so on. Data that is logically impossible when considered across variables should
be carefully scrutinized because its inclusion could alter the results of the research and render it
meaningless.
To reduce the likelihood that inaccurate data may affect the statistical analysis of a research project, the
researcher must critically consider the data each research participant provides and write rules regarding
when data will be excluded from the analyses for the project. These rules must be applied in exactly the
same manner to every research participant’s data in the data set. The researcher must be careful NOT to
selectively apply these data cleaning rules to just those cases that might refute their hypotheses. The rules
must be applied to every case regardless of whether that case is consistent or inconsistent with what the
researcher hopes the data will reveal. Selective application of rules is a breach of research ethics.
Unfortunately, the data sets provided by textbook publishers to practice statistical analyses are already
cleaned so that the prepackaged data makes sense and is logically plausible. Data collected in the real
world is rarely so neat and tidy.
To provide exposure to critically evaluating research data and writing cleaning rules that will be applied
to all cases in a data set prior to data analysis, students should review the hypothetical data set in the
Excel file associated with this assignment. An explanation of each variable follows:
ID – This is a number assigned to each participant based on when they completed the study. #1 is the first
person to complete it, #2 is the second person to complete it, and so on.
Age – This is the participant’s age in years. Because minors were not approved by the IRB for this
research, all research participants must be able to legally provide their consent to participate, which is the
age of 18 in Texas.
Gender – This is the participant’s self-reported gender identification. Numerical codes are used for the
data with 0 = cisgender female and 1 = cisgender male.
Military – This is the participant’s military affiliation. Numerical codes are used for the data with 0 = no
affiliation, 1 = active duty service member, 2 = spouse of active duty service member, 3 = child of active
duty service member.
Satisfaction – This is the average rating of the participant’s satisfaction with life scale score. Values
range from 1 (not satisfied) to 7 (very satisfied).
Intention – This value is the participant’s rating on intention to enroll in college in the upcoming fall
semester. Values range from 1 (not at all likely) to 10 (extremely likely).
Marital – This value is the participant’s marital status. Numerical codes are used for the data with 1 =
never married, 2 = married, 3 = divorced/widowed.
Children – This value represents the number of children under the age of 18 who live in the home with
the research participant.
To complete the application assignment, students should examine each variable one at a time to identify
problematic data points and write rules that would allow you to remove that data point (and others with
the same problem) from future analyses. In addition, students should review the variables measured
across each participant’s responses to verify that the responses are possible/logical. Students should
identify and write rules to exclude any cases (and others like it) that are logically inconsistent or
impossible. Then, students should practice the SPSS lessons in this module by importing the Excel file
into SPSS.
Students should respond to each item below by writing complete sentences for their responses to each
item.
1. For participant #1, should this individual be included in the data set for analysis? If so, why? If
not, why not? Explain the information you considered for your decision. If the data should be
excluded, state the data cleaning rule that could be applied to every research participant’s data in
the data set. (1 point)
2. For participant #2, should this individual be included in the data set for analysis? If so, why? If
not, why not? Explain the information you considered for your decision. If the data should be
excluded, state the data cleaning rule that could be applied to every research participant’s data in
the data set. (1 point)
3. For participant #3, should this individual be included in the data set for analysis? If so, why? If
not, why not? Explain the information you considered for your decision. If the data should be
excluded, state the data cleaning rule that could be applied to every research participant’s data in
the data set. (1 point)
4. For participant #4, should this individual be included in the data set for analysis? If so, why? If
not, why not? Explain the information you considered for your decision. If the data should be
excluded, state the data cleaning rule that could be applied to every research participant’s data in
the data set. (1 point)
5. For participant #5, should this individual be included in the data set for analysis? If so, why? If
not, why not? Explain the information you considered for your decision. If the data should be
excluded, state the data cleaning rule that could be applied to every research participant’s data in
the data set. (1 point)
6. For participant #6, should this individual be included in the data set for analysis? If so, why? If
not, why not? Explain the information you considered for your decision. If the data should be
excluded, state the data cleaning rule that could be applied to every research participant’s data in
the data set. (1 point)
7. For participant #7, should this individual be included in the data set for analysis? If so, why? If
not, why not? Explain the information you considered for your decision. If the data should be
excluded, state the data cleaning rule that could be applied to every research participant’s data in
the data set. (1 point)
8. For participant #8, should this individual be included in the data set for analysis? If so, why? If
not, why not? Explain the information you considered for your decision. If the data should be
excluded, state the data cleaning rule that could be applied to every research participant’s data in
the data set. (1 point)
9. Import the Excel file into SPSS. To document this portion of the assignment, 1) students may take
a screenshot of the data in the Data View in SPSS and upload the screenshot to the assignment
link, or 2) students may save the data file in SPSS and upload the data file to the assignment link.
(7 points)
ID
Age
1
2
3
4
5
6
7
8
Gender
17
22
35
75
43
27
19
49
Military
0
0
1
0
0
0
0
1
3
0
2
1
0
3
0
0
Satisfaction Intention Marital
5
5
6
10
7
7
7
1
5
0
6
6
6
10
6
3
Children
1
1
1
3
2
2
2
2
0
0
2
0
2
2
1
7

Calculator

Calculate the price of your paper

Total price:$26
Our features

We've got everything to become your favourite writing service

Need a better grade?
We've got you covered.

Order your paper
Live Chat+1(978) 822-0999EmailWhatsApp

Order your essay today and save 20% with the discount code GOLDEN

seoartvin escortizmir escortelazığ escortbacklink satışbacklink saleseskişehir oto kurtarıcıeskişehir oto kurtarıcıoto çekicibacklink satışbacklink satışıbacklink satışbacklink