SPSS Data Analysis
Discuss the issues related to z Test Versus t Test and their applications in the research.
2. Follow instructions from Doing Data Analysis with SPSS, Session 11 and practice
(Link will be attached below)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.Doing Data Analysis
with SPSS®
Version 18
Robert H. Carver
Stonehill College
Jane Gradwohl Nash
Stonehill College
Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Doing Data Analysis with SPSS®
Version 18
Robert H. Carver, Jane Gradwohl Nash
Publisher: Richard Stratton
Senior Sponsoring Editor: Molly Taylor
Assistant Editor: Shaylin Walsh
© 2012, 2009, 2006, Brooks/Cole Cengage Learning
ALL RIGHTS RESERVED. No part of this work covered by the
copyright herein may be reproduced, transmitted, stored or used
in any form or by any means graphic, electronic, or mechanical,
including but not limited to photocopying, recording, scanning,
digitizing, taping, Web distribution, information networks, or
information storage and retrieval systems, except as permitted
under Section 107 or 108 of the 1976 United States Copyright Act,
without the prior written permission of the publisher.
Media Editor: Andrew Coppola
Marketing Manager: Jennifer Jones
Marketing Coordinator: Michael Ledesma
Marketing Communications Manager:
Mary Anne Payumo
Content Project Management: PreMediaGlobal
Art Director: Linda Helcher
Print Buyer: Diane Gibbons
Production Service: PreMediaGlobal
Cover Designer: Rokusek Design
Cover Image: Kostyantyn Ivanyshen/
©Shutterstock
Compositor: PreMediaGlobal
For permission to use material from this text or product, submit all
requests online at www.cengage.com/permissions
Further permissions questions can be e-mailed to
permissionrequest@cengage.com
Library of Congress Control Number: 2010942243
Student Edition:
ISBN-13: 978-0-8400-4916-2
ISBN-10: 0-8400-4916-1
Cengage Learning
20 Channel Center Street
Boston, MA 02210
USA
Represented in Canada by Nelson Education, Ltd.
tel: (416) 752 9100 / (800) 668 0671
www.nelson.com
Cengage Learning is a leading provider of customized learning solutions
with office locations around the globe, including Singapore, the United
Kingdom, Australia, Mexico, Brazil and Japan. Locate your local office at
international.cengage.com/region
Cengage Learning products are represented in Canada by Nelson
Education, Ltd.
For your course and learning solutions, visit www.cengage.com.
Purchase any of our products at your local college store or at
our preferred online store www.cengagebrain.com.
Instructors: Please visit login.cengage.com and log in to access
instructor-specific resources.
Printed in the United States
1 2 3 4 5 6 7 15 14 13 12 11
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
In loving memory of my brother and teacher Barry,
and for Donna, Sam, and Ben, who teach me daily.
RHC
For Justin, Hanna and Sara—you are my world.
JGN
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents
Session 1. A First Look at SPSS Statisitcs 18 1
Objectives 1
Launching SPSS/PASW Statistics 18 1
Entering Data into the Data Editor 3
Saving a Data File 6
Creating a Bar Chart 7
Saving an Output File 11
Getting Help 12
Printing in SPSS 12
Quitting SPSS 12
Session 2. Tables and Graphs for One Variable 13
Objectives 13
Opening a Data File 13
Exploring the Data 14
Creating a Histogram 16
Frequency Distributions 20
Another Bar Chart 22
Printing Session Output 22
Moving On… 23
Session 3. Tables and Graphs for Two Variables 27
Objectives 27
Cross-Tabulating Data 27
Editing a Recent Dialog 29
More on Bar Charts 29
Comparing Two Distributions 32
v
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
vi
Contents
Scatterplots to Detect Relationships 33
Moving On… 34
Session 4. One-Variable Descriptive Statistics 39
Objectives 39
Computing One Summary Measure for a Variable 39
Computing Additional Summary Measures 43
A Box-and-Whiskers Plot 46
Standardizing a Variable 47
Moving On… 48
Session 5. Two-Variable Descriptive Statistics 51
Objectives 51
Comparing Dispersion with the Coefficient of Variation 51
Descriptive Measures for Subsamples 53
Measures of Association: Covariance and Correlation 54
Moving On… 57
Session 6. Elementary Probability 61
Objectives 61
Simulation 61
A Classical Example 61
Observed Relative Frequency as Probability 63
Handling Alphanumeric Data 65
Moving On… 68
Session 7. Discrete Probability Distributions 71
Objectives 71
An Empirical Discrete Distribution 71
Graphing a Distribution 73
A Theoretical Distribution: The Binomial 74
Another Theoretical Distribution: The Poisson 76
Moving On… 77
Session 8. Normal Density Functions 81
Objectives 81
Continuous Random Variables 81
Generating Normal Distributions 82
Finding Areas under a Normal Curve 85
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents
vii
Normal Curves as Models 87
Moving On… 89
Session 9. Sampling Distributions 93
Objectives 93
What Is a Sampling Distribution? 93
Sampling from a Normal Population 94
Central Limit Theorem 97
Sampling Distribution of the Proportion 99
Moving On… 100
Session 10. Confidence Intervals 103
Objectives 103
The Concept of a Confidence Interval 103
Effect of Confidence Coefficient 106
Large Samples from a Non-normal (Known) Population 106
Dealing with Real Data 107
Small Samples from a Normal Population 108
Moving On… 110
Session 11. One-Sample Hypothesis Tests 113
Objectives 113
The Logic of Hypothesis Testing 113
An Artificial Example 114
A More Realistic Case: We Don’t Know Mu or Sigma 117
A Small-Sample Example 119
Moving On… 121
Session 12. Two-Sample Hypothesis Tests 125
Objectives 125
Working with Two Samples 125
Paired vs. Independent Samples 130
Moving On… 132
Session 13. Analysis of Variance (I) 137
Objectives 137
Comparing Three or More Means 137
One-Factor Independent Measures ANOVA 138
Where Are the Differences? 142
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
viii
Contents
One-Factor Repeated Measures ANOVA 144
Where Are the Differences? 149
Moving On… 149
Session 14. Analysis of Variance (II) 153
Objectives 153
Two-Factor Independent Measures ANOVA 153
Another Example 159
One Last Note 161
Moving On… 162
Session 15. Linear Regression (I) 165
Objectives 165
Linear Relationships 165
Another Example 170
Statistical Inferences in Linear Regression 171
An Example of a Questionable Relationship 172
An Estimation Application 173
A Classic Example 174
Moving On… 175
Session 16. Linear Regression (II) 179
Objectives 179
Assumptions for Least Squares Regression 179
Examining Residuals to Check Assumptions 180
A Time Series Example 185
Issues in Forecasting and Prediction 187
A Caveat about “Mindless” Regression 190
Moving On… 191
Session 17. Multiple Regression 195
Objectives 195
Going Beyond a Single Explanatory Variable 195
Significance Testing and Goodness of Fit 201
Residual Analysis 202
Adding More Variables 202
Another Example 203
Working with Qualitative Variables 204
A New Concern 206
Moving On… 207
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents
ix
Session 18. Nonlinear Models 211
Objectives 211
When Relationships Are Not Linear 211
A Simple Example 212
Some Common Transformations 213
Another Quadratic Model 215
A Log-Linear Model 220
Adding More Variables 221
Moving On… 221
Session 19. Basic Forecasting Techniques 225
Objectives 225
Detecting Patterns over Time 225
Some Illustrative Examples 226
Forecasting Using Moving Averages 228
Forecasting Using Trend Analysis 231
Another Example 234
Moving On… 234
Session 20. Chi-Square Tests 237
Objectives 237
Qualitative vs. Quantitative Data 237
Chi-Square Goodness-of-Fit Test 237
Chi-Square Test of Independence 241
Another Example 244
Moving On… 245
Session 21. Nonparametric Tests 249
Objectives 249
Nonparametric Methods 249
Mann-Whitney U Test 250
Wilcoxon Signed Ranks Test 252
Kruskal-Wallis H Test 254
Spearman’s Rank Order Correlation 257
Moving On… 258
Session 22. Tools for Quality 261
Objectives 261
Processes and Variation 261
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
x
Contents
Charting a Process Mean 262
Charting a Process Range 265
Another Way to Organize Data 266
Charting a Process Proportion 268
Pareto Charts 270
Moving On… 272
Appendix A. Dataset Descriptions 275
Appendix B. Working with Files 309
Objectives 309
Data Files 309
Viewer Document Files 310
Converting Other Data Files into SPSS Data Files 311
Index 315
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface
Quantitative Reasoning, Real Data, and Active Learning
Most undergraduate students in the U.S. now take an
introductory course in statistics, and many of us who teach statistics
strive to engage students in the practice of data analysis and quantitative
thinking about real problems. With the widespread availability of
personal computers and statistical software, and the near-universal
application of quantitative methods in many professions, introductory
statistics courses now emphasize statistical reasoning more than
computational skill development. Questions of how have given way to
more challenging questions of why, when, and what?
The goal of this book is to supplement an introductory
undergraduate statistics course with a comprehensive set of self-paced
exercises. Students can work independently, learning the software skills
outside of class, while coming to understand the underlying statistical
concepts and techniques. Instructors can teach statistics and statistical
reasoning, rather than teaching algebra or software. Both students and
teachers can devote their energies to using data analysis in ways that
inform their understanding of the world and investigate problems that
really matter.
The Approach of This Book
The book reflects the changes described above in several ways.
First and most obviously it provides some training in the use of a
powerful software package to relieve students of computational drudgery.
Second, each session is designed to address a statistical issue or need,
rather than to feature a particular command or menu in the software.
xi
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xii
Preface
Third, nearly all of the datasets in the book are real, reflecting a variety
of disciplines and underscoring the wide applicability of statistical
reasoning. Fourth, the sessions follow a traditional sequence, making the
book compatible with many texts. Finally, as each session leads students
through the techniques, it also includes thought-provoking questions
and challenges, engaging the student in the processes of statistical
reasoning. In designing the sessions, we kept four ideas in mind:
•
Statistical reasoning, not computation, is the goal of the course.
This book asks students questions throughout, balancing
software instruction with reflection on the meaning of results.
•
Students arrive in the course ready to engage in statistical
reasoning. They need not slog all the way through descriptive
techniques before encountering the concept of inference. The
exercises invite students to think about inferences from the
start, and the questions grow in sophistication as students
master new material.
•
Exploration of real data is preferable to artificial datasets. With
the exception of the famous Anscombe regression dataset and
a few simulations, all of the datasets are real. Some are very
old and some are quite current, and they cover a wide range
of substantive areas.
•
Statistical topics, rather than software features, should drive
the design of each session. Each session features several SPSS
functions selected for their relevance to the statistical concept
under consideration.
This book provides a rigorous but limited introduction to the
software produced by SPSS, an IBM company.1 The SPSS/PASW2
Statistics 18 system is rich in features and options; this book makes no
attempt to “cover” the entire package. Instead, the level of coverage is
commensurate with an introductory course. There may be many ways to
perform a given task in SPSS; generally, we show one way. This book
provides a “foot in the door.” Interested students and other users can
explore the software possibilities via the extensive Help system or other
standard SPSS documentation.
SPSS was acquired by IBM in October 2009.
SPSS Statistics 18 was formerly known as PASW Statistics 18, and the
PASW name appears on several screens in the software. The book will reference
the SPSS name only, but note that SPSS and PASW are interchangeable terms.
1
2
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface
xiii
Using This Book
We presume that this book is being used as a supplementary text
in an introductory-level statistics course. If your courses are like ours
(one in a psychology department, the other in a business department),
class time is a scarce resource. Adding new material is always a
balancing act. As such, supplementary readings and assignments must
be carefully integrated. We suggest that instructors use the sessions in
this book in four different ways, tailoring the approach throughout the
term to meet the needs of the students and course.
•
•
•
•
In-class activity: Part or all of some sessions might best be
done together in class, with each student at a computer. The
instructor can comment on particular points and can roam to
offer assistance. This may be especially effective in the earliest
sessions.
Stand-alone assignments: In conjunction with a topic covered
in the principal text, sessions can be assigned as independent
out-of-class work, along with selected Moving On… questions.
This is our most frequently-used approach. Students
independently learn the software, re-enforce the statistical
concepts, and come to class with questions about any
difficulties they encountered in the lab session.
Preparation for text-based case or problem: An instructor may
wish to use a textbook case for a major assignment. The
relevant session may prepare the class with the software skills
needed to complete the case.
Independent projects: Sessions may be assigned to prepare
students to undertake an independent analysis project
designed by the instructor. Many of the data files provided
with the book contain additional variables that are never used
within sessions. These variables may form the basis for
original analyses or explorations.
Solutions are available to instructors for all Moving On… and
bold-faced questions. Instructors should consult their Cengage Learning
sales representatives for details. A companion website is available to both
instructors and students at www.cengage.com/statistics/carver.
The Data Files
As previously noted, each of the data files provided with this book
contains real data, much of it downloaded from public sites on the World
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xiv
Preface
Wide Web. The companion website to accompany the book contains all of
the data files. Appendix A describes each file and its source, and provides
detailed definitions of each variable. Many of the files include variables in
addition to those featured in exercises and examples. These variables
may be useful for projects or other assignments.
The data files were chosen to represent a variety of interests and
fields, and to illustrate specific statistical concepts or techniques. No
doubt, each instructor will have some favorite datasets that can be used
with these exercises. Most textbooks provide datasets as well. For some
tips on converting other datasets for use with SPSS, see Appendix B.
Note on Software Versions
The sessions and screen images in this book mostly used SPSS
Base 18 running under Windows XP. Users of other versions will notice
minor differences with the figures and instructions in this book. Before
starting Sessions 9−11, users of the Student Version of SPSS should be
aware that the student version does not support the use of syntax files,
and therefore will not be able to run the simulations in those sessions.
We’ve provided the results of our simulation runs so that you’ll still get
the point. Read the sessions closely and you will still be able to follow the
discussion.
To the Student
This book has two goals: to help you understand the concepts
and techniques of statistical analysis, and to teach you how to use one
particular tool—SPSS—to perform such analysis. It can supplement but
not replace your primary textbook or your classroom time. To get the
maximum benefit from the book, you should take your time and work
carefully. Read through a session before you sit down at the computer.
Each session should require no more than about 30 minutes of computer
time; there’s little need to rush through them.
You’ll often see boldfaced questions interspersed through the
computer instructions. These are intended to shift your focus from
mouse-clicking and typing to thinking about what the answers mean,
whether they make sense, whether they surprise or puzzle you, or how
they relate to what you have been doing in class. Attend to these
questions, even when you aren’t sure of their purpose.
Each session ends with a section called Moving On…. You should
also respond to the numbered questions in that section, as assigned by
your instructor. Questions in the Moving On… sections are designed to
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface
xv
challenge you. Sometimes, it is quite obvious how to proceed with your
analysis; sometimes, you will need to think a bit before you issue your
first command. The goal is to get you to engage in statistical thinking,
integrating what you have learned throughout your course. There is
much more to doing data analysis than “getting the answer,” and these
questions provide an opportunity to do realistic analysis.
As noted earlier, SPSS is a large and very powerful software
package, with many capabilities. Many of the features of the program are
beyond the scope of an introductory course, and do not figure in these
exercises. However, if you are curious or adventurous, you should
explore the menus and Help system. You may find a quicker, more
intuitive, or more interesting way to approach a problem.
Typographical Conventions
Throughout this book, certain symbols and typefaces are used
consistently. They are as follows:
Menu h Sub-menu h Command The mouse icon indicates an
action you take at the computer, using the mouse or keyboard.
The bold type lists menu selections for you to make.
Dialog box headings are in this typeface.
Dialog box choices, variable names, and items you should type appear in
this typeface.
File names (e.g., Colleges) appear in this typeface.
A box like this contains an instruction requiring special care or information
about something that may work differently on your computer system.
Bold italics in the text indicate a question that you should
answer as you write up your experiences in a session.
Acknowledgments
Like most authors, we owe many debts of gratitude for this book.
This project enjoyed the support of Stonehill College through the annual
Summer Grants and the Stonehill Undergraduate Research Experience
(SURE) programs. As the SURE scholar in the preparation of the first
edition of the book, Jason Boyd contributed in myriad ways, consistently
doing reliable, thoughtful, and excellent work. He tested every session,
prepared instructors’ solutions, researched datasets, critiqued sessions
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xvi
Preface
from a student perspective, and tied up loose ends. His contributions
and collegiality were invaluable.
For the previous edition we enlisted the help of two very able
students, Jennifer Karp and Elizabeth Wendt. Their care and affable
approach to the project has made all the difference.
Many colleagues and students suggested or provided datasets.
Student contributors were Jennifer Axon, Stephanie Duggan, Debra
Elliott, Tara O’Brien, Erin Ruell, and Benjamin White. A big thank you
goes out to our students in Introduction to Statistics and Quantitative
Analysis for Business for pilot-testing many of the sessions and for
providing useful feedback about them.
We thank our Stonehill colleagues Ken Branco, Lincoln Craton,
Roger Denome, Jim Kenneally, and Bonnie Klentz for suggesting or
sharing data, and colleagues from other institutions who supported our
work: Chris France, Roger Johnson, Stephen Nissenbaum, Mark
Popovksy, and Alan Reifman. Thanks also to the many individuals and
organizations granting permission to use published data for these
sessions; they are all identified in Appendix A.
Over the years working with Cengage Learning, we have enjoyed
the guidance and encouragement of Richard Stratton, Curt Hinrichs,
Carolyn Crockett, Molly Taylor, Dan Seibert, Catherine Ronquillo,
Jennifer Risden, Ann Day, Sarah Kaminskis, and Seema Atwal. We also
thank Paul Baum at California State University, Northridge and to
Dennis Jowaisas at Oklahoma City University, two reviewers whose
constructive suggestions improved the quality of the first edition.
W
W
W
Finally, we thank our families.
I want to thank my husband, Justin, for his
unwavering support of my professional work, and our
daughters, Hanna and Sara, for providing an enjoyable
distraction from this project.
JGN
The Carver home team has been fabulous, as always.
To Donna, my partner and counsel; to Sam and Ben, my
cheering section and assistants. Thanks for the time, space,
and encouragement.
RHC
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
About the Authors
Robert H. Carver is Professor of Business Administration at
Stonehill College in Easton, Massachusetts and an Adjunct Professor at
the International School of Business at Brandeis University, and has
received awards for teaching excellence at both institutions. He teaches
courses in applied statistics, research methods, information systems,
strategic management, and business and society. He holds an A.B. from
Amherst College and a Ph.D. in Public Policy Studies from the University
of Michigan. He is the author of Doing Data Analysis with Minitab 14
(Cengage Learning), and articles in Case Studies in Business, Industry
and Government Statistics; Publius; The Journal of Statistics Education;
The Journal of Business Ethics; PS: Political Science & Politics; Public
Administration Review; Public Productivity Review; and The Journal of
Consumer Marketing.
Jane Gradwohl Nash is Professor of Psychology at Stonehill
College. She earned her B.A. from Grinnell College and her Ph.D. from
Ohio University. She enjoys teaching courses in the areas of statistics,
cognitive psychology, and general psychology. Her research interests are
in the area of knowledge structure and knowledge change (learning) and
more recently, social cognition. She is the author of articles that have
appeared in the Journal of Educational Psychology; Organizational
Behavior and Human Decision Processes; Computer Science Education;
Headache; Journal of Chemical Education; Research in the Teaching of
English; and Written Communication.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Session 1
A First Look at SPSS Statistics 18
Objectives
In this session, you will learn to do the following:
• Launch and exit the program
• Enter quantitative and qualitative data in a data file
• Create and print a graph
• Get Help
• Save your work to a disk
Launching SPSS/PASW Statistics 18
Before starting this session, you should know how to run a
program within the various Windows operating systems. All the
instructions in this manual presume basic familiarity with the Windows
environment.
Check with your instructor for specific instructions about running the
program on your school’s system. Your instructor will also tell you where
to find the software and its related files.
Click on the start button at the lower left of your screen, and
among the programs, find SPSS Inc and select PASW Statistics 18
PASW Statistics 18. Depending on how the program was installed, you
may also have a shortcut icon on your desktop.
On the next page is an image of the screen you will see when the
software is ready. First you will see a menu dialog box listing several
options; behind it is the Data Editor, which is used to display the data
that you will analyze using the program. Later you will encounter the
Output Viewer window that displays the results of your analysis. Each
1
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2
Session 1 A First Look at SPSS Statistics 18
window has a unique purpose, to be made clear in due course. It’s
important at the outset to know there are several windows with different
functions.
At any point in your session, only one window is selected,
meaning that mouse actions and keystrokes will affect that window
alone. When you start, there’s a special start-up window. For now, click
Cancel and the Data Editor will be selected.
Since the software operates upon data, we generally start by
placing data into the Editor, either from the keyboard or from a stored
disk file. The Data Editor looks much like a spreadsheet. Cells may
contain numbers or text, but unlike a spreadsheet, they never contain
formulas. Except for the top row, which is reserved for variable names,
rows are numbered consecutively. Each variable in your dataset will
occupy one column of the data file, and each row represents one
observation. For example, if you have a sample of fifty observations on
two variables, your worksheet will contain two columns and fifty rows.
The menu bar across the top of the screen identifies broad
categories of SPSS’ features. There are two ways to issue commands in
SPSS: choose commands from the menu or icon bars, or type them
directly into a Syntax Editor. This book always refers you to the menus
and icons. You can do no harm by clicking on a menu and reading the
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Entering Data into the Data Editor
3
choices available, and you should expect to spend some time exploring
your choices in this way.
Entering Data into the Data Editor
For most of the sessions in this book, you will start by accessing
data already stored on a disk. For small datasets or class assignments,
though, it will often make sense simply to type in the data yourself. For
this session, you will transfer the data displayed below into the Data
Editor.
In this first session, our goal is simple: to create a small data file,
and then use the software to construct two graphs using the data. This is
typical of the tasks you will perform throughout the book.
The coach of a high school swim team runs a practice for 10
swimmers, and records their times (in seconds) on a piece of paper.1
Each swimmer is practicing the 50-meter freestyle event, and the boys on
the team assert that they did better than the girls. The coach wants to
analyze these results to see what the facts are. He codes gender with as F
(female) for the girls and M (male) for the boys.
Swimmer
Sara
Jason
Joanna
Donna
Phil
Hanna
Sam
Ben
Abby
Justin
Gender
F
M
F
F
M
F
M
M
F
M
Time
29.34
30.98
29.78
34.16
39.66
44.38
34.80
40.71
37.03
32.81
The first step in entering the data into the Data Editor is to define
three variables: Swimmer, Gender, and Time. Creating a variable
requires us to name it, specify the type of data (qualitative, quantitative,
number of decimal places, etc.) and assign labels to the variable and data
values if we wish.
1 Nearly every dataset in this book is real. For the sake of starting
modestly, we have taken a minor liberty in this session. This example is actually
extracted from a dataset you will use later in the book. The full dataset appears
in two forms: Swimmer and Swimmer2.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4
Session 1 A First Look at SPSS Statistics 18
Move your cursor to the bottom of the Data Editor, where you will
see a tab labeled Variable View. Click on that tab. A different grid
appears, with these column headings (widen the window to see all
columns):
For each variable we create, we need to specify all or most of the
attributes described by these column headings.
Move your cursor into the first empty cell in Row 1 (under Name)
and type the variable name Swimmer. Press Enter (or Tab).
Now click within the Type column, and a small gray button
marked with three dots will appear; click on it and you’ll
see this dialog box. Numeric is the default variable type.
Click on the circle labeled String in the lower left corner of the
dialog box. The names of the swimmers constitute a nominal or
categorical variable, represented by a “string” of characters rather
than a number. Click OK.
Notice that the Measure column (far right column) now reads
Nominal, because you chose String as the variable type.
In SPSS, each variable may carry a descriptive label to help
identify its meaning. Additionally, as we’ll soon see, we can also label
individual values of a variable. Here’s how we add the variable label:
Move the cursor into the Label column, and type Name of
Swimmer. As you type, notice that the column gets wider. This
completes the definition of our first variable.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Entering Data into the Data Editor
Now let’s create a variable to represent gender. Move to the first
column of row 2, and name the new variable Gender.
Like Name, Gender is also a nominal scale variable, so we will
proceed as in the prior step. Change the variable type from
Numeric to String, and reduce the width of the variable from 8
characters down to 1.
5
Throughout the book, we’ll often ask you to carry out a step on your own
after previously demonstrating the technique in the previous example. In
this way you will eventually build facility with these skills.
Label this variable Sex of swimmer.
Now we can assign text labels to our coded values. In the Values
column, click on the word None and then click the gray box with
three dots. This opens the Value Labels dialog box (completed
version shown here). Type F in the Value box and type Female in
the Value Label box. Click Add.
Then type M in Value, and Male in Value Label. Click Add, and
then click OK.
Finally, we’ll create a scale variable in this dataset: Time.
Begin as you have done twice now, by naming the third variable
Time. You may leave Type, Width, and Decimals as they are, since
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6
Session 1 A First Look at SPSS Statistics 18
Time is a numeric variable and the default setting of 8 spaces
wide with two decimal places is appropriate here.2
Label this variable “Practice time (secs).”
Switch to the Data View by clicking the appropriate tab in the
lower left of your screen.
Follow the directions below, using the data table found on page 3.
If you make a mistake, just return to the cell and retype the entry.
Move the cursor to the first cell below Swimmer, and type Sara;
then press Enter. In the next cell, and type Jason. When you’ve
completed the names, move to the top cell under Gender, and go
on. When you are finished, the Data Editor should look like this:
In the View menu at the top of your screen, select Value Labels;
do you see the effect in the Data Editor? Return to the View menu
and click Value Labels again. You can toggle labels on and off in
this way.
Saving a Data File
It is wise to save all of your work in a disk file. SPSS distinguishes
between two types of files—output and data—that one might want to
2 When we create a numeric variable, we specify the maximum length of
the variable and the number of decimal places. For example, the data type
“Numeric 8.2” refers to a number eight characters long, of which the final two
places follow the decimal point: e.g., 12345.78.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Creating a Bar Chart
7
save. At this point, we’ve created a data file and ought to save it on a
disk. Let’s call the data file Swim.
Check with your instructor to see if you can save the data file on a hard drive
or network drive in your system. On your own computer, it is wise to establish
a folder to hold your work related to this book.
On the File menu, choose Save As…. In the Save in box, select the
destination directory that chosen (in our example, we’re saving it
to the Desktop). Then, next to File Name, type swim. Click Save.
A new output Viewer window will open, with an entry that
confirms you’ve saved your data file.
Creating a Bar Chart
With the data entered and saved, we can begin to look for an
answer for the coach. We’ll first use a bar graph to display the average
time for the males in comparison to the females. In SPSS, we’ll use the
Chart Builder to generate graphs.
Click on Graphs in the menu bar, and choose Chart Builder….
You will see an information window noting that variables must be
specified as we did earlier. Close the window and you’ll find the
dialog box shown at the top of the next page.
From now on in this book, we’ll abbreviate menu selections with the name of
the menu and the submenu or command. The command you just gave would be
Graphs h Chart Builder…
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8
Session 1 A First Look at SPSS Statistics 18
The Chart Builder shows a list of graph types and allows us to
specify which variable(s) to summarize as well as many options. This is
true for many commands; we’ll typically use the default options early in
this book, moving to other choices as you become more familiar with
statistics and with SPSS.
2. Drag the Simple Bar chart
icon to the Preview area.
1.In the Gallery of chart
types, we’ll first select Bar
In the lower left of the dialog, note that Bar chart is the default
option. There are basic types of bar chart here, symbolized by the
icons in the lower center of the dialog. The first of these icons
represents a simple bar chart; drag it to the Preview area.
The Preview area of the Chart Builder displays a prototype of the
graph we are starting to build. In our graph, we’ll want to display two
bars to represent the average practice times of the girls and the boys. To
do this, we’ll place sex on the horizontal axis and average practice time
on the vertical. In the Chart Builder, This is easily accomplished by
dragging the variables to the axes.
Notice that the three variables are initially listed by description
and name on the left side of the dialog box, along with special symbols:
Nominal variable (qualitative)
Scale variable (quantitative)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Creating a Bar Chart
9
In the upper left of the dialog, highlight Sex of swimmer and drag it
to the horizontal axis within the preview.
Similarly, click and drag Practice time to the vertical axis. In the
preview, note that the axis is now labeled Mean Practice Time. By
default, SPSS suggests summarizing this quantitative variable.
It is good practice to place a title on graphs. In the lower portion
of the dialog, click the tab marked Titles/Footnotes. Check the
Title 1 box. In the Content area of the Element Properties dialog,
type a title (we’ve chosen “Comparison of Female & Male Practice
Times”). Then click Apply at the bottom of the Element Properties
dialog and OK at the bottom of the Chart Builder dialog.
You will now see a new window appear, containing a bar chart
(see next page). This is the output Viewer, and contains two “panes.” On
the left is the Outline pane, which displays an outline of all of your
output. The Content pane, on the right, contains the output itself.
Also, notice the menu bar at the top of the Viewer window. It is
very similar to the one in the Data Editor, with some minor differences.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10
Session 1 A First Look at SPSS Statistics 18
In general, we can perform statistical analysis from either window. Later,
we’ll learn some data manipulation commands that can only be given
from the Data Editor.
This is the
Contents pane
This is the
Outline pane
Now look at the chart. The height of each bar corresponds to the
simple average time of the males and females. What does the chart tell
you about the original question: Did the males or females have a
better practice that day?
There is much more to a set of data than its average. Let’s look at
another graph that can give us a feel for how the swimmers did
individually and collectively. This graph is called a box-and-whiskers plot
(or boxplot), and displays how the swimmers’ times were spread out.
Boxplots are fully discussed in Session 4, but we’ll take a first look now.
You may issue this command either from the Data Editor or the Viewer.
Graphs h Chart Builder… The dialog reopens where we last left it,
with the Titles tab foremost. Return to the Gallery tab and choose
Boxplot from the gallery, dragging Simple Boxplot to the preview.
Notice that the earlier selections still apply; our choice of
variables is unchanged. This is often a very helpful feature of the Chart
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Saving an Output File
11
Builder: we can explore different graphing alternatives without needing to
redo all prior steps. Go ahead and click OK.
The boxplot shows results for the males and females. There are
two boxes, and each has “whiskers” extending above and below the box.
In this case, the whiskers extended from the shortest to the longest time.
The outline of the box reflects the middle three times, and the line
through the middle of the box represents the median value for the
swimmers.3
Looking now at the boxplot, what impression do you have of
the practice times for the male and female swimmers? How does
this compare to your impression from the first graph?
Saving an Output File
At this point, we have the Viewer open with some output and the
Data Editor with a data file. We have saved the data, but have not yet
saved the output on a disk. This can sometimes be confusing for new
users—the raw data files are maintained separately from the results we
generate during a working session.
3 The median of a set of points is the middle value when the observations
are ranked from smallest to largest. With only five swimmers of each gender, the
median values are just the time recorded for the third female and the third male.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
12
Session 1 A First Look at SPSS Statistics 18
File h Save As… In this dialog box, assign a name to the file (such
as Session 1). This new file will save both the Outline and Content
panes of the Viewer.
Getting Help
You may have noticed the Help button in the dialog boxes. SPSS
features an extensive on-line Help system. If you aren’t sure what a term
in the dialog box means, or how to interpret the results of a command,
click on Help. You can also search for help on a variety of topics via the
Help menu at the top of your screen. As you work your way through the
sessions in this book, Help may often be valuable. Spend some time
experimenting with it before you genuinely need it.
Printing in SPSS
Now that you have created some graphs, let’s print them. Be sure
that no part of the outline is highlighted; if it is, click once in a clear area
of the Outline pane. If a portion of the outline is selected, only that
portion will print.
Check with your instructor about any special considerations in selecting a
printer or issuing a print command. Every system works differently in this
matter.
File h Print… This command will print the Contents pane of the
Viewer. Click OK.
Quitting SPSS
When you have completed your work, it is important to exit the
program properly. Virtually all Windows programs follow the same
method of quitting.
File h Exit You will generally see a message asking if you wish to
save changes. Since we saved everything earlier, click No.
That’s all there is to it. Later sessions will explain menus and
commands in greater detail. This session is intended as a first look; you
will return to these commands and others at a later time.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Session 2
Tables and Graphs for One Variable
Objectives
In this session, you will learn to do the following:
• Retrieve data stored in a SPSS data file
• Explore data with a Stem-and-Leaf display
• Create and customize a histogram
• Create a frequency distribution
• Print output from the Viewer window
• Create a bar chart
Opening a Data File
In the previous session, you created a SPSS data file by entering
data into the Data Editor. In this lab, you’ll use several data files that are
available on your disk. This session begins with some data about traffic
accidents in the United States. Our goal is to get a sense of how
prevalent fatal accidents were in 2005.
NOTE:
The location of SPSS files depends on the configuration of your
computer system. Check with your instructor.
Choose File h Open h Data… A dialog box like the one shown on
the next page will open. In the Look in: box, select the appropriate
directory for your system or network, and you will see a list of
available worksheet files. Select the one named States. (This file
name may appear as States.sav on your screen, but it’s the same
file.)
13
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
14
Session 2 Tables and Graphs for One Variable
Click on States.sav
Click Open, and the Data Editor will show the data from the
States file. Using the scroll bars at the bottom and right side of the
screen, move around the worksheet, just to look at the data. Move the
cursor to the row containing variable names (e.g. state, MaleDr, FemDr,
etc.) Notice that the variable labels appear as the cursor passes each
variable name. Consult Appendix A for a full description of the data files.
Exploring the Data
SPSS offers several tools for exploring data, all found in the
Explore command. To start, we’ll use the Stem-and-Leaf plot to look at
the number of people killed in automobile accidents in 2005.
Analyze h Descriptive Statistics h Explore… We want to select
Number of fatalities in accidents in 2005 [accfat2005]. As shown in this
dialog box, the variable names appear to be truncated.
1. Highlight this variable and click once
2. Click on arrow to
select the variable
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Exploring the Data
15
You can increase the size of a dialog box by placing the cursor on
any edge and dragging the box out. Try it now to make it easier to
find the variable of interest here. Once you select the variable,
click OK.
Many SPSS dialog boxes show a list of variables, as this one does.
Here the variables are listed in the same order as in the Data Editor. In
other dialog boxes, they may be listed alphabetically by variable label.
When you move your cursor into the list, the entire label becomes
visible. The variable name appears in square brackets after the label.
This book often refers to variables by name, rather than by label. If you
cannot find the variable you are looking for, consult Appendix A.
By default, the Explore command reports on the extent of missing
data, generates a table of descriptive statistics, creates a stem-and-leaf
plot, and constructs a box-and-whiskers plot. The descriptive statistics
and boxplot are treated later in Session 4.
The first item in the Viewer window summarizes how many
observations we have in the dataset; here there are 51 “cases,” or
observations, in all. For every one of the 50 states plus the District of
Columbia, we have a valid data value, and there is no missing data.
Below that is a table of descriptive statistics. For now, we bypass
these figures, and look at the Stem-and-Leaf plot, shown on the next
page and explained below.
In this output, there are three columns of information,
representing frequency, stems, and leaves. Looking at the notes at the
bottom of the plot, we find that each stem line represents a 1000’s digit,
and each leaf represents 1 state.
Note that the first five rows have a 0 stem. The first row
represents states between 0 and 199 fatalities while the second row
represents states with 200 to 299 fatalities, and so on. Thus, in the first
row of output we find that 11 states had between 0 and 199 automobile
accident fatalities in 2005. There are four “0-leaves” in that first row;
these represent four states that had fewer than 100 fatalities that year.
The seven “1-leaves” (highlighted below) represent seven states with
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
16
Session 2 Tables and Graphs for One Variable
between 100 and 199 fatalities. Moving down the plot, the row with a
stem of 1 and leaves of 6 and 7 indicate that one state had fatalities in
the 1600s and one state had fatalities in the 1700s. Finally, in the last
row, we find 3 states that had at least 3504 fatalities, and that these are
considered extreme values.
There are 5 rows with a
stem of 0. Leaves in the
first row are values under
200; the second row is for
values 200-399, etc.
Each stem is a
1000’s digit (e.g. 2
stands for 2000)
Let’s take a close look at the first row of output to review what it
means.
Frequency
Stem &
11.00
0 .
Leaf
00001111111
11 states had fewer
than 200 fatalities.
These 7 states had
between 100 and 199
fatalities.
The Stem-and-Leaf plot helps us to represent a body of data in a
comprehensible way, and permits us to get a feel for the “shape” of the
distribution. It can help us to develop a meaningful frequency
distribution, and provides a crude visual display of the data. For a better
visual display, we turn to a histogram.
Creating a Histogram
In the first session, we created a bar graph and boxplots. In this
session, we’ll begin by making a histogram.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Creating a Histogram
17
Graphs h Chart Builder…. Under the Choose From menu,
select Histogram. Four choices of histograms will now appear.
Drag the first histogram (simple) to the preview area. As
shown below, select AccFat2005 by clicking on it and dragging
it to the X axis. By default, histograms have frequency on the
Y axis so this part of the graph is all set.
Click on the Titles/Footnotes tab, select Title 1, and type a title
for this graph (e.g., 2005 Traffic Fatalities) in the space marked
Content within the Element Properties window. Click Apply.
Now place your name on your graph by selecting Footnote 1,
typing in the content box, and clicking Apply. Your histogram
will appear in the Viewer window after you click OK.
Click here to add a title
The horizontal axis represents a number of fatalities, and the vertical
represents the number of states reporting that many cases. The
histogram provides a visual sense of the frequency distribution. Notice
that the vast majority of the states appear on the left end of the graph.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
18
Session 2 Tables and Graphs for One Variable
Outlier
How would you describe the shape of this distribution?
Compare this histogram to the Stem-and-Leaf plot. What important
differences, if any, do you see?
Also notice the short bars at the extreme right end of the graph.
What state do you think might lie furthest to the right? Look in the
Data Editor to find that outlier.
In this histogram, SPSS determined the number of bars, which
affects the apparent shape of the distribution. Using the Chart Editor we
can change the number of bars as follows:
Double click anywhere on your histogram which will open up the
Chart Editor (see next page).
Now double click on the bars of the histogram. A Properties dialog
box will appear. Under the Binning tab, choose Custom for the X
axis. Type in 24 as the number of intervals as shown in the
illustration on the next page. Click Apply and you’ve changed the
number of intervals in your histogram.
You can experiment with other numbers of bars as well. When
you are satisfied, close the Chart Editor by clicking on the r
button in the upper right corner.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Creating a Histogram
19
How does this compare to your first histogram? Which
graph better summarizes the dataset? Explain.
We would expect more populous states to have more fatalities
than smaller states. As such, it might make more sense to think in terms
of the proportion of the population killed in accidents in each state. In
our dataset, we have a variable called Traffic fatalities per 100,000 pop, 2005
[RateFat].
Use the Chart Builder to construct a histogram for the variable
Ratefat. Note that you can replace Accfat2005 with Ratefat by
dragging the new variable into the horizontal axis position.
In the Element Properties box, you will see Edit Properties and then
choose Title 1. Notice that the title of the previous graph is still
there. Replace it with a new title, click Apply, and OK.
How would you describe the shape of this distribution?
What was the approximate average rate of fatalities per 100,000
residents in 2005? Is there an outlier in this analysis? In which
states are traffic fatalities most prevalent?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
20
Session 2 Tables and Graphs for One Variable
Now, return to the Chart Builder. In the Element Properties box,
under Statistics, choose Cumulative Count; click Apply and OK.
A cumulative histogram displays cumulative frequency. As you
read along the horizontal axis from left to right, the height of the bars
represents the number of states experiencing a rate less than or equal to
the value on the horizontal axis. Compare the results of this graph to the
prior graph. About how many states had traffic fatality rates of less
than 20 fatalities per 100,000 population?
Frequency Distributions
Let’s look at some questions concerning qualitative data. Switch
from the Viewer window back to the Data Editor window.
File h Open h Data… Choose the data file Census2000. SPSS
allows you to work with multiple data files, but you may wish to
close States.
This file contains a random sample of 1270 Massachusetts
residents, with their responses to selected questions on the 2000 United
States Decennial Census. One question on the census form asked how
they commute to work. In our dataset, the relevant variable is called
Means of Transportation to Work [TRVMNS]. This is a categorical, or nominal,
variable. The Bureau of Census has assigned the following code numbers
to represent the various categories:
Value
0
1
2
3
4
5
6
7
8
9
10
11
12
Meaning
n/a, not a worker or in the labor force
Car, Truck, or Van
Bus or trolley bus
Streetcar or trolley car
Subway or elevated
Railroad
Ferryboat
Taxicab
Motorcycle
Bicycle
Walked
Worked at Home
Other Method
To see how many people in the sample used each method, we can
have SPSS generate a simple frequency distribution.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Frequency Distributions
21
Analyze h Descriptive Statistics h Frequencies… Select the
variable Means of Transportation to Work [TRVMNS] and click OK.
In the Viewer window, you should now see this:
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
22
Session 2 Tables and Graphs for One Variable
Among people who work, which means of transportation is
the most common? The least common? Be careful: the most common
response was “not working” at all.
Another Bar Chart
To graph this distribution, we should make a bar chart.
Graphs h Chart Builder … Choose Bar. Select the first bar graph
option (simple) by dragging it to the preview area. Drag the
TRVMNS variable to the X axis. Place a title and your name on
the graph and click OK.
The bar chart and frequency distribution should contain the
same information. Do they? Comment on the relative merits of
using a frequency table versus a bar chart to display the data.
Printing Session Output
Sometimes you will want to print all or part of a Viewer window.
Before printing your session, be sure you have typed your name into the
output. To print the entire session, click anywhere in the Contents pane
of the Viewer window (be sure not to select a portion of the output), and
then choose File h Print. To print part of a Viewer window, do this:
In the Outline pane of the Viewer window (the left side of the
screen), locate the first item of the output that you want to print.
Position the cursor on the name of that item, and click the left
mouse button.
Using the scroll bars (if necessary), move the cursor to the end of
the portion you want to print. Then press Shift on the keyboard
and click the left mouse button. You’ll see your selection
highlighted, as shown here.
File h Print… Notice that the Selection button is already marked,
meaning that you’ll print a selection of the output within the
Contents pane. Click OK.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Moving On…
23
Outline pane
Only the highlighted
portions will print
Contents
pane
Moving On…
Using the skills you have practiced in this session, now answer
the following questions. In each case, provide an appropriate graph or
table to justify your answer, and explain how you drew your conclusion.
1. (Census2000 file) Note that the TRVMNS variable includes the
responses of people who don’t have jobs. Among those who do
have jobs, what proportion use some type of public
transportation (bus, subway, or railroad)?
For the following questions, you will need to use the files States,
Marathon, AIDS, BP, and Nielsen (see Appendix A for detailed file
descriptions). You may be able to use several approaches or commands
to answer the question; think about which approach seems best to you.
States
2. The variable named BAC2004 refers to the legal blood alcohol
threshold for driving while intoxicated. All states set the
threshold at either .08 or .10. About what percentage of states
use the .08 standard?
3. The variable called Inc2004 is the median per capita income
for state residents in 2004. Did residents of all states earn
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
24
Session 2 Tables and Graphs for One Variable
about the same amount of income? What seems to be a
typical amount? How much variation is there across states?
4. The variable called mileage is the average number of miles
driven per year by a state’s drivers. With the help of a Stemand-Leaf plot, locate (in the Data Editor) two states where
drivers lead the nation in miles driven; what are they?
Marathon
This file contains the finish times for the wheelchair racers in the
100th Boston Marathon.
5. The variable Country is a three-letter abbreviation for the
home country of the racer. Not surprisingly, most racers were
from the USA. What country had the second highest number
of racers?
6. Use a cumulative histogram to determine approximately what
percentage of wheelchair racers completed the 26-mile course
in less than 2 hours, 10 minutes (130 minutes).
7. How would you characterize the shape of the histogram of the
variable Minutes? (Experiment with different numbers of
intervals in this graph.)
AIDS
This file contains data related to the incidence of AIDS around
the world.
8. How would you characterize the shape of the distribution of
the number of adults living with HIV/AIDS in 2005? Are there
any outlying countries? If so, what are they?
9. Consider the 2003 infection rate (%). Compare the shape of
this distribution to the shape of the distribution in the
previous question.
BP
This file contains data about blood pressure and other vital signs
for subjects after various physical and mental activities.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Moving On…
25
10. The variable sbprest is the subject’s systolic blood pressure at
rest. How would you describe the shape of the distribution of
systolic blood pressure for these subjects?
11. Using a cumulative histogram, approximately what percent of
subjects had systolic pressure of less than 140?
12. The variable dbprest is the subject’s diastolic blood pressure at
rest. How would you describe the shape of the distribution of
diastolic blood pressure for these subjects?
13. Using a cumulative histogram, approximately what percent of
subjects had diastolic pressure of less than 80?
Nielsen
This file contains the Nielsen ratings for the 20 most heavily
watched television programs for the week ending September 24, 2007.
14. Which of the networks reported had the most programs in the
top 10? Which had the fewest?
15. Approximately what percentage of the programs enjoyed
ratings in excess of 11.5?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Session 3
Tables and Graphs for Two Variables
Objectives
In this session, you will learn to do the following:
• Cross-tabulate two variables
• Create several bar charts comparing two variables
• Create a histogram for two variables
• Create an XY scatterplot for two quantitative variables
Cross-Tabulating Data
The prior session dealt with displays of a single variable. This
session covers some techniques for creating displays that compare two
variables. Our first example considers two qualitative variables. The
example involves the Census data that you saw in the last session, and
in particular addresses the question: “Do men and women use the same
methods to get to work?” Since sex and means of transportation are
both categorical data, our first approach will be a joint frequency table,
also known as a cross-tabulation.
Open the Census file by selecting File h Open h Data…, and
choosing Census2000.
Analyze h Descriptive Statistics h Crosstabs… In the dialog box
(next page), select the variables Means of transportation to work
[TRVMNS] and Sex [sex], and click OK. You’ll find the crosstabulation in the Viewer window. Who makes greater use of
cars, trucks, or vans: Men or women? Explain your
reasoning.
27
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
28
Session 3 Tables and Graphs for Two Variables
The results of the Crosstabs command are not mysterious. The
case processing summary indicates that there were 1270 cases, with no
missing data. In the crosstab itself, the rows of the table represent the
various means of transportation, and the columns refer to males and
females. Thus, for instance, 243 women commuted in a car, truck, or
van.
Simply looking at the frequencies could be misleading, since the
sample does not have equal numbers of men and women. It might be
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Editing a Recent Dialog
29
more helpful to compare the percentage of men commuting in this way to
the percentage of women doing so. Even percentages can be misleading if
the samples are small. Here, fortunately, we have a large sample. Later
we’ll learn to evaluate sample information more critically with an eye
toward sample size.
The cross-tabulation function can easily convert the frequencies
to relative frequencies. We could do this by returning to the Crosstabs
dialog box following the same menus as before, or by taking a slightly
different path.
Editing a Recent Dialog
Often, we’ll want to repeat a command using different variables or
options. For quick access to a recent command, SPSS provides a special
button on the toolbar below the menus. Click on the Dialog Recall
button (shown to the right), and you’ll see a list of recently issued
commands. Crosstabs will be at the top of the list; click on
Crosstabs, and the last dialog box will reappear.
To answer the question posed above, we want the values in each
cell to reflect frequencies relative to the number of women and
men, so we want to divide each by the total of each respective
column. To do so, click on the button marked Cells, check Column
Percentages, click Continue, and then click OK. Based on this
table, would you say that men or women are more likely to
commute by car, truck, or van?
Now try asking for Row Percentages (click on Dialog Recall). What
do these numbers represent?
More on Bar Charts
We can also use a bar chart to analyze the relationship between
two variables. Let’s look at the relationship between two qualitative
variables in the student survey: gender and seat belt usage. Students
were asked how frequently they wear seat belts when driving: Never,
Sometimes, Usually, and Always. What do you think the students said?
Do you think males and females responded similarly? We will create a
bar chart to help answer these questions.
In the Data Editor, open the file called Student.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
30
Session 3 Tables and Graphs for Two Variables
Graphs h Chart Builder… We used this command in the prior
session. From the Gallery choices, choose Bar. Then drag the
second bar graph icon (clustered) to the preview area. We must
specify a variable for the horizontal axis, and may optionally
specify other variables.
Drag Frequency of seat belt usage [belt] to the horizontal axis. If we
were to click OK now, we would see the total number of students
who gave each response. But we are interested in the comparison
of responses by men and women.
Drag Gender to the Cluster on X box. Click OK.
We want to cluster the
bars by Gender.
The Cluster setting
creates side-by-side
bars for males and
females
Look closely at the bar chart that you have just created. What
can you say about the seat belt habits of these students?
In this bar chart, the order of axis categories is alphabetical. With
this ordinal variable, it would be more logical to have the categories
sequenced by frequency: Never, Sometimes, Usually, and Always. We can
change the order of the categories either by opening the Chart Editor or
by recalling the Chart Builder dialog. Return to the prior Chart Builder
dialog.
Under Element Properties box on the right, select X-axis1 (Bar1).
Under Categories, use the up and down arrows to place the order
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
More on Bar Charts
31
of categories on the horizontal axis
in the following order: Never,
Sometimes, Usually, Always.
Click Apply in Element Properties
and then OK in the main Chart
Builder dialog. The resulting graph
should be clearer to read and
interpret.
This graph uses clustered bars to
compare the responses of the men and the
women. A clustered bar graph highlights
the differences in belt use by men and
women, but it’s hard to tell how many
students are in each usage category. A stacked bar chart is a useful
alternative.
Select the dialog recall icon as we did previously and choose Chart
Builder. Drag the third bar graph icon (stacked) to the preview
area. The horizontal axis variable (frequency of seat belt usage) will
stay the same. However, you will need to drag Gender to the Stack
box
Arrange the categories by frequency as done previously.
Here are the clustered and stacked versions of this graph. Do
they show different information? What impressions would a viewer
draw from these graphs?
Stacked
Clustered
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
32
Session 3 Tables and Graphs for Two Variables
We can also analyze a quantitative variable in a bar chart. Let’s
compare the grade point averages (GPA) of the men and women in the
student survey. We might compare the averages of the two groups.
Graphs h Chart Builder… Choose Bar and drag the first bar graph
icon (simple) to the preview area.
Drag Current GPA [gpa] to the vertical axis and Gender to the
horizontal axis. Click OK
The bars in the graph represent the mean, or average, of the GPA
variable. How do the average GPAs of males and females compare?
Comparing Two Distributions
The bar chart compared the mean GPAs for men and women.
How do the whole distributions compare? As a review, we begin by
looking at the distribution of GPAs for all students.
Graphs h Chart Builder… Choose Histogram and drag the first
histogram icon (simple) to the preview area. Select Current GPA
[gpa] as the variable, and click OK. You’ll see the graph shown
here. How do you describe the shape of this distribution?
Let’s compare the distribution of grades for male and female
students. We’ll create two side-by-side histograms, using the same
vertical and horizontal scales:
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Scatterplots to Detect Relationships
33
Click the Dialog Recall button, and choose Chart Builder. We need
to indicate that the graph should distinguish between the GPAs
for women and men.
Click on the Groups/Points ID tab. Select Columns Panel Variable
then drag Gender into the Panel box, and click OK.
What does this graph show about the GPAs of these
students? In what ways are they different? What do they have in
common? What reasons might explain the patterns you see?
Scatterplots to Detect Relationships
The prior example involved a quantitative and a qualitative
variable. Sometimes, we might suspect a connection between two
quantitative variables. In the student data, for example, we might think
that taller students generally weigh more than shorter ones. We can
create a scatterplot or XY graph to investigate.
Graphs h Chart Builder… From the gallery choices, choose
Scatter/Dot. Then drag the first scatter graph icon (simple) to the
preview area. Select Weight in pounds [wt] as the y, or vertical axis
variable, and Height in inches [ht] as the x variable. Click OK.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
34
Session 3 Tables and Graphs for Two Variables
Look at the scatterplot, reproduced here. Describe what you see
in the scatterplot. By eye, approximate the range of weights of
students who are 5’2” (or 62 inches) tall. Roughly how much more
do 6’2” students weigh?
We can easily incorporate a third variable into this graph.
Recall the Chart Builder and drag the second scatterplot icon
(grouped) to the preview area. Drag Gender to the box marked
Set Color in the preview area. Click OK.
In what ways is this graph different from the first
scatterplot? What additional information does it convey? What
generalizations can you make about the heights and weights of
men and women? Which points might we consider to be outliers?
Moving On…
Create the tables and graphs described below. Refer to Appendix
A for complete data descriptions. Be sure to title each graph,
including your name. Print the completed graphs.
Student
1. Generate side-by-side histograms of the distribution of
heights, separating men and women. Comment on the
similarities and differences between the two groups.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Moving On…
35
2. Do the same for students’ weights.
Bev
3. Using the Interactive Bar Chart command, display the mean of
Revenue per Employee, by SIC category. Which beverage
industry generates the highest average revenue by employee?
4. Make a similar comparison of Inventory Turnover averages.
How might you explain the pattern you see?
SlavDiet
In Time on the Cross: The Economics of American Negro Slavery, by
Robert William Fogel and Stanley Engerman, the diets of slaves and the
general population are compared.
5. Create two bar charts summing up the calories consumed by
each group, by food type. How did the diets of slaves compare
to the rest of the population, according to these data? [NOTE:
you want the bars to represent the sum of calories]
Galileo
In the 16th century, Galileo conducted a series of famous
experiments concerning gravity and projectiles. In one experiment, he
released a ball to roll down a ramp. He then measured the total
horizontal distance which the ball traveled until it came to a stop. The
data from that experiment occupy the first two columns of the data file.
In a second experiment, a horizontal shelf was added to the base
of the ramp, so that the ball rolled directly onto the shelf from the ramp.
Galileo recorded the vertical height and horizontal travel for this
apparatus as well, which are in the third and fourth column of the file.1
6. Construct a scatterplot for the first experiment, with release
height on the x axis and horizontal distance on the y axis.
Describe the relationship between x and y.
7. Do the same for the second experiment.
1 Sources: Drake, Stillman. Galileo at Work, (Chicago: University of
Chicago Press, 1978); Dickey, David A. and Arnold, J. Tim “Teaching Statistics
with Data of Historic Significance,” Journal of Statistics Education, v.3, no. 1,
1995.
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
36
Session 3 Tables and Graphs for Two Variables
AIDS
8. Construct a bar chart that displays the mean adult infection
rate in 2003, by World Health Organization region. Which
region of the world had the highest incidence of HIV/AIDS in
2003?
Mendel
Gregor Mendel’s early work laid the foundations for modern
genetics. In one series of experiments with several generations of pea
plants, his theory predicted the relative frequency of four possible
combinations of color and texture of peas.
9. Construct bar charts of both the actual experimental
(observed) results and the predicted frequencies for the peas.
Comment on the similarities and differences between what
Mendel’s theory predicted, and what his experiments showed.
Salem
In 1692, twenty persons were executed in connection with the
famous witchcraft trials in Salem, Massachusetts. At the center of the
controversy was Rev. Samuel Parris, minister of the parish at Salem
Village. The teenage girls who began the cycle of accusations often
gathered at his home, and he spoke out against witchcraft. This data file
represents a list of all residents who paid taxes to the parish in 1692. In
1695, many villagers signed a petition supporting Rev. Parris.
10. Construct a crosstab of proParris status and the accuser
variable. (Hint: Compute row or column percents, using the
Cells button.) Based on the crosstab, is there any indication
that accusers were more or less likely than nonaccusers to
support Rev. Parris? Explain.
11. Construct a crosstab of proParris status and the defend
variable. Based on the crosstab, is there any indication that
defenders were more or less likely than nondefenders to
support Rev. Parris? Explain.
12. Create a chart showing the mean (average) taxes paid, by
accused status. Did one group tend to pay higher taxes than
the other? If so, which group paid more?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Moving On…
37
Impeach
This file contains the results of the U.S. Senate votes in the
impeachment trial of President Clinton in 1999.
13. The variable called conserv is a rating scale indicating how
conservative a senator is (0 = very liberal, 100 = very
conservative). Use a bar chart to compare the mean ratings of
those who cast 0, 1, or 2 votes to convict the President.
Comment on any pattern you see.
14. The variable called Clint96 indicates the percentage of the
popular vote cast for President Clinton in the senator’s home
state in the 1996 election. Use a bar chart to compare the
mean percentages for those senators who cast 0, 1, or 2 votes
to convict the President. Comment on any pattern you see.
GSS2004
These questions were selected from the 2004 General Social
Survey. For each, construct a crosstab and discuss any possible
relationship indicated by your analysis.
15. Does a person’s political outlook (liberal vs. conservative)
appear to vary by their highest educational degree?
16. One question asks respondents if they consider themselves
happily married. Did women and men tend to respond
similarly? Did responses to this question tend to vary by
region of the country?
17. One question asks respondents about how frequently they
have sex. Did men and women respond similarly?
18. How does attendance at religious services vary by region of
the country?
GSS942004
This file contains responses to a series of General Social Survey
questions from 1994 and 2004. Respondents were different in the two
years. Use a bar chart to display the percentages of responses to the
following questions, comparing the 1994 and 2004 results. Comment on
the changes, if any, you see in the ten-year comparison.
19. Should marijuana be legalized?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
38
Session 3 Tables and Graphs for Two Variables
20. Should abortion be allowed if a woman wants one for any
reason?
21. Should colleges permit racists to teach?
22. Are you afraid to walk in your neighborhood at night?
States
23. Use a scatterplot to explore the relationship between the
number of fatal injury accidents in a state and the population
of the state in 2005. Comment on the pattern, if any, in the
scatterplot.
24. Use a scatterplot to explore the relationship between the
number of fatal injury accidents in a state and the mileage
driven within the state in 2005. Comment on the pattern, if
any, in the scatterplot.
Nielsen
25. Chart the mean (average) rating by network. Comment on
how well each network did that week. (Refer to your work in
Session 2.)
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Session 4
One-Variable Descriptive Statistics
Objectives
In this session, you will learn to do the following:
• Compute measures of central tendency and dispersion for a
variable
• Create a box-and-whiskers plot for a single variable
• Compute z-scores for all values of a variable
Computing One Summary Measure for a Variable
There are several measures of central tendency (mean, median,
and mode) and of dispersion (range, variance, standard deviation, etc.)
for a single variable. You can use SPSS to compute these measures. We’ll
start with the mode of an ordinal variable.
Open the data file called Student. The variables in this file are
student responses to a first-day-of-class survey.
One variable in the file is called Drive. This variable represents
students’ responses to the question, “How would you rate yourself as a
driver?” The answer codes are as follows:
1 = Below average
2 = Average
3 = Above Average
We’ll begin by creating a frequency distribution:
39
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
40
Session 4 One-Variable Descriptive Statistics
Analyze h Descriptive Statistics h Frequencies… Scroll down the
list of variables until you find How do you rate your driving? [drive].
Select the variable, and click OK. Look at the results. What was
the modal response? What strikes you about this frequency
distribution? How many students are in the “middle”? Is
there anything peculiar about these students’ view of
“average”?
1. Highlight this variable
2. Click here to select
the variable
Frequencies
Statistics
How do you rate your driving?
N
Valid
218
Missing
1
One student did
not answer
How do you rate your driving?
Valid
Missing
Total
Below Average
Average
Above Average
Total
System
Frequency
8
106
104
218
1
219
Percent
3.7
48.4
47.5
99.5
.5
100.0
Valid Percent
3.7
48.6
47.7
100.0
Cumulative
Percent
3.7
52.3
100.0
What does each column above tell you?
Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Computing One Summary Measure for a Variable
41
Drive is a qualitative variable with three possible values. Some
categorical variables have only two values, and are known as binary
variables. Gender, for instance, is binary. In this dataset, there are tw…
We've got everything to become your favourite writing service
Money back guarantee
Your money is safe. Even if we fail to satisfy your expectations, you can always request a refund and get your money back.
Confidentiality
We don’t share your private information with anyone. What happens on our website stays on our website.
Our service is legit
We provide you with a sample paper on the topic you need, and this kind of academic assistance is perfectly legitimate.
Get a plagiarism-free paper
We check every paper with our plagiarism-detection software, so you get a unique paper written for your particular purposes.
We can help with urgent tasks
Need a paper tomorrow? We can write it even while you’re sleeping. Place an order now and get your paper in 8 hours.
Pay a fair price
Our prices depend on urgency. If you want a cheap essay, place your order in advance. Our prices start from $11 per page.