# Economics Statistic STATA Problem Set

Name:Email ID:
@psu.edu
Worked with these other students:
ECON306 Problem Set 5
INSTRUCTIONS: Solve the following questions to the best of your ability. Ask me if
you do not know how to solve any of these questions before the due date. I will work with
you if you are having trouble solving these.
To receive full credit for this assignment, the problem set needs to be submitted to Canvas
in a single PDF document containing your 1) Stata log file in a .pdf file and 2) any
written explanations and answers. All of these components need to be attached together in
that order. Late submissions will NOT be accepted. DO NOT email! No assignments will be
accepted via email.
ECON 306 Problem Set 5, Fall 2022
Page 2
Boston Mortgages
1
Probability Replication
In this section, we are going to replicate the results presented in Chapter 11 of the Stock
the Problem Set 5 assignment. In Stata, you are used to performing an OLS regression with
the command regress. You can perform a probit estimation in the exact same way except
you replace regress with probit. If you want to perform a logit model, the command is logit.
You will use the data set hmda PS5.dta for this question. You can view the data description
with the file hmda.pdf on Canvas.
a) Replicate the results of the linear probability model, equation 11.1
b) Replicate the results of the probit models, equations 11.7 and 11.8
c) Replicate the results of the logit model, equation 11.10
d) The LTV variable is the loan-to-value ratio and it represents the fraction of the value of
the home that is being borrowed for the mortgage. It is much less risky for banks to make
a loan when the LTV is low. Using the probit model from equation 11.8, add variables
for coapplicant, LTV, and years of schooling.
e) You probably see the result that years of schooling is not statistically significant in the
previous regression. That result is incorrect. Examine your data and decide how to
perform another regression using the same variables where you find the legitimate result
that years of schooling are statistically significant.
ECON 306 Problem Set 5, Fall 2022
Page 3
Test Scores
2
Differences in Differences
The data in FFT102sp15.dta come from my ECON102 class in Spring 2015. After the first
midterm, I sent an email to students who failed the exam (got message=1 for these students).
The email expressed my desire to help students do better and included links to resources
for the course, encouragement to visit regular office hours, and a link to schedule a 1-on-1
meeting with me. I would like you to help me investigate if this message of encouragement
made a difference in these students grades for the midterm and final exam.
a) Calculate the difference in differences using a 4-number summary (average midterm 1
score and average midterm 2 score for students who did vs. did not get the message).
b) Find the appropriate variable in the data set and rename it “treat”
c) Generate a variable called “post” to use in your DID regression.
d) Create the interaction term and call it “treat post”
e) Perform the basic DID regression like we did in class (see Mastering Metrics equation 5.3
if you don’t remember what to include)
f) Interpret the results coefficients on treat, post, and treat post
g) Do you think it is fair to compare the students who failed the first test to those who did
not? Why or why not?
h) Add the control variable dropped to your DID regression. Interpret the coefficient on
dropped and comment if you think that leaving it out would cause omitted variable bias
in any of the other variables. NOTE: for students who dropped the class, we only see
their second midterm score to compare to the first midterm score. For students who did
not drop, we can compare using their second midterm and final exam scores in the post
period.
Note: I actually transformed the original data I had for the class in order to make this
problem easier for you. If you’re curious and want to know what it used to look like, you
can type the following command into Stata:
reshape wide testpct, i(id) j(testnum)
If
you want to get back, you can type reshape long
or just close the without saving.
ECON 306 Problem Set 5, Fall 2022
Page 4
Beer Tax
Download the description and data set fatalityPS5.dta for this problem. We will replicate
the results from Table 10.1 in Stock & Watson, columns 1-3.
3
Panel Data
a) Replicate the results of column (1) by performing a simple regression with the variables
beertax and fatalityrate treating the data as cross-sectional.
b) Let’s perform a simple panel data regression without any fixed effects. In order to do
this, we first need to let Stata know that we are working with panel data. In order to do
that, you need to use the xtset command. This basically tells Stata what variables relate
to the i and t in panel data. For this data set, state is the cross-sectional observational
unit and year represents the different times. Therefore, type xtset state year in order
to let Stata understand how your panel data are organized. In order to take advantage
of panel data regression, you have to use the command xtreg in Stata instead of regress.
robust standard errors by typing , vce(robust) at the end of your regression command.
d) Replicate the results of column (2) by adding state fixed effects. In order to do this, you
need to add , fe vce(cluster state) to the end of your previous regression command.
The fe tells Stata to use entity fixed effects. The vce(cluster state) tells Stata to cluster
your standard errors by State (this does not have to be the same as your cross-sectional
ID…for example, if you had data on housing prices over time, you might want to cluster
standard errors by neighborhood).
e) Replicate the results of column (3) by adding time fixed effects by adding dummy variables
for the years 1982 – 1987 to your previous specification. Comment on if these time
dummies are significant.
f) Comment on what you notice about the difference from moving one step at a time from
parts a) through e) of this question.
Stock/Watson – Introduction to Econometrics 4th Edition
THE STATE TRAFFIC FATALITY DATA SET
The data are for the “lower 48” U.S. states (excluding Alaska and Hawaii),
annually for 1982 through 1988. The traffic fatality rate is the number of traffic deaths in
a given state in a given year, per 10,000 people living in that state in that year. Traffic
fatality data were obtained from the U.S. Department of Transportation Fatal Accident
Reporting System. The beer tax is the tax on a case of beer, which is an available
measure of state alcohol taxes more generally. The drinking age variables in Table 10.1
are binary variables indicating whether the legal drinking age is 18, 19, or 20. The two
binary punishment variables in Table 10.1 describe the state’s minimum sentencing
requirements for an initial drunk driving conviction: “Mandatory jail?” equals one if the
state requires jail time and equals zero otherwise, and “Mandatory community service?”
equals one if the state requires community service and equals zero otherwise. Total
vehicle miles traveled annually by state was obtained from the Department of
Transportation. Personal income was obtained from the U.S. Bureau of Economic
Analysis, and the unemployment rate was obtained from the U.S. Bureau of Labor
Statistics.
These data were graciously provided to us by Professor Christopher J. Ruhm of
the Department of Economics at the University of North Carolina.
Data Series:
Series
state
year
spircons
unrate
perinc
Descriptions
State ID (FIPS) Code
Year
Spirits Consumption
Unemployment Rate
Per Capita Personal Income
Stock/Watson – Introduction to Econometrics 4th Edition
emppop
beertax
sobapt
mormon
mlda
dry
yngdrv
vmiles
breath
jaild
comserd
allmort
mrall
allnite
mralln
allsvn
a1517
mra1517
a1517n
mra1517n
a1820
a1820n
mra1820
mra1820n
a2124
mra2124
a2124n
mra2124n
aidall
mraidall
pop
pop1517
pop1820
pop2124
miles
unus
epopus
gspch
Employment/Population Ratio
Tax on Case of Beer
% Southern Baptist
% Mormon
Minimum Legal Drinking Age
% Residing in Dry Counties
% of Drivers Aged 15-24
Ave. Mile per Driver
Prelim. Breath Test Law
Mandatory Jail Sentence
Mandatory Community Service
# of Vehicle Fatalities (#VF)
Vehicle Fatality Rate (VFR)
# of Night-time VF (#NVF)
Night-time VFR (NFVR)
# of Single VF (#SVN)
#VF, 15-17 year olds
VFR, 15-17 year olds
#NVF, 15-17 year olds
NVFR, 15-17 year olds
#VF, 18-20 year olds
#NVF, 18-20 year olds
VFR, 18-20 year olds
NVFR, 18-20 year olds
#VF, 21-24 year olds
VFR, 21-24 year olds
#NVF, 21-24 year olds
NVFR, 21-24 year olds
# of alcohol-involved VF
Alcohol-Involved VFR
Population
Population, 15-17 year olds
Population, 18-20 year olds
Population, 21-24 year olds
total vehicle miles (millions
U.S. unemployment rate
U.S. Emp/Pop Ratio
GSP Rate of Change
Stock/Watson – Introduction to Econometrics 4th Edition
THE BOSTON HMDA DATA SET
The Boston HMDA data set was collected by researchers at the Federal Reserve
Bank of Boston. The data set combines information from mortgage applications and a
follow-up survey of the banks and other lending institutions that received these mortgage
applications. The data pertain to mortgage applications made in 1990 in the greater
Boston metropolitan area. The full data set has 2925 observations, consisting of all
mortgage applications by blacks and Hispanics plus a random sample of mortgage
applications by whites.
To narrow the scope of the analysis in this chapter, we use a subset of the data for
single-family residences only (thereby excluding data on multi-family homes) and for
black and white applicants only (thereby excluding data on applicants from other
minority groups). This leaves 2380 observations. Definitions of the variables used in
this chapter are given in Table 11.1.
These data were graciously provided to us by Geoffrey Tootell of the Research
set, along with the conclusions reached by the Federal Reserve Bank of Boston
researchers, is available in the article by Alicia H. Munnell, Geoffrey M.B. Tootell,
Lynne E. Browne, and James McEneaney, “Mortgage Lending in Boston: Interpreting
HMDA Data,” American Economic Review, 1996, pp. 25 – 53.
Two datasets have been included on the website. HMDA_AER is the full HMDA
data set used in the Munnell, Tootell, Browne and McEneaney paper. HMDA_SW is
contains the 2380 observations that are used in the analysis in Chapter 11. See the
replication files for Chapter 11 for the variable definitions used in the chapter.
Stock/Watson – Introduction to Econometrics 4th Edition
The description of the data set given below was supplied by the Federal Reserve Bank of
Boston.
(list rev. 8/1/01)
Federal Reserve Bank of Boston
Research Department
General Research Data Set
FOLLOW-UP TO 1990 HOME MORTGAGE DISCLOSURE ACT (HMDA) REPORTS
LOAN/APPLICATION REGISTER (LAR)
DETAILED LIST OF VARIABLES
(Abbreviated as Question Number on HMDA Surveys)
I.
(SEQ) – sequence number, unique identifier for observations
II.
Original HMDA data
A. Loan Information
1. (S3) Type of Loan
Codes:
1 – Conventional
3. (S4) Purpose of Loan
Codes:
1 – Home purchase
2 – Home improvement
3 – Refinancing
4 – Multifamily
4. (S5) Occupancy
Codes:
1 – Owner-occupied
2 – Not owner-occupied
3 – Not applicable
5. (S6) Loan amount (in thousands)
6. (S7) Type of action taken
Codes:
1 – Loan originated
2 – Application approved but not accepted
by applicant
3 – Application denied
4 – Application withdrawn
5 – File closed for incompleteness
6 – Loan purchased by institution
B.
Property Location:
1. (S9) MSA (Boston Metropolitan Statistical Area) number where property
located
2. (S11) County where property located
Codes:
Stock/Watson – Introduction to Econometrics 4th Edition
1 – Suffolk
0 – Other
*
C.
Applicant Information
1. (S13) Applicant race
Codes:
1 – American Indian or Alaskan Native
2 – Asian or Pacific Islander
3 – Black
4 – Hispanic
5 – White
6 – Other
7 – Information not provided by applicant in
mail or telephone application
8 – Not applicable
2. (S14) Co-applicant race*
3. (S15) Applicant sex
Codes:
1 – Male
2 – Female
3 – Information not provided by applicant in
mail or telephone application
4 – Not applicable
4. (S16) Co-applicant sex*
5. (S17) Applicant income (in thousands)
D.
Other Loan Information
1. (S18) Type of purchaser of loan
Codes:
0 – Loan was not sold in calendar year covered by register
1 – FNMA
2 – GNMA
3 – FHLMC
4 – FMHA
5 – Commercial bank
6 – Savings bank or savings association
7 – Life insurance company
8 – Affiliate institution
9 – Other type of purchaser
2. (S19A) Original HMDA report, reasons for denial
Codes:
1 – Debt-to-income ratio
2 – Employment history
3 – Credit history
4 – Collateral
5 – Insufficient cash
6 – Unverifiable information
Same codes as preceding variable
Stock/Watson – Introduction to Econometrics 4th Edition
7 – Credit application incomplete
8 – Mortgage insurance denied
9 – Other
III.
*
Follow-up Survey Data
1. (S19B, S19C, S19D) Additions or corrections to reasons for denial from
Boston survey data*
2. (S20) Number of units in property purchased
3. (S23A) Marital status of applicant
Codes:
M – Married
U – Unmarried (includes single, divorced and widowed)
S – Separated
4. (S24A) Number of dependents claimed by applicant
5. (S25A) Years employed in applicable line of work
6. (S26A) Years employed on applicable job
7. (S27A) Self-employed applicant
Codes:
0 – Not self-employed
1 – Self-employed
8. (S30A) Base employment monthly income of applicant (in dollars)
9. (S30C) Base employment monthly income of coapplicant (in dollars)
10. (S31A) Total monthly income of applicant (in dollars)
11. (S31C) Total monthly income of coapplicant (in dollars)
12. (S32) Proposed monthly housing expense (in dollars)
13. (S33) Purchase price (in thousands)
14. (S34) Other financing (in thousands)
15. (S35) Liquid assets (in thousands)**
16. (S39) Number of commercial credit reports in loan file
17. (S40) Applicants’ credit history meets loan policy guidelines for approval
Codes:
0 – No
1 – Yes
18. (S41) Number of separate consumer credit lines on credit reports
19. (S42) Credit history – mortgage payments
Codes:
1 – No late mortgage payments
2 – No mortgage payment history
3 – One or two late mortgage payments
4 – More than two late mortgage payments
20. (S43) Credit history – consumer payments
Codes:
1 – No “slow pay” or delinquent accounts, but sufficient references for
determination
2 – One or two “slow pay” account(s) (each with one or two payments 30
days past due)
Same codes as preceding variable
Applicant and coapplicant data were summed if separate statements were completed.
**
Stock/Watson – Introduction to Econometrics 4th Edition
3 – More than two “slow pay” accounts (each with one or two payments
30 days past due); or one or two chronic “slow pay” account(s) (with
three or more payments 30 days past due in any 12-month period)
4 – Insufficient credit history or references for determination
5 – Delinquent credit history (containing account(s) with a history of
payments 60 days past due)
6 – Serious delinquencies (containing account(s) with a history of
payments 90 days past due)
21. (S44) Credit history – public records
0 – Information not considered
0 – No public record defaults
1 – Bankruptcy
1 – Bankruptcy and charge offs
1 – One or two charge-off(s), public record(s), or collection action(s),
totaling less than \$300
1 – Charge-off(s), public record(s), or collection action(s) totaling more
than \$300
22. (S45) Debt-to-income ratio (the banks’ calculation of housing
expense/income)
23. (S46) Debt-to-income ratio (the banks’ calculation of total
obligations/income)
24. (S47) Fixed or adjustable rate loan (F or A)
Codes:
2 – Fixed
3 – Not Available
25. (S48) Term of loan (months)
26. (S49) Special loan application program
27. (S50) Appraised value (in thousands)
28. (S51) Type of property purchased
Codes:
1 – Condominium
2 – Single family
3 – 2 to 4 families
29. (S52) Private mortgage insurance (PMI) sought?
Codes:
0 – No or information not available
1 – Yes
30. (S53) Private mortgage insurance (PMI) denied?
Codes:
0 – PMI approved, did not apply, or information not available
1 – PMI sought and denied
31. (S54) Was a gift or grant as part of down payment?
Codes:
0 – No or information not available
1 – Yes
32. (S55) Was there a co-signer for the application?
Codes:
0 – No or information not available
1 – Yes
33. (S56) Unverifiable information
Stock/Watson – Introduction to Econometrics 4th Edition
Codes:
0 – Not applicable (all verifiable)
1 – Some information unverifiable
34. (S57) Number of times application was reviewed by underwriter
III.
Variables Added for Analysis, taken from the Census Survey
1. (netw) Net worth (Total assets – Total liabilities)***
2. (uria) Probability of unemployment by industry
3. (rtdum) Minority population share in tract
Codes:
0 – if £ 0.30
1 – if > 0.30
4. (bd) Boarded-up value of tract
Codes:
0 – if £ MSA median
1 – if > MSA median
5. (mi) Median tract income
Codes:
0 – if £ MSA median
1 – if > MSA median
6. (old) Applicant age
Codes:
0 – if £ MSA median
1 – if > MSA median
2 – missing
7. (vr) Tract vacancy
Codes:
0 – if £ MSA median
1 – if > MSA median
8. (school) Years of education
9. (chvalc) Change in median value of property in a given tract, 1980-1990
IV.
Dummy variables created from HMDA data
1. (dnotown) Owner occupied property
0 – Owner occupied
1 – Not owner occupied, or information not available
2. (dprop) Type of property
0 – Condominium or single family
1 – 2-4 families
Notes:
1. 999,999.4 is used in the database to signify missing observations in numerical columns.
2. NA is used to signify missing observations in character columns.
***
Applicant and coapplicant combined

Don't use plagiarized sources. Get Your Custom Essay on
Economics Statistic STATA Problem Set
Just from \$13/Page
Calculator

Total price:\$26
Our features

## Need a better grade? We've got you covered.

Order your essay today and save 20% with the discount code GOLDEN