0% found this document useful (0 votes)
45 views9 pages

Sample Paper

This document provides an example of the structure and concepts illustrated in an empirical economics paper. It includes sections on introduction, data, empirical results, and conclusion. Marginal notes explain the purpose of each aspect to help the reader understand how to write their own paper.

Uploaded by

abhay dwivedi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views9 pages

Sample Paper

This document provides an example of the structure and concepts illustrated in an empirical economics paper. It includes sections on introduction, data, empirical results, and conclusion. Marginal notes explain the purpose of each aspect to help the reader understand how to write their own paper.

Uploaded by

abhay dwivedi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Sample Paper in Econometrics

This is a sample research paper for an introductory course in econometrics. It shows how to communicate econometric work
in written form. The paper integrates many writing instructions and rules into a single example and shows how they all fit
together. You should pay attention to the structure of the paper: how it is divided into sections and how each section serves a
distinct purpose. You should also note how the descriptive statistics and empirical results are presented.

The paper includes numerous notes in the margins. These notes explain the purpose of each paragraph, and provide
comments on tables and other aspects of the paper. The margin notes are there to make you aware of the writing process.
They are designed to help you bridge the gap between reading and understanding on one hand, and writing and creating
knowledge on the other. The readings which have been assigned in your economics courses are finished products which you
are able to read and understand. However, in order for you to be able to create a finished product yourself, you need to
become aware of how such a product is created. The notes in the margins reveal the thinking and consideration that go into
each section, paragraph and table, and should therefore help you in writing your own paper.

It is worth emphasizing that you should use this paper only as a guide. You should not copy the paper and simply fill in your
own names, words and numbers. You can deviate from the order and purpose of each paragraph in order to meet the needs
of your own work. You can add separate sections on prior literature, methodology or theory. Such sections would normally
come after the introduction. The sample paper includes the discussion of prior literature in the introduction. The theory and
methodology are folded into the Introduction, Data and Empirical Results sections. The absence of a separate theory or
methodology sections is not uncommon in applied empirical papers. However, theory or methodology sections are a must
when the empirical question is derived from an explicit theoretical model or when the methodology requires a longer
explanation. You are also welcome to include additional tables or graphs. What should remain the same, though, is that each
section, paragraph, table and graph has a purpose, and that they are organized in a logical manner.

Concepts illustrated in the paper


Structure: Writing Style:

 Introduction (see example)  Citation Style (see example)


o The Introduction should convey four things. First, o The citation and bibliography styles most
what is the question that the paper asks. Second, commonly used in economics are detailed in the
why is the question important. Third, how is the Chicago Manual of Style.
paper going to answer the question. Finally, how  Use of acronyms (see example)
is the paper related to existing work. The o The first time an acronym is used it should be
introduction is the most important part of any written out, followed by the acronym in
paper. No one will continue to read any further if parenthesis.
the introduction is confusing or poorly written.  Use of first person (see example)
 Data (see example) o It is acceptable to use first person (I) in an
o The Data section should accomplish three things: economics paper.
First, state the sources of data. Second, discuss the  Coherence (see example)
variables used and how they relate to the concepts o Make each sentence linked to the previous one.
that they are supposed to measure. Finally,
present the data’s descriptive statistics.  Tense (see example)
o It is appropriate to use past tense when describing
 Empirical Results (see example) the construction of your variables. However, use
o The Empirical Results section should present and present tense when referring to tables or your
discuss the empirical results. The presentation of results.
results is usually done with a table. The discussion
of results typically includes a statement of whether
the results support or refute the hypothesis, a
statement of whether the results are statistically
significant, interpretation of the magnitude of the
coefficients and a comment on functional form.
 Conclusion (see example)
o The conclusion should accomplish three things:
summarize the results, explore the implications of
the results, and point to future research.
Conventions in an Empirical Paper: Other:

 Descriptive Statistics Table (see example)  Title (see example)


o A descriptive statistics table should include the list o The title should concisely express what the paper
of variables and the mean, median, standard is about. It can also be used to capture the
deviation, minimum and maximum. In cases where reader's attention.
the number of observations varies from variable to  Searching for existing literature (see example)
variable, a column specifying the number of o EconLit is the most commonly used database for
observations is necessary. The orientation of the searching published papers in Economics.
table should be such that the variables are in rows Working papers can be found via IDEAS, SSRN,
and the statistics in columns. This way, even if a NBER or even google.
large number of variables are used, the table will  Effect vs. affect (see example)
fit on one page. o "Effect" is usually a noun (that is, it could be
 Discussing Descriptive Statistics (see example) preceeded by "the"). "Affect" is usually a verb.
o Discussing the minimum and maximum and the  Appeal to authority (see example)
corresponding data points makes the data “come o It is appropriate to cite other studies when
alive.” It also reassures the reader that the data justifying the use of a variable or technique. This
was put together correctly. also makes the comparison to other work easier.
 Rounding numbers in the text (see example)  Acknowledge shortcomings of data (see example)
o When discussing quantities in the text, use round o It is appropriate to acknowledge the shortcomings
numbers. of your data. The shortcomings could come from
 Presentation of regression results (see example) unreliability of the source, lack of observations or,
o Regression results are typically presented in this as in this case, lack of time to properly adjust the
compact form. The columns show results from 6 data for inflation.
different regressions. The rows show the intercept,
independent variables and the R-squared. The
estimated coefficients and their associated
standard errors in parentheses appear inside the
table. Some authors prefer to show each
coefficient’s t-statistics in parentheses; therefore it
is always necessary to specify this in the table’s
footnote. If the independent variable is not
included in a specification, the cell corresponding
to that independent variable and specification is
left blank. If the number of observations varies
across specifications, it can be included as the last
row. The asterisks are for easy identification of the
significance level - the more asterisks, the higher
the significance.
 Converting variables to convenient units (see
example)
o In order to be able to present regression results in
a compact and readable form, it is necessary to
convert the variables to appropriate units. For
example, the appropriate units for payroll are
millions of dollars. This is because if payroll were
in dollars, the coefficient in specification (3)
would appear as 0.0000001 which is more difficult
to fit in a table and more difficult to read.
 Interpreting estimated coefficients (see example)
o It is very important to include the units of both the
independent and the dependent variables.
 Assessing economic significance (see example)
o Assessing economic significance requires
judgment. Unlike statistical significance, there is
no "official" benchmark for assessing economic
significance.
The title should
Does pay inequality within a team affect concisely express what
the paper is about. It
performance? can also be used to
capture the reader's
Tomas Dvorak* attention.

The Introduction should 1. Introduction


convey four things. This paragraph
First, what is the The business of sports draws considerable attention explains the question
question that the paper from the media and the general public. Fans and sports that the paper is asking.
asks. Second, why is the
question important. writers frequently speculate about the effects of money
Third, how is the paper on athletic performance. There is general agreement
going to answer the that more financial resources usually lead to better
question. Finally, how athletic performance. In team sports, higher pay can be
is the paper related to Notice how each
used to lure better players from other teams and sentence in this
existing work. The
introduction is the most therefore improve performance. However, performance paragraph is linked to
important part of any can also be affected by pay inequality among players the previous one. This
paper. No one will within a team. On the one hand, pay inequality could makes the paragraph
continue to read any coherent
have a negative effect because it may hinder
further if the cooperation among team members. In many sports,
introduction is
confusing or poorly team cooperation is critical for good performance. If
written. pay inequality creates tensions or animosity among
team members, performance is likely to suffer. On the
other hand, inequality could have a positive effect on
performance by providing incentives. The prospect of a
very large salary could be a powerful drive behind an
athlete’s performance. Pay inequality might also
enhance performance if low paid players learn from
high paid players. This would happen when pay
inequality is associated with skill inequality. For
example, if a highly paid superstar can teach other
players, the overall performance of a team may
improve. Given that arguments can be made both ways,
it is not surprising that there is little agreement on the
effects of pay inequality on team performance. The
purpose of this paper is to determine whether, on
balance, the effect of pay inequality on performance is
positive or negative.

Understanding the effect of pay inequality on a team’s This paragraph


performance is important for at least two reasons. First, explains why the
team managers can use this information to make question is important.
decisions about which players to hire. For example,
should they hire one expensive superstar and two
inexpensive players, or three medium-priced players? If
we find that pay inequality leads to poor team
performance, then the team may perform better with
three medium-priced players than one superstar and
two low-priced players. Second, because salaries are a
large part of contract negotiations between player
associations and team owners, understanding the
effects of pay inequality on performance can help
determine optimal policies. For example, if pay
inequality has a negative effect on performance, an
argument for a higher minimum salary could be made.

The two papers were There are a number of studies that look at the effects of This paragraph lists
found in EconLit by pay inequality on performance. DeBrock, Hendricks existing papers on the
searching for different and Koenker (2004) study the effects of pay inequality topic and states their
combinations of the on performance in Major League Baseball (MLB). findings.
following key words:
“pay,” “salary,” They find that pay inequality is associated with poor
The first time an
"inequality” and performance. Frick, Prinze and Winklemann (2003) acronym is used it
“performance.” look at the effects of pay inequality in all four major should be written out,
leagues in North America. They find that inequality followed by the
improves team performance in basketball and worsens acronym in parenthesis.
team performance in baseball. They find no statistically
significant effect of inequality on performance in
football and hockey.

The citation and This paper looks at the effects of inequality on This paragraph
bibliography styles most performance in MLB. It differs from that of DeBrock, explains how the
commonly used in Hendricks and Koenker (2004) in that it uses the most current paper differs
economics are detailed recent data. While the previous authors use data from from what has been
in the Chicago Manual done before. It is
of Style. 1985 through 1998, I use data from the latest two important to explain
seasons: 2003 and 2004. Another difference is that I how the paper fits in the
use a different measure of pay inequality. Rather than existing literature.
the Herfindahl index, I use the percentage of payroll
It is appropriate to use
earned by the best paid 20% of players. I chose the first person in an
share earned by the top 20% players for two reasons: it economics paper.
is somewhat easier to calculate, and its magnitude is
easier to interpret.

The Data section should 2. Data The next three


accomplish three paragraphs discuss the
things: First, state the The data on pay inequality was constructed in the sources of data and the
sources of data. Second, following way. From the USA Today salary database, I construction of
discuss the variables variables used in the
used and how they collected annual salaries for each player in all MLB paper. This one
relate to the concepts teams during the 2003 and 2004 seasons. I summed the describes the measure
that they are supposed salaries of all players for each team and each season to of inequality.
to measure. Finally, obtain the total payroll. The active roster in baseball is
present the data’s 25, but the database includes salaries of disabled The word “data” is
descriptive statistics. used as both plural and
players as well. Therefore, the number of players for singular.
each team ranges from 25 to 31. As the measure of pay
inequality, I calculated the percentage of payroll earned
by the highest paid 20% of players. For example, for a
30 player team I summed the salaries of the highest
paid 6 players and divide that amount by total payroll.
If every player earned the same amount, the best paid It is appropriate to use
20% would earn exactly 20% of the payroll. When pay past tense when
is unequal, this measure is higher than 20%. The higher describing the
the share of payroll earned by the top 20% of players, construction of your
It is appropriate to cite variables. However, use
other studies when the higher the pay inequality. present tense when
justifying the use of a referring to tables or
variable or technique. To measure performance I use the percentage of games your results.
This also makes the
comparison to other won in the regular season. This data comes from This paragraph
work easier. BaseballReference.com. It does not include describes the measure
performance during league championships or the World of performance.
Series. However, with 162 games per regular season,
the winning percentage can be regarded as a reasonable
measure of performance. This is also the measure used
by DeBrock, Hendricks and Koenker (2004).

In addition to pay inequality and performance, I use This paragraph


data on the total payroll of each team. This is a measure describes the last
of financial resources which could be an important variable - total payroll.
determinant of performance. I measure payroll in
current dollars and do not adjust for inflation. While It is appropriate to
acknowledge the
2003 dollars are not exactly comparable to 2004 shortcomings of your
The next two dollars, 2003 inflation was low enough not to influence data. The shortcomings
paragraphs discuss the the results significantly. could come from
descriptive statistics unreliability of the
table. Discussing the Table 1 shows the descriptive statistics of each source, lack of
minimum and maximum observations or, as in
and the corresponding variable. In the first row we see that on average the this case, lack of time to
data points makes the highest paid 20% of players earn about 61% of the total properly adjust the data
data “come alive.” It payroll. This implies that on a 30 player team, the six for inflation.
also reassures the best paid players earn more than the remaining 24
reader that the data was combined. According to this measure, the team with the
put together correctly.
For example, many most equitable pay is the New York Yankees during
would be justifiably the 2003 season when the top 20% of players earned
alarmed if the maximum only 42% of total payroll. The team with the highest
payroll did not turn out inequality was the Colorado Rockies during the 2004
to be the famously season. On that team, five players earned more than
wealthy New York
Yankees. 78% of the team’s total payroll.

The second row in Table 1 shows that the average


winning percentage is 50% which has to be the case
When discussing
since for every game won there is a game lost. The quantities in the text we
Detroit Tigers have the lowest winning percentage in use round numbers. For
the data with only 26% of games won during the 2003 example, instead of 19.6
season. The maximum winning percentage in the data million I use “less than
is for the St. Louis Cardinals, who won nearly 65% of 20 million.”
their games during the 2004 season. Finally, the last
row in Table 1 shows that the average payroll is about
70 million dollars. The range of payroll is quite
striking. It goes from less than 20 million dollars for
the Tampa Bay Rays to over 184 million for the New
York Yankees.
A descriptive statistics Table 1: Descriptive Statistics The orientation of the
table should include the table should be such
list of variables and the mean median st.dev. min max that the variables are in
mean, median, standard rows and the statistics
deviation, minimum and Top20share (in %) 61.0 61.4 8.0 42.2 78.3 in columns. This way,
maximum. In cases Games Won (in %) 50.0 51.6 8.2 26.5 64.8 even if a large number
where the number of Payroll (in mil. USD) 70.0 65.3 30.3 19.6 184.2 of variables are used,
observations varies the table will fit on one
from variable to page.
variable, a column
specifying the number
of observations is
necessary.

The Empirical Results 3. Empirical Results


section should present Regression results are
and discuss the I estimate three different specifications. The dependent typically presented in
empirical results. The variable in each specification is performance, as this compact form. The
presentation of results columns show results
is usually done with a measured by the percentage of games won. Pay
from 3 different
table. The discussion of inequality and total payroll are the independent regressions. The rows
results typically variables. Table 2 shows the results. In the first show the intercept,
includes a statement of specification, I regress performance on the share earned independent variables
whether the results by the top 20% of players. The coefficient on the share and the R-squared. The
support or refute the estimated coefficients
hypothesis, a statement of top 20% is negative and statistically significant. This
and their associated
of whether the results indicates that teams with higher pay inequality tend to standard errors appear
are statistically win fewer games. A one percentage point increase in inside the table. Some
significant, the share of payroll earned by the top 20% of players is authors prefer to show
interpretation of the associated with about half of a percentage point decline each coefficient’s t-
magnitude of the statistics in
coefficients and a in the percentage of games won.
parentheses; therefore
comment on functional it is always necessary to
form Table 2: Regression Results specify this in the
In order to be able to table’s footnote. If the
present regression Dependent variable: winning percentage (in %) independent variable is
results in this form it is not included in a
(1) (2) (3) specification, the cell
necessary to convert the
variables to appropriate Intercept 77.3 59.9 37.35 corresponding to that
units. For example, the (7.43)** (9.51)** (15.37)* independent variable
appropriate units for and specification is left
payroll are millions of blank. If the number of
Top20share (in -0.45 -0.27 -0.28
dollars. This is because observations varies
if payroll were in
%) (0.12)** (0.13)* (0.13)* across specifications, it
dollars, the coefficient can be included as the
in specification (3) Payroll (in mil. 0.10 last row. The asterisks
would appear as USD) (0.04)** are for easy
0.0000001 which is identification of the
more difficult to fit in a significance level - the
table and more difficult
Log of Payroll 7.09 more asterisks, the
to read. (2.43)** higher the significance.
R-squared 0.19 0.29 0.29
Adjusted R- 0.18 0.26 0.27
squared
Number of observations is 60.
Standard errors are in parentheses.
** significant at 1%, * significant at 5%
In the second specification I include total payroll as an
independent variable. Payroll is a measure of the
financial resources which can affect performance - the
higher the payroll, the higher the quality of players and,
generally, the better the performance. Therefore,
including payroll may increase the precision of the
estimated coefficient on pay inequality. More
importantly, it is possible that pay inequality is
correlated with total payroll. If low payroll teams tend
to have more pay inequality, then the coefficient on pay
inequality in specification (1) is biased. Indeed, the
correlation coefficient between the share earned by the
top 20% of players and total payroll is -0.5. Teams with
high pay inequality may perform worse not because of
pay inequality, but because they are also the teams with
a lower payroll. Therefore, in order to measure the
effect of pay inequality on performance, I need to
control for total payroll.

Once I control for total payroll, the coefficient on the


This sentence interprets share of the top 20% remains statistically significant These two sentences try
the magnitude of the but the magnitude drops substantially. Holding payroll to assess the economic
estimated coefficient. It constant, a one percentage point increase in the share significance of the
is very important to earned by the highest paid 20% is associated with a estimated coefficient.
include the units of both This requires judgment.
0.27 percentage point decline in the percentage of
the independent and the Unlike statistical
dependent variables. games won. The impact of inequality on performance significance, there is no
does not seem enormous. For example, a five "official" benchmark for
percentage point increase in inequality for the team assessing economic
with median inequality would shift the team up 13 significance. The
approach adopted here
spots in the inequality ranking, but its performance
looks at how much a
ranking would drop by only 2 spots. The coefficient on given shift in the
total payroll is positive and statistically significant. A ranking by the
one million dollar increase in total payroll is associated independent variable
with about 0.1 percentage point increase in the changes the raking by
the dependent variable.
percentage of games won. This indicates that greater
Another popular
financial resources tend to improve performance. approach is to calculate
Adding payroll as an independent variable led to an the change in the
increase in R-squared from about 0.19 to 0.29. dependent variable per
one standard deviation
change in the
Finally, in specification (3) I include the logarithm of
independent variable.
payroll instead of payroll. I want to verify that the
result in specification (2) is robust to different
functional forms. In addition, the effect of an additional
one million dollars may be smaller for a team with a
100 million payroll than for one with a 20 million
payroll. Thus, including payroll in logarithm seems
appropriate. The coefficient on the share of the top 20%
remains statistically significant with roughly the same
magnitude. The log of payroll is statistically
significant. A one percent increase in payroll is
associated with about 0.07 percentage points increase
in the percentage of games won.
The conclusion should 4. Conclusion
accomplish three This paragraph sums up
things: summarize the The analysis in this paper shows that pay inequality the results and asserts
results, explore the within MLB teams has a negative effect on paper’s contribution.
implications of the
results, and point to performance. The effect remains statistically significant
future research. even after controlling for total payroll. The result is the
same as that of DeBrock et al. (2004) who use data
from 1985 through 1998. My paper confirms their
finding using the most recent data and using a different
measure of pay inequality. This paragraph draws
conclusions. Notice how
The fact that pay inequality leads to worse performance the argument goes back
implies that managers should strive for pay equality in to the motivation for the
their teams. For example, instead of hiring two low- question that was stated
in the introduction.
priced players and one superstar, performance may be
better if three medium-priced players are hired. Given
these results, it is surprising that there is not a more
equal distribution of pay in baseball. One possible
explanation is that managers may care about attendance
as well as winning. They may be willing to sign up an
expensive superstar who will attract fans even though it
will increase pay inequality and may hinder
This paragraph lists the
limitations of the paper. performance.
It considers both
external and internal
validity. The external The conclusions above are subject to a number of
validity asks if the limitations. First, it is unclear to what extent the results
conclusions can be
generalized to other can be generalized to other sports. Each sport requires a
settings. The internal different degree of cooperation among team members.
validity asks if the Therefore, the relationship between pay inequality and
specifications used to performance is likely to differ across sports. Second,
reach the conclusions the error terms for each team could be correlated over
were appropriate and
free of any biases. time. For example, if a team wins a lot of games one
year given its payroll and pay inequality, that team is
likely to win a lot of games the next year as well.
Therefore, the estimation procedure may need to This paragraph reflects
correct for this autocorrelation. Finally, there may be on the results and
points to further
other variables that affect performance, e.g. coach
questions that may be
salary or quality of training facilities. Including these in addressed in future
the regression would increase the precision of my research. In this
estimates as well as eliminate potential omitted variable example, I consider
bias. possible reasons driving
the negative
relationship between
The channels through which pay inequality affects pay inequality and
performance are not clear. I can think of two performance. In other
possibilities. One is that pay inequality leads to studies you may
tensions within the team and impairs performance. The consider alternative
explanations for your
other possibility is that baseball requires players of
findings. Notice the
similar quality. Pay inequality is probably associated tentative language.
with skill inequality, and it may be the skill inequality
"Effect" is usually a that drives down performance. An excellent pitcher
noun "Affect" is usually cannot win the game when the outfielders cannot catch
a verb. These two or throw. It may be possible to distinguish these two
sentences illustrate channels empirically. Using statistics on individual
each case.
player skill level, one could construct a measure of skill
inequality for a team and include it as an additional
control. The coefficient on pay inequality in that case
would capture the effect of pay inequality on
performance while holding skill inequality constant. A
negative impact of pay inequality would then support
the idea that pay inequality leads to tensions which
affect performance. This investigation, however, is left
for future research.

References should be References:


listed according to the
Chicago Manual of DeBrock, Lawrence, Wallace Hendricks, and Roger
Style. Koenker. 2004. Pay and performance: The impact of
salary distribution on firm-level outcomes in baseball.
Journal of Sports Economics 5 (August): 243–261.

Frick, Bernd, Joachim Prinz, and Karina Winkelmann.


2003. Pay inequalities and team performance:
Empirical evidence from the North American major
leagues. International Journal of Manpower 24: 472-
491.

It is important that you Appendix:


keep a well documented
file with your data and Data with documentation and results: MLB.xls
analysis.

(back to the top)


* I would like to thank Mary Mar, Youghwan Song, Stephen Schmidt and two anonymous referees for their helpful comments. I am
also grateful to many Union College students for their useful feedback.

You might also like