| Statistic |
Purpose |
Level of Measurement |
How
toCompute |
| Mean |
Describes central tendency |
interval |
Add up the cases and divide by N. |
| Median |
Describes central tendency |
ordinal |
Put the cases in order and
select the one in the middle. |
| Mode |
Describes central tendency (most
frequent case) |
nominal |
Just choose the most frequent
case |
| Range |
Describes dispersion |
ordinal |
Put data in order and measure the distance from the lowest to the highest |
| Standard Deviation |
Describes dispersion |
interval |
For each case, compute the
distance between that case and the mean, square the differences, add
them up, divide by n-1, and take the square root. See the
detailed description on the Descriptive
Statistics handout. |
| Inter-quartile range |
Describes dispersion |
ordinal |
Put data in order and measure
the distance from the 25th percentile to the 75th percentile |
| Margin of Error for a percent |
Inferential: tells how much a sample percent is likely to differ from the true population value | nominal, typically used for
questionnaire items, e.g. 55% for Candidate A, 40% for B, etc. |
m =
1/sqrt(n) where n is the size of the sample This is described at more length in the class notes. |
| Margin of Error for a Mean score |
Inferential: tells how
much a mean score from a sample |
interval because that is needed
to get the mean. Used for data such as income, test scores, age,
etc. |
M = 2 * sd / SQRT(N), where sd is the standard deviation and N is the sample size, sqrt means take the square root. This is described at more length in the class notes. |
| Sample Size |
the size of a sample needed to
obtain a given margin of error |
nominal |
n
= 1/ m2 where m is the desired margin of error expressed as
a proportion (not as a percent), e.g, .05, not 5%. This is
described at more length in the class
notes. |
| Observed Frequency |
The number of cases observed in
a particular category |
Nominal data. It can be
bivariate, e.g,. 35 men and 55 women, or bivariate, e.g, 15 Tall Men,
25 Short Men, 10 Tall Women, etc. |
Tabulate the data, either by
hand or with a computer. Microcase or other statistical packages
will do this. To do it in Excel, create a "pivot table."
Usually you will be given the observed frequencies. |
| Row Percent |
Describes the frequency in a
cell as a percent of the row total |
two nominal variables in a cross
tabulation |
observed frequency/row total -
all the row percents in a row will add to 100% |
| Column Percent |
Describes the frequency in a cell as a percent of the column total | two nominal variables in a cross tabulation | observed frequency/column total
- all the column percents in a column will add to 100% |
| Total Percent |
Describes the frequency in a cell as a percent of the column total | two nominal variables in a cross tabulation | observed frequency/grand total.
- all the total percents in the entire table will add to 100% |
| Expected Frequency |
Used to compute chi-square;
tells what frequency we would expect if there were no relationship
between two variables |
two nominal variables in a cross
tabulation table |
For each cell, you take the row
total for the row it is in, multiply it by the column total for the
column it is in, then divide by the grand total. These are
frequencies, not percents, so do not give them a percent sign. If
you add them all up, they add to the same number as the observed
frequencies |
| chi-square |
Inferential: tells if the
relationship between two variables in a cross-tabulation is
"statistically significant |
A cross-tabulation of two
nominal variables (or it can be used to compare one variable to
theoretical values) |
Use the WEB chi-square
calculator or get it from Microcase. (To get it by hand you
subtract each observed frequency from the corresponding expected
frequency, square the difference, divide by the expected, then add them
up.) |
| ANOVA or Analysis of Variance |
Inferential: Usually used
to compare scores of two or more groups (e.g., experimental and control
groups) on a variable measured on an interval scale. |
One nominal variable (often
groups of respondents) and one interval variable |
Microcase or other software will
compute this for you, it can be used with the GSS data set in student
Microcase,e.g, to compare groups such as religions or racial groups on
continuous variables. |
| correlation coefficient |
Describes the strength of the
relationship between two interval variables |
interval, but can also be used
with ordinal variables if they are a reasonable approximation to
interval. |
Microcase or Excel will
calculate it for you. It varies from -1 to 0 to +1.
Squaring it tells you the percentage of variance the equation explains. |
| Cramer's V |
Describes how well one variable
in a cross-tabulation can explain the other. |
two nominal variables in a
cross-tabulation table |
Microcase computes it. It
is derived from the chi-square, dividing it by N. |
| Multiple R2 |
Describes how well all the
variables in a multiple regression equation explain the variance in the
dependent variable |
interval or dichotomous |
Microcase or Excel will compute
it for you. |
| regression equation |
Describes the line that best
fits the relationship between two variables. Can be used to
predict the dependent variable with the independent variable, if the
relationship is close to linear. |
interval (or dichotomous) |
Normally, you will get the
formula from Microcase or Excel or it will be given to you. It
will be of the form Y = a + b X, where X is the independent
variable, Y is the dependent variable, b is the regression coefficient
and a is the intercept. a and b will be numbers (parameters) that
define the equation for a particular case. To make a prediction,
multiply the value for X by b and add a (or subtract if it is negative). |
| beta coefficient |
Describes how good a predictor
each of the independent variables in a multiple regression equation is. |
interval (or dichotomous) |
These are standardized
regression coefficients. Microcase or Excel will compute them for
you. They are used to describe the strength of each arrow on a
path diagram. Note: for bivariate regressions the
correlation coefficient is the same as the beta coefficient. |
| path diagram |
Describes how a number of
regression equations can be used to describe a pattern of causal
relationships. |
interval or dichotomous |
Draw a diagram with the
dependent variable on the right and the independent variables on the
left. Insert the antecedent and intervening variables. Draw
arrows going from left to right to represent each hypothesized
link. Compute a regression equation for each variable that has an
arrow going into it. Each arrow represents an independent
variable to be includes. This is explained in Introduction
to Path Analysis. |