Notes for Methods and Techniques of Social Research, Spring 2005
Grading formulas:
Total Score =
([Attendance]*0.1+[Quizzes and Assignments]*0.2+[Grade on Midterm
One]*0.20+[Grade on Midterm Two]*0.20+[Midterm2 Makeup]*0.2+[Final
Exam]*0.3)+[Extra Credit Points]
Final Exam = ([Final Multiple
Choice Items]*0.75+[Final Statistics Items]*0.25)
Quizzes and Assignments =
([Microcase Intro]+[Workbook 1]+[Workbook 2a]+[Workbook
2b]+[PercentQuiz]+[Workbook 3]+[Enrolling]+[Sampling]+[Workbook
5]+[Historical Trends]+[Workbook 8]+(2*[Human Subjects Letter])+[Crime
Drop]+(2*[Excel Regression])+[Research Design])/17
April 26 - Review
Here are some points covered with multiple choice
questions. This material is covered in our text. The
"review glossary" at the end of each chapter is useful for review.:
- The differences between survey research, field research,
experimental research, focus groups and content analysis
- The field research, content analysis and other studies covered in
the last two weeks of class. If you weren't in class, you can
check these out on the notes.
- History effects, maturation effects, testing effects, regression
to the men and subject mortality in experimental research.
- Criteria for establishing causation. Independent,
dependent, antecedent and intervening variables.
- Levels of measurement: dichotomous, nominal, ordinal,
interval and ratio. The levels of measurement required to use
statistical techniques such as percentages, chi square, correlation,
regression, means, standard deviation.
- Interpreting a time series graph, such as those we did with the
Historical Trends module of WEBCT and in interpreting the Crime Drop
article .
- Tests of reliability: inter-rater, test-retest, internal
consistency.
- Tests of validity: criterion, construct, convergent, face.
- Types of samples: simple random, stratified, quota,
systematic, cluster.
- Interpretation of scattergrams, e.g., height and weight.
Statistics questions will include:
percentages - row, column and total. Covered in
these notes.
expected frequencies. Covered in these
notes.
computation
of mean
computation
of standard deviation
computation of margins of error, covered in
these notes
use of an unstandardized regression equation to make a prediction. -
this was part of the Excel Regression assignment and is also discussed in these notes. A sample item is
question 9 in the test called "Review Items for Final Two"
Use of standardized regression coefficients (beta weights) to interpret
a path diagram. This is explained in
these notes.
some WEB pages, one on
Descriptive Statistics and one on Inferential
Statistics Review. The class notes for October 1 (below) show
how to calculate many of the statistics. On December 10 we did
some statistics
review questions taken from last semester's final.
Depression
Scale.
Weight and Mortality.
April 21:
Content Analysis - "unobtrusive
data" Data created
by a bureaucratic system, e. g. police records, or often by the
media.
Television or Newspapers either because that is our interest, the
media,
or as a way of getting information, e.g., on crime reported in the news.
Similar to survey research, except
that you do coding
instead of interviewing. Coding means that you assign numbers to
phenomena that you observe. Counting things. Each of your
variables
is coded from the published information.
Conceptualization.
Measurement. Reliability and Validity.
Manifest Content - what's it's about on
the surface
Latent Content - things that we infer
about the content,
e.g., does the writer sound angry? Indignation, sexy?
A Content
Analysis Study of Editorial Cartoons.
A
Content Analysis of Internet-Accessible Written Pornographic Depications.
April 19
Experimental Research. Experimental Designs. See the graphs in the book
or on Trochim's WEB site: Types of
Designs.
Essential
characteristics:
- Two or more groups are matched, usually by random assignment,
sometimes by a kind of stratified random selection, e.g., an equal
number
of men and women or black sand whites in each group. But the key
is random assignment so that the groups can be assumed to be the same
on
all variables. "Quasi-experiments" are when we use groups that
are
pretty much the same but we didn't assign people at random
- The Independent Variable is "manipulated," i.e.,
it is applied
to one group and not to the other
- Change in the Dependent Variable is measured
Experiments can be done:
- In laboratory settings with volunteers, e.g.,
student volunteers
- In institutional settings such as prisons,
hospitals, rehabilitation
centers, etc., where people are assigned to treatment groups
- New drugs and medical treatments generally must
be shown
to work in experiments before they are approved for use. Often,
treatment
is compared to a placebo. These experiments are usually
"double-blind,"
to control for the psychological effects of knowing one is getting
treatment.
This is a way of controlling subject bias and experimenter bias/
- In criminal justice, one might do an experiment
comparing
a "half way house" to drug treatment program to a prison term for
offenders.
To do this, you would have to get the judge to assign offenders to
different
programs at random. Ethical issues are raised here and there are
likely to be objections
- Occasionally in natural settings, for example
- welfare reform
experiment, assign some recipients to the
old program, some to
the
new. This didn't work very well, there
were
errors in the group assignments and the women often forgot which group
they were in anyway
- vaccination experiments
- guaranteed annual income experiments
Although logically experiments are the most rigorous
way
to test causal hypotheses, there are practical problems:
- It may be hard to manipulate the independent
variable effectively,
it may not have enough importance to people that they notice it
- Experimental conditions may not be realistic
enough, e.g.,
the Milgram experiments having people apply electric shock to people,
experiments
that simulate being in prison. An experiment is not the real
world
and people know it. This is called external validity, does the
experiment
match real world conditions
- There may be problems of internal validity,
difficulties
in carrying out the experiment:
- "History" effects - the world changes during
the experiment,
people get older, more mature, they are effected by things in the real
world
- Maturation, people get older, learn more
- Testing effects, taking the pretest measure
effects people,
causes them to change. Sometimes we have a matched but untested
control
group that is measured only after the experiment.
- Instrument effects, the testing instrument may
change.
You can't use the same exact test sometimes because people will
remember
it, so items change
- Regression to the mean, just by chance the
people who got
extremely high or low scores on a pretest are likely to get more
average
scores on the second test.
- Subject "mortality" - we may lose people.
This is especially
a problem in testing things like drug rehabilitation, it works for the
people who stick with it, the failures drop out
- Ethical concerns: people may not be willing
to be experimented
on, or it may be harmful to subject them to experimental conditions,
e.g.,
- Tuskeegee syphillis experiment denied some men
penicillin.
You can only deny an experimental drug if you are not "certain" that it
works or if the condition is not serious, e.g., common cold research
- A big strength of experiments is resolving
questions that
involve different recollections of events, e.g., children's reports of
abuse. You don't know what "really" happened and people disagree
on how well they accept the recollections of different people. In
an experiment, you know what really happened, so you can check the
accuracy
of perception. We find that children often remember things that
didn't
really happen. "20/20 report on Child Abuse experiments
(VIDEO shown in class) demonstrates false memory because we know what
really happened since it happened in a controlled experimental
setting. This is much more difficult to establish in real life
case histories: Loftus: Who Abused Jane
Doe?
April 14: Field Research:
Some examples of field resarch:
Margaret
Mead, the only anthropologist (or
sociologist)
to get her own postage stamp, won fame through field work, primarily
her
book Coming
of Age in Samoa. Later, this book was denounced by
anthropologist
Derek Freeman in his book Margaret
Mead and the Heretic : The Making and Unmaking of an Anthropological
Myth.Anthropologists
have come to Mead's defense, and
have restudied the case, but I would have to agree with your text
that
"had Mead come back from Samoa with an accurate ethnographic report, it
would not have made her famous." Here is the NY Times Review of Freeman's
critique of Mead.
More recently, there has been a raging controversy about the book Darkness
in El Dorado about research on the Yanomamo in Venezuela is the
latest
ethical controversy, which also raises important methodological
questions.
Many of the book's allegations, however, have
been contested by the National Academy of Sciences.
The combining
of fiction with factual research is increasingly common both in
anthropology
and in biographies. Sometimes this is
openly
done as a literary form, in other cases such as that of Rigoberta
Menchu,
it is only admitted when
critics discover it.
The
Rigoberta Menchu Controversy by Arturo Arias.
There
are many problems with field research: ethical issues, problems
of
reliability and validity when data are gathered by only one researcher,
etc. A controversial book is Laud Humphrey's Tea
Room Trade, which raises ethical issues. He studied gay sex in a
men's
room in a park in St. Louis, without informing the participants what he
was doing.
Field researchers sometimes seem to find examples that fit their
preconceptions,
and their work is often ignored by those who do not like the results,
e.g.,
Leon Dash's book When
Children Want Children and
Rosa Lee which are just ignored by welfare advocates who prefer
more sympathetic treatments. One of the best field studies is
Kathryn Edin's book Making
Ends Meet. which is highly sympathetic to the mothers.
However,
Edin collected statistical data as well her illustrative
observations.
The statistics showed that almost none of the mothers actually lived
off
their grants alone. Eli Anderson's book Streetwise
on men in a Philadelphia ghetto has been well received, in large part
because
goes beyond one-sided advocacy.
A great strength of field work is observing behaviors that the people
themselves
don't understand or aren't even aware of., or at any event, are unable
or unwilling to talk about. Anthropologist Jules
Henry spent a week living in each of the homes of several children
who had grown up mentally ill,
trying
to discern patterns in the family interactions that contributed to the
illness. Myra Bluebond-Langner's book The
Private Worlds of Dying Children has been very influential;
she
has just published a sequel called In
the Shadow of Illness : Parents and Siblings of the Chronically Ill
Child
Field reserch offers a richness of description and possibility of new
insights
that is unparalled by any other method. Unless it is supplemented
with other methods, it does not provide statistical data, and it is
hard
to replicate.
Myra Bluebond-Langner of our Anthropology Department wrote a classic, The
Private Worlds of Dying Children, and more recently, In
The Shadow of Illness.
Coming
of Age in New Jersey.
The
Corner.
Black
American Students in an Affluent Suburb. by John
Ogbu.
Commentary
on Ogbu's research.
Many scholars who have disputed those findings rely on a
continuing survey of about 17,000 nationally representative students,
which is conducted by the National Center for Education Statistics, an
arm of the federal government. This self-reported survey shows that
black students actually have more favorable attitudes than whites
toward education, hard work and effort.
But that has by no means settled the debate. In the February
issue of the American Sociological Review, for example, scholars who
tackled the subject came to opposite conclusions. One article (by three
scholars) said that the government data were not reliable because there
was often a gap between what students say and what they do; another
article by two others said they found that high-achieving black
students were especially popular among their peers.
"It's difficult to determine what's going on," said Vincent
J. Roscigno, a professor of sociology at Ohio State University who has
studied racial differences in achievement. "'I'm sort of split on Ogbu.
It's hard to compare a case analysis to a nationally representative
statistical analysis. I do have a hunch that rural white poor kids are
doing the same thing as poor black kids. I'm tentative about saying
it's race-based."
Indeed, Professor Mickelson of the University of North
Carolina found that working class whites as well as middle-class blacks
were more apt to believe that doing well in school compromised their
identity.
All these years later, Professor Fordham said, she fears that
the acting-white idea has been distorted into blaming the victim. She
said she wanted to advance the debate by looking at how race itself was
a social fiction, rooted not just in skin color but also in behaviors
and social status.
"Black kids don't get validation and are seen as trespassing
when they exceed academic expectations," Professor Fordham said,
echoing her initial research. "The kids turn on it, they sacrifice
their spots in gifted and talented classes to belong to a group where
they feel good."
April 12: We
went over
The
Crime
Drop in America: Disaggregating Violence Trends.
April 7: we learned how to
plot regression lines on graphs in Excel.
April Fifth. Department
Newsletter.
March 31, Second Midterm: Grading formulas:
Quizzes and Assignments = ([Microcase Intro]+[Workbook 1]+[Workbook
2a]+[Workbook 2b]+[PercentQuiz]+[Workbook
3]+[Enrolling]+[Sampling]+[Workbook 5]+[Historical Trends]+[Workbook
8])/11
Grade on Midterm Two = 0.8*[Midterm Two Multiple Choice]+0.2*[Midterm
Two Statistics]
Estimated Course Grade = ([Attendance]*0.1+[Quizzes and
Assignments]*0.2+[Grade on Midterm One]*0.35+[Grade on Midterm
Two]*0.35)+2
March 29 Amputation
Survey. mentioned in class. General review going over these
notes. In your review, you may find it better to start with the
older notes at the bottom of the file. The exam will cover
chapters one, two, three, four, five, seven and eight plus material
covered in class (see the class notes) including path analysis and the
computation of margins of error. It will NOT include the use of
regression equations to calculate predicted scores on dependent
variables (we'll do that after the midterm).
March 24
Capital
Punishment and Homicide Rates: Sociological
Realities and Econometric Distortions, a paper which is
in WEBCT.
The ABSTRACT summarizes the main point of the paper:
Sociological methods have
consistently succeeded while econometric methods have failed in
research on capital punishment and homicide. But econometricians
aggressively promote their findings in public policy venues, while
sociologists are less assertive. This is due to cultural
differences between the disciplines, and to a philosophy of science
that values falsification of hypotheses over progress in answering
research questions. This problem has occurred and is likely to
reoccur in other policy areas where sociologists are insufficiently
asserting in defending their accomplishments.
- In this paper the "sociological method" discussed is comparative
analysis as discussed in Chapter 8 of our text, especially the use of
time series graphs and scattergrams. Several are at the end of
the paper.
- The "econometric" method is multiple regression, discussed in
chapter 5 of our text.
- The dependent variable is the homicide rate in a state, the
independent variable is either the presence or absence of capital
punishment in a state or the number of executions in a given year in a
state.
- Econometric students have given highly variable results, many
econometricians conclude that each execution deters a number of
homicides, thus saving lives.
- But others conclude the opposite, showing the importance of
reviewing the literature comprehensively instead of just relying on the
studies that happen to confirm one's position.
- Why have the regression studies failed to give a consistent
result: 1) the available data are limited and do not meet
the assumptions required for regression analysis, especially a normal
distribution 2) the State of Texas is a prominent outlier,
dominating the results 3) there are so many different ways
to adjust the data to fit them into a regression equation that
researchers seem to be able to come up with any result they believe in.
- This has been the case in many other areas, including gun
control, race and intelligence, assessing the effects of minimum wage
laws, etc. You can find an economist on either side of the
issue, just as you can find a psychiatrist on both sides in many
criminal trials. This is why Harry Truman said he wanted a
"one-handed" economist.
- A better method is to examine the data in graphs, then interpret
it using you qualitative knowledge of the states in question.
This method does not distort the data by assuming it fits into normal,
linear distributions.
- People who used this method have agreed that no effect of capital
punishment on homicide rates can be found in the available data.
We will go over Exercise 8 in the Workbook.
Discussion of paper on Capital Punishment and Homicide Rates, which is
available in WEBCT. An older, more popular version is at online.
March 21
Mr.
Goertzel:
Guess what! As the weeks have been passing, and you've been teaching
us all this research data stuff... I've asked myself a hundred times -
"when the hell am I ever going to need to know all of this?"
The past few days I was in Chicago with J'ona Meyer for a CJ
convention and during one of the presentations a gentleman put up his data on a
projector......and I was ASTONISHED that I understood EVERYTHING he was
talking about - from his variables, to the standard deviation, to it's statistical
signifigance, etc.! I'm happy to report I'm actually learning
something of use in your class! Had I not taken this class, I would have sat there
feeling like a total dummy not knowing a thing what he was talking about.
Thanks - Denise Gilboy
Comparative
Research Using Aggregate Units, Chapter 8 in the text. This
research method uses data about social or geographic units.
Consistent
criminal justice statistics are important for evaluating CJ
policies. Thorsten
Sellin, a professor at Penn, was instrumental in getting consistent
CJ statistics established. We can find examples on the Bureau of Justice
Statistics
WEB site.
Comparative methods are particularly useful for studying change because
we can get data about trends over time. Look, for example, at
some Trend
Graphs taken from the "Historical Trends" module in the
Professional Microcase. This is available in the computer center
on the networked Windows computers (click on Statistics and Microcase
on the Windows menu, then open "Microcase Curriculum Plan 2003-2004 and
load the TrendSmp data set. Our next assignment requires using
this data set in the computer lab..
Some concepts:
Rate: A statistic that reduces numbers to a common
base. The base is often, but not necessarily, the total
population in an area. If we are looking at voting participation,
we might compute rates using the base of the number of adults 18 or
over. If we are trying to predict an election, we might use a
base of registered voters.
A crude birth rate is the number of births per 1,000
population. Fertility rate is the number of births per female
during her lifetime.
Time Series analysis: uses time periods as the unit of
analysis, looks at how things change over time often in one
case. A lagged time series takes into account the time it
takes for one variable to influence another, thus incarcerations in one
year might be related to crimes in the next year.
Cross-sectional analysis compares a number of cases at
one point in time.
Reliability: are statistics computed the same way in
different geographic units or different time periods. This causes
all sorts of problems - it is better to imporve statistics, but doing
so causes us to lose comparability.
Validity: do the statistics measure what we want them to
measure. Crimes reported to the policy are not a valid measure of
the amount of actual crime, especially for crimes that are often not
reported.
Case oriented vs. variable oriented. The case oriented
approach is more qualitative, although quantitative trend data can be
used. The variable oriented approach assumes that the same
variables are causally related in the same way in a large number of
cases, e.g., "capital punishment" and "homicide rates" in a number of
states or countries.
Outliers: especially in variable-oriented research, it is
important to look for exceptional cases that are very different from
the norm. These tend to cause a disproportionate impact on our
results.
Lagged: Using statistics from past years to predict events in
current years. This is done because our theory says that causal
linkages take some time to take place.
March 10
To understand regression, we first need to
understand what it means to
plot an equation on a graph. If we draw two coordinates on a
piece of paper or on the whiteboard, we can draw a Cartesian
coordinate plane. with an x-axis (for our independent
variable) and a y-axis (for our dependent variable). We can then
plot lines on this graph by using a regression equation:
Y =
a + b
X. where X and Y are our variables,
and a and b are parameters or fixed numbers given to us by the computer
software.
For example, plot the following lines:
If a is zero and b is one, then Y = X.
We can say: if X is 0, Y is 0. If X is 2, Y is 2,
etc. If we plot these points on the graph we get a straight
diagonal line going from the lower left to the upper right (to be
demonstrated in class):
If a is one and b is one, we get a line
parallel to the first, but one notch up.
If a is 0 and b is minut one, the line
will go down... etc.
is a method that computes equations like this to fit straight lines to
bivariate relationships between continuous or linear variables.
It works best when the variables are "normally distributed," i.e. when
they fit a bell-shaped
normal curve with most of the cases near the mean and few extremes.
We can see how regression works best by using the scatterplot program
in Microcase and the USA data set which has many continuous variables
using the US States as the unit of measurement. and clicking on "reg
line". For example, the graph of % college and Median
family income (open Microcase to see this).
At the bottom it says "Line Equation Y = 15254 + 902.229
X. This is the equation straight line that appears on the
graph.
What does it mean to say that it is the equation for a line? It
means that if you use the equation to plot points on a graph they will
look like that line. The more general form of this equation is Y
= a + b X where:
X is the independent variable (in
this case % college)
Y is the dependent variable (in
this case Med Fam $)
a is the "intercept" - this is a
"parameter" of the equation which means it stays fixed while the
variables vary
b is the
"unstandardized regression coefficient" - it is also a paramater.
The software computes the equation for
us, which is called "fitting a regression equation to the data".
We can also do Multiple Regression which means we have more than one
independent variable. For example, we could use both the %College
and the %urban and the %smokers to predict median family income.
We would have an equation such as:
Y
= a + b1X1 + b2X2
+ b3X3
with as many b's and x's as we include variables
With multiple regression, however, we can't plot a scatterplot
unless it is three-dimensional. Going beyond three dimensions is
impossible to visualize. Plus, it is hard to compare the b's
because they are measured in different units. So we create:
standardized regression coefficients also referred to as BETA Coefficients
which vary from -1 to 0 to +1 like correlation coefficients.
These are used to compare how well each of the independent variables
helps us to predict the dependent variables. We can also
construct complex networks of regression equations
where A, B and C predict D, then D and E
predict F, etc. etc. This method is best illustrated with path
analysis, a way of graphing complex regression models.
Path analysis is useful because it enables us to visualize our ideas
much better than we can when we see a list of equations. It
approaches the complexity of our qualitative thinking.
Unfortunately, however, the mathematics requires a lot of simplifying
assumptions. The method does not really PROVE that the model is
correct, it simply illustrates our ideas and shows the strength of the
correlations. It assumes linear relationships, which we often do
not have. In my opinion, graphing trends and interpreting them in
view of our qualitative knowledge is more valid, although less
"hi-tech". I have published two articles arguing this point in
the Skeptical Inquirer magazine. Here is a link to
the text of one of them in case you are interested: this is
something we will come back to later in the course.
Before getting into the criticism, however, we can learn how to do a
path analysis. Although the
mathematics is a bit complex, the
computer does it for us, so it is not actually difficult. You can
find the basics ideas in Brief
Intro to Path Analysis. If anyone wants a longer
introduction with more examples it is available.
For the exam, you should know how to set up the regression equations
to fit a path diagram. All you have to do to actually do a path
analysis is put these equations into Microcase. The rules are the
follows:
- There should be a regression equation for each variable that has
an arrow pointing towards it.
- For each equation, the variable having arrows pointing into it is
the dependent variable, and goes to the left of the euqals sign.
- For each equation, the variables on the left of the dependent
variable that have arrows pointing into it are the independent
variables. These are listed to the right of the equal sign and
connected with + signs.
- There is no need to include an intercept, because we are
interested only in the standardized regression equations or beta
weights.
An example. Suppose we have the following diagram:

For this diagram, we would need the following equations:
vote for perot = alienation
from government + alienation from society + finances worse
alienation from government =
status
deficiency
alienation from society
=
status deficiency
If we got measures for these variables from a National Election Survey
(Status Deficiency would be an index we would have to calculate), we
could use the Regression procedure in Microcase to enter the three
regression equations and get Beta coefficients which we could put on
the diagram, as follows:

October
March 8:
Today we will look at testing causal
hypotheses.
On page 93 in the text, we have the example of the relationship between
Height and Liking Basketball. This is anIV and a DV. An
obvious
TEST VARIABLE is Gender. This would be Antecedent, Gender
determines
both your height and liking for basketball. We could draw this as
a path diagram (on board).
When we introduce the control, we split the table
into
two parts, e.g.,
Males
Females
Total
Tall
Short
Tall Short
Tall Short
Likes
BB
85%
85%
25% 25%
65%
45%
Does
Not
15%
15%
75% 75%
35%
55%
Total
100% 100%
100%
100% 100% 100%
In the real world, things are never this
sharp.
Let's look at some real data, using FEAR
WALK, PLACE SIZE and R.INCOME from the GSS data set:
In the total sample, the low income
respondents are more likely to feel there are areas near them where
they should fear walking. However, this effect disappears for
some of the respondents when we control for the size of the town in
which they live.
To make it a finished Table:
Small Town or rural
Small
City
City/Surb
Total
Low Med
Hi
Low Med Hi Low Med
Hi
Low Med Hi
Fear
Walk
30% 27%
24%
48 42% 20% 56 41
43 51% 39% 41%
No
Fear
70% 73%
76%
52% 58% 80% 44% 59%
57%
49% 61% 59%
p =
.710
p = .043 p
= .000 p=.000
N =
251
N =
133
N = 1253 N = 1637
To to a more complete causal model of
Fear of Walking at Night, we should introduce more variables.
Some of them may be in our data set, others now.
What variables should we look at?
Variables
Hypotheses
Gender
Females more fearful than males.
Age
Elderly more fearful, also Children. Might be curvilinear.
Crime Rate
People
in high crime communities
Street Lighting
Freq of Patrols
Graffiti, Broken Windows, Trash, other
indicators
of an "out of control" neighborhood
Bicycles
Number of Pedestrians
Physical Shape
Training in Self Defense
We can examine some of these variables with our
data. We may find it useful to use regression rather than
cross-tabulation.
We can also use pages 114-122 in the workbook as examples..
March 3:
Causal Analysis - Chapter
5.
The
Art and Science of Cause and Effect. (powerpoint)
Probabilistic cause, not an absolute cause, not a
cause
that is sufficient or necessary. "Cigarette smoking causes
cancer." WHat we mean is, smoking cigarettes
increases
the likelihood of getting cancer. How much?
There are multiple causes for everything. What
we
want to find out is how much each thing contributes. There are
also
causal linkages, or indirect causes. A causes B
and then B causes C.
Diagraming causal models. We put the dependent
variable
at the right. We draw arrows going into it for each causal
variable that effects it directly. Then we can
have arrows that go into the arrows, steps into the causal analysis, as
in
this sample file:
http://crab.rutgers.edu/~goertzel/homomale.htm
Criteria of Causation - how do we know that
something
is a cause of something else.
1. Time Order. The cause comes before
the
effect. Sometimes we sort out the time order theoretically, we
assume
that
education preceeds employment. Or we can use a
research design that involves gathering data at two points in
time.
If
you don't have measurements at two points in time, this
is shaky.
2. Correlation. The two variables vary
together.
When one is high, the other is high OR when one is low the other is
high. This gets at the degree of causation, the
higher the correlation the strong the causal relationship.
3. non-spuriousness, we want to know
that
the correlation is not cause by something else. We can test this
with an
experimental design, if feasible. Or we can use
statistical controls, which are not quite as convincing but its all you
do
in many cases.
We test for non-spuriousness by introducing controls.
Causal Models: representations of the complex
causal
relationships between variables. Variables have different causal
roles, but this is determined by our causal our causal model, it is not
inherent in the variables. One person's cause can be
another's
effect.
Dependent Variable - that is what we want to
explain.
Often these are opinions or behaviors
Independent Variable - what we use to explain
it.
Often there are traits or physical characteristics, e.g., sex or race,
almost always independent.
If you study the relationship of race on voting, for
example,
race would be independent and voting dependent.
Antecedent variables, things come before the
independent
variable. This helps us to deal with a causal chain.
Antecedent variable cause IV which causes the DV.
If the antecedent variable "explains" the
relationship,
we have an "explanation", we say it is "spurious".
Intervening Variables, this that are intervening,
e.g.
Race determines ideology which determines the vote.
This is an "interpretation" it tells WHY the causal
relationship exists.
Path
Models: a way of graphically expressing complex causal models.
Example: Determinants
of Adult Homosexuality in White Males.
Example: The Seattle
Social Development Project.
Today we will learn the formula for margins of error for
mean scores:
If
you need a margin of error for a mean score (an average such as income
in dollars or scores on a test), you need to know the standard
deviation
(sd) and the sample size (N). Ignore any other
information
you are given, including the size of the population.
Use the following
formula:
M
= 2 * sd / SQRT(N).
Here is an example question: A study
of Rutgers Camden Sociology Department graduates showed that the mean
annual salary was $55,000 with a standard deviation of $3500. Three
hundred graduates were sampled. What is the margin of error for this
statistic? Answer: M = 2 *
3500/SQRT(300) = 7000/17.3205 = $404.15. Note
that this is a dollar amount, since the question was in dollars.
It is not a percentage.
What is the confidence interval for this
mean score? The answer is $55,000 plus or minus $404.15, or
$54,595.85 to $55,404.15
The formula for percentages or proportions is:
m =
1/sqrt(n)
March 1:
SAMPLING is used when we are
interested in studying a population that is too large for us to study
each individual. The first step is to define the
population
we wish to make statements about, e.g. adults in New Jersey, probable
voters, people convicted of felonies, graduates of our
department. We might want to study the entire population of the
USA. If we try to collect data from everyone, this is a
census. The Census Bureau does this once every decade, and misses
a lot of people. Everyone else does sampling, we select a
cross-section to represent the population. If you
try to study the whole population, you often fail to do a good job.
Gallup:
How Polls are Conducted.
Size of the sample. How big of a sample do I
need?
Size
of the sample does not depend on the size of the population.
How do we select the sample size? Decide on the
margin of error you will tolerate? Margin of error is equal to
one
divided by the square root of the sample size. Sample of
400,
the square root is 20. 1/20 = .05 or 5%. If you interviewed
400, 300 were white, 50 were black and 50 were others. For the
blacks,
with a sample of 50, we would have a 14% margin of error. For the
whites, with a sample of 300, we would have a 5.8% margin or error.
Take 300, the square root of 300 is =
17.32
1 /17.32 = .0577 * 100 = 5.8%
Sample statistic - what the sample says
population parameter - what the real figure is
Even if the sampling is done well, the response rate is less than 100%.
Weighting is done to make the sample more like the population.
This formula is for proportions or percents
(if you move the decimal over two)
m = 1/sqrt(n)
Solve for N: m2 =
1/n
n * m2 = 1 n = 1/ m2
If we need a margin of error of 3%, or .03. n = 1/ .032
If you have a sample size
and need to know the margin of
error, use m = 1/sqrt(n)
If you are given
a margin of
error
and asked how large a sample you need, use n = 1/ m2
In these
formulas
n = the size of the sample (not the population). m =
the margin of error expressed as a proportion, not as a percent.
Thus, if the questions says "we need a margin of error of 5%, then m =
.05.
If our sample is stratified, this means we really have several
sub-samples and we need the same size sample for each of them,
regardless of the size. For example, if we want sample white,
black and Hispanic respondents and make statements about each group, we
need the same size sample of both regardless of their size in the
population. Thus, if we need a margin of error of 5% for each of
the three
groups,
then the answer is 3 * (
n = 1/ m2 ).
If
you need a margin of error for a mean score (an average such as income
in dollars or scores on a test), you need to know the standard
deviation
(sd) and the sample size (N). Ignore any other
information
you are given, including the size of the population.
Use the following
formula:
M
= 2 * sd / SQRT(N)
Terms:
Margin of Error: How much a sample statistic is likely to vary
from the population parameter. We say that we are 95% sure that
the sample is not off by more than the margin of error. How this
is presented in
NY Times. "19 out of 20" is another way of saying 95%.
Confidence level: we always use a 95% confidence level.
Confidence interval: the range within which we think a
statistic would fall, e.g., if the margin of error is 3% and the sample
statistic is 67%, the confidence interval is from 64% to 70%. We
are 95% sure that the true figure is within this limit.
All of this assumes a simple random sample, which means that each
person (or other sampling unit) in the population has the same chance
of appearing in the sample. In practice, however, we often do not
use simple random samples, for several reasons:
- we may not have a list of the population. If we do not, we
first divide the sample into sub-groups of some kind (census tracts,
blocks, classrooms, organizations, depending on the nature of the
study). We then sample the subgroups and list the populations in
them . This is called cluster sampling
- We may be interested in differences between sub-groups of the
sample and need to make sure we have enough of them. In this case
we select random samples of each of the relevant sub-groups, and weight
the results appropriately. This is called stratified
sampling.
- Sometimes we just go down a list, which is called systematic
sampling. This gives the same results as simple random sampling,
unless there is some systematic ordering to the list that causes a
distortion
- Sometimes we use non-random or "quota" sampling. This is
done for convenience, or because we just want to know what the range of
differences is without putting numbers on them.
Feb 24: midterm exam. Grades are in WEBCT.
Grading formulas:
Quizzes and Assignments =
([Microcase Intro]+[Workbook 1]+[Workbook 2a]+[Workbook
2b]+[PercentQuiz]+[Workbook 3]+[Enrolling])/7
Grade on the Midterm=[Midterm One Stats]*0.25+[Midterm One Multiple
Choice]*0.75
Estimated Course Grade = ([Attendance]*0.1+[Quizzes and
Assignments]*0.2+[Midterm One Total]*0.7)
Feb 22 was the review for the midterm.
Feb 17: Why do we gather statistics? One reason is to
make policy decisions. We decide whether policies we are
following are effective by gathering statistical data. An example
is a book on "The Crime
Drop in America" which is based on crime statistics. It is very
difficult to establish WHY the trends are as they are Often it is
discouraging, e.g., giving out
speeding tickets does not cut traffic accidents.
Another purpose is for evaluating local efforts, a policy that is often
referred to as "compstat" which just an abbreviation for "computers and
statistics" What this means is that policy units are evaluated
according to the statistics in their precinct or other
jurisdiction. New York
City and Philadelphia have done a lot of this. San Diego had
an exemplary web site communicating this data to the public, but its
mapping software doesn't seem to be up at the moment. Camden
let its mapping system go dead and has promised to get it started
again. Some good work is being done by community
groups, who have produced some power points on Camden Crime and the
Camden police that are on our WEBCT site.
Links to
sources of crime statistics are on a separate page.
Feb 15:
We did an in-class exercise with the Keirsey Temperament Sorter.
Here are some hypotheses we can test:
|
Sociology
|
Criminal Justice
|
total
|
Thinking
|
40%
expected percent
37.5% observed sociology majors who are thinking
3 observed frequencies
3.2 expected based on the 40/60 hypothesis
2.7 expected based on the null hypothesis of no difference
|
60%
expected percent
33.3% cj majors who are thinking
6 observed freq.
10.8 expected based on the 60/40 hypothesis
6.2 expected based on the null hypothesis of no difference |
9
|
Feeling
|
60% expected percent
62.5% sociology majors who are thinking
5 observed freq
4.8 expected based on the 40/60 hypothesis
5.2 expected based on the null hypothesis of no difference
|
40% expected percent
66.7% CJ m ajors who are feeling
12 observed freq
7.2 expected based on the 60/40 hypothesis
11.8 expected based on the null hypothesis of no difference |
17
|
Total
|
100%
8 people
|
100%
18 people
|
26
|
|
Bush
|
Kerry
|
|
Thinking
|
1
|
3
|
|
Feeling
|
4
|
20
|
|
|
|
|
|
In this second case, it is obvious that there are not enough "Bush"
voters to provide an adequate sample. The bias in the "feeling"
direction is strong.
These hypotheses relate to findings from the
Alumni Survey done last semester.
Scaling or
index construction is when we use a number of items, such as
questionnaire items, to measure a more general
concept. We can do this by adding them up (in which case your
text would call it an "index", although many people still use the term
scale) , or they may be ordered from lowest to highest (in which case
it is a true scale as the term is used in your book). Your test
is an example. I just add up the points, to measure the general
variable "knowledge of research methods as covered in the first part of
the course." Another approach would be to rank the items from
easy to hard and see which you could do. This is tricky, because
some people can do the hard ones and not the easy ones. When we
make an index or scale, we get measures that can be treated as
interval, even if they are not strictly interval. Scaling methods
can be more precise, but these are not used as often in sociology or
CJ because they are more difficult and the added information is not
always needed.
Scaling methods include Thurstone
and Guttman
Scaling. Likert or
summative scaling is actually a method of "index" construction as
defined in our book. A powerpoint on Thurstone
scaling.
For example, we could scale the seriousness
of crimes. There are various methods of
measuring this. - paired comparisons means asking a sample of
people to rate crimes based on their perceived seriousness.
A very popular test is the Myers-Briggs
Type Indicator, based on Jungian personality theory. You can
take a free
version online. Another is the
Keary Temperament Sorter.
Feb 10
Reliability - you get the
same thing
over and over. Consistency.
inter-rater
- two different raters get the same answer.
test-retest, if you take it twice the answers are the
same.
internal consistency - are theitems on a test
consistent.
Chronbach's alpha is a statistic that measure inter-item reliability.
Validity is it "really"
measuring
what it is supposed to measure.
Face Validity - does it look right?
Predictive or criterion validity - does it predict what we want to
predict,
some "true" measure. SAT test predicts college or law or medical
school grades.
Convergent
validity - do several measures give the same result.
Construct
validity - does the measure perform as our theory says it
should.
We use this when we have no criterion.
This is the most difficult, it is used when things are inherently
difficult to measure.
We will examine some materials in a chapter called "Connecting
Conceptualization and Measurement" that will be distributed in class.
An example: a study of UFO Abduction Status.
February 8
Measurement means putting observations into categories. Often
these categories are given numbers, although not always..
Sometimes we do this just to keep track of things, e.g., each American
has a social security number, we have a library number, a student
number, etc.. But often the numbers give us more information than
that, e.g., the NJ driver's license gives height in feet and
inches. It also gives sex and eye color, which are described in
words but could be given arbitrary numbers. But the numbers given
for height are not arbitrary. In some sciences, e.g., astronomy,
numerical measurement has led to
important insights, e.g, to understanding the motion of the
planets.
This is because our observations can be summarized with mathematical
equations that enable us to predict events.
When we measure something, we need to be clear exactly what the
measure means. Especially when we use a number, we want to know
what it means. What
is a number? It is not so obvious as one might think.
Bertrand Russell said "A number is the class of all classes similar to
a given class." I.e., all sets of three have something in common,
which we could call "threeness."
Levels of Measurement. What is
our measurement
really
saying about the relationship between the values?
Dichotomous Measurement -
Two and only
two
categories. Can be a natural dichotomy or a "dummy
variables" - we take a complex variable and
divide
it into a series of dichotomous variables.
Nominal Measurement. Categories
that could be
put
in any order.
Catholic,
Protestant,
Jewish, Moslem, LDS, Buddhist, Episcopalian, Baptist
variable one, category of religion, variable two denomination.
Mental illnesses (DSMIV)
e.g., adjustment disorder, borderline
personality
disorder, paranoid schizophrenic
Crimes: burglary, assault, murder. What do these
terms mean? Look at the US
Criminal Code.
Each individual should go into one
and
only one category on a variable, one value on a variable.
For example: What is your favorite food, we have a long list, but
each person is allowed only one.
Sorting
people
into categories must be reliable and accurate or valid.
Ordinal Measurement. Here
we have
categories
in a logical order. Very short,
short,
medium, very tall, tall . Often we take continuous variables and
make them ordinal. Income: Under
$20,000
$20 to 40,000 $40 to 60,000 $60000 plus.
Interval Measurement:
TEMPERATURE IN
FAHRENHEIT
OR CENTIGRADE, 0 degrees is not the absence of heat. How about
the day that the "temperature
doubled" in New York City?
Ratio Measurement:
Income in
dollars:
a continous numerical value PLUS a meaningful zero point. Height
in inches.
Scaling is when we use a number of measures,
such as
test scores or questionnaire items, to measure a more general
concept. This often allows us to move to a higher level of
measurement. For example, we can add up test score items them up
(in which case your
text would call it an "index", although many people still use the form
scale) , or they may be ordered from lowest to highest (in which case
it is a true scale as the term is used in your book). Your test
is an example. I just add up the points, to measure the general
variable "knowledge of research methods as covered in the first part of
the course." Another approach would be to rank the items from
easy to hard and see which you could do. This is tricky, because
some people can do the hard ones and not the easy ones. When we
make an index or scale, we get measures that can be treated as
interval, even if they are not strictly interval. Scaling methods
can be more precise, but these are not used much in sociology or
CJ. For example, we could scale the seriousness
of crimes. There are various methods of
measuring this. - paired comparisons means asking a sample of
people to rate crimes based on their perceived seriousness.
One of the reasons we have to be clear
about levels of measurement is that the statisitcs we use depend on how
the data are measured.
Statistics for Nominal
Data: Percentages and Chi Square The
percentages are descriptive (they summarize our data), the chi square
is inferential (it tells us if we can generalize from our
sample). Survey data usually produces nominal (or
ordinal) statistics. Cramer's V is a correlation
coefficient for nominal data, scores on it vary from 0 to 1, but there
are no negatives since the data are not ordered.
Statistics for Ordinal Data: The
median is the only statistic we have covered that is specifically
designed for ordinal data - it finds the case in the middle once all
the cases are sorted in order. There are correlation coefficients
for ordinal data which you can find on the "statistics" page for
crosstabulations (gamma, tau) but it is more common to use interval
statistics (Pearson's r) or nominal ones (Cramer's V) with ordinal data.
Statistics for Interval
Data: Scattergrams, means, standard deviations, correlation
coefficients. Tests of statistical significance for correlations.
February 3:
Today we will begin with Amar
Patel's Chi-Square lesson. This covers the concept of
expected frequencies and observed frequencies, and introduces the
concept of "fairness", the difference statistic and the chisquare
statistic. These are applied to problems where the expected
frequencies are given by a null hypothesis of "fairness".
We can apply this to any distribution where we have a theoretical
reason to expect a certain result. E.g., with two dice, each with
six sides. What results are possible and what likelihood do we
have?
-
- *
Snake-eyes!
-
**
(1 and 2; 2 and 1)
-
*** (1 and
3; 3 and 1; 2 and 2)
- **** (1
and 4; 4 and 1; 3 and 2; 2 and 3)
- ***** (1 and 5; 5 and 1; 4 and
2; 2 and 4; 3 and 3)
- ****** (4 and 3; 3 and
4; 5 and 2; 2 and 5; 6 and 1; 1 and 6)
- ***** (4 and 4; 5 and 3; 3
and 5; 6 and 2; 2 and 6)
- **** (5
and 4; 4 and 5; 6 and 3; 3 and 6)
-
***
(5 and 5; 6 and 4; 4 and 6)
-
**
(6 and 5; 5 and 6)
- *
Boxcars!
Suppose we try real dice 36 times and see what we get:
Total
|
Expected
|
Observed
|
2
|
1
|
|
3
|
2
|
|
4
|
3
|
|
5
|
4
|
|
6
|
5
|
|
7
|
6
|
|
8
|
5
|
|
9
|
4
|
|
10
|
3
|
|
11
|
2
|
|
12
|
1
|
|
We can compute the chisquare with Graph Pad
QuickCalcs on the Internet.
We will then apply the same statistic to crosstabulations where the
expected frequencies are determined by the marginal frequencies.
Last class we calculated expected frequencies, see the notes below.:
For a criminal justice example, consider the study of racial
profiling by the San Diego police.
February 1: We will go over the examples on pages 47-53 in the
Workbook, as well as Exercise 2b.
We will also introduce the concept of Expected Frequencies. For
this purpose we will use a simple 2 by 2 distribution as follows.
The
variables are gender and opinion on an issue, each of which has two
values:
25 men agreed
17 men disagreed
65 women agreed
30 women disagreed
| Observed Frequencies or Obtained Frequencies |
Men |
Women |
total |
| Agree |
25 |
65 |
90 |
| disagree |
17 |
30
|
47
|
| total |
42 |
95
|
137 |
We can compute expected frequencies, based
on the null hypothesis that
men and women do not differ intheir opinions. We can compute
these
knowing only the marginal or total frequencies. The easy way to
compute them is to multiple the row total for each cell by the column
total for that cell, then divide by the grand total. Another way
would be to convert the row totals to proportions, then multiply then
by the column totals. Expected Frequencies - rt *ct /gt
| Expected Frequencies |
men |
women |
total |
| agree |
90*42/137=27.59 |
90*95/137=62.41 |
90
|
| disagree |
47*42/137=14.41 |
47*95/137=32.59 |
47 |
| total |
42 |
95
|
137 |
What would we get if we used the expected frequencies to make
acolumn percentage table? The percentages would be the
January 27: We spent most of the class on descriptive
statistics. The required
reading is on the Internet and will be distributed in class on
February 1. You should know how to do frequency distributions and
how to calculate means and standard deviations exactly as explained in
the reading. (Do not group values together as we did in
class). There are mean and standard deviation items at the end of
the Percents and Expected Frequencies quiz, which is now open.
January 25:
How does social science
differ from other
ways of
thinking: poetry, philosophy, theology, physical or biological
sciences, history, journalism?
How would
we divide up fields of study? Physical Science,
Social Science, Humanities? Science, Art and Morality? Or,
in
Greek, Episteme, Techne, Phronesis: Three
approaches
to knowledge. At Rutgers Camden we divide knowledge up
differently: Rutgers
Camden requirements. How does social science differ from the
other
categories? Some sociologists like to think of us as a science
similar to chemistry or physics, others see us as closer to history or
journalism. The latter conceptions might make us exempt from
human subjects regulations, if we are not doing research aimed as
generalization. But we do not want to give up the hope of
establishing generalizations.
Social science begins with concepts as do other fields such as
philosophy
and even mathematics if we recognize that numbers are concepts.
The small
integers are especially important, especially Zero and One (or nothing
and something). Religion may also
start with
concepts The Bible says In the beginning
there was
the Word, and the Word was with God, and the Word was God. What does that mean? Ask
a
theologian. Religious concepts are good if they provoke spiritual
reflection, as in reciting a Mantra in Buddhism. Literary
concepts are
good if they are beautiful, which social sciences seldom
are. W.H. Auden's poem Under Which Lyre
is an aesthetic attack on social
science and other
applied sciences. Social science may not appeal to poets, but it
can provide objective evidence of important points.
Florence Nightingale used social research to advocate for better
nursing care in the British armed forces during the Boer War. She
invented the bar graph and pie chart. Felton
Earls and his colleagues used a combination of research methods to
study the causes of urban crime. Their organizing concept was
"collective efficacy".
In Social Science, a concept is good if it helps us to understand
empirical reality. A good concept leads to useful generalizations
or
theories. Theories are general statements about relationships
between
concepts that reflect how people think and behave. It can also be
operationalized which means finding
indicators
to measure it. A very common way of operationalizing
a concept is to write a survey question. Others may be operationalized by observation or by physical
measurement
or by counting things. In criminal justice, concepts are often operationalized by having police officers fill
out reports
on incidents. We can find a good list of sociological concepts by
going
to survey research archives, where concepts are translated into survey
questions. Check the General
Social Survey and the
Eagleton poll.Criminal
justice concepts can be found on the Bureau of Justice
Statistics
WEB site.
There are also bad concepts. For an example of one I
think
is bad, click on virtropy. What's wrong with this
concept?
Recently there has been some controversy over "race" as a
concept. Some people say races do not "really" exist.
Biologically, that is true if by "exist" you mean that people fall
into distinct categories. Physical differences exist with regard
to skin
color and other traits, but they are distributed continuously, not in
distinct
categories. Sociologically, racial differences exist and are
important. The people who say they do not "exist" are usually
in favor of using them for affirmative action programs, or even for
reparations, so they concede that they have sociological meaning.
That
meaning differs from society to society, and may change over
time. The
growth of the Hispanic population in the US
is forcing a change in how we think about this. Census
Racial Categories. Census
Document
on Racial and Ethnic Categories. Racial categories in
Latin
America.
Other concepts we can consider are: poverty,
power, crime, murder, race, IQ, liberalism/conservatism, homelessness.
Or we
could look at Personality
Types as
defined by Carl Jung and Measured by Isabel Meyers-Briggs
January 20:
We began with the Microcase software. If you
miss class today, you should work through pages 1 to 11 in the Workbook
on your own.
Computation of percentages. A percent is
calculated on a given
base In a cross-tabulation the base can be the row total,
the column total or the grand total. To get the percents,
you first put the percents in a two by two table. Then you
compute the row, column and grand totals. Then you use the totals
as the base to compute the appropriate percent.
Male Female
- total observed frequencies
Agree
55
79 134
Disagree
89
47 136
totals
144
126 270
____ % OF THE MEN AGREED
number of men who agreed/the number of men * 100 ; 55/144 *
100 38.2%
____% OF THE WOMEN AGREED 79/126
*100 = 62.7%
_____ % OF THE PEOPLE WHO AGREED WERE
MEN number of men who agreed/the
number of people who agreed * 100 55/134 41.0%
_____ % OF THE RESPONDENTS WERE AGREEABLE MEN
number of men who agreed/the number of respondents * 100
55/270*100 20.4%
January 18: We went over the syllabus , schedule
and assignments page, enrolling
assignment, and the use of WEBCT.
All "quizzes" should be taken for the first time at least two days
before they are due. No allowance will be made for technical
difficulties if you wait until the last day to try the
quiz. Please remember to sign the attendance sheet each
day. I allow three missed classes for good reasons such as
illness and funerals. It is not necessary to bring excuses until
your excused absences exceed three. Students who add the class
late and miss the first class or two have used up some of their excused
absences and are responsible for completing all assignments on
time.