Here are some points covered with multiple choice questions:
Content Analysis - "unobtrusive data" Data created by a bureaucratic system, e. g. police records, or often by the media. Television or Newspapers either because that is our interest, the media, or as a way of getting information, e.g., on crime reported in the news.
Similar to survey research, except that you do coding instead of interviewing. Coding means that you assign numbers to phenomena that you observe. Counting things. Each of your variables is coded from the published information.
Conceptualization.
Measurement. Reliability and Validity.
Experimental Research. Experimental Designs. See the graphs in the book or on Trochim's WEB site: Types of Designs.
Essential characteristics:
treatment
is compared to a placebo. These experiments are usually
"double-blind,"
to control for the psychological effects of knowing one is getting
treatment.
This is a way of controlling subject bias and experimenter bias/
the
new. This didn't work very well, there
were
errors in the group assignments and the women often forgot which group
they were in anywayThe Review Glossary is not adequate as a guide to this chapter. Some points to be covered:
sociologist)
to get her own postage stamp, won fame through field work, primarily
her
book Coming
of Age in Samoa. Later, this book was denounced by
anthropologist
Derek Freeman in his book Margaret
Mead and the Heretic : The Making and Unmaking of an Anthropological
Myth.Anthropologists
have come to Mead's defense, and
have restudied the case, but I would have to agree with your text
that
"had Mead come back from Samoa with an accurate ethnographic report, it
would not have made her famous." Here is the NY Times Review of Freeman's
critique of Mead.
openly
done as a literary form, in other cases such as that of Rigoberta
Menchu,
it is only admitted when
critics discover it.
The
Rigoberta Menchu Controversy by Arturo Arias.
trying
to discern patterns in the family interactions that contributed to the
illness. Myra Bluebond-Langner's book The
Private Worlds of Dying Children has been very influential;
she
has just published a sequel called In
the Shadow of Illness : Parents and Siblings of the Chronically Ill
Child
Field reserch offers a richness of description and possibility of new
insights
that is unparalled by any other method. Unless it is supplemented
with other methods, it does not provide statistical data, and it is
hard
to replicate.
Research Design. How research is organized or structured to
accomplish
different ends.
| Purpose of Study | Preferred Design | Advantages/Disadvantages |
| Exploration - To get some new ideas,
or at least ideas that are new to you. |
1. Literature Review - library research 2. Secondary Analysis - Using data that is already being collected by a country, a government office, a company. Criminal justice systems generate a lot of data for their own purposes. You are limited to the questions someone else designed and asked. 3.Field Observation - Go into the natural setting and observe what is going on. You may talk to people and ask questions as well, but the really unique aspect is observation. 4. Focus Groups - Group interviews lasting about an hour and a half. 5. Case Studies - based on documents, interviews or sometimes observations |
1. Get insights of others. Avoid
reinventing the wheel./ Tends to repeat the past, not generate new
ideas. 2. Access a tremendous amount of information quickly and cheaply./ Limited to the questions asked by others. 3. Get new insights in natural setting/ Difficult and time consuming, small sample. Access difficult. 4. Detailed, inductive subtle understanding of patterns./ Difficult to generalize. |
| Description - To get accurate and
relatively precise information, especially about large groups or |
1. Secondary Analysis - Data banks of surveys
are available, many other kinds of data also. 2. Surveys - Questionnaires or interviews. Often on the telephone. 3. Content analysis - Looking at media as a source of data: tv shows, letters to the editor, newspaper articles. Written documents. You can go back in time. |
1. Excellent data, especially for trends
over time/ Limited to questions asked by others. 2. Ask your own questions, choose your own sample/ Time consuming of expensive. Limited to topics people can answer accurately 3. Unobtrustive, allows study of media./ Limited to topics that involve published media. |
| Explanation. To answer
questions about cause and effect. |
1. Experiment - In an experiment we manipulate
the independent variable. The independent variable is the
"cause"
. Then we measure the dependent variable or "effect" both before
and after on experimental and control group. 2. Multivariate Statistical Analysis of Survey Data |
1. Best method of proving causal
relationships./ Hard to maintain rigor of design (internal validity)
and to generalize beyond the limits of the experiment (external
validity). Serious ethical and practical limitations. 2. Can use servey and secondary data and address wide range of important topics/ Data sets must include good measures of all relevant variables and wide range of data. Not valid unless the models can be shown to predict trends in fresh data. Most useful for making predictions to be evaluated with fresh data. |
For the exam, you should know how to set up the regression equations
to fit a path diagram. The rules are the follows:
alienation from government = status deficiency
alienation from society = status deficiencyToday we will look at testing causal hypotheses. On page 93 in the text, we have the example of the relationship between Height and Liking Basketball. This is anIV and a DV. An obvious TEST VARIABLE is Gender. This would be Antecedent, Gender determines both your height and liking for basketball. We could draw this as a path diagram (on board).
When we introduce the control, we split the table into two parts, e.g.,
Males
Females
Total
Tall
Short
Tall Short
Tall Short
Likes
BB
85%
85%
25% 25%
65%
45%
Does
Not
15%
15%
75% 75%
35%
55%
Total 100% 100% 100% 100% 100% 100%
In the real world, things are never this
sharp.
Let's look at some real data, using FEAR
WALK, PLACE SIZE and R.INCOME from the GSS data set:
In the total sample, the low income
respondents are more likely to feel there are areas near them where
they should fear walking. However, this effect disappears for
some of the respondents when we control for the size of the town in
which they live.
To make it a finished Table:
Small Town or rural
Small
City
City/Surb
Total
Low Med
Hi
Low Med Hi Low Med
Hi
Low Med Hi
Fear
Walk
30% 27%
24%
48 42% 20% 56 41
43 51% 39% 41%
No
Fear
70% 73%
76%
52% 58% 80% 44% 59%
57%
49% 61% 59%
p =
.710
p = .043 p
= .000 p=.000
N =
251
N =
133
N = 1253 N = 1637
To to a more complete causal model of Fear of Walking at Night, we should introduce more variables. Some of them may be in our data set, others now.
What variables should we look at?
Variables
Hypotheses
Gender
Females more fearful than males.
Age
Elderly more fearful, also Children. Might be curvilinear.
Crime Rate
People
in high crime communities
Street Lighting
Freq of Patrols
Graffiti, Broken Windows, Trash, other
indicators
of an "out of control" neighborhood
Bicycles
Number of Pedestrians
Physical Shape
Training in Self Defense
We can examine some of these variables with our
data. We may find it useful to use regression rather than
cross-tabulation.
We can also use pages
Today we will look at testing causal hypotheses. On page 93 in the text, we have the example of the relationship between Height and Liking Basketball. This is anIV and a DV. An obvious TEST VARIABLE is Gender. This would be Antecedent, Gender determines both your height and liking for basketball. We could draw this as a path diagram (on board).
When we introduce the control, we split the table into two parts, e.g.,
Males
Females
Total
Tall
Short
Tall Short
Tall Short
Likes
BB
85%
85%
25% 25%
65%
45%
Does
Not
15%
15%
75% 75%
35%
55%
Total 100% 100% 100% 100% 100% 100%
In the real world, things are never this
sharp.
Let's look at some real data, using FEAR
WALK, PLACE SIZE and R.INCOME from the GSS data set:
In the total sample, the low income
respondents are more likely to feel there are areas near them where
they should fear walking. However, this effect disappears for
some of the respondents when we control for the size of the town in
which they live.
To make it a finished Table:
Small Town or rural
Small
City
City/Surb
Total
Low Med
Hi
Low Med Hi Low Med
Hi
Low Med Hi
Fear
Walk
30% 27%
24%
48 42% 20% 56 41
43 51% 39% 41%
No
Fear
70% 73%
76%
52% 58% 80% 44% 59%
57%
49% 61% 59%
p =
.710
p = .043 p
= .000 p=.000
N =
251
N =
133
N = 1253 N = 1637
To to a more complete causal model of Fear of Walking at Night, we should introduce more variables. Some of them may be in our data set, others now.
What variables should we look at?
Variables
Hypotheses
Gender
Females more fearful than males.
Age
Elderly more fearful, also Children. Might be curvilinear.
Crime Rate
People
in high crime communities
Street Lighting
Freq of Patrols
Graffiti, Broken Windows, Trash, other
indicators
of an "out of control" neighborhood
Bicycles
Number of Pedestrians
Physical Shape
Training in Self Defense
We can examine some of these variables with our
data. We may find it useful to use regression rather than
cross-tabulation.
We can also use pages 114-122 in the workbook as examples..
The Art and Science of Cause and Effect. (powerpoint)
Probabilistic cause, not an absolute
cause, not a
cause
that is sufficient or necessary. "Cigarette smoking causes
cancer." WHat we mean is, smoking cigarettes
increases
the likelihood of getting cancer. How much?
There are multiple causes for
everything. What
we
want to find out is how much each thing contributes. There are
also
causal linkages, or indirect causes. A
causes B
and then B causes C.
Diagraming causal models. We put the
dependent
variable
at the right. We draw arrows going into it for each causal
variable that effects it directly. Then we
can
have arrows that go into the arrows, steps into the causal analysis, as
in
this sample file:
http://crab.rutgers.edu/~goertzel/homomale.htm
Criteria of Causation - how do we know that something is a cause of something else.
1. Time Order. The cause comes
before
the
effect. Sometimes we sort out the time order theoretically, we
assume
that
education preceeds employment. Or we can use
a
research design that involves gathering data at two points in
time.
If
you don't have measurements at two points in time,
this
is shaky.
2. Correlation. The two
variables vary
together.
When one is high, the other is high OR when one is low the other is
high. This gets at the degree of causation,
the
higher the correlation the strong the causal relationship.
3. non-spuriousness, we want to know that the correlation is not cause by something else. This can be tested rigorously with experimental designs, when feasible. But with most sociological or criminal justice problems experimental rigor is not possible, so we may use statistical controls as an alternative. This is much less rigorous, but often all we can do is see whether the relationship holds up when we control for other variables that might account for it.
Causal Models: representations of the complex causal relationships between variables. Variables have different causal roles, but this is determined by our causal our causal model, it is not inherent in the variables. One person's cause can be another's effect.Dependent Variable - that is what we want to explain. Often these are opinions or behaviors
Independent Variable - what we use to explain
it.
Often there are traits or physical characteristics, e.g., sex or race,
almost always independent.
If you study the relationship of race on voting, for example, race would be independent and voting dependent.
Antecedent variables, things come before the
independent
variable. This helps us to deal with a causal chain.
Antecedent variable cause IV which causes the DV.
If the antecedent variable "explains" the
relationship,
we have an "explanation", we say it is "spurious".
Intervening Variables, this that are intervening,
e.g.
Race determines ideology which determines the vote.
This is an "interpretation" it tells WHY the causal
relationship exists.
Path
Models: a way of graphically expressing complex causal models.
Example: Determinants of Adult Homosexuality in White Males.
Example: The Seattle Social Development Project.
Take 300, the square root of 300 is =
17.32
1 /17.32 = .0577 * 100 = 5.8%
m = 1/sqrt(n) Solve for N: m2 = 1/n n * m2 = 1 n = 1/ m2 If we need a margin of error of 3%, or .03. n = 1/ .032
If you have a sample size and need to know the margin of error, use m = 1/sqrt(n)
If you are given a margin of error and asked how large a sample you need, use n = 1/ m2
In these
formulas
n = the size of the sample (not the population). m =
the margin of error expressed as a proportion, not as a percent.
Thus, if the questions says "we need a margin of error of 5%, then m =
.05.
If our sample is stratified, this means we really have several
sub-samples and we need the same size sample for each of them,
regardless of the size. For example, if we want sample white,
black and Hispanic respondents and make statements about each group, we
need the same size sample of both regardless of their size in the
population. Thus, if we need a margin of error of 5% for each of
the three
groups,
then the answer is 3 * (
n = 1/ m2 ).
Terms:
Margin of Error: How much a sample statistic is likely to vary
from the population parameter. We say that we are 95% sure that
the sample is not off by more than the margin of error. How this
is presented in
NY Times. "19 out of 20" is another way of saying 95%.
Confidence level: we always use a 95% confidence level.
An example: a study of UFO Abduction Status.
Levels of Measurement. What is our measurement really saying about the relationship between the values?
Dichotomous Measurement - Two and only two categories. Can be a natural dichotomy or a "dummy variables" - we take a complex variable and divide it into a series of dichotomous variables.
Nominal Measurement. Categories
that could be
put
in any order.
Catholic,
Protestant,
Jewish, Moslem, LDS, Buddhist, Episcopalian, Baptist
variable one, category of religion, variable two denomination.
Illnesses: adjustment disorder, borderline
personality
disorder, paranoid schizophrenic
Crimes: burglary, assault,
Each individual should go into one
and
only one category on a variable, one value on a variable.
For example: What is your favorite food, we have a long list, but
each person is allowed only one.
Sorting
people
into categories must be reliable and accurate or valid.
Ordinal Measurement. Here we have categories in a logical order. Very short, short, medium, very tall, tall . Often we take continuous variables and make them ordinal. Income: Under $20,000 $20 to 40,000 $40 to 60,000 $60000 plus.
Interval Measurement: TEMPERATURE IN FAHRENHEIT OR CENTIGRADE, 0 degrees is not the absence of heat. How about the day that the "temperature doubled" in New York City?
Ratio Measurement:
Income in
dollars:
a continous numerical value PLUS a meaningful zero point. Height
in inches.
Scaling is when we use a number of measures,
such as
test scores or questionnaire items, to measure a more general
concept. We can do this by adding them up (in which case your
text would call it an "index", although many people still use the form
scale) , or they may be ordered from lowest to highest (in which case
it is a true scale as the term is used in your book). Your test
is an example. I just add up the points, to measure the general
variable "knowledge of research methods as covered in the first part of
the course." Another approach would be to rank the items from
easy to hard and see which you could do. This is tricky, because
some people can do the hard ones and not the easy ones. When we
make an index or scale, we get measures that can be treated as
interval, even if they are not strictly interval. Scaling methods
can be more precise, but these are not used much in sociology or
CJ. For example, we could scale the seriousness
of crimes. There are various methods of
measuring this. - paired comparisons means asking a sample of
people to rate crimes based on their perceived seriousness.
February 11:
Today we will begin with Amar
Patel's Chi-Square lesson. This covers the concept of
expected frequencies and observed frequencies, and introduces the
concept of "fairness", the difference statistic and the chisquare
statistic. These are applied to problems where the expected
frequencies are given by a null hypothesis of "fairness".
We can apply this to any distribution where we have a theoretical
reason to expect a certain result. E.g., with two dice, each with
six sides. What results are possible and what likelihood do we
have?
| Total |
Expected |
Observed |
| 2 |
1 |
|
| 3 |
2 |
|
| 4 |
3 |
|
| 5 |
4 |
|
| 6 |
5 |
|
| 7 |
6 |
|
| 8 |
5 |
|
| 9 |
4 |
|
| 10 |
3 |
|
| 11 |
2 |
|
| 12 |
1 |
We will then apply the same statistic to crosstabulations where the
expected frequencies are determined by the marginal frequencies.
Last class we worked with observed frequencies, row percent,
column
percent, and total percent. Today we will compute expected
frequencies for each cell in a
cross-tabulation
table, and show how the difference statistic and chisquare statistic
are computed.
We will use a simple 2 by 2 distribution as follows. The
variables are gender and opinion on an issue, each of which has two
values:
25 men agreed
17 men disagreed
65 women agreed
30 women disagreed
| Observed Frequencies or Obtained Frequencies | Men | Women | total |
| Agree | 25 | 65 | 90 |
| disagree | 17 | 30 |
47 |
| total | 42 | 95 |
137 |
We can compute expected frequencies, based on the null hypothesis that men and women do not differ intheir opinions. We can compute these knowing only the marginal or total frequencies. The easy way to compute them is to multiple the row total for each cell by the column total for that cell, then divide by the grand total. Another way would be to convert the row totals to proportions, then multiply then by the column totals. Expected Frequencies - rt *ct /gt
| Expected Frequencies | men | women | total |
| agree | 90*42/137=27.59 | 90*95/137=62.41 | 90 |
| disagree | 47*42/137=14.41 | 47*95/137=32.59 | 47 |
| total | 42 | 95 |
137 |
What would we get if we used the expected frequencies to make
acolumn percentage table? The percentages would be the same in
each column (except for rounding error). That is the point of
expected frequencies, they are frequencies we would get if all
the columns were the same on percentage term.
| Percents Computed from Expected Frequencies |
Men |
Women |
Total |
| Agree |
65.7% |
65.7% |
65.7% |
| Disagree |
34.3% |
34.3% |
34.3% |
| Total |
100% |
100% |
100% |
We can use the expected frequencies to compute the "difference
statistic" as described by Patel. This tells us how much each
cell is off from what was expected. As you can see, each cell is
off by 2.59, in either the positive or the negative
direction. This is a rough measure of how much our
observations differ from the expected, plus or minus 2.59, but it is
not widely used. The sum of the differences is zero because the
negatives cancel out the positives.
The statistic that is used is the chi-square statistic. This
is
designed to give more weight to bigger differences and to make all
differences positive so they can be added up to a number that can be
used for probability testing. We have probability distributions
for chi-square, which enables us to tell the likelihood that the
difference could have appeared by chance. Chisquare is
computed by squaring the differences between the observed (Fo) and
expected (Fe) for each cell, then dividing them by the expected for
that cell, then adding them up.
To get the chi square, we add up the computations
for each cell = .2431+.1075+.4655+.2058 = 1.0229.
Programs such as Microcase compute this for us. We can also
get the chi square typing the observed frequencies and into the WEB
chisquare calculator (using the version without the "Yates
correction"). The result is 1.023. The computer this tells
us that the result is not "statistically significant" by chi-square
test. In the days before computers, we looked these up in a table
in the back of a statistics book.
To see these tables, open the EXCEL
2 by 2 chi-square calculator I have prepared. It has all the
tables: observed frequencies, row percents, column percent, total
percent, difference statistic, chi square. In this spreadsheet,
if we change
the numbers in observed frequencies table, the other numbers will
change accordingly.
In computing percents, take our observed frequencies and put them in a contingency or cross-tabulation table. For example, if we ask men and women an agree/disagree survey question, we might get the following results:
55 men agreed
33 women agreee
27
men disagreed
42 women disagreed
The first thing we do is put these into a contingency or
cross-tabulation table. We usually put the Independent or
(causal) variable in the column and the dependent variable in the row
. It is best not to have too many categories on either variable,
unless you have a very large number of cases. This is the
smallest possible table, a 2 by 2 table.
| Observed Frequencies | men | women | Total |
| Agree | 55 | 33 | 88 |
| Disagree | 27 | 42 | 69 |
| Total | 82 | 75 | 157 |
There are three ways to do the percents.
In the row percent, the total is the number in the row which is used as the base.1. What percent of the men agreed?
2. What percent of the women disagreed?
3. What percent of those who agreed were men?
4. What percent of those who disagreed were women?
5. What percent of the respondents agreed?
6. What percent of the respondents were women?
Here is the kind of table we would put in a report. It gives the
column percents because the column variable is the Independent
Variable. For most purposes, the percents are based on the
Independent Variable:
| Column Percents | Men | Women | Total |
| Agree | 67.1% | 39.1% | 52.2% |
| disagree | 37.5% | 60.9% | 47.8% |
| Total | 100% | 100% | 100% |
Other concepts we can consider are: poverty,
power, crime, murder, race, IQ, liberalism/conservatism, homelessness.
Or we
could look at Personality
Types as
defined by Carl Jung and Measured by Isabel Meyers-Briggs.
September 10: No regular class was
held. Humansubjects movie was offered and a laboratory session.
September 8: Ethics of research with
human subjects. We went through the material in the course on
WEBCT.