MethodsNotes

Notes for Methods and Techniques of Social Research, Fall 2004, Goertzel.

Here are the formulas I used to compute the grades::

Total Score = [Attendance]*0.05+[Enrolling]*0.05+[Human Subjects Letter]*0.05+[Quizzes]*0.25+[Midterm Grade]*0.25+[Final Exam]*0.35+[Extra Credit Points]

Assignments = ([Enrolling]+[Human Subjects Letter]*3+[Interviews or Content Analysis]*3+[Historical Trends]+[ExcelRegression])/9

Quizzes = ([Microcase Intro]+[Workbook 1]+[Workbook 2a]+[Workbook 2b]+[PercentQuiz]+[Sampling]+[Workbook 5]+[Crime Drop]+[Workbook 8]+[Workbook 3]+[Research Design]+[Statistics Review]+[Review Quiz])/13

Final Exam = ([Final Multiple Choice Items]*0.75+[Final Statistics Items]*0.25)

December 13:

Here are some points covered with multiple choice questions:

The differences between survey research, field research, experimental research, focus groups and content analysis
The field research, content analysis and other studies covered in the last two weeks of class. If you weren't in class, you can check these out on the notes.
History effects, maturation effects, testing effects, regression to the men and subject mortality in experimental research.
Criteria for establishing causation. Independent, dependent, antecedent and intervening variables.
Levels of measurement: dichotomous, nominal, ordinal, interval and ratio. The levels of measurement required to use statistical techniques such as percentages, chi square, correlation, regression, means, standard deviation.
Interpreting a time series graph, such as those we did with the Historical Trends module of WEBCT. Fitting a linear regression equation to a time series graph.
Tests of reliability: inter-rater, test-retest, internal consistency.
Tests of validity: criterion, construct, convergent, face.
Types of samples: simple random, stratified, quota, systematic, cluster.
Interpretation of scattergrams, e.g., height and weight.

December 8 and 10: Statistics Review This is not covered in our book, but on some WEB pages, one on Descriptive Statistics and one on Inferential Statistics Review. The class notes for October 1 (below) show how to calculate many of the statistics. On December 10 we did some statistics review questions taken from last semester's final.

December 1

Content Analysis - "unobtrusive data" Data created by a bureaucratic system, e. g. police records, or often by the media. Television or Newspapers either because that is our interest, the media, or as a way of getting information, e.g., on crime reported in the news.

Similar to survey research, except that you do coding instead of interviewing. Coding means that you assign numbers to phenomena that you observe. Counting things. Each of your variables is coded from the published information.

Conceptualization.
Measurement. Reliability and Validity.

Manifest Content - what's it's about on the surface
Latent Content - things that we infer about the content, e.g., does the writer sound angry? Indignation, sexy?
A Content Analysis Study of Editorial Cartoons. A Content Analysis of Internet-Accessible Written Pornographic Depications. Links to Content Analysis studies.

Video: Junk Science Critique. Junk Science.com Prayer Study: Science or Not? Vitamin C: defense of Pauling.

November 29

Experimental Research. Experimental Designs. See the graphs in the book or on Trochim's WEB site: Types of Designs.

Essential characteristics:

Two or more groups are matched, usually by random assignment, sometimes by a kind of stratified random selection, e.g., an equal number of men and women or black sand whites in each group. But the key is random assignment so that the groups can be assumed to be the same on all variables. "Quasi-experiments" are when we use groups that are pretty much the same but we didn't assign people at random
The Independent Variable is "manipulated," i.e., it is applied to one group and not to the other
Change in the Dependent Variable is measured

Experiments can be done:

In laboratory settings with volunteers, e.g., student volunteers
In institutional settings such as prisons, hospitals, rehabilitation centers, etc., where people are assigned to treatment groups

New drugs and medical treatments generally must be shown to work in experiments before they are approved for use. Often, treatment is compared to a placebo. These experiments are usually "double-blind," to control for the psychological effects of knowing one is getting treatment. This is a way of controlling subject bias and experimenter bias/
In criminal justice, one might do an experiment comparing a "half way house" to drug treatment program to a prison term for offenders. To do this, you would have to get the judge to assign offenders to different programs at random. Ethical issues are raised here and there are likely to be objections

Occasionally in natural settings, for example

welfare reform experiment, assign some recipients to the old program, some to the new. This didn't work very well, there were errors in the group assignments and the women often forgot which group they were in anyway
vaccination experiments
guaranteed annual income experiments

Although logically experiments are the most rigorous way to test causal hypotheses, there are practical problems:

It may be hard to manipulate the independent variable effectively, it may not have enough importance to people that they notice it
Experimental conditions may not be realistic enough, e.g., the Milgram experiments having people apply electric shock to people, experiments that simulate being in prison. An experiment is not the real world and people know it. This is called external validity, does the experiment match real world conditions
There may be problems of internal validity, difficulties in carrying out the experiment:

"History" effects - the world changes during the experiment, people get older, more mature, they are effected by things in the real world
Maturation, people get older, learn more
Testing effects, taking the pretest measure effects people, causes them to change. Sometimes we have a matched but untested control group that is measured only after the experiment.
Instrument effects, the testing instrument may change. You can't use the same exact test sometimes because people will remember it, so items change
Regression to the mean, just by chance the people who got extremely high or low scores on a pretest are likely to get more average scores on the second test.
Subject "mortality" - we may lose people. This is especially a problem in testing things like drug rehabilitation, it works for the people who stick with it, the failures drop out

Ethical concerns: people may not be willing to be experimented on, or it may be harmful to subject them to experimental conditions, e.g.,

Tuskeegee syphillis experiment denied some men penicillin. You can only deny an experimental drug if you are not "certain" that it works or if the condition is not serious, e.g., common cold research

A big strength of experiments is resolving questions that involve different recollections of events, e.g., children's reports of abuse. You don't know what "really" happened and people disagree on how well they accept the recollections of different people. In an experiment, you know what really happened, so you can check the accuracy of perception. We find that children often remember things that didn't really happen. "20/20 report on Child Abuse experiments (VIDEO shown in class) demonstrates false memory because we know what really happened since it happened in a controlled experimental setting. This is much more difficult to establish in real life case histories: Loftus: Who Abused Jane Doe?

November 19

The Review Glossary is not adequate as a guide to this chapter. Some points to be covered:

Field research is needed to gather data on processes, especially those involving personal interaction
It is good for observing behavior rather than getting reports of behavior or attitudes
It takes a lot of time, and is less likely to be replicated, leading to problems of reliability. Would another researcher observe the same thing? Validity issues are difficult to assess, what abstract concepts are being used?
By its nature field research is likely to be inductive, people go to see what's happening rather than to test a theory. But observers may have a "hidden agenda" that leads them to see what they want to see.
Selection of sites and topics is likely to reflect the special interest of the researcher. There may be a bias toward the bizarre, or towards cases that will prove a particular point.
Getting access is problematic, groups may not want to be observed. Sometimes deception is used, which raises ethical issues. The whole question of ethical review is difficult to apply to field research, e.g., "informed consent."
There can be structured and unstructured observation. You can count things. Observation through a one-way mirror is sometimes possible, e.g., of day care centers. This might be done with WEB-cams.
Notes are important, need to be recorded quickly. Data base programs can be used, here there might be validity issues, in the coding of observations, e.g., was a certain remark a case of "sexual harassment"?
Accounts tend to be descriptive, there is a bias toward things that are interesting. It is possible to combine field work with quantitative data collection.

Some examples of field resarch:
Margaret Mead, the only anthropologist (or

sociologist) to get her own postage stamp, won fame through field work, primarily her book Coming of Age in Samoa. Later, this book was denounced by anthropologist Derek Freeman in his book Margaret Mead and the Heretic : The Making and Unmaking of an Anthropological Myth.Anthropologists have come to Mead's defense, and have restudied the case, but I would have to agree with your text that "had Mead come back from Samoa with an accurate ethnographic report, it would not have made her famous." Here is the NY Times Review of Freeman's critique of Mead.
More recently, there has been a raging controversy about the book Darkness in El Dorado about research on the Yanomamo in Venezuela is the latest ethical controversy, which also raises important methodological questions. Many of the book's allegations, however, have been contested by the National Academy of Sciences.
The combining of fiction with factual research is increasingly common both in anthropology and in biographies. Sometimes this is

openly done as a literary form, in other cases such as that of Rigoberta Menchu, it is only admitted when critics discover it.   The Rigoberta Menchu Controversy by Arturo Arias.
There are many problems with field research: ethical issues, problems of reliability and validity when data are gathered by only one researcher, etc. A controversial book is Laud Humphrey's Tea Room Trade, which raises ethical issues. He studied gay sex in a men's room in a park in St. Louis, without informing the participants what he was doing.
    Field researchers sometimes seem to find examples that fit their preconceptions, and their work is often ignored by those who do not like the results, e.g., Leon Dash's book When Children Want Children and Rosa Lee which are just ignored by welfare advocates who prefer more sympathetic treatments. One of the best field studies is Kathryn Edin's book Making Ends Meet. which is highly sympathetic to the mothers. However, Edin collected statistical data as well her illustrative observations. The statistics showed that almost none of the mothers actually lived off their grants alone. Eli Anderson's book Streetwise on men in a Philadelphia ghetto has been well received, in large part because goes beyond one-sided advocacy.
    A great strength of field work is observing behaviors that the people themselves don't understand or aren't even aware of., or at any event, are unable or unwilling to talk about. Anthropologist Jules Henry spent a week living in each of the homes of several children who had grown up mentally ill,

trying to discern patterns in the family interactions that contributed to the illness. Myra Bluebond-Langner's book The Private Worlds of Dying Children has been very influential; she has just published a sequel called In the Shadow of Illness : Parents and Siblings of the Chronically Ill Child Field reserch offers a richness of description and possibility of new insights that is unparalled by any other method. Unless it is supplemented with other methods, it does not provide statistical data, and it is hard to replicate.
Myra Bluebond-Langner of our Anthropology Department wrote a classic, The Private Worlds of Dying Children, and more recently, In The Shadow of Illness.

Coming of Age in New Jersey.

The Corner.

Black American Students in an Affluent Suburb. by John Ogbu.

November 5: optional class in BSB 108 to work with Excel.

November 3: Hight Weight excel file. Election data on CNN.com. Use of Excel to prepare secondary data analyses. Sources of Data.

November 1

Research Design. How research is organized or structured to accomplish different ends.


Purpose of Study	Preferred Design	Advantages/Disadvantages
Exploration - To get some new ideas, or at least ideas that are new to you.	1. Literature Review - library research 2. Secondary Analysis - Using data that is already being collected by a country, a government office, a company. Criminal justice systems generate a lot of data for their own purposes. You are limited to the questions someone else designed and asked. 3.Field Observation - Go into the natural setting and observe what is going on. You may talk to people and ask questions as well, but the really unique aspect is observation. 4. Focus Groups - Group interviews lasting about an hour and a half. 5. Case Studies - based on documents, interviews or sometimes observations	1. Get insights of others. Avoid reinventing the wheel./ Tends to repeat the past, not generate new ideas. 2. Access a tremendous amount of information quickly and cheaply./ Limited to the questions asked by others. 3. Get new insights in natural setting/ Difficult and time consuming, small sample. Access difficult. 4. Detailed, inductive subtle understanding of patterns./ Difficult to generalize.
Description - To get accurate and relatively precise information, especially about large groups or	1. Secondary Analysis - Data banks of surveys are available, many other kinds of data also. 2. Surveys - Questionnaires or interviews. Often on the telephone. 3. Content analysis - Looking at media as a source of data: tv shows, letters to the editor, newspaper articles. Written documents. You can go back in time.	1. Excellent data, especially for trends over time/ Limited to questions asked by others. 2. Ask your own questions, choose your own sample/ Time consuming of expensive. Limited to topics people can answer accurately 3. Unobtrustive, allows study of media./ Limited to topics that involve published media.
Explanation. To answer questions about cause and effect.	1. Experiment - In an experiment we manipulate the independent variable. The independent variable is the "cause" . Then we measure the dependent variable or "effect" both before and after on experimental and control group. 2. Multivariate Statistical Analysis of Survey Data	1. Best method of proving causal relationships./ Hard to maintain rigor of design (internal validity) and to generalize beyond the limits of the experiment (external validity). Serious ethical and practical limitations. 2. Can use servey and secondary data and address wide range of important topics/ Data sets must include good measures of all relevant variables and wide range of data. Not valid unless the models can be shown to predict trends in fresh data. Most useful for making predictions to be evaluated with fresh data.

Samples:
Crackdowns.doc Crackdownsgraph.doc
A study of UFO Abduction Status.
NY Times Election Survey on todays Times Home Page.

October 25:

We will discuss path analysis and interpreting regression models, following the textbook and the discussions in:
A Brief Intro to Path Analysis. A longer introduction with more examples. A more technical Intro to Path Analysis. Skeptical Inquirer Cover. Text.

For the exam, you should know how to set up the regression equations to fit a path diagram. The rules are the follows:

There should be a regression equation for each variable that has an arrow pointing towards it.
For each equation, the variable having arrows pointing into it is the dependent variable, and goes to the left of the euqals sign.
For each equation, the variables on the left of the dependent variable that have arrows pointing into it are the independent variables. These are listed to the right of the equal sign and connected with + signs.
There is no need to include an intercept, because we are interested only in the standardized regression equations or beta weights.

An example. Suppose we have the following diagram:

Perot Vote Input Path Diagram

For this diagram, we would need the following equations:

vote for perot = alienation from government + alienation from society + finances worse

alienation from government = status deficiency

alienation from society = status deficiency

If we got measures for these variables from a National Election Survey (Status Deficiency would be an index we would have to calculate), we could use the Regression procedure in Microcase to enter the three regression equations and get Beta coefficients which we could put on the diagram, as follows:

Output Path Diagram

October 22: NJ polling. Post Tracking Poll. Rasmussen. Zogby.

Today we will look at testing causal hypotheses. On page 93 in the text, we have the example of the relationship between Height and Liking Basketball. This is anIV and a DV. An obvious TEST VARIABLE is Gender. This would be Antecedent, Gender determines both your height and liking for basketball. We could draw this as a path diagram (on board).

When we introduce the control, we split the table into two parts, e.g.,

Males Females Total
Tall Short Tall Short Tall Short

Likes BB 85% 85% 25% 25% 65% 45%
Does Not 15% 15% 75% 75% 35% 55%

Total 100% 100% 100% 100% 100% 100%

In the real world, things are never this sharp.

Let's look at some real data, using FEAR WALK, PLACE SIZE and R.INCOME from the GSS data set:

In the total sample, the low income respondents are more likely to feel there are areas near them where they should fear walking. However, this effect disappears for some of the respondents when we control for the size of the town in which they live.

To make it a finished Table:
Small Town or rural Small City City/Surb Total
Low Med Hi Low Med Hi Low Med Hi Low Med Hi

Fear Walk         30% 27% 24%            48 42% 20%      56 41 43     51% 39% 41%
No Fear           70% 73% 76%            52% 58% 80%      44% 59% 57%    49% 61% 59%
                     p = .710                p = .043           p = .000       p=.000
                     N = 251                 N = 133            N = 1253      N = 1637

To to a more complete causal model of Fear of Walking at Night, we should introduce more variables. Some of them may be in our data set, others now.

What variables should we look at?

Variables      Hypotheses
Gender           Females more fearful than males.
Age              Elderly more fearful, also Children. Might be curvilinear.
Crime Rate       People in high crime communities
Street Lighting
Freq of Patrols
Graffiti, Broken Windows, Trash, other indicators of an "out of control" neighborhood
Bicycles
Number of Pedestrians
Physical Shape
Training in Self Defense

We can examine some of these variables with our data. We may find it useful to use regression rather than cross-tabulation.
We can also use pages

When we introduce the control, we split the table into two parts, e.g.,

Males Females Total
Tall Short Tall Short Tall Short

Likes BB 85% 85% 25% 25% 65% 45%
Does Not 15% 15% 75% 75% 35% 55%

Total 100% 100% 100% 100% 100% 100%

In the real world, things are never this sharp.

Let's look at some real data, using FEAR WALK, PLACE SIZE and R.INCOME from the GSS data set:

To make it a finished Table:
Small Town or rural Small City City/Surb Total
Low Med Hi Low Med Hi Low Med Hi Low Med Hi

To to a more complete causal model of Fear of Walking at Night, we should introduce more variables. Some of them may be in our data set, others now.

What variables should we look at?

We can examine some of these variables with our data. We may find it useful to use regression rather than cross-tabulation.

We can also use pages 114-122 in the workbook as examples..

October 20: It's hard to poll. NJ Race. Causal Analysis - Chapter 5.

The Art and Science of Cause and Effect. (powerpoint)

Probabilistic cause, not an absolute cause, not a cause that is sufficient or necessary. "Cigarette smoking causes
cancer." WHat we mean is, smoking cigarettes increases the likelihood of getting cancer. How much?

There are multiple causes for everything. What we want to find out is how much each thing contributes. There are also
causal linkages, or indirect causes. A causes B and then B causes C.

Diagraming causal models. We put the dependent variable at the right. We draw arrows going into it for each causal
variable that effects it directly. Then we can have arrows that go into the arrows, steps into the causal analysis, as in
this sample file:
http://crab.rutgers.edu/~goertzel/homomale.htm

Criteria of Causation - how do we know that something is a cause of something else.

1. Time Order. The cause comes before the effect. Sometimes we sort out the time order theoretically, we assume that
education preceeds employment. Or we can use a research design that involves gathering data at two points in time. If
you don't have measurements at two points in time, this is shaky.

2. Correlation. The two variables vary together. When one is high, the other is high OR when one is low the other is
high. This gets at the degree of causation, the higher the correlation the strong the causal relationship.

3. non-spuriousness, we want to know that the correlation is not cause by something else. This can be tested rigorously with experimental designs, when feasible. But with most sociological or criminal justice problems experimental rigor is not possible, so we may use statistical controls as an alternative. This is much less rigorous, but often all we can do is see whether the relationship holds up when we control for other variables that might account for it.

Causal Models: representations of the complex causal relationships between variables. Variables have different causal roles, but this is determined by our causal our causal model, it is not inherent in the variables. One person's cause can be another's effect.

Dependent Variable - that is what we want to explain. Often these are opinions or behaviors

Independent Variable - what we use to explain it. Often there are traits or physical characteristics, e.g., sex or race,
almost always independent.

If you study the relationship of race on voting, for example, race would be independent and voting dependent.

Antecedent variables, things come before the independent variable. This helps us to deal with a causal chain.
Antecedent variable cause IV which causes the DV.
If the antecedent variable "explains" the relationship, we have an "explanation", we say it is "spurious".

Intervening Variables, this that are intervening, e.g. Race determines ideology which determines the vote.
This is an "interpretation" it tells WHY the causal relationship exists.
Path Models: a way of graphically expressing complex causal models.

Example: Determinants of Adult Homosexuality in White Males.

Example: The Seattle Social Development Project.

October 18: Midterms returned. Polling issues. Literary digest poll.

Grading formulas at Midterm:

Quizzes = ([Microcase Intro]+[Workbook 1]+[Workbook 2a]+[Workbook 2b]+[PercentQuiz]+[Sampling])/6
Midterm Grade = [Midterm Statistics]*0.25+[Midterm Multiple Choice]*0.75
Predicted Grade = [Attendance]*0.05+[Enrolling]*0.05+[Human Subjects Letter]*0.05+[Quizzes]*0.25+[Midterm Grade]*0.6

When you look at the "predicted grade" ignore the (out of 35.00). I have never been able to figure out where it gets those numbers or how to get rid of them. All the grades are out of 100.

October 10: Interviewing Guidelines Trochim on interviewing.

October 6. NY Times Poll. Other Polls.

Notes, October 1
SAMPLING is used when we are interested in studying a population that is too large for us to study each individual. The first step is to define the population we wish to make statements about, e.g. adults in New Jersey, probable voters, people convicted of felonies, graduates of our department. We might want to study the entire population of the USA. If we try to collect data from everyone, this is a census. The Census Bureau does this once every decade, and misses a lot of people. Everyone else does sampling, we select a cross-section to represent the population. If you try to study the whole population, you often fail to do a good job. Gallup: How Polls are Conducted.

Size of the sample. How big of a sample do I need? Size of the sample does not depend on the size of the population.
How do we select the sample size? Decide on the margin of error you will tolerate? Margin of error is equal to one divided by the square root of the sample size. Sample of 400, the square root is 20. 1/20 = .05 or 5%. If you interviewed 400, 300 were white, 50 were black and 50 were others. For the blacks, with a sample of 50, we would have a 14% margin of error. For the whites, with a sample of 300, we would have a 5.8% margin or error.

Take 300, the square root of 300 is = 17.32 1 /17.32 = .0577 * 100 = 5.8%

Sample statistic - what the sample says
population parameter - what the real figure is
Even if the sampling is done well, the response rate is less than 100%.
Weighting is done to make the sample more like the population.

m = 1/sqrt(n) Solve for N: m² = 1/n n * m²= 1 n = 1/ m² If we need a margin of error of 3%, or .03. n = 1/ .03²

If you have a sample size and need to know the margin of error, use m = 1/sqrt(n)

If you are given a margin of error and asked how large a sample you need, use n = 1/ m²

In these formulas n = the size of the sample (not the population). m = the margin of error expressed as a proportion, not as a percent. Thus, if the questions says "we need a margin of error of 5%, then m = .05.

If our sample is stratified, this means we really have several sub-samples and we need the same size sample for each of them, regardless of the size. For example, if we want sample white, black and Hispanic respondents and make statements about each group, we need the same size sample of both regardless of their size in the population. Thus, if we need a margin of error of 5% for each of the three groups, then the answer is 3 * ( n = 1/ m² ).

Terms:

Margin of Error: How much a sample statistic is likely to vary from the population parameter. We say that we are 95% sure that the sample is not off by more than the margin of error. How this is presented in NY Times. "19 out of 20" is another way of saying 95%.

Confidence level: we always use a 95% confidence level.

Confidence interval: the range within which we think a statistic would fall, e.g., if the margin of error is 3% and the sample statistic is 67%, the confidence interval is from 64% to 70%. We are 95% sure that the true figure is within this limit.

September 29 -

Discussion of designing research projects. How do we decide what to study? Supplementary reading in Trochim on the structure of research. You may prefer his "hourglass" metaphor to the circular one on page 14 of our textbook.

Selecting a topic. Typical motives include:

Finding out something we don't know. This may include something local, e.g., what do people in Camden think about the new Governor's actions, something that has been unresolved in earlier research, something that hasn't been studied because it is new, etc. This is what the authors of your book mean when they say "research always starts with wondering."
Another purpose that motivates research is proving to other people that what we "know" is true really is true. This is "advocacy" research, and it can be very one-sided and lead to sloppy work. Often this involves causal arguments, proving "why" something happens. This kind of research may not start with "wondering" but with "arguing."
Answering a question posed to us by our employer or by a client, applied research. Here someone else really chooses the topic.

Formulating a Research Question. This means formulating a "statement" which will involve variables. We have an argument or story in mind at this point.
Defining the Concepts. Usually not a lot of time goes into this stage of empirical research, but some people do write articles focusing on this, e.g., what does "race" or "poverty" mean, what is the difference between "sex" and "gender" An example: the measurement of romantic love.
Operationalizing the Concepts. A lot of effort goes into this. Quantitative research means you have to measure your variables and a lot depends on having good measurement. Sometimes this is difficult, e.g., measuring "intelligence" or "liberalism-conservatism" or "mental illness" or "crime rates (various kinds)". Often we use standard measures created by the government agencies that collect statistics.
Formulating Hypotheses. This is usually pretty easy. There is a distinction between "null hypotheses" and regular hypotheses, which is explained on page 13. It means testing the hypothesis that your hypothesis is not true. Thus, you hope to "reject the null hypothesis" rather than "accept the (regular, not-null) hypothesis". So far as I know, there is no word for the opposite of Null, it might be Substantive? Type One Error: accepting that a relationship exists when it doesn't. Type two: rejecting a relationship when it really does exist.
Making observations. This is a major step unless we just get the observations from someone who already did the work.
Analyzing the Data. This is "number crunching" running data through the computer. Of course, one can also analyze qualitative data from interviews or observations, but today even that tends to get quantified (content analysis).
Assessing the results. This is really part of the analysis. If the hypothesis doesn't work out, often researchers go back and change the hypotheses and pretend they knew all along what was going to happen
Publishing the findings. This assumes that you are doing "scientific" or "pure" research, much applied research is actually distributed only within the organization that paid for it. This may be done in person, with a "power point" presentation. Refereed publications: you paper is sent to other specialists for review to decide if it should be published. "Refereed journal." Press release. Publication can be online as well as on paper. You publish the research so you can get credit, see your name in print, get promoted, and also so that you can inform others, and perhaps most important, so that other people can criticize or attempt to replicate it. Usually people replicate research in the hope of overthrowing it, if you just find the same thing as before, there is less interest. This cancels out a lot of the bias in social research, since there is usually someone with the opposite bias to correct it.

Here are some samples we can look at: Papers presented at the 2000 ASA meetings in Washington, a Study of Tire-Crash Patterns (Word Format with Excel File Used to Reproduce Graphs.) The controversy over a study on the effects of sex abuse. Compstat in the NYC and Philadelphia Police Departments. The origin and development of the project on South Jersey's Identity that we workied on in this class in 2000. Results are on my home page. Last semester we worked on a survey of graduates of this department. The Questionnaire is available online. We did an earlier survey in 1995, a Report is available. Contacts between Police and the Public. The 2002 Final Report on the National Drug Control Strategy. And the 2003 version - the emphasis on the goals has been lessened, with the excuse of discontinuities in data collection. 2003 Tables in HTML presentation form .

September 27:
Quality of Measures -
   Reliability - you get the same thing over and over. Consistency.
         inter-rater - two different raters get the same answer.
         test-retest, if you take it twice the answers are the same.
           internal consistency - are theitems on a test consistent. Chronbach's alpha is a statistic that measure inter-item reliability.
    Validity is it "really" measuring what it is supposed to measure.
          Face Validity - does it look right?
          Predictive or criterion validity - does it predict what we want to predict, some "true" measure. SAT test predicts college or law or medical school grades.
          Convergent validity - do several measures give the same result.
          Construct validity - does the measure perform as our theory says it should. We use this when we have no criterion.   This is the most difficult, it is used when things are inherently difficult to measure.

An example: a study of UFO Abduction Status.

September 24: We will do some statistical computations in class, and do an exercise which is available online.

September 22

Levels of Measurement. What is our measurement really saying about the relationship between the values?

Dichotomous Measurement - Two and only two categories. Can be a natural dichotomy or a "dummy variables" - we take a complex variable and divide it into a series of dichotomous variables.

Nominal Measurement. Categories that could be put in any order.
      Catholic, Protestant, Jewish, Moslem, LDS, Buddhist, Episcopalian, Baptist
                       variable one, category of religion, variable two denomination.
            Illnesses:    adjustment disorder, borderline personality disorder, paranoid schizophrenic
               Crimes:   burglary, assault,

Each individual should go into one and only one category on a variable, one value on a variable.   For example: What is your favorite food, we have a long list, but each person is allowed only one.
       Sorting people into categories must be reliable and accurate or valid.

Ordinal Measurement. Here we have categories in a logical order. Very short, short, medium, very tall, tall . Often we take continuous variables and make them ordinal. Income: Under $20,000 $20 to 40,000 $40 to 60,000 $60000 plus.

Interval Measurement: TEMPERATURE IN FAHRENHEIT OR CENTIGRADE, 0 degrees is not the absence of heat. How about the day that the "temperature doubled" in New York City?

Ratio Measurement: Income in dollars: a continous numerical value PLUS a meaningful zero point. Height in inches.

Scaling is when we use a number of measures, such as test scores or questionnaire items, to measure a more general concept. We can do this by adding them up (in which case your text would call it an "index", although many people still use the form scale) , or they may be ordered from lowest to highest (in which case it is a true scale as the term is used in your book). Your test is an example. I just add up the points, to measure the general variable "knowledge of research methods as covered in the first part of the course." Another approach would be to rank the items from easy to hard and see which you could do. This is tricky, because some people can do the hard ones and not the easy ones. When we make an index or scale, we get measures that can be treated as interval, even if they are not strictly interval. Scaling methods can be more precise, but these are not used much in sociology or CJ. For example, we could scale the seriousness of crimes. There are various methods of measuring this. - paired comparisons means asking a sample of people to rate crimes based on their perceived seriousness.

February 11:

September 20:

Today we will begin with Amar Patel's Chi-Square lesson. This covers the concept of expected frequencies and observed frequencies, and introduces the concept of "fairness", the difference statistic and the chisquare statistic. These are applied to problems where the expected frequencies are given by a null hypothesis of "fairness".

We can apply this to any distribution where we have a theoretical reason to expect a certain result. E.g., with two dice, each with six sides. What results are possible and what likelihood do we have?

* Snake-eyes!
** (1 and 2; 2 and 1)
*** (1 and 3; 3 and 1; 2 and 2)
**** (1 and 4; 4 and 1; 3 and 2; 2 and 3)
***** (1 and 5; 5 and 1; 4 and 2; 2 and 4; 3 and 3)
****** (4 and 3; 3 and 4; 5 and 2; 2 and 5; 6 and 1; 1 and 6)
***** (4 and 4; 5 and 3; 3 and 5; 6 and 2; 2 and 6)
**** (5 and 4; 4 and 5; 6 and 3; 3 and 6)
*** (5 and 5; 6 and 4; 4 and 6)
** (6 and 5; 5 and 6)
* Boxcars!

Suppose we try real dice 36 times and see what we get:

Total	Expected	Observed
2	1
3	2
4	3
5	4
6	5
7	6
8	5
9	4
10	3
11	2
12	1

We can compute the chisquare with Graph Pad QuickCalcs on the Internet.

We will then apply the same statistic to crosstabulations where the expected frequencies are determined by the marginal frequencies. Last class we worked with observed frequencies, row percent, column percent, and total percent. Today we will compute expected frequencies for each cell in a cross-tabulation table, and show how the difference statistic and chisquare statistic are computed.

We will use a simple 2 by 2 distribution as follows. The variables are gender and opinion on an issue, each of which has two values:

25 men agreed
17 men disagreed
65 women agreed
30 women disagreed

Observed Frequencies or Obtained Frequencies Men Women total

Agree 25 65 90

disagree 17 30
47

total 42 95
137

Observed Frequencies or Obtained Frequencies	Men	Women	total
Agree	25	65	90
disagree	17	30	47
total	42	95	137

We can compute expected frequencies, based on the null hypothesis that men and women do not differ intheir opinions. We can compute these knowing only the marginal or total frequencies. The easy way to compute them is to multiple the row total for each cell by the column total for that cell, then divide by the grand total. Another way would be to convert the row totals to proportions, then multiply then by the column totals. Expected Frequencies - rt *ct /gt

Expected Frequencies men women total

agree 90*42/137=27.59 90*95/137=62.41 90

disagree 47*42/137=14.41 47*95/137=32.59 47

total 42 95
137

Expected Frequencies	men	women	total
agree	90*42/137=27.59	90*95/137=62.41	90
disagree	47*42/137=14.41	47*95/137=32.59	47
total	42	95	137

What would we get if we used the expected frequencies to make acolumn percentage table? The percentages would be the same in each column (except for rounding error). That is the point of expected frequencies, they are frequencies we would get if all the columns were the same on percentage term.

Percents Computed from Expected Frequencies	Men	Women	Total
Agree	65.7%	65.7%	65.7%
Disagree	34.3%	34.3%	34.3%
Total	100%	100%	100%

We can use the expected frequencies to compute the "difference statistic" as described by Patel. This tells us how much each cell is off from what was expected. As you can see, each cell is off by 2.59, in either the positive or the negative direction. This is a rough measure of how much our observations differ from the expected, plus or minus 2.59, but it is not widely used. The sum of the differences is zero because the negatives cancel out the positives.

The statistic that is used is the chi-square statistic. This is designed to give more weight to bigger differences and to make all differences positive so they can be added up to a number that can be used for probability testing. We have probability distributions for chi-square, which enables us to tell the likelihood that the difference could have appeared by chance. Chisquare is computed by squaring the differences between the observed (Fo) and expected (Fe) for each cell, then dividing them by the expected for that cell, then adding them up.

To get the chi square, we add up the computations for each cell = .2431+.1075+.4655+.2058 = 1.0229. Programs such as Microcase compute this for us. We can also get the chi square typing the observed frequencies and into the WEB chisquare calculator (using the version without the "Yates correction"). The result is 1.023. The computer this tells us that the result is not "statistically significant" by chi-square test. In the days before computers, we looked these up in a table in the back of a statistics book.

To see these tables, open the EXCEL 2 by 2 chi-square calculator I have prepared. It has all the tables: observed frequencies, row percents, column percent, total percent, difference statistic, chi square. In this spreadsheet, if we change the numbers in observed frequencies table, the other numbers will change accordingly.

We do not normally compute these statistics by hand, so I have stopped requiring students to compute a chi-square as a test question. However, it is impotant to understand what the computer is doing and what the results mean. I do ask you to compute row, column and total percents and expected frequencies with a hand calculator, since this is not arduous.
September 17:

Survey data are largely nominal or categorical, which means that there are two or three distinct answers, rather than continuous. Continuous variables vary on a scale with a large number of values. This includes things such as height and weight if measured in inches or pounds, or rates of all kinds, e.g., crime rate, divorce rate, birth rate. Votes can be continuous, e.g., if you say 56% voted for Kerry, 26% for Dean, etc. However, if you ask how an individual voted, there is a distinct set of categories: Kerry, Dean, Kucinich, etc. A continuous variable can be collapsed into categories. A categorical variable can also be converted to continuous when you are talking about a large population,e.g, the categorical votes of a number of individuals can be converted to percents voting for each. So to work with survey data we need to understand "per cent" and the different ways of computing and using them. "Cent" means 100. Per cent is a ratio, with the denominator being 100. A rate. We have other rates, such as per 1000 or per 100000 or even per million or per billion.

In computing percents, take our observed frequencies and put them in a contingency or cross-tabulation table. For example, if we ask men and women an agree/disagree survey question, we might get the following results:

       55 men agreed
   33 women agreee
     27    men disagreed
   42 women disagreed

The first thing we do is put these into a contingency or cross-tabulation table. We usually put the Independent or (causal) variable in the column and the dependent variable in the row . It is best not to have too many categories on either variable, unless you have a very large number of cases. This is the smallest possible table, a 2 by 2 table.

Observed Frequencies men women Total

Agree 55 33 88

Disagree 27 42 69

Total 82 75 157

Observed Frequencies	men	women	Total
Agree	55	33	88
Disagree	27	42	69
Total	82	75	157

There are three ways to do the percents.

In the row percent, the total is the number in the row which is used as the base.
In the column percent, the total is the number in the column which is the base.
In the total percent, the total is the grand total which is the base

1. What percent of the men agreed?
2. What percent of the women disagreed?
3. What percent of those who agreed were men?
4. What percent of those who disagreed were women?
5. What percent of the respondents agreed?
6. What percent of the respondents were women?

Here is the kind of table we would put in a report. It gives the column percents because the column variable is the Independent Variable. For most purposes, the percents are based on the Independent Variable:

Column Percents Men Women Total

Agree 67.1% 39.1% 52.2%

disagree 37.5% 60.9% 47.8%

Total 100% 100% 100%

Column Percents	Men	Women	Total
Agree	67.1%	39.1%	52.2%
disagree	37.5%	60.9%	47.8%
Total	100%	100%	100%

September 13: How does social science differ from other ways of thinking: poetry, philosophy, theology, physical or biological sciences, history, journalism? How would we divide up fields of study? Physical Science, Social Science, Humanities? Science, Art and Morality? Or, in Greek, Episteme, Techne, Phronesis: Three approaches to knowledge. At Rutgers Camden we divide knowledge up differently: Rutgers Camden requirements. How does social science differ from the other categories? Some sociologists like to think of us as a science similar to chemistry or physics, others see us as closer to history or journalism. The latter conceptions might make us exempt from human subjects regulations, if we are not doing research aimed as generalization. But we do not want to give up the hope of establishing generalizations.

Social science begins with concepts as do other fields such as philosophy and even mathematics if we recognize that numbers are concepts. The small integers are especially important, especially Zero and One (or nothing and something). Religion may also start with comments The Bible says In the beginning there was the Word, and the Word was with God, and the Word was God. What does that mean? Ask a theologian. Religious concepts are good if they provoke spiritual reflection, as in reciting a Mantra in Buddhism. Literary concepts are good if they are beautiful, which social sciences seldom are. W.H. Auden's poem Under Which Lyre is an aesthetic attack on social science and other applied sciences.
In Social Science, a concept is good if it helps us to understand empirical reality. A good concept leads to useful generalizations or theories. Theories are general statements about relationships between concepts that reflect how people think and behave. It can also be operationialized which means finding indicators to measure it. A very common way of operationalizing a concept is to write a survey question. Others may be operationalized by observation or by physical measurement or by counting things. In criminal justice, concepts are often operationalized by having police officers fill out reports on incidents. We can find a good list of sociological concepts by going to survey research archives, where concepts are translated into survey questions. Check the General Social Survey and the Eagleton poll.Criminal justice concepts can be found on the Bureau of Justice Statistics WEB site.
There are also bad concepts. For an example of one I think is bad, click on virtropy. What's wrong with this concept? Recently there has been some controversy over "race" as a concept. Some people say races do not "really" exist. Biologically, that is true if by "exist" you mean that people fall into distinct categories. Physical differences exist with regard to skin color and other traits, but they are distributed continuously, not in distinct categories. Sociologically, racial differences exist and are important. The people who say they do not "exist" are usually in favor of using them for affirmative action programs, or even for reparations, so they concede that they have sociological meaning. That meaning differs from society to society, and may change over time. The growth of the Hispanic population in the US is forcing a change in how we think about this. Census Racial Categories. Census Document on Racial and Ethnic Categories. Racial categories in Latin America.

Other concepts we can consider are: poverty, power, crime, murder, race, IQ, liberalism/conservatism, homelessness. Or we could look at Personality Types as defined by Carl Jung and Measured by Isabel Meyers-Briggs.

September 10: No regular class was held. Humansubjects movie was offered and a laboratory session.

September 8: Ethics of research with human subjects. We went through the material in the course on WEBCT.

September 1: Introduction to the class. Webct and Microcase.