Class Notes for Methods and Techniques of Social Research, Spring 2004 

 

See the Schedule and Assignments Page for a daily class schedule.  These class notes include some material prepared to be shown on the screen in class and some notes typed in class.  They are not intended to cover everything said in lecture.

May 3: 

Review for Final.  The exam is 9 to 12 officially, but we should be done by 11.  Please come promptly at 9.  Bring a  pencil with an eraser and a calculator.  The exam will be similar to the two midterms, just longer.  It will have 96 multiple choice questions and three pages of math questions including percentages, expected frequencies, margins of error, mean scores, regression, path diagrams and standard deviation.  I will put the formulas for margins of error and the standard deviation on the board as reminders.

You must complete the Percent Review and Review for Final WEBCT quizzes before the test.  These count as assignments, and you can take them as often as you like

A good way to review is to go over these class notes and the review glossaries at the end of each chapter in the textbook.  Also, several of the semester's WEBCT quizzes will be reopened May 4 and 5.  This is a chance to raise your score, as well as to use the quizzes for review.

Here are some points covered with multiple choice questions:

April 28:

Content Analysis - "unobtrusive data"  Data created by a bureaucratic system, e. g. police records, or often by the media.  Television or Newspapers either because that is our interest, the media, or as a way of getting information, e.g., on crime reported in the news.

Similar to survey research, except that you do coding instead of interviewing.  Coding means that you assign numbers to phenomena that you observe.  Counting things.  Each of your variables is coded from the published information.

Conceptualization.
Measurement.  Reliability and Validity.

Manifest Content - what's it's about on the surface
Latent Content - things that we infer about the content, e.g., does the writer sound angry?  Indignation, sexy?
A Content Analysis Study of Editorial Cartoons.   A Content Analysis of Internet-Accessible Written Pornographic Depications Links to Content Analysis studies.

April 26:

Experimental Research.   Experimental Designs.  See the graphs in the book or on Trochim's WEB site:  Types of Designs

Essential characteristics:

  1. Two or more groups are matched, usually by random assignment, sometimes by a kind of stratified random selection, e.g., an equal number of men and women or black sand whites in each group.  But the key is random assignment so that the groups can be assumed to be the same on all variables.  "Quasi-experiments" are when we use groups that are pretty much the same but we didn't assign people at random
  2. The Independent Variable is "manipulated," i.e., it is applied to one group and not to the other
  3. Change in the Dependent Variable is measured
Experiments can be done:
  1. In laboratory settings with volunteers, e.g., student volunteers
  2. In institutional settings such as prisons, hospitals, rehabilitation centers, etc., where people are assigned to treatment groups
    1. New drugs and medical treatments generally must be shown to work in experiments before they are approved for use.  Often,Stanley Milgram's Obedience Experiment from http://www.new-life.net/milgram.htm treatment is compared to a placebo.  These experiments are usually "double-blind," to control for the psychological effects of knowing one is getting treatment.  This is a way of controlling subject bias and experimenter bias/
    2. In criminal justice, one might do an experiment comparing a "half way house" to drug treatment program to a prison term for offenders.  To do this, you would have to get the judge to assign offenders to different programs at random.  Ethical issues are raised here and there are likely to be objections
  3. Occasionally in natural settings, for example
    1. welfare reform experiment, assign some recipients to the old program, some to the new.  This didn't work very well, there were errors in the group assignments and the women often forgot which group they were in anyway
    2. vaccination experiments
    3. guaranteed annual income experiments
Although logically experiments are the most rigorous way to test causal hypotheses, there are practical problems:


April 23: 

The Review Glossary is not adequate as a guide to this chapter.  Some points to be covered:

  Some examples of field resarch:
Margaret Mead, the only anthropologist (or sociologist) to get her own postage stamp, won fame through field work, primarily her book Coming of Age in Samoa.  Later, this book was denounced by anthropologist Derek Freeman in his book Margaret Mead and the Heretic : The Making and Unmaking of an Anthropological Myth.Anthropologists have come to Mead's defense, and have restudied the case, but I would have to agree with your text that "had Mead come back from Samoa with an accurate ethnographic report, it would not have made her famous."  Here is the NY Times Review of Freeman's critique of Mead.
     More recently, there has been a raging controversy about the book Darkness in El Dorado about research on the Yanomamo in Venezuela is the latest ethical controversy, which also raises important methodological questions.  Many of the book's allegations, however, have been contested by the National Academy of Sciences.
  The combining of fiction with factual research is increasingly common both in anthropology and in biographies.  Sometimes this is openly done as a literary form, in other cases such as that of Rigoberta Menchu, it is only admitted when critics discover it.   The Rigoberta Menchu Controversy by Arturo Arias. 
There are many problems with field research:  ethical issues, problems of reliability and validity when data are gathered by only one researcher, etc. A controversial book is Laud Humphrey's Tea Room Trade, which raises ethical issues. He studied gay sex in a men's room in a park in St. Louis, without informing the participants what he was doing.
    Field researchers sometimes seem to find examples that fit their preconceptions, and their work is often ignored by those who do not like the results, e.g., Leon Dash's book When Children Want Children and Rosa Lee which are just ignored by welfare advocates who prefer more sympathetic treatments.  One of the best field studies is Kathryn Edin's book Making Ends Meet. which is highly sympathetic to the mothers.  However, Edin collected statistical data as well her illustrative observations.  The statistics showed that almost none of the mothers actually lived off their grants alone.  Eli Anderson's book Streetwise on men in a Philadelphia ghetto has been well received, in large part because goes beyond one-sided advocacy.
    A great strength of field work is observing behaviors that the people themselves don't understand or aren't even aware of., or at any event, are unable or unwilling to talk about.  Anthropologist Jules Henry spent a week living in each of the homes of several children who had grown up mentally ill, trying to discern patterns in the family interactions that contributed to the illness.   Myra Bluebond-Langner's book The Private Worlds of Dying Children has been very influential;  she has just published a sequel called In the Shadow of Illness : Parents and Siblings of the Chronically Ill Child    Field reserch offers a richness of description and possibility of new insights that is unparalled by any other method.  Unless it is supplemented with other methods, it does not provide statistical data, and it is hard to replicate.
    Myra Bluebond-Langner of our Anthropology Department wrote a classic, The Private Worlds of Dying Children, and more recently, In The Shadow of Illness.

April 21:
  Chapter 7 on Survey Research.    How Polls are Conducted (Gallup).  Questionnaire DesignQuestionnaire Construction.    Interviewing Guidelines. -  Interviewing Techniques Questionnaire from alumni survey.   Preliminary Report on 2003 Alumni Survey

April 19:

Research Design.  How research is organized or structured to accomplish different ends.


Purpose of Study Preferred Design Advantages/Disadvantages
Exploration - To get some new ideas, or at least ideas that are new to you.
1. Literature Review - library research
2. Secondary Analysis - Using data that is already being collected by a country, a government office, a company.  Criminal justice systems generate a lot of data for their own purposes.  You are limited to the questions someone else designed and asked. 
3.Field Observation - Go into the natural setting and observe what is going on.  You may talk to people and ask questions as well, but the really unique aspect is observation.
4. Focus Groups - Group interviews lasting about an hour and a half.
5. Case Studies - based on documents, interviews or sometimes observations
1.  Get insights of others.  Avoid reinventing the wheel./ Tends to repeat the past, not generate new ideas.
2.  Access a tremendous amount of information quickly and cheaply./ Limited to the questions asked by others.
3.  Get new insights in natural setting/ Difficult and time consuming, small sample.  Access difficult.
4.  Detailed, inductive subtle understanding of patterns./ Difficult to generalize.
Description - To get accurate and relatively precise information, especially about large groups or
1. Secondary Analysis - Data banks of surveys are available, many other kinds of data also.
2. Surveys - Questionnaires or interviews.  Often on the telephone.
3.  Content analysis - Looking at media as a source of data:  tv shows, letters to the editor, newspaper articles.  Written documents.  You can go back in time.
1.  Excellent data, especially for trends over time/  Limited to questions asked by others.
2.  Ask your own questions, choose your own sample/  Time consuming of expensive.  Limited to topics people can answer accurately
3.  Unobtrustive, allows study of media./ Limited to topics that involve published media.
Explanation.  To answer questions about cause and effect.
1. Experiment - In an experiment we manipulate the independent variable.  The independent variable is the "cause" .  Then we measure the dependent variable or "effect" both before and after on experimental and control group.
2.  Multivariate Statistical Analysis of Survey Data
1.  Best method of proving causal relationships./ Hard to maintain rigor of design (internal validity) and to generalize beyond the limits of the experiment (external validity).  Serious ethical and practical limitations.
2.  Can use servey and secondary data and address wide range of important topics/  Data sets must include good measures of all relevant variables and wide range of data.  Not valid unless the models can be shown to predict trends in fresh data.  Most useful for making predictions to be evaluated with fresh data.


April 12:  SAMPLING is used when we are interested in studying a population that is too large for us to study each individual.  The first step is to define the population we wish to make statements about, e.g. adults in New Jersey, probable voters, people convicted of felonies, graduates of our department.  We might want to study the entire population of the USA.  If we try to collect data from everyone, this is a census.  The Census Bureau does this once every decade, and misses a lot of people.  Everyone else does sampling, we select a cross-section to represent the population.  If you try to study the whole population, you often fail to do a good job.   Gallup:  How Polls are Conducted.

Size of the sample.  How big of a sample do I need? Size of the sample does not depend on the size of the population.

How do we select the sample size?  Decide on the margin of error you will tolerate?  Margin of error is equal to one divided by the square root of the sample size.  Sample of 400, the square root is 20.  1/20 = .05 or 5%.  If you interviewed 400, 300 were white, 50 were black and 50 were others.  For the blacks, with a sample of 50, we would have a 14% margin of error.  For the whites, with a sample of 300, we would have a 5.8% margin or error.

Take 300, the square root of 300 is = 17.32     1 /17.32 = .0577  * 100 = 5.8%

Sample statistic - what the sample says
population parameter - what the real figure is
Even if the sampling is done well, the response rate is less than 100%.
Weighting is done to make the sample more like the population.

   m = 1/sqrt(n)    Solve for N:      m2 =  1/n      n * m2 = 1     n = 1/ m2    If we need a margin of error of 3%, or .03.   n = 1/ .032

  If you have a sample size and need to know the margin of error, use    m = 1/sqrt(n)

   If you are given a margin of error and asked how large a sample you need, use  n = 1/ m2

          In these formulas n = the size of the sample (not the population).    m = the margin of error expressed as a proportion, not as a percent.  Thus, if the questions says "we need a margin of error of 5%, then m = .05.   

If our sample is stratified, this means we really have several sub-samples and we need the same size sample for each of them, regardless of the size.  For example, if we want sample white, black and Hispanic respondents and make statements about each group, we need the same size sample of both regardless of their size in the population.  Thus, if we need a margin of error of 5% for each of the three groups, then the answer is  3 * ( n = 1/ m2 ).

Terms:

Margin of Error:  How much a sample statistic is likely to vary from the population parameter.  We say that we are 95% sure that the sample is not off by more than the margin of error.  How this is presented in NY Times.  "19 out of 20" is another way of saying 95%. 

 Confidence level:  we always use a 95% confidence level.

Confidence interval:  the range within which we think a statistic would fall, e.g., if the margin of error is 3% and the sample statistic is 67%, the confidence interval is from 64% to 70%.  We are 95% sure that the true figure is within this limit.

March 31:  We will discuss path analysis and interpreting regression models, following the textbook and the discussions in:
  A Brief Intro to Path Analysis. A longer introduction with more examples.   A more technical  Intro to Path Analysis

For the exam, you should know how to set up the regression equations to fit a path diagram.  The rules are the follows:

  1. There should be a regression equation for each variable that has an arrow pointing towards it.
  2. For each equation, the variable having arrows pointing into it is the dependent variable, and goes to the left of the euqals sign.
  3. For each equations, the variables on the left of the dependent variable that have arrows pointing into it are the independent variables.  These are listed to the right of the equal sign and connected with + signs.
  4. There is no need to include an intercept, because we are interested only in the standardized regression equations or beta weights.
An example.  Suppose we have the following diagram:

Perot Vote Input Path Diagram

For this diagram, we would need the following equations:

vote for perot =  alienation from government + alienation from society + finances worse

    alienation from government = status deficiency

     alienation from society =  status deficiency

If we got measures for these variables from a National Election Survey (Status Deficiency would be an index we would have to calculate), we could use the Regression procedure in Microcase to enter the three regression equations and get Beta coefficients which we could put on the diagram, as follows:

Output Path Diagram

March 29:  Multiple Regression. For predicting a dependent variable with one or more independent variables, we need both an "unstandardized regression coefficient" and an "intercept."  This is what we did in the Excel-Regression assignment  Excel referred to them as the "Coefficients" one of which is the X-variable coefficient, the other the intercept.  This is an unstandardized regression coefficient.  In that example, we used only one independent variable.  However, we could use more than one.  See the "Regression Analysis" in class exercise. 

Each of the unstandardized regression coefficients is on a different scale because it is designed to be multipled with the independent variable with which it is associated.  If we want to compare them, we standardize them so they all vary from zero to one or minus one, like correlation coefficients (they are standardized by multiplying them by the ratio of the standard deviations of the IV and the DV, but we don't have to worry about this since the software does it for us).  These are called "standardized regression coefficients" or Beta Weights.  They are used in Path Diagrams because they are comparable, the larger they are (in absolute value) the stronger the relationship between variables.

March 26:  We will continue with testing causal relationships through cross-tabulation.

Today we will look at testing causal hypotheses.  On page 93 in the text, we have the example of the relationship between Height and Liking Basketball.  This is anIV and a DV.  An obvious TEST VARIABLE is Gender.  This would be Antecedent, Gender determines both your height and liking for basketball.  We could draw this as a path diagram (on board).

When we introduce the control, we split the table into two parts, e.g.,

                             Males                  Females                 Total
                        Tall     Short           Tall   Short         Tall    Short

Likes BB           85%    85%            25%   25%         65%    45%
Does Not           15%    15%            75%   75%         35%    55%

Total                 100%  100%          100%  100%      100%   100%

In the real world, things are never this sharp.

Let's look at some real data, using FEAR WALK, PLACE SIZE and R.INCOME from the GSS data set:

In the total sample, the low income respondents are more likely to feel there are areas near them where they should fear walking.  However, this effect disappears for some of the respondents when we control for the size of the town in which they live.

To make it a finished Table:
                Small Town or rural         Small City        City/Surb          Total
                  Low  Med  Hi             Low Med Hi        Low Med Hi      Low Med Hi

Fear Walk         30%  27%  24%            48  42%  20%      56  41  43     51%  39%  41%
No Fear           70%  73%  76%            52% 58%  80%      44% 59% 57%    49%  61%  59%
                     p = .710                p = .043           p = .000       p=.000
                     N = 251                 N = 133            N = 1253      N = 1637
 

To to a more complete causal model of Fear of Walking at Night, we should introduce more variables.  Some of them may be in our data set, others now.  

What variables should we look at?

Variables      Hypotheses
Gender           Females more fearful than males.
Age              Elderly more fearful, also Children.  Might be curvilinear.
Crime Rate       People in high crime communities
Street Lighting
Freq of Patrols
Graffiti, Broken Windows, Trash, other indicators of an "out of control" neighborhood
Bicycles
Number of Pedestrians
Physical Shape
Training in Self Defense

We can examine some of these variables with our data.  We may find it useful to use regression rather than cross-tabulation.

March 24:  we will go through pages 114-122 in the workbook.

March 22: 

Causal Analysis - Chapter 5.

The Art and Science of Cause and Effect. (powerpoint)

Probabilistic cause, not an absolute cause, not a cause that is sufficient or necessary.   "Cigarette smoking causes
cancer."  WHat we mean is, smoking cigarettes increases the likelihood of getting cancer.  How much?

There are multiple causes for everything.  What we want to find out is how much each thing contributes.  There are also
causal linkages, or indirect causes.  A causes B and then B causes C.

Diagraming causal models.  We put the dependent variable at the right.  We draw arrows going into it for each causal
variable that effects it directly.  Then we can have arrows that go into the arrows, steps into the causal analysis, as in
this sample file:
http://crab.rutgers.edu/~goertzel/homomale.htm

Criteria of Causation - how do we know that something is a cause of something else.

1.  Time Order.  The cause comes before the effect.  Sometimes we sort out the time order theoretically, we assume that
education preceeds employment.  Or we can use a research design that involves gathering data at two points in time.  If
you don't have measurements at two points in time, this is shaky.

2.  Correlation.  The two variables vary together.  When one is high, the other is high OR when one is low the other is
high.  This gets at the degree of causation, the higher the correlation the strong the causal relationship.

3.  non-spuriousness,  we want to know that the correlation is not cause by something else.  We can test this with an
experimental design, if feasible.  Or we can use statistical controls, which are not quite as convincing but its all you do
in many cases.

We test for non-spuriousness by introducing controls.

Causal Models:  representations of the complex causal relationships between variables.  Variables have different causal roles, but this is determined by our causal our causal model, it is not inherent in the variables.   One person's cause can be another's effect.

Dependent Variable - that is what we want to explain.  Often these are opinions or behaviors

Independent Variable - what we use to explain it.  Often there are traits or physical characteristics, e.g., sex or race,
almost always independent.

If you study the relationship of race on voting, for example, race would be independent and voting dependent.

Antecedent variables, things come before the independent variable.  This helps us to deal with a causal chain.
Antecedent variable cause IV which causes the DV.
If the antecedent variable "explains" the relationship, we have an "explanation", we say it is "spurious".

Intervening Variables, this that are intervening, e.g.   Race determines ideology which determines the vote.
This is an "interpretation" it tells WHY the causal relationship exists.
Path Models:  a way of graphically expressing complex causal models.

Example:  Determinants of Adult Homosexuality in White Males.

Example:  The Seattle Social Development Project. 

 

March 12:  we will meet in the BSB 108 computer lab for  help with Micrcase Professional and Excel.  This class is optional, attendance will not be taken.  You should be able to complete the Excel Regression assignment during this class.

March 8 and 10:  More on trends and regression modelling, including multiple regression.

March 5:  Linear regression as a tool for data analysis.  Online regression applet.  We will learn to do regression in Microsoft Excel.  This is on the "tools"/"data analysis" menu.  You may need to install this from the CD-rom if you have Excel on your home computer.  Regression by Eye applet.

March 3:  We discussed time series analysis using the Historical Trends module in the professional Microcase software.  Details are on the Microcase Trends assignment page.

March 1:  Comparative Research Using Aggregate Units, Chapter 8 in the text.  This research method uses data about social or geographic units.  Consistent criminal justice statistics are important for evaluating CJ policies.  Thorsten Sellin, a professor at Penn, was instrumental in getting consistent CJ statistics established.  We can find examples on the Bureau of Justice Statistics WEB site.

Comparative methods are particularly useful for studying change because we can get data about trends over time.  Look, for example, at some Trend Graphs taken from the "Historical Trends" module in the Professional Microcase.  This is available in the computer center on the networked Windows computers (click on Statistics and Microcase on the Windows menu, then open "Microcase Curriculum Plan 2003-2004 and load the TrendSmp data set.  Our next exercise, after the Quiz on Workbook 8, will involve using this data set.

Some concepts:

Rate:  A statistic that reduces numbers to a common base.  The base is often, but not necessarily, the total population in an area.  If we are looking at voting participation, we might compute rates using the base of the number of adults 18 or over.  If we are trying to predict an election, we might use a base of registered voters. 

A crude birth rate is the number of births per 1,000 population.  Fertility rate is the number of births per female during her lifetime. 

Time Series analysis:  uses time periods as the unit of analysis, looks at how things change over time often in one case.   A lagged time series takes into account the time it takes for one variable to influence another, thus incarcerations in one year might be related to crimes in the next year.

 Cross-sectional analysis compares a number of cases at one point in time.

Reliability:  are statistics computed the same way in different geographic units or different time periods.  This causes all sorts of problems - it is better to imporve statistics, but doing so causes us to lose comparability. 

Validity:  do the statistics measure what we want them to measure.  Crimes reported to the policy are not a valid measure of the amount of actual crime, especially for crimes that are often not reported. 

Case oriented vs. variable oriented.  The case oriented approach is more qualitative, although quantitative trend data can be used.  The variable oriented approach assumes that the same variables are causally related in the same way in a large number of cases, e.g., "capital punishment" and "homicide rates" in a number of states or countries. 

Outliers:  especially in variable-oriented research, it is important to look for exceptional cases that are very different from the norm.  These tend to cause a disproportionate impact on our results. 

Lagged

February 27:  Quality of Measures -
   Reliability -  you get the same thing over and over.  Consistency.
         inter-rater - two different raters get the same answer.
         test-retest, if you take it twice the answers are the same.
           internal consistency - are theitems on a test consistent.  Chronbach's alpha is a statistic that measure inter-item reliability.
    Validity  is it "really" measuring what it is supposed to measure.
          Face Validity - does it look right?
          Predictive or criterion validity - does it predict what we want to predict, some "true" measure.  SAT test predicts college or law or medical school grades.
          Convergent validity -  do several measures give the same result.
             
          Construct validity - does the measure perform as our theory says it should.  We use this when we have no criterion.
  
This is the most difficult, it is used when things are inherently difficult to measure. 

                  An example:  a study of UFO Abduction Status.
            

February 25:  Measurement Chapter 3 in both books

Variables are characteristics or aspects that take different values among the units of analysisbeing studied.

In a questionnaire, often each question is a variable, but if it has a lot of choices, they may each be a variable

Are you Democrat, Republican, Independent or what?  One variable with three values.

Which of the following foods did you eat last week (check all that apply)
1. spaghetti   -
2. soup
3 artichoke hearts
4.  hamburger
5.  chicken

This would be a series of variables: 
Spaghetti - yes or no
Soup - yes or no
Hamburger - yes or no

Some variables that are natural dichotomies, such as Gender (male or female) or Age (Child, Adult)  or Opinion on an Issue (agree, disagree).  Or at least we choose to think of them that way.  We might think differently:  an opinion could be (strongly agree, agree, undecided, disagree, strongly disagree).  Or it could be, rate your opinion on a scale from 1 to 10.    These

Levels of Measurement.  What is our measurement really saying about the relationship between the values?

Dichotomous Measurement -   Two and only two categories.  Can be a natural dichotomy or a  "dummy variables" - we take a complex variable and divide it into a series of dichotomous variables.  

Nominal Measurement.  Categories that could be put in any order.
      Catholic, Protestant, Jewish, Moslem, LDS, Buddhist, Episcopalian, Baptist
                       variable one, category of religion, variable two denomination.
            Illnesses:    adjustment disorder, borderline personality disorder, paranoid schizophrenic
               Crimes:   burglary, assault,  

  Each individual should go into one and only one category on a variable, one value on a variable.  
For example:  What is your favorite food, we have a long list, but each person is allowed only one.
       Sorting people into categories must be reliable and accurate or valid.

Ordinal Measurement.   Here we have categories in a logical order.       Very short, short, medium, very tall, tall .  Often we take continuous variables and make them ordinal.    Income:   Under $20,000   $20 to 40,000  $40 to 60,000   $60000 plus.

Interval Measurement:   TEMPERATURE IN FAHRENHEIT OR CENTIGRADE, 0 degrees is not the absence of heat.  How about the day that the "temperature doubled" in New York City?

Ratio Measurement:    Income in dollars:  a continous numerical value PLUS a meaningful zero point.  Height in inches. 
 
Scaling is when we use a number of measures, such as test scores or questionnaire items, to measure a more general concept.  We can do this by adding them up (in which case your text would call it an "index", although many people still use the form scale) , or they may be ordered from lowest to highest (in which case it is a true scale as the term is used in your book).  Your test is an example.  I just add up the points, to measure the general variable "knowledge of research methods as covered in the first part of the course."  Another approach would be to rank the items from easy to hard and see which you could do.  This is tricky, because some people can do the hard ones and not the easy ones.  When we make an index or scale, we get measures that can be treated as interval, even if they are not strictly interval.  Scaling methods can be more precise, but these are not used much in sociology or CJ.  For example, we could scale the seriousness of crimes.  There are various methods of measuring this. - paired comparisons means asking a sample of people to rate crimes based on their perceived seriousness.

February 11:

Today we will begin with Amar Patel's Chi-Square lesson.   This covers the concept of expected frequencies and observed frequencies, and introduces the concept of "fairness", the difference statistic and the chisquare statistic.  These are applied to problems where the expected frequencies are given by a null hypothesis of "fairness".

We can apply this to any distribution where we have a theoretical reason to expect a certain result.  E.g., with two dice, each with six sides.  What results are possible and what likelihood do we have?

  1.  
  2.   *               Snake-eyes!
  3.   **             (1 and 2; 2 and 1)
  4.   ***           (1 and 3; 3 and 1; 2 and 2)
  5.   ****         (1 and 4; 4 and 1; 3 and 2; 2 and 3)
  6.   *****       (1 and 5;  5 and 1; 4 and 2;  2 and 4; 3 and 3)
  7.   ******      (4 and 3;  3 and 4;  5 and 2;  2 and 5; 6 and 1; 1 and 6)
  8.   *****       (4 and 4; 5 and 3; 3 and 5; 6 and 2; 2 and 6)
  9.   ****         (5 and 4;  4 and 5; 6 and 3;  3 and 6)
  10.   ***            (5 and 5; 6 and 4; 4 and 6)
  11.   **              (6 and 5; 5 and 6)
  12.  *                  Boxcars!
Suppose we try real dice 36 times and see what we get:

Total
Expected
Observed
2
1

3
2

4
3

5
4

6
5

7
6

8
5

9
4

10
3

11
2

12
1


We can compute the chisquare with Graph Pad QuickCalcs on the Internet. 

We will then apply the same statistic to crosstabulations where the expected frequencies are determined by the marginal frequencies.  Last class we worked with observed frequencies, row percent, column percent, and total percent.  Today we will compute expected frequencies for each cell in a cross-tabulation table, and show how the difference statistic and chisquare statistic are computed. 

We will use a simple 2 by 2 distribution as follows.  The variables are gender and opinion on an issue, each of which has two values:

25 men agreed
17 men disagreed
65 women agreed
30 women disagreed
 
 

Observed Frequencies or Obtained Frequencies Men Women total
Agree 25 65 90
disagree 17 30
47
total 42 95
137

    We can compute expected frequencies, based on the null hypothesis that men and women do not differ intheir opinions.  We can compute these knowing only the marginal or total frequencies.  The easy way to compute them is to multiple the row total for each cell by the column total for that cell, then divide by the grand total.  Another way would be to convert the row totals to proportions, then multiply then by the column totals.  Expected Frequencies - rt *ct /gt

 
Expected Frequencies  men women total
agree 90*42/137=27.59 90*95/137=62.41 90
disagree 47*42/137=14.41 47*95/137=32.59 47
total 42 95
137

  What would we get if we used the expected frequencies to make acolumn percentage table?  The percentages would be the same in each column (except for rounding error).  That is the point of expected frequencies, they are frequencies  we would get if all the columns were the same on percentage term.

Percents Computed from
 Expected Frequencies
Men
Women
Total
Agree
65.7%
65.7%
65.7%
Disagree
34.3%
34.3%
34.3%
Total
100%
100%
100%

We can use the expected frequencies to compute the "difference statistic" as described by Patel.  This tells us how much each cell is off from what was expected.  As you can see, each cell is off by 2.59, in either the positive or the negative direction.   This is a rough measure of how much our observations differ from the expected, plus or minus 2.59, but it is not widely used.  The sum of the differences is zero because the negatives cancel out the positives.

The statistic that is used is the chi-square statistic.  This is designed to give more weight to bigger differences and to make all differences positive so they can be added up to a number that can be used for probability testing.  We have probability distributions for chi-square, which enables us to tell the likelihood that the difference could have appeared by chance.   Chisquare is computed by squaring the differences between the observed (Fo) and expected (Fe) for each cell, then dividing them by the expected for that cell, then adding them up.

To get the chi square, we add up the computations for each cell  = .2431+.1075+.4655+.2058 = 1.0229.   Programs such as Microcase compute this for us.  We can also get  the chi square typing the observed frequencies and into the WEB chisquare calculator (using the version without the "Yates correction").  The result is 1.023.  The computer this tells us that the result is not "statistically significant" by chi-square test.  In the days before computers, we looked these up in a table in the back of a statistics book.

To see these tables, open the EXCEL 2 by 2 chi-square calculator I have prepared. It has all the tables:  observed frequencies, row percents, column percent, total percent, difference statistic, chi square.  In this spreadsheet, if we change the numbers in observed frequencies table, the other numbers will change accordingly. 

We do not normally compute these statistics by hand, so I have stopped requiring students to compute a chi-square as a test question.  However, it is impotant to understand what the computer is doing and what the results mean.  I do ask you to compute row, column and total percents and expected frequencies with a hand calculator, since this is not arduous.   On friday, we will do some practice questions in class.  Hiten will also do practice sessions with anyone who needs help with this on Friday, Feb 20..

February 9:   Survey data are largely nominal or categorical, which means that there are two or three distinct answers, rather than continuous.  Continuous variables vary on a scale with a large number of values.  This includes things such as height and weight if measured in inches or pounds, or rates of all kinds, e.g., crime rate, divorce rate, birth rate.  Votes can be continuous, e.g., if you say 56% voted for Kerry, 26% for Dean, etc.  However, if you ask how an individual voted, there is a distinct set of categories:  Kerry, Dean, Kucinich, etc.  A continuous variable can be collapsed into categories.  A categorical variable can also be converted to continuous when you are talking about a large population,e.g, the categorical votes of a number of individuals can be converted to percents voting for each.  So to work with survey data we need to understand  "per cent" and the different ways of computing and using them.  "Cent" means 100.  Per cent is a ratio, with the denominator being 100.  A rate.  We have other rates, such as per 1000 or per 100000 or even per million or per billion.

The problem with percents is knowing the base, and how it  adds to 100.   What are the other components of the total.

men       55 agreed
women   33 agreee
men 27 disagreed
women 42 disagreed

  The first  thing we do is put these into a contingency or cross-tabulation table.  We usually put the Independent or (causal) variable in the column and the dependent variable in the row .  It is best not to have too many categories on either variable, unless you have a very large number of cases.  This is the smallest possible table, a 2 by 2 table.
 
Observed Frequencies men women Total
Agree 55 33 88
Disagree 27 42 69
Total 82 75 157

There are three ways to do the percents.

In the row percent, the total is the number in the row which is used as the base.
In the column percent, the total is the number in the column which is the base.
In the total percent, the total is the grand total which is the base

1. What percent of the men agreed?  
2. What percent of the women disagreed? 
3. What percent of those who agreed were men?  
4. What percent of those who disagreed were women? 
5. What percent of the respondents agreed? 
6. What percent of the respondents were women?

Here is the kind of table we would put in a report.  It gives the column percents because the column variable is the Independent Variable.  For most purposes, the percents are based on the Independent Variable:

 
 
Column Percents Men Women Total
Agree 67.1% 39.1% 52.2%
disagree 37.5% 60.9% 47.8%
Total 100% 100% 100%

Next class we will do expected frequencies and chi-square.  There are some expected frequency questions on the WEBCT Quiz for Exercise 2b.


February 4:
Discussion of designing research projects.  How do we decide what to study?  Supplementary reading in Trochim on the structure of research.  You may prefer his "hourglass" metaphor to the circular one on page 14 of our textbook.

  1. Selecting a topic.  Typical motives include:
    1. Finding out something we don't know.  This may include something local, e.g., what do people in Camden think about the new Governor's actions, something that has been unresolved in earlier research, something that hasn't been studied because it is new, etc.  This is what the authors of your book mean when they say "research always starts with wondering."
    2. Another purpose that motivates research is proving to other people that what we "know" is true really is true.  This is "advocacy" research, and it can be very one-sided and lead to sloppy work.  Often this involves causal arguments, proving "why" something happens.  This kind of research may not start with "wondering" but with "arguing."
    3. Answering a question posed to us by our employer or by a client, applied research.  Here someone else really chooses the topic.
  2. Formulating a Research Question.  This means formulating a "statement" which will involve variables.  We have an argument or story in mind at this point.
  3. Defining the Concepts.  Usually not a lot of time goes into this stage of empirical research, but some people do write articles focusing on this, e.g., what does "race" or "poverty" mean, what is the difference between "sex" and "gender"  An example:  the measurement of romantic love.


  4. Operationalizing the Concepts.  A lot of effort goes into this.  Quantitative  research means you have to measure your variables and a lot depends on having good measurement.  Sometimes this is difficult, e.g., measuring "intelligence" or "liberalism-conservatism" or "mental illness" or "crime rates (various kinds)".  Often we use standard measures created by the government agencies that collect statistics.
  5. Formulating Hypotheses.  This is usually pretty easy.  There is a distinction between "null hypotheses" and regular hypotheses, which is explained on page 13.  It means testing the hypothesis that your hypothesis is not true.  Thus, you hope to "reject the null hypothesis" rather than "accept the (regular, not-null) hypothesis".  So far as I know, there is no word for the opposite of Null, it might be Substantive?  Type One Error:  accepting that a relationship exists when it doesn't.  Type two:  rejecting a relationship when it really does exist.
  6. Making observations.  This is a major step unless we just get the observations from someone who already did the work.
  7. Analyzing the Data.  This is "number crunching"  running data through the computer.  Of course, one can also analyze qualitative data from interviews or observations, but today even that tends to get quantified (content analysis).
  8. Assessing the results.  This is really part of the analysis.  If the hypothesis doesn't work out, often researchers go back and change the hypotheses and pretend they knew all along what was going to happen
  9. Publishing the findings. This assumes that you are doing "scientific" or "pure" research, much applied research is actually distributed only within the organization that paid for it.  This may be done in person, with a "power point" presentation.  Refereed publications:  you paper is sent to other specialists for review to decide if it should be published.  "Refereed journal."  Press release.   Publication can be online as well as on paper.  You publish the research so you can get credit, see your name in print, get promoted, and also so that you can inform others, and perhaps most important, so that other people can criticize or attempt to
  10. Replicate it.    Usually people replicate research in the hope of overthrowing it, if you just find the same thing as before, there is less interest.  This cancels out a lot of the bias in social research, since there is usually someone with the opposite bias to correct it.
    Here are some samples we can look at: Papers presented at the 2000 ASA meetings in Washington, a  Study of Tire-Crash Patterns (Word Format with Excel File Used to Reproduce Graphs.)  The controversy over a study on the effects of sex abuse. Compstat in the  NYC and Philadelphia  Police Departments.     The origin and development of the project on South Jersey's Identity that we workied on in this class in 2000.  Results are on my home page.  Last semester we worked on a survey of graduates of this department, not yet written up.  The Questionnaire is available online.  We did an earlier survey in 1995, a Report is available.  Contacts between Police and the Public.    The 2002 Final Report on the National Drug Control Strategy.    And the 2003 version - the emphasis on the goals has been lessened, with the excuse of discontinuities in data collection.  2003 Tables in HTML presentation form  . 



February 2 :Today, we will look at the use of scatterplots.  This is a two (or three) dimensionial plot of the relationship between continuous variables.  Height and weight are an example, as we can see of thr plot of heights and weights from a previous class.  There are also summary statistics and a regression equation.   

Height and Weight

 

The Line Equation gives us a formula for plotting the straight line that best fits the points.  The r= is a measure how closely the points fit a straight line.  If it has asterisks, the relationship is “statistically significant” which means that it is strong enough that it probably is not just due to random change.   The Prob = gives us the probability that the relationship occurred by chance, if it is less than .05, often given as p < .05, we say it is “significant”.  It may not be meaningful, just something other than random chance.  In this case, we might say that “height” is the independent variable or cause because it is more reasonable to say that one’s height determines one’s weight than the other way around.  To predict someone’s weight from their height, we use the line equation.  We multiply the height (in inches) by 6.596 and subtract 291.542.  This us what the person would weigh if they were of average weight for people of their height in our class.

We can use an online 2D scatterplot program.  to generate examples.  There is also a program for 3-d scatterplots.  When you have more than three variables, you can still do the mathematics, but diagrams don’t make much sense (sometimes color-coding is used for a variable). 

Sometimes variables are not related in a “linear” fashion, which means that regression doesn’t make much sense.  An example is “Anscombe’s Quartet”  These are four scatterplots that are all fitted by the same linear regression equation.

Anscombe's Quartet


Using linear regression when the data do not fit a regression line can lead to problems in research.   Here, for example, is a scatterplot of executions and homicide rates in US States;

Executions and Homicide Rates


January 30 - How does social science differ from other ways of thinking:  poetry, philosophy, theology, physical science?  How would we divide up fields of study?  Physical Science,  Social Science, Humanities?  Science, Art and Morality? Or, in Greek, Episteme, Techne, PhronesisThree approaches to knowledge. At Rutgers Camden we divide knowledge up differently:  Rutgers Camden requirements.  How does social science differ from the other categories?   We begin with concepts, as we discussed in the last class, but so do other fields especially  philosophy and even mathematics if we recognize that numbers are concepts.  The small integers are especially important, especially Zero and One (or nothing and something).  Religion may also start with comments  The Bible says In the beginning there was the Word, and the Word was with God, and the Word was God What does that mean?  As a theologian.  Religious concepts are good if they provoke spiritual reflection, as in reciting a Mantra in Buddhism.  Literary concepts are good if they are beautiful, which social sciences seldom are.   W.H. Auden's poem Under Which Lyre is  an aesthetic attack on social science and other applied sciences.
 In Social Science, a concept is good if it helps us to understand empirical reality.  A good concept leads to useful generalizations or theories.  Theories are general statements about relationships between concepts that reflect how people think and behave.  It can also be operationialized which means finding indicators to measure it.  A very common way of operationalizing a concept is to write a survey question.  Others may be operationalized by observation or by physical measurement or by counting things.  In criminal justice, concepts are often operationalized by having police officers fill out reports on incidents.  We can find a good list of sociological concepts by going to survey research archives, where concepts are translated into survey questions.  Check the General Social Survey and the Eagleton poll.Criminal justice concepts can be found on the  Bureau of Justice Statistics WEB site.
  There are also bad concepts.   For an example of one I think is bad, click on virtropy.  What's wrong with this concept?  Recently there has been some controversy over "race" as a concept.  Some people say races do not "really" exist.  Biologically, that is true if by "exist" you mean that people fall into distinct categories.  Physical differences exist with regard to skin color and other traits, but they are distributed continuously, not in distinct categories.  Sociologically, racial differences exist and are important.  The people who say they do not "exist" are usually in favor of using them for affirmative action programs, or even for reparations, so they concede that they have sociological meaning.  That meaning differs from society to society, and may change over time.  The growth of the Hispanic population in the US is forcing a change in how we think about this. Census Racial Categories.    Census Document on Racial and Ethnic CategoriesRacial categories in Latin America.  

Other concepts we can consider are: poverty, power, crime, murder, race, IQ, liberalism/conservatism, homelessness. Or we could look at Personality Types as defined by Carl Jung and Measured by Isabel Meyers-Briggs.

January 28:  We will look at the material on Rutgers Policy, Informed Consent and Behavioral Research in the Human Subjects Certification Course.  You can find the same material in WEBCT.

Categories of Exemption include most survey or interviewing research unless it involves confidential information or people can be identified in the data set.  Research using previously existing data is also exempt.  These actually are most of the things that we do in the social sciences.  Most "behavioral" research is exempt from review.  However, you still have to fill out a form and state why your research is exempt.

We will also look at some cases that have raised ethical issues:

There has been a raging controversy about the book Darkness in El Dorado about research on the Yanomamo in Venezuela is the latest ethical controversy, which also raises important methodological questions. The allegation is that researchers gave the tribe measles, but some people argue that any contact with isolated tribes violates their rights.  Many of the book's allegations, however, have been contested by the National Academy of Sciences.  

A controversial book is Laud Humphrey's Tea Room Trade, which raises ethical issues. He studied gay sex in a men's room in a park in St. Louis, without informing the participants what he was doing.  This violated the principle of "respect for persons" which requires getting "informed consent" from someone you study.  However, it was for a good cause, so it can be justified under the principle of "beneficence".  It might be questioned under the criteria of "justice" because it focused on a minority group, but one could argue that it benefited them and therefore was not unjust. 

Concepts and theories.  Concepts are words, or the meaning behind the word.  Mother or madre, same concept.  "surrogate mother"  "birth mother"  "adoptive mother" What is the difference?  "gender" (social) "sex" (biological)  Race?
A theory is when you make statements about how concepts fit together, the relationship between them.  Two kinds of relationships:
                       logical or tautological - true by definition  falls into philosophy, metaphysics

                       empirical or testable - true by observation - this is what we are interested in.
                       authority or tradition or religion - fall into a religious or political category


January 26:  we went through parts 1, 2 and 3 on the Table of Contents of the "Course Content"  of the Human Subjects Certification Program WEBCT course.  You can find the same material in WEBCT.