BULLETIN BOARD (Q & A)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Question.  How do the problem sets will affect the overall grade in the class? I remember in class you talking that it will only matter if your grade is on the border line between two grades, but the syllabus says that they worth 20% of your grade.

Answer. 
The short answer is that the problem sets in practice will affect your course grade only if you are near AND BELOW the border line between two letter grades.  The first thing I will do is average out the two test grades (25% each) and final exam grade (50%) and you are guaranteed to get at least that average as the course grade.   But if that average comes out to, say, 2.45 (high C+ just short of the 2.5 borderline between a B and a C), then I would look at how many problem sets you turned in and how well they were done.  They could count up to 20% (depending on how many you turned in) of the overall grade, and they might boost from a C+ (recorded on your transcript as a C) to a B- (recorded as a B).    On the other hand, if your tests average out to say 2.55 (a low B-), you'd get a B in the course even if you turned in no problem sets.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Question:   I am trying to decide whether it would be best to take POLI 300 or POLI 400.  I am planning on going into a law related field upon graduation and after law school.  Which of these would you recommend?  Also, is statistics a  recommendation for either of these courses?

Answer:  POLI 300 (Quantitative Methods) arguably has more relevance beyond academic social/political science than POLI 400 (Qualitative Methods).  If the LSAT still has quantitative/logical/mathematical questions (along the lines of the SAT Math test), POLI 300 should be helpful there also.  POLI 300 introduces some statistical ideas and computations, but does so pretty informally.  My general recommendation is that, if students are going to take a STAT course (even STAT 121) at some point, its probably best to do this after taking POLI 300, on the grounds that POLI 300 will give you a bit of a headstart on some of the topics and should also give you a sense of the practical utility of the topics taught in a STAT course.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Question:  Would POLI 300 class be more appropriate to take this fall as a sophomore, a junior, or a senior?

<>Answer:   While POLI 300 probably should not be taken by freshmen and we really want sudents to take it before their senior year (though unfortunately many students do put it off), there are no powerful reasons to prefer sophomore to junior year or vice versa.  But if it fits into your schedule as a sophmore, it would probably to make sense to take it then, since in can be helpful in some 300-level POLI courses (POLI 324, 325, etc.) that you may be interested in.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Question:   I  found the SPSS directions fairly simple to follow.  I did however find it difficult conceptualizing the cross tabulation percentages (i.e., rows, columns, totals...what info does each provide?).  I also found recoding to be difficult, but I assume that is because we have not covered that material in class.

Answer:    Yes, we will cover these and other topics in much more detail later in the course.  The point at this stage [Problem Set #1A] is to make sure that you can carry out the SPSS commands discussed in the handout, even if you don't fully understand what they mean.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<>Question:   I was searching around on the Web and I found a decent SPSS tutorial.  I don't expect anyone would have much trouble using it, but if they do perhaps this tutorial would further help.

Answer:   Thanks very much for the information on SPSS tutorial website.  I was aware of that faculty group before, have heard some of their presentations, and have some of their printed materials and diskettes, but I was not aware of the website, which is much more useful. I'll announce it class and put a link on the course web page.  It goes beyond the topics needed for POLI 300, but the hypertext format makes it easy to skip around to find what you need (and I'm sure it's more helpful that the SPSS Help function).  I'm sure I can learn useful things from it, and will refer other students and UMBC faculty and staff members to it.  [SPSS Tutorial]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Question:   I have recently read in the news that a researcher used an online poll for his study.  It is obvious, from what we've learned in class, that there are serious reliabilty [really bias issues -- NRM] issues here.  According to the news excerpt [Reuters], out of some 40,000 responses from an MSNBC website he examined a random sample of some 7,000 respondents and further narrowed the group down to 384.  How is this valid?  Even if he took a random sample, wouldn't the response be biased anyway?  Further, why would PhD level researchers use polls like these when stupid undergrads like me know that it is a no-no?

Answer:   The researcher may have regarded the 40,000 self-selected MSNBC respondents to be the population of  interest but did not have the resources to study all 40,000 cases, and therefore took a random sample of 7,000 to study and then a subsample of 384 for more detailed study.  This would be an entirely  reasonable and proper procedure.  Indeed, it would be of considerable to compare the opinions of people who (voluntarily) view the MSNBC website and (voluntarily) respond to its on-line survey with the opinions of the general public (based on a standard survey asking the same questions).  What would not be proper (as you recognize and cf. the Ann Landers example) would be to assume that the 40,000 original respondents constituted a representative sample of the general population.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 

Question:   In Problem Set #3A (Identifying Variables), don't we sometimes need to include more than one unit of analysis?  For example, in question #4, wouldn't you need to analyze both elections (to see if they're competitive), and individuals (to find out Congressional responsiveness)?

Answer:   A very good question.  The fact that reduces this proposition to a single unit of analysis is that there is a one-to-one correspondence between members of  the House and House districts -- that is, each district has exactly one member and each member comes from exactly one district.  Think about setting up an (Excel or SPSS) data array or spreadsheet for the data you would need to assess the empirical truth of statement #4, such that row of data corresponds to one case and each column to one variable.  In turn, each case can be deemed to represent either one district (in which event the variables are DEGREE OF COMPETITIVENESS [of each district] and DEGREE OF RESPONSIVENESS OF MEMBER [from each district])  or  each case can be deemed to represent one member (in which event the variables are DEGREE OF COMPETITIVENESS [of each member's district] and DEGREE OF RESPONSIVENESS [of each member]).  Either way you think of it, it really comes to the same data arranged in the same same way.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Question:  We are confused by a number terms.  Can you clarify?

Answer:    RELIABILITY -- see Handout on Measuring Variables; also Moore, pp. 168-171; and (with respect to survey measurement and coding) Weisberg, pp. 94, 143-144.    RAW DATA/DATA ARRAY (or Data Spreadsheet, etc.) -- illustrated by the Student Survey Raw Data (Spreadsheet) that was distributed and that you used in Problem Sets #5 (Q1&2), #6 (Q2&3), and #12, or by the NES/SETUPS data you see when you open SPSS.    OBSERVATIONS (or OBSERVED VALUES)  refers simply to the entries in a data spreadsheet, e.g., we "observe" (e.g., on the basis of a response to a survey question) that the value of the variable PARTY ID in a particular case is "Weak Democrat."   MISSING DATA -- where we have failed, for one reason or another, to observe any value of a variable, e.g., the respondent failed to answer the question.  In both SETUPS and Student Survey data, all missing data is coded is coded "9."   The difference between (Unadjusted) Relative Frequencies and Adjusted Relative Frequencies ("Valid Percent" in SPSS) is that missing data is excluded from the latter calculations.
UNIVARIATE ANALYSIS
  -- is data analysis that involves only ONE variable at a time (frequency distributions, histograms, measures of central tendency and dispersion) as opposed to TWO (bivariate) or more (multivariate) variables at a time (crosstabulations, scattergrams, measures of association, regression equations).
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Question:   Referring to the charts for PS #7, we seemed to conclude in class that the Bush chart has the smallest SD, and the Perot chart has the largest.  But then you said didn't actually have the smallest SD.  Also, I calculated the rough SD for the Bush and Perot charts, and the Perot SD is approx. 6.51, while the Bush one is about 15.68.

Answer:  The general class discussion concluded (once we established that the question pertains not to dispersion in the height of the bars but to dispersion in the data represented by the frequency bar charts) that dispersion was clearly greatest in the Perot chart and seemed to be smallest in the Bush chart.  I agreed with the latter "eyeball" assessment but noted right at the end of the period that the Bush chart turns out to have a slightly greater SD than the Clinton chart.

I don't know how you reached them but your calculations for the Perot and Bush SDs are way off. Consider the following.  The range in all three charts is 4, because in each the maximum observed value is 5 and the minimum is 1.  Obviously we can't use the range as a measure to compare and contrast the dispersions in these three charts, so we need to turn to a more informative measure of dispersion, such as the SD. 

Once we think out the logic of the formula, it should be clear that the SD can't exceed half the range, so your estimated SDs of 6.51 for the Perot chart and 15.68 for the Bush chart must be wrong. Remember that the "building blocks" of the SD formula are the deviations from the mean in each case.  The largest positive deviation is the deviation in the case that has the maximum value and the largest negative deviation is the the deviation in the case that has the minimum value.  Ignoring the minus sign of the latter the deviation, the sum of these two deviations is equal to the range, while the average of these two deviations is just half of that.  Since by definition no other cases have larger deviations and some may have smaller deviations, the average (and also the standard) deviation from mean must be less than half the range if there are any cases with values intermediate between the two extreme.

Let's proceed more step-by-step.

The range is the answer to this question: how far apart are the maximum and minimum observed values in the data? (In this case, the answer is 4 ideological "points" or "steps.")

The mean deviation MD is the answer to this question: how far apart on average are all the observed values from the mean value, i.e., what is the average absolute deviation from the mean?   For the reason noted above, the MD cannot be larger than half the range -- indeed it can be that big only when half the cases have identical maximum values and the other half have identical minimum values (maximum "polarization").  Otherwise, the MD clearly must be less half the range, usually much less.  (In the case of Bush chart, the MD happens to be appproximately 1.)

The variance is the answer to this question:  what is the average squared deviation from the mean? The standard deviation SD, which is the square root of the variance, is never less than and usually about somewhat larger than the MD.  In the special case of maximum polarization described above, the SD (like the MD) is equal to half the range.  Otherwise the SD is less than half of the range (but greater than the MD).  (In the case of the Bush chart, the variance is about 1.7 and the SD is about 1.3.)

The question did not ask you to actually calculate the SDs, but here is how it would be done in the Bush case.  We start with a relative frequency table (like in Question 3), where the relative frequencies are read off the bar chart and turned into decimal fractions.  The mean perceived Bush ideological position is 3.96 (calculated from the relative frequences as described in Handout #6 top of p. 3 and PS #6:A&D, Q2(c))--- let's round this off to 4 to simplify the arithmetic.

        Values   Rel. Freqs.    Deviations        Sq. Devs. x Freq.      Abs. Devs. x Freq.    
           1            .08                   -3             9   x  .08  =  .72          3  x  .08  =  .24
           2            .08                   -2             4   x  .08  =  .32          2  x  .08  =  .16
           3            .14                   -1             1   x  .14  =  .14          1  x  .14  =  .14
           4            .20                    0              0  x   .20  =  .00         0  x  .20  =  .00
           5            .50                  +1             1  x   .50  =  .50          1  x  .50  =  .50
        Total       1.00                   *                      var =  1.68                 MD = 1.04
                                                                         SD =  1.29

* The deviations sum to zero once they have been weighted by relative frequency.  If you go to the final column and put the minus signs back in the first three rows, you see that the weighted deviations sum to -0.04.  (This is differs slightly from zero because we used an approximation of the mean.)