263
Sample Files
nhis2000_subset.sav. TheNational Health Interview Survey (NHIS) is a large, population-based
survey of the U.S. civilian population. Interviews are carried out face-to-face in a nationally
representative sample of households. Demographic information and observations about
health behaviors and status are obtained for members of each household. This data
le contains a subset of information from the 2000survey. National Center for Health
Statistics. National HealthInterview Survey,2000. Public-use data le and documentation.
ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Datasets/NHIS/2000/.Accessed 2003.
ozone.sav. Thedata include 330 observations on six meteorological variables for predicting
ozone concentration from the remaining variables. Previous researchers (Breiman and
Friedman, 1985), (Hastie and Tibshirani,1990), among others found nonlinearities among
these variables, which hinder standard regression approaches.
pain_medication.sav. This hypothetical data le contains the results of a clinicaltrial for
anti-inammatory medicationfor treating chronic arthritic pain. Of particular interestis the
time it takes for the drug to take effectand how it compares to an existing medication.
patient_los.sav. Thishypothetical data le contains the treatment records of patients who were
admittedto the hospital for suspected myocardial infarction (MI, or “heart attack”). Eachcase
corresponds to a separate patient and records many variables related to their hospital stay.
patlos_sample.sav. This hypothetical data le contains the treatment records of a sample
of patients who received thrombolyticsduring treatment for myocardial infarction (MI, or
“heart attack”). Each case corresponds to a separate patient and records many variables
related to their hospital stay.
poll_cs.sav. Thisis a hypothetical data le that concerns pollsters’ efforts to determine the
level of public support for a bill before the legislature. The cases correspond to registered
voters. Each case records the county, township, and neighborhood in which the voter lives.
poll_cs_sample.sav. This hypothetical data le contains a sample of the voters listed in
poll_cs.sav. The sample was taken according to the design specied in the poll.csplan plan
le, and this data le records the inclusion probabilities and sample weights. Note, however,
that becausethe sampling plan makes use of a probability-proportional-to-size (PPS) method,
there is also a le containing the joint selectionprobabilities (poll_jointprob.sav). The
additional variables corresponding to voter demographics and their opinion on the proposed
bill were collected and added the datale afte r the sample as taken.
property_assess.sav. Thisis a hypothetical data le that concerns a county assessor’s efforts to
keep property value assessments upto date on limited resources. The cases correspond to
properties sold in the county in the past year. Each case in the data le records the township
in which the property lies, the assessor who last visited the property,the time since that
assessment, the valuation made at that time, and the sale value of the property.
property_assess_cs.sav. Thisis a h ypothetical data le that concernsa state assessor ’sefforts
to keep property value assessments up to date on limitedresources. The cases correspond
to properties in the state. Each case in the data le records the county, township, and
neighborhood in which the property lies, the time since the last assessment, and the valuation
made at that time.
property_assess_cs_sample.sav. Thishypothetical data le contains a sample of the proper ties
listed in property_assess_cs.sav. The sample was taken according to the design specied in
the property_assess.csplanplan le, and this data le records the inclusion probabilities