Project: Analysis of Risk Factors
The data file ncbirth.sav
a sample of birth records taken by the North Carolina State Center for
Health and Environmental Statistics. The data set represents a sample of
1450 births taken within the state of North Carolina. Of particular interest
will be incidents of Low Infant Birth Weight. Low birth weight has been
associated with weaker development of many characteristics such as intelligence,
coordination, strength, etc. Low birth weight is commonly defined as less
than 2500 grams (approximately 88.18 ounces).
The variables examined are:
||Sex of child (1=Male, 2=Female)
||Race of child (0=other Nonwhite, 1=White,
2=Black, 3=American Indian, 4=Chinese, 5=Japanese, 6=Hawaiian, 7=Filipino,
Age of mother
||Education level of mother
||Completed Weeks of Gestation
||Birth weight (grams) group (0=500 or less,
3=1501-2000, 4=2001-2500, 5=2501-3000, 6=3001-3500, 7=3501-4000, 8=4001-4500,
9=4501 and over)
||Marital status (1=married, 2=not married)
||Number of pounds in actual birth weight
||Number of remaining ounces in actual birth
||Average number of cigarettes daily (98=smokes
an unknown amount)
||Average number of alcoholic drinks weekly
(98=drinks an unknown amount)
||Apgar score at 1 minute (0-10)
||Fetal Alcohol Syndrome (0=No, 1=Yes)
Number of children born of the pregnancy
||Weight of child in total ounces
Answer the following questions. Answers to questions need to be properly
labeled. Use SPSS output to support your answers wherever it is needed.
SPSS outputs have to be also properly labeled and referenced in your answers,
summaries or conclusions. Be aware of data value "98" in the variables
drinks. Find a reasonable way (or form) to include them in the analysis
unless the variable is not considered in the analysis.
Dichotomize the totounc variable using 88.18 ounces as the cutoff
and save it as btotounc which will have value 1 if underweight
and have value 0 if over 88.18 ouonces. Create an indicator variable
for smoking status, name it bsmoke. This variable will have a value
for indicating the mother smoked and 0 indicating the mother did
not smoke. Create an indicator variable from drinks, name
it bdrankal. This variable will have a value 1 for
indicating the mother drank and 0 indicating the mother did not
drink. Create an indicator variable, bmothed, for whether a mother
completed high school or not (0 if mothed £
12, 1 if mothed > 12).
The odds ratio for studying a risk factor can be calculated from a simple
2 by 2 contingency table and can also be estimated from a logistic regression
model as in problem 3. Is there any difference between these two odds ratios?
If yes, explain the difference and the advantages and/or disadvantages
for each method.
What are the limitations of the analysis above?
The data was from North
Carolina Vital Statistics Institute for Research in Social Science.
Visiting the web site may help you to gain more insight about the data.
data for this project comes from 1995 birth registry at the North Carolina
State Center for Health and Environmental Statistics. Use is allowed
if reference is cited to the above agency.
Perform a logistic regression for predicting btotounc using bsmoke,
bdrankal, bmothed, mothage, and gest as predictor variables.
Identify the significant factors of giving birth to low birth weight child.
Include the SPSS output to explain your findings.
What was the odds ratio of having low birth weight child for smoking mothers
versus non-smoking mothers? What was the odds ratio of having low birth
weight child for mothers who drank alcohol versus non-drinker? What was
the odds ratio of having low birth weight child for mothers who had less
than 12 years of education versus those who had more years of education?
Interpret these odds ratios?