Regression for Classification
The marketing department for a
retail company was about to embark on a periodic campaign to convince existing
customers to apply for one of the company’s discount shopping advantage card with
a fixed annual fee. The major decision facing the marketing department concerns
which type of customers should be targeted for the campaign. The information
from a random sample of 32 current customers who had been invited to apply of
shopping card in the past was recorded, and three of the variables that are to
be used to assist the marketing department in the analysis are: Shopping card
status (1=Yes “have the discount shopping card”, 0=No “don’t have the discount
shopping card”), Annual spending in thousands of dollars, Possession of credit
card (1=Yes “have credit card”, 0=No “don’t have credit card”), Education level
(1=Middle School or Below, 2=High School, 3=College or Above).
Use the data above to evaluate the
significance of the two predictor variables selected for predicting the
willingness of the customers to obtain the discount shopping advantage card.
Answer the following questions.
- Use logistic regression model
to model the probability of customer to obtain the discount shopping card
with all other variables mentioned above. Show the regression output with
all variables in the model and identify the significant factors for
predicting whether the customer will obtain a shopping card.
- Test to see if the logistic
regression model above fits the data well at the 5% level of significance.
- Check and report if there is collinearity
only the significant factors found in the model for the following questions:
- Use only the significant
factors you identified above to build a logistic regression model, to
estimate the odds ratio to see how likely the customers who possessed
credit card would obtain a discount shopping card when comparing with
those who do not possess credit card, and interpret this number.
- Report the 95% confidence
interval for the odds ratio for the question above. If there are more than
one predictor used in the model then this odds ratio would be an adjusted
- Find the prediction equation
for estimating the probability of a customer will obtain a discount
- What is the predicted
probability of a customer who own a credit card and spent $50,000 on last
year will obtain a discount shopping card?
- If the amount of spending
last year is the only predictor in the analysis, a customer with how much
spending last year will have a probability of at least .5 to obtain a
discount shopping card? (Use the estimated logit model to solve for x.)