Project: Logistic Regression for Classification


The marketing department for a retail company was about to embark on a periodic campaign to convince existing customers to apply for one of the company’s discount shopping advantage card with a fixed annual fee. The major decision facing the marketing department concerns which type of customers should be targeted for the campaign. The information from a random sample of 32 current customers who had been invited to apply of shopping card in the past was recorded, and three of the variables that are to be used to assist the marketing department in the analysis are: Shopping card status (1=Yes “have the discount shopping card”, 0=No “don’t have the discount shopping card”), Annual spending in thousands of dollars, Possession of credit card (1=Yes “have credit card”, 0=No “don’t have credit card”), Education level (1=Middle School or Below, 2=High School, 3=College or Above).



Data: shoppingcard_more.sav



Use the data above to evaluate the significance of the two predictor variables selected for predicting the willingness of the customers to obtain the discount shopping advantage card. Answer the following questions.

  1. Use logistic regression model to model the probability of customer to obtain the discount shopping card with all other variables mentioned above. Show the regression output with all variables in the model and identify the significant factors for predicting whether the customer will obtain a shopping card.
  2. Test to see if the logistic regression model above fits the data well at the 5% level of significance.
  3. Check and report if there is collinearity problem.


Use only the significant factors found in the model for the following questions:

  1. Use only the significant factors you identified above to build a logistic regression model, to estimate the odds ratio to see how likely the customers who possessed credit card would obtain a discount shopping card when comparing with those who do not possess credit card, and interpret this number.
  2. Report the 95% confidence interval for the odds ratio for the question above. If there are more than one predictor used in the model then this odds ratio would be an adjusted odds ratio.
  3. Find the prediction equation for estimating the probability of a customer will obtain a discount shopping card.
  4. What is the predicted probability of a customer who own a credit card and spent $50,000 on last year will obtain a discount shopping card?
  5. If the amount of spending last year is the only predictor in the analysis, a customer with how much spending last year will have a probability of at least .5 to obtain a discount shopping card? (Use the estimated logit model to solve for x.)