4. Logistic Regression

All of this information is taken from Section 4.6.2 Logistic Regression in the Chapter 4 Lab walkthrough. You should consult that for details.

For StatsModels, the output variable must be a number. Change strings like “Larger” and “Smaller” to the numbers 0 and 1.

import statsmodels.api as sm
result = sm.Logit.from_formula("output~p1+p2",
           data=DataFrame).fit()

The resulting result variable is of type LogitResults, so check the documentation for all available information. Some important fields of LogitResults:

  • summary()
  • summary2(): We found this was needed on old installations.
  • params: Coefficients.
  • pvalues

Less important, still common to use:

  • tvalues()
  • fittedvalues
  • bse: Coefficient standard errors.
  • aic: A measure of model fit.