Employees and Residents

You are a public health investigator in Alabama, where there is a second plant located.You want to use the data collected on the NY employees and residents to build a Logistic Regression model so that you then assess the probability that Alabama residence that work or live near the plant will develop the cancer.Because the cancer test is very expensive, your office is unable to just test everyone, so you want to use the logistic regression to screen a large population of 133 people as a preliminary test to determine which are ‘at a higher risk’ of developing the cancer based on the exposure variables identified in the New York population.

For this, you will

1 – review the data and determine which variables are relevant to the investigation

2 – construct a Logistic Regression model

3 – apply the logistic regression model to the Alabama resident data

4 – determine which of the Alabama residents should be considered for additional testing for the cancer

5 Compile a report that clearly states

  • which of the exposure variables were rejected from the model, and for what reason.
  • which of the Alabama residents were recommended for the additional screening test, and why these individuals were selected
  • Recommendations for a larger testing regiment based on the assessment of the 133 people screened for the preliminary study


