Skip to content

Classification analysis of toxic releases in Texas and socioeconomic data from zipcodes where releases are present

Notifications You must be signed in to change notification settings

david-cuervo/Toxic_Releases_in_Texas

Repository files navigation

Capstone Project

David Cuervo

Objective

Explore the relationship between industrial toxic releases and the socioeconomic demographics of the neighborhoods they are in.

Do poorer neighborhoods or predominantly minority neighborhoods have more pounds of toxic releases compared to wealthier, white neighborhoods?

Repository Contents

  • Raw Data folder
  • Cleaned Data folder
  • EDA notebook
  • Modeling notebook
  • Presentation slides

Exploratory Data Analysis

Screen Shot 2021-04-20 at 12 59 45 PM

Identified top 10 most polluted counties counties_texas

Top polluters in Texas companies_texas

Top 10 Chemicals Released chemicals_texas

Results

  • Built linear and logistic regression, decision tree, and random forest models
  • Most accurate was the random forest: 45.22%

feature_importance_revised

Top 4 Features

American_Indian_bargraph

Population_boxplot

Asian_bargraph

Population_125_PL_boxplot

Conclusions

  • Some relation between poverty and toxins released
  • The higher the Asian population, the less toxins releases in that ZIP code
  • American Indians only make up 1% of the Texan population
  • Future studies could explore:
    • More states, not just Texas
    • The number of actual industries in a neighborhood, not the pounds of toxins released
    • Other sources of pollution such as: bus depots, water treatment facilities, municipal waste sites

About

Classification analysis of toxic releases in Texas and socioeconomic data from zipcodes where releases are present

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published