Credit Score Modeling and Classification

Which features are most useful to in classifying credit score?

Statistics 811, or "Applied Statistical Modeling for Data Scientists" at Michigan State is part of the Master's in Data Science degree program and instructed by Dr. Paul Speaker.

The course included coverage of many statistical modeling and data science project methods, including data visualization, regression, variance analysis, linear models, variable selection, categorical data analysis, experiment design, classification, and time series modeling.

Final course deliverables included the development of a classification model using a large dataset. Our team, which included my partners, Steven Strachan and Chris Grandy, selected categories of credit score ranking to develop a classifier off a variety of features. Our utilized dataset can be found here, a synthetic dataset of 100,000 records with financial, occupational, and personal data. All modeling was completed in Python, with libaries including:

keras
scikit-learn
xgboost
scipy
pandas
numpy
matplotlib
seaborn

To read the full process of classifier development, please reference the below report, which details all stages of the process, through exploratory data analysis, feature engineering, model generation, and hyperparameter tuning.

Latest Updates

After many mini-projects in front-end web development, this portfolio autobiography has been the most substantial project I have utilized web development languages and Github for. Keep up to date on my self-taught adventure by checking out the repository.

Get in Touch

Contact me or connect with any questions, collaborative interests or just to chat with a fellow impassioned "Data Guru".

mikayla.e.norton@gmail.com
(517) 304-6875
Hillsboro, OR