Movie Recommendation Algorithms

The impact of mathematical optimization algorithms on selecting your next movie to watch

CMSE 831, also known as "Computational Optimization" at Michigan State is part of the Master's in Data Science degree program and was instructed by Dr. Longxiu Huang.

The primary goal of this course aimed to emphasize the roles of optimization algorithms in "Big Data" analysis. The materials and learnings of this course highlight the use of mathematics in the foundations of optimization algorithm development, especially within linear algebra and multivariable calculus.

By the conclusion of the course, students were expected to demonstrate the understanding of these algorithms on a large dataset. With collaborative efforts from myself and fellow Data Science students, Steven Strachan, Saumya Shah, and Lacey Hamilton, we elected to reformulate several algorithms potentially used by common streaming services to recommend film options to users and review which algorithms seemed to work most effectively.

The datasource for the project can be found here. The project utilized Python for algorithm generation, with particular emphasis on the following libraries:

  • keras
  • scikit-learn
  • tensorflow
  • pandas
  • numpy
  • matplotlib
  • seaborn
  • random

The full report and development can be read in detail below. Our final outputs found highest success rates (lowest error) in three of the eleven generated matrix factorization methods: Low Rank Matrix Recovery​, Soft-Impute​, and Mean Imputation. If given more time in the course, our team would have liked to create an assembled model, with the ideal equation outlined in the report.