Amazon Product Recommender (MIT IDSS – Capstone Project)
As part of a machine learning capstone from MIT-IDSS, this project developed a product recommendation system using a real-world dataset of Amazon product reviews. The goal was to design a recommender that improves customer engagement by suggesting relevant items based on user behaviour.
The work began with cleaning and preprocessing the dataset — removing duplicates, handling missing values, and addressing sparse user–item interactions. Through exploratory analysis, cold-start issues were investigated and user and product activity distributions were examined, establishing minimum rating thresholds to reduce noise from infrequent interactions.
Multiple models were implemented and evaluated, including:
- Rank-based recommender based on product popularity.
- User-based and item-based collaborative filtering with tuned similarity metrics.
- Matrix factorisation using SVD.
- A hybrid approach combining strengths of different models.
All models were evaluated using precision@k and recall@k. The final solution was an on-demand, deployable recommender using precomputed predictions for fast Streamlit inference.
Tech used: Python, Google-Colab, Streamlit | Surprise and Pickle (main libraries).
Project Links
Notebook Preview
Due to MIT-IDSS copyright restrictions, the notebook cannot be shared publicly. I am happy to discuss further details upon request.