Big Data Project
Instacart Market Basket Analysis (Kaggle Competition)
Copied from this blog post, All credits to the blog author.
User features - what is this user like? Item features - what is this item like? User x item features - how does this user feel about this item? Datetime features - what is this day and hour like? Here are some of the ideas behind the features I created.
How often the user reordered items Time between orders Time of day the user visits Whether the user ordered organic, gluten-free, or Asian items in the past Features based on order sizes How many of the user’s orders contained no previously purchased items
How often the item is purchased Position in the cart How many users buy it as "one shot" item Stats on the number of items that co-occur with this item Stats on the order streak Probability of being reordered within N orders Distribution of the day of week it is ordered Probability it is reordered after the first order Statistics around the time between orders
Number of orders in which the user purchases the item Days since the user last purchased the item Streak (number of orders in a row the user has purchased the item) Position in the cart Whether the user already ordered the item today Co-occurrence statistics Replacement items
Counts by day of week Counts by hour