Paper:

"LORE: A large-scale offer recommendation engine through the lens of an online subscription service"

Objectives:
- Optimization of recommendation with constraints is usually NP-Hard
- In the online allocation problem, the goal is to maximize revenue where users arrive in an online manner and the decisions are irrevocable and instantaneous.
- The paper introduces Min-Cost Flow network optimization, which enables us to satisfy the constraints within the optimization itself and solve it in polynomial time.
Min Cost Flow Network
- The Min Cost Flow network is similar to Linear Programming constraints, but instead of solving for integer, the proposal is to change it to real numbers (or the "relaxed" form)
- Framing it in the context for uplift modelling for conversion requires propensity modelling
Offline Policy Evaluation
- An alternative to A/B testing is offline policy evaluation that can serve as a proof of concept test before deployment
- Possible methods:
  - Direct Method (DM) Estimator
    - High bias
  - Inverse propensity Score (IPS) Estimator
    - Unbiased estimator but has high variance, especially when the logging policy probability is close to 0
  - Self Normalizing IPS (SNIPS) estimator
    - Controls the variance of the IPS by use of control variates
  - Doubly Robust (DR) Estimator
    - Balances bias and variance
    - Uses the DM method as baseline and applies an IPS based correction whenever the reward data is available (when new and old policy actions match.)

Provide feedback

Saved searches