PHP based recommender system that can be used to predict values, find similar items or recommend items to user.
Movielens small dataset file ratings.csv is added. Example codes use this file to build dataset for test cases
- Cosine of angle
- Manhattan distance
- Euclidean distance
include "src/autoload.php";
use UglyRecommender\DistanceCalculator;
use UglyRecommender\VectorsUnequalException;
use UglyRecommender\NeighborsNotFoundException;
use UglyRecommender\RecommenderSystem;
use UglyRecommender\DataHelper\CsvLoadHelper;
Every dataset is different in format but our class RecommenderSystem expects data to be normalized in a specific format before it is passed to the class for operations we are using MovieLens dataset "ratings.csv" throughout for our examples each row in ratings.csv is in the format userid,movieid,rating_given
This recommender system use DataMatrix class to represet dataset. We will use CsvLoadHelper to construct DataMatrix filled with values from "ratings.csv"
$dataMatrix=CsvLoadHelper::load("ratings.csv");
//we will fill missing values with 0 for this dataset
$dataMatrix->setMissingValues(0);
$recommender=new RecommenderSystem();
$recommender->setDataMatrix($dataMatrix);
$recommender->setUnknownValue(0); //this tells RecommenderSystem that missing values are filled with 0
$recommender->setDistanceMethod("cosine");
//get recommendations for user "448".Maximum neighbors(similar users) to use=100,Maximum results to bring=5
$recommendations=$recommender->getRecommendations("448",100,5);
$recommender->setDistanceMethod("manhattan");
$recommendations=$recommender->getRecommendations("448",100,5)
//use maximum 100 neighbors to predict
$value=$recommender->predict("448","5",100);
//use maximum 100 neighbors to predict and use weights for nearest 10 neighbors
$value=$recommender->predict("448","5",100,[1.3,1.8,1.7,1.4,1.2,1.1,1.05,1.12,1.13]);
$recommender->setDistanceMethod("cosine"); //use cosine similarity
$similar_users=$recommender->getNearestNeighbors("448","",15);
$recommender->setDistanceMethod("cosine");
$similar_users=$recommender->getNearestNeighbors("448","5",15)
- Adding Error calculation methods
- Adjust weights depending on error value
- Include hamming distance to use string based data
- Matrix factorization