This is MXNet implementation for the paper:
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu and Tat-Seng Chua (2017). Neural Collaborative Filtering. In Proceedings of WWW '17, Perth, Australia, April 03-07, 2017.
Three collaborative filtering models: Generalized Matrix Factorization (GMF), Multi-Layer Perceptron (MLP), and Neural Matrix Factorization (NeuMF). To target the models for implicit feedback and ranking task, we optimize them using log loss with negative sampling.
Author: Dr. Xiangnan He (http://www.comp.nus.edu.sg/~xiangnan/)
Code Reference: https://github.com/hexiangnan/neural_collaborative_filtering
We use MXnet with MKL-DNN as the backend.
- MXNet version: MXNet Master(TBD)
pip install -r requirements.txt
We provide the processed datasets on Google Drive: MovieLens 20 Million (ml-20m), you can download directly or run the script to prepare the datasets:
python convert.py ./data/
train-ratings.csv
- Train file (positive instances).
- Each Line is a training instance: userID\t itemID\t
test-ratings.csv
- Test file (positive instances).
- Each Line is a testing instance: userID\t itemID\t
test-negative.csv
- Test file (negative instances).
- Each line corresponds to the line of test.rating, containing 999 negative samples.
- Each line is in the format: userID,\t negativeItemID1\t negativeItemID2 ...
We provide the pretrained ml-20m model on Google Drive, you can download directly for evaluation or calibration.
dtype | HR@10 | NDCG@10 |
---|---|---|
float32 | 0.6393 | 0.3849 |
float32 opt | 0.6393 | 0.3849 |
int8 | 0.6395 | 0.3852 |
int8 opt | 0.6396 | 0.3852 |
# train ncf model with ml-20m dataset
python train.py # --gpu=0
# optimize nc model
python model_optimizer.py
# neumf calibration on ml-20m dataset
python ncf.py --prefix=./model/ml-20m/neumf --calibration
# optimized neumf calibration on ml-20m dataset
python ncf.py --prefix=./model/ml-20m/neumf-opt --calibration
# neumf float32 inference on ml-20m dataset
python ncf.py --batch-size=1000 --prefix=./model/ml-20m/neumf
# optimized neumf float32 inference on ml-20m dataset
python ncf.py --batch-size=1000 --prefix=./model/ml-20m/neumf-opt
# neumf int8 inference on ml-20m dataset
python ncf.py --batch-size=1000 --prefix=./model/ml-20m/neumf-quantized
# optimized neumf int8 inference on ml-20m dataset
python ncf.py --batch-size=1000 --prefix=./model/ml-20m/neumf-opt-quantized
usage: bash ./benchmark.sh [[[-p prefix ] [-e epoch] [-d dataset] [-b batch_size] [-i instance] [-c cores/instance]] | [-h]]
# neumf float32 benchmark on ml-20m dataset
sh benchmark.sh -p model/ml-20m/neumf
# optimized neumf float32 benchmark on ml-20m dataset
sh benchmark.sh -p model/ml-20m/neumf-opt
# neumf int8 benchmark on ml-20m dataset
sh benchmark.sh -p model/ml-20m/neumf-quantized
# optimized neumf int8 benchmark on ml-20m dataset
sh benchmark.sh -p model/ml-20m/neumf-opt-quantized