Skip to content

ankane/disco-rust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

145 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Disco Rust

🔥 Recommendations for Rust using collaborative filtering

  • Supports user-based and item-based recommendations
  • Works with explicit and implicit feedback
  • Uses high-performance matrix factorization

🎉 Zero dependencies

Build Status

Installation

Add this line to your application’s Cargo.toml under [dependencies]:

discorec = "0.3"

Getting Started

Prep your data in the format (user_id, item_id, value)

let data = vec![
    ("user_a", "item_a", 5.0),
    ("user_a", "item_b", 3.5),
    ("user_b", "item_a", 4.0),
];

IDs can be integers, strings, or any other hashable data type

(1, "item_a".to_string(), 5.0)

If users rate items directly, this is known as explicit feedback. Fit the recommender with:

use discorec::Recommender;

let recommender = Recommender::fit_explicit(&data);

If users don’t rate items directly (for instance, they’re purchasing items or reading posts), this is known as implicit feedback. Use 1.0 or a value like number of purchases or page views for the dataset, and fit the recommender with:

let recommender = Recommender::fit_implicit(&data);

Get user-based recommendations - “users like you also liked”

recommender.user_recs(&user_id, 5);

Get item-based recommendations - “users who liked this item also liked”

recommender.item_recs(&item_id, 5);

Get the predicted rating for a specific user and item

recommender.predict(&user_id, &item_id);

Get similar users

recommender.similar_users(&user_id, 5);

Examples

MovieLens

Download the MovieLens 100K dataset and use:

use discorec::RecommenderBuilder;
use std::fs::File;
use std::io::{BufRead, BufReader};

fn main() {
    let mut data = Vec::with_capacity(100000);

    let file = File::open("path/to/ml-100k/u.data").unwrap();
    let rdr = BufReader::new(file);
    for line in rdr.lines() {
        let line = line.unwrap();
        let mut row = line.split('\t');

        let user_id: i32 = row.next().unwrap().parse().unwrap();
        let item_id: i32 = row.next().unwrap().parse().unwrap();
        let rating: f32 = row.next().unwrap().parse().unwrap();

        data.push((user_id, item_id, rating));
    }

    let (train_set, valid_set) = data.split_at(80000);

    let recommender = RecommenderBuilder::new()
        .factors(20)
        .fit_explicit(train_set);
    println!("RMSE: {:?}", recommender.rmse(valid_set));
}

Storing Recommendations

Save recommendations to your database.

Alternatively, you can store only the factors and use a library like pgvector-rust. See an example.

Algorithms

Disco uses high-performance matrix factorization.

Specify the number of factors and iterations

RecommenderBuilder::new()
    .factors(8)
    .iterations(20)
    .fit_explicit(&train_set);

Progress

Pass a callback to show progress

RecommenderBuilder::new()
    .callback(|info| println!("{:?}", info))
    .fit_explicit(&train_set);

Note: train_loss and valid_loss are not available for implicit feedback

Validation

Pass a validation set with explicit feedback

RecommenderBuilder::new()
    .callback(|info| println!("{:?}", info))
    .fit_eval_explicit(&train_set, &valid_set);

The loss function is RMSE

Cold Start

Collaborative filtering suffers from the cold start problem. It’s unable to make good recommendations without data on a user or item, which is problematic for new users and items.

recommender.user_recs(&new_user_id, 5); // returns empty array

There are a number of ways to deal with this, but here are some common ones:

  • For user-based recommendations, show new users the most popular items
  • For item-based recommendations, make content-based recommendations

Reference

Get ids

recommender.user_ids();
recommender.item_ids();

Get the global mean

recommender.global_mean();

Get factors

recommender.user_factors(&user_id);
recommender.item_factors(&item_id);

References

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/ankane/disco-rust.git
cd disco-rust
cargo test

About

Recommendations for Rust using collaborative filtering

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages