#suffix-array #bioinformatics #saca #suffixarray

sais_drum

An implementation of the SAIS algorithm for suffix array construction

2 releases

Uses new Rust 2024

0.1.1 Sep 17, 2025
0.1.0 Sep 17, 2025

#400 in Biology

MIT/Apache

65KB
1.5K SLoC

🥁 SAIS-drum 🥁

Build Status Crates.io Documentation

A Rust implementation of the Suffix Array Induced Sort (SAIS) algorithm for suffix array construction. Inspired by Ilya Grebnov's libsais and based on the following paper:

G. Nong, S. Zhang and W. H. Chan: Two Efficient Algorithms for Linear Time Suffix Array Construction (2011) DOI: 10.1109/TC.2010.188

State of the implementation

The algorithm is implemented and tested using proptest, but not yet fully optimized. I highly recommend using my bindings to libsais instead. Other Rust solutions include Amos Wenger's port of divsufsort and Andrew Gallant's suffix crate.

In the future, the following optimizations could be added (inspired by libsais):

  • Algorithmic improvements laid out in this paper:

    N. Timoshevskaya and W. -c. Feng: SAIS-OPT: On the characterization and optimization of the SA-IS algorithm for suffix array construction (2014) DOI: 10.1109/ICCABS.2014.6863917

  • Multithreading, based on this paper:

    Lao, B., Nong, G., Chan, W.H. et al. : Fast induced sorting suffixes on a multicore machine (2018) DOI: 10.1007/s11227-018-2395-5

  • Implementation techniques laid out by Ilya Grebnov in the README of libsais

  • General optimizations such as writing vectorization-friendly code

  • Some of my own ideas that leverage Rust-specific features such as the easy creation of generic code compared to C

Why drum?

Who doesn't like drums? Also, it's a pretty funny wordplay in German.

Dependencies

~1.5MB
~32K SLoC