Readme
Parsing and reading of SquashFS archives, on top of any
implementor of the tokio:: io:: AsyncRead and tokio:: io:: AsyncSeek traits.
More precisely, this crate provides:
A SquashFs structure to read SquashFS archives on top of any asynchronous reader.
An implementation of fuser_async:: Filesystem on SquashFs , allowing to easily build FUSE filesystems using SquashFS archives.
A squashfuse-rs binary for mounting SquashFS images via FUSE, with async IO and multithreaded decompression.
Motivation: multithreaded/async SquashFS reading
The main motivation was to provide a squashfuse implementation that could:
Decompress blocks in parallel.
Benefit from async I/O when relevant (mostly with the case of a networked backend in mind), with easy integration with tokio:: io .
To the author's understanding, squashfuse uses a single-threaded FUSE loop , and while the kernel driver does multithreaded decompression (when compiled with this option), it doesn't support parallel reads. Note that a patch exists to add multi-threading to the low-level squashfuse see Benchmarks .
Example: squashfs-on-S3
This crate has been used to implement a FUSE filesystem providing transparent access to squashfs images hosted on an S3 API, using the S3 example in fuser_async . With a local MinIO server, throughput of 365 MB/s (resp. 680 MB/s) are achieved for sequential (resp. parallel) access to zstd1-compressed images with 20 MB files.
squashfuse-rs binary
The squashfuse-rs binary is an example that implements an analogue to squashfuse using this crate, allowing to mount squashfs images via FUSE.
$ squashfuse-rs -- help
USAGE:
squashfuse-rs [ OPTIONS] < INPUT> < MOUNTPOINT>
ARGS:
< INPUT> Input squashfs image
< MOUNTPOINT> Mountpoint
OPTIONS:
--backend < BACKEND> [ default: memmap] [ possible values: tokio, async- fs, memmap]
--cache-mb < CACHE_MB> Cache size (MB ) [default: 100]
-d, -- debug
--direct-limit < DIRECT_LIMIT> Limit (B ) for fetching small files with direct access [ default: 0]
-h, -- help Print help information
--readers < READERS> Number of readers [ default: 4]
Benchmarks
The following benchmarks (see tests/ ) compute the mean and standard deviation of 10 runs, dropping caches after each run, with the following variations:
Sequential or parallel (with 4 threads) read.
Compressed archive (gzip and zstd1) or not.
Different backends for reading the underlying file in squashfuse-rs .
The archives are either:
Case A: Containing sixteen random files of 20 MB each, generated by tests/testdata.rs (note that given that the files are random, the zstd compression has minimal effect on the data blocks).
Case B: Containing three hundred 20 MB images (with a compression ratio of 1.1 with zstd-1).
Entries are normalized by ( case, comp) pair (i.e. pairs of rows) with respect to the duration of the sequential squashfuse run. Number smaller than 1 indicate faster results than this baseline. The last 3 columns are squashfuse-rs with different backends (MemMap being the most performant).
Case
Comp.
squashfuse
squashfuse_ll_mt
MemMap
Tokio
AsyncFs
A
Seq
-
1
1.16
1.01
1.93
1.56
Par
-
1.8
0.5
0.54
0.8
0.76
Seq
gzip
1
0.92
0.94
1.79
1.48
Par
gzip
2.07
0.46
0.51
0.75
0.71
Seq
zstd1
1
0.96
1.04
1.78
1.47
Par
zstd1
2.35
0.48
0.51
0.76
0.71
B
Seq
-
1
0.89
0.93
2.08
1.43
Par
-
1.6
0.54
0.6
0.89
0.91
Seq
zstd1
1
0.59
0.65
0.98
0.87
Par
zstd1
1.07
0.3
0.35
0.3
0.54
Smaller numbers are better; numbers smaller than 1 denote an improvement over the baseline
[!WARNING]
These should be updated with the latest versions of the code and of squashfuse .
To execute the tests (case A), cargo needs to run with root privileges to be able to clear caches between runs, e.g.
$ N_RUNS=10 CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUNNER=' sudo -E' cargo test - r -- test main -- --nocapture
Differences with similar crates
squashfs is a work in progress that only supports parsing some structures (superblock, fragment table, uid/gid table).
backhand and this crate were implemented indendently at roughly the same time. Some differences are (see also Limitations below):
The primary goal of this crate was to allow mounting squashfs images with FUSE, with async IO and multithreaded decompression. backend uses a synchronous std:: io:: Read /std:: io:: Seek backend, while this crate uses a tokio:: io:: AsyncRead /tokio:: io:: AsyncSeek backend.
This crates provide caching for decompressed blocks.
backhand supports write operations, while squashfs-async doesn't.
squashfs-ng-rs wraps the C API, while this crate is a pure Rust implementation.
References
Limitations/TODOs
For now, only file and directory inodes are supported.
The tables are loaded into memory on initial parsing for caching, rather than being accessed lazily.
...