Table of Contents generated with DocToc
- Getting Started with Rust
- Some links on Rust
- Borrowing and Lifetime Tricks
- Macros
- Cool Rust Projects
- Rust Error Handling
- Rust Concurrency
- Data Processing and Data Structures
- Rust and Scala/Java
- CLI and Misc
- IDE/Editor/Tooling
- Testing and CI/CD
- Performance and Low-Level Stuff
Why the developers who use Rust love it so much - from StackOverflow survey, really good quotes
If you want a Rust REPL, check out evcxr.
I highly recommend rust-analyzer to support fast compile checks, references, refactorings, etc. in your editor. VSCode works pretty well - install rust-analyzer and the "Even Better TOML" extension and you should be set.
- The Rust Book - probably the best starting point
- Comprehensive Rust - The Google Android team's guide to learning Rust
- Rustlings - small exercises to learn
- Easy Rust Youtube Channel - great videos
- Rust By Example - also the guide on their site is pretty good.
- Complete(sh) Rust Cheat Sheet
- explaine.rs - paste Rust code into the window and hover over keywords to get explanations. Great for learning.
- Rustlang in a Nutshell - great introduction
- Mental models for learning Rust - really really short blurb on how to approach learning Rust
- Rust Borrowing and Ownership - easy-to-read, short summary of basic ownership, borrowing, and lifetime references
- A Java Programmer Understanding Rust Ownership
- Rust Error Handling for Pythonistas
- Zero to Production in Rust
Easy short intros:
- A Rust Gem: The Rust Map API - comparing C++, Java, and Rust Map APIs and why Option and not having nulls makes the Rust Map API superior
Online resources and help:
- cheats.rs - Awesome quick ref.
- The Rust Discord #beginners channel has been pretty helpful for me
- Rust IRC channel
- Rust for Rubyists
- Rust Playpen - closest thing to a REPL :(
- makepad - Web-based Rust + WebASM multimedia playground
- Awesome Rust Mentors
-
Rust Design Patterns - super helpful resource
-
What you Can't Do in Rust and What To Do Instead - great guide for anti-patterns
-
Rust: A Unique Perspective - comprehensive summary about Rust ownership from angle of unique access, covers RC/Arc etc.
-
Rust is for Professionals - great perspective on what makes Rust unique and so appealing
-
References are like Jumps - on why Rust's core principle of ensuring one cannot both mutate and alias references at the same time is so brilliant
-
Tour of Rust's Standard Library Traits - really great detailed guide with an explanation about traits, generics, associated types, etc.
-
- Common Rust Lifetime Misconceptions -- a great detailed dive into nuances
-
Learn Rust with Too Many Linked Lists - hilarious.
-
Learning Rust - a very detailed treatise on borrowing, ownership,
dyn Trait
and trait objects -
In Rust, Ordinary Vectors are Values - about why persistent collections are not quite as useful in Rust, because Rust already prevents shared mutation
-
Shared Mutability in Rust: Acyclic Graphs - really good article on mutable child entities, and how to share things which need to be mutable (hint: don't, instead use an "arena" pattern where a single owner mutates things)
-
Jon Gjengset on Rust Lifetime Annotations - actually check out his Youtube channel, lots of great tutorials
-
The Evolution of Rust Programmers - hilarious look at different coding styles
-
Fireflowers: Rust in the words of its Practitioners - just brilliant commentary on what Rust is.
-
Oxidizing the Interview - hilarious read on a Rust technical interview
-
Rust and the Three Laws of Informatics - great detailed guide to how Rust allows developers to uncompromisingly achieve correctness, maintainability, AND efficiency
-
Why Scientists are turning to Rust - from Nature mag
-
Rust After the Honeymoon - by Bryan Cantrill, a list of top favorite Rust tricks/properties. Did you know that
{:#x?}
will pretty-print structs in HEX?? -
Prefer Rust over C/C++ - when to and when not to prefer Rust
-
- C2Rust and Quake - a tool to auto translate C to Rust!
-
Clear Explanation of Rust's Module System - easy intro guide
-
On Rusts Module System - good explanation of paths, naming, modules -- see this when compiler complains about cannot find symbols
-
Null The Billion Dollar Mistake and how Rust Provides a Solution
Speed without wizardry - how using Rust is safer and better than using hacks in Javascript
Dealing with strings are confusing in rust, because there are two types: a heap-
allocated String
and a pointer to a slice of String bytes: &str
. Knowing
what to use, and defining structures on them, immediately exposes the steep
learning curve of ownership.
See the Guide to Strings for some help.
Specific topics:
-
Default Values for Maintainability - short and easy guide
-
Async Rust - A really concise and great intro to async/await
-
Async Rust: Futures, Tasks, Wakers; Oh My! - another great concise intro, starting with basic async concepts/syntax and diving into details about Wakers and the Future mechanism
-
Async Rust can be a Pleasure to Work with - great post on "structured concurrency" and thread-per-core as alternatives to standard work-stealing tasks
-
Rust Async is Colored - great deep dive into async vs sync, connecting the two worlds, and implications
-
Book: Rust Atomics and Locks
-
Shared/Exclusive Refs, not Mutable/Immutable - excellent explanation from @dtolnay on thinking about
&mut T
as exclusive, not immutable. Also explaining interior mutability andRefCell
etc. - and why they allow&self
safely while providing mutation. -
Ultimate Guide to Rust NewTypes - great guide to the
struct Foo(InnerType)
pattern -
Elegant library APIs in Rust - lots of good tips here
-
Rain's Rust CLI Guide - how to write and organize Rust CLI apps
-
Effectively using Iterators in Rust - on differences between
iter()
,into_iter()
, types, etc. -
Generics and Associated Types - when to use each one
-
Defeating Coherence in Rust Traits - How to implement multiple different methods of the same name for traits
-
Returning Iterators - really helpful article, this is not easy
- Recursive Iterators in Rust - yelch, using Box
- Internal-iterator - a potentially better solution for easily implementing some iterators
- propane - Creating iterators via generator/yield API
-
Generic Return Types in Rust - deep dive into
Iterator.collect()
, traits, and Rust's type system -
Rust-san - sanitizers for Rust code, if the basic compiler checks are not enough :)
-
Pretty State Machines in Rust - great article on diff state machine patterns, use of enums and structs
-
Init Struct Pattern - on patterns for initializing struct
-
In-place construction Seems Surprisingly Simple? - avoiding a move when constructing new structs, and using
MaybeUninit
instead -
COW, Rust vs C++ - great dive into details of copy-on-write. Might be a great pattern for working with things like strings, where cloning might be expensive.
-
Stacked Borrows - a deep dive into the mental model behind Rust's borrow checker and tools like Miri
-
Magical Zero-Sized Types and Proofs - for type masochists
-
Structural Typing in Rust - HLists, ability to use path-based and shape/signature based trait typing instead of by name
-
How Rust Solved Dependency Hell - neat look at what's underneath Cargo to help solve dep issues. Rustc can handle multiple versions of a dependency.
If you need to borrow multiple items mutably from a Vec/array/SmallVec/etc.:
- The thread on solutions
- You can use split_at_mut() but this is clumsy
- Arref gives a great solution
- There is a nightly get_many_mut() API
If you have a Trait with an associated type that must deal with lifetimes: https://stackoverflow.com/questions/33734640/how-do-i-specify-lifetime-parameters-in-an-associated-type
I started writing Rust macros and it is not only lots of fun, but pretty essential for writing concise, performant code IMO. Writing Rust has lots of boilerplate sometimes, especially owing to not having real inheritance. I recommend starting with macro_rules!
which are fancy templates and really easy. Here are some links to help:
- Macros in Rust - a Tutorial - really easy tutorial, esp for
macro_rules!
- Macros - a Methodical Introduction - very detailed book mostly about
macro_rules!
with explanations for the minutae of parsing - How to Write Hygienic Rust Macros - important and short. Read this to ensure your macros work everywhere - so users don't have to worry about imports, etc.
- Rust Macros case studies
- Overview of Macros in Rust - from Steve Klabnik
Some crates that may help write macros:
- spez - match and specialize on the type of an expression. "A trick to do specialization in non-generic contexts on stable Rust"
- concat-ident - macro to concat multiple identifiers etc. and use the result, perhaps as a struct or method name. Very useful in macros
NOTE: there's a separate section for Data-related projects.
CLI tools:
- XSV - a fast CSV parsing and analysis tool
- zoxide - a supercharged, AI-based replacement for cd with rank-based search of your most frequently used dirs
- mcfly - Upgraded, smarter Ctrl-R for bash etc. (note: fish users already have this built in, basically)
- Ripgrep - insanely fast grep utility, great for code searches. Shows off power of Rust regex library
- Bat - A super
cat
with syntax highlighting, git integration, other features - Bottom - Cross-platform fancy
top
in Rust - process/sys mon with graphs, very useful! - eza - a better
ls
with tree view, git info and color-coding (ps exa is not maintained anymore) - procs - a better
ps
- gitui - awesome, fast Git terminal UI. It will change your life!
- skim -
sk
is a general purpose fuzzy-finder; it can work with ripgrep and other utils too - zellij - terminal mux/session detach like tmux/screen, but with a pretty UI and plugins
- pueue - instead of using tmux, queue and manage your background tasks
- xh - HTTPie clone / much better
curl
alternative - Dust - Rust graphical-text faster and friendlier version of du
- Diskonaut - another Text-UI folder/file space usage and browsing tool
- fd - Rust CLI, friendlier and faster replacement for
find
- rustscan - Really fast port scanner, this should easily replace lsof / netstat
- sd - Easier to use sed. You can search and replace in like all files under subdir with
sd old_str new_str **
. - Nushell - Rust shell that turns all output into tabular data. Pretty cool!
- delta - git-delta: colorful git diff viewer
- ruplacer - Source code search and replace tool
- imagecli - CLI for image batch processing
- Hyperfine - Rust performance benchmarking CLI
- Alacritty - GPU accelerated terminal emulator
- jql - Rust version of popular
jq
JSON CLI processor, though not as powerful - rq - a Record Query/Transform tool, translate CSV, Avro, CBOR, Json etc etc to and from each other
- htmlq - like jq but for HTML
- Starship - "The minimal, blazing-fast, and infinitely customizable prompt for any shell!"
- Kubesql - SQL queries for Kube metadata!
- grex - CLI tool to create regexes given a set of strings to match!
- Scaphandre - Metrics agent for collecting power consumption metrics!
- kdash - Text UI Kubernetes dashboard
- Josh - Cool git proxy allows you to treat part of a large monorepo like its own smaller git repo!
Wasm:
- Wasmer - general purpose WASM runtime
- Krustlet - WebAssembly (instead of containers) runtime on Kubernetes!! Use Rust + wasm + WASI for a truly portable k8s-based deploy!
- wasm-gpu - run WASM containers on GPU! (Interpreted)
- Extism - a universal WASM-based plugin system, multi language, but written in Rust
- lunatic - Erlang-like server side WASM runtime with supervision and channel-based message passing, plus hot reloading!
- CosmWasm - Rust/WASM for programming smart contract on Cosmos ecosystem
Others:
- TabNine - an ML-based autocompleter, written in Rust
- kiro - a CLI text editor with syntax highlighting, like a friendlier vim
- ox - another CLI/Text UI lightweight text editor
- async-std - the standard library with async APIs
- Convey - Layer 4 load balancer
- Ockam - End to end secure messaging lib/platform between cloud and IoT devices
- Parsec - abstraction layer for hardware security and cryptography
- Gazebo - useful utilties for all apps, by the Facebook Rust team. They also have blog posts such as on Dupe
Do Rust in Turkish, Spanish and other languages! :)
Languages etc.
- BLisp - a statically-typed Lisp built on top of Rust
- RustPython
Error handling survey - really good summary of the Rust error library landscape as of late 2019.
- Anyhow - streamlined error handling with context....
- Snafu - adding context to errors
- Error-stack - I really like the philosophy behind this crate. It makes it easy to "stack" errors - you get not a backtrace, but stacked detailed errors, with the inner error showing through.
-
Rust Concurrency: Five Easy Pieces - a great intro to threads, using message queues, determinism, and more
-
Async stacktraces - this is SUPER COOL!!!
-
tokio-console - remote async debugging facility!
-
Rust Parallelism for non C/C++ Devs - great resource on the low-level primitives like
Mutex
andRwLock
-
Fearless Concurrency with Hazard Pointers - using the
conc
crate andAtomic
which implements hazard pointers for fine-grained and safe protection of readers and garbage -
Tasks are the Wrong Abstraction - on having unified, better APIs for concurrency and parallelism decoupling, by Yoshua Wuyts
-
Pin - A detailed treatise of
Pin
references in Rust, the history of how Pin came to be, and why its hard to use -
Bastion - Erlang/Akka-style, remote supervised actor framework
-
Kompact - Kompics style message-passing "component system" with actor model and networking built in
-
Actyx - really cool "decentralized event database, streaming and processing engine" based on event sourcing concepts, built by one of Akka's founders
-
Actors with Tokio - not using any Actor framework, just channels
-
Kameo - new actor library built on Tokio, with supervision
-
Use mio if you want a lower-level event loop, or thin_main_loop
-
monoio - Very fast thread-per-core async I/O Rust runtime, based on io_uring etc. Much faster at top end than Tokio, for servers
Sometimes one needs to share a large data structure across threads and several of them must access it.
The most general way to share a data structure is to use Arc<RwLock<...>>
or Arc<Mutex<...>>
. The Arc
keeps track of lifetimes and lets different threads exist for different lengths of time, and is inexpensive since it is usually only accessed once at thread spawn. The Mutex
or RwLock
lets different threads mutate it safely, assuming the data structure is not thread-safe.
A thread-safe data structure could be used in place of the RwLock
or Mutex
.
Scoped threads could be used if only one owner will mutate the data structure, and one wants to share immutable refs with other threads for reading. However, the special threads in Crossbeam crate are still needed as Rustc by itself has no way of proving the lifetime of a thread or when it will be joined, thus any immutable refs created from the owner thread still cannot compile or be shared due to rustc lifetime checks. Scoped threads are a way around that as it gives rustc a guarantee that the threads will be joined before the owner goes away.
Arc-swap is an alternative to Arc that is designed for occasional updates - enables atomic swapping of the object underneath the Arc, and allows one to read without contention (unlike Mutex/RwLock).
Also see beef - a leaner version of Cow.
There is a neat crate hybrid-rc which gives a version of Rc which can be switched to an Arc. Also has some unsized-sized coercion utils, like [T; N]
to [T]
etc.
-
Are we learning yet? - list of ML Rust crates
- Linfa - Rust ML framework
-
Timely Dataflow - distributed data-parallel compute engine in Rust!!!
-
Hydroflow - a brand new Rust based optimized streaming dataflow engine, relational data, based on very advanced UCBerkeley research on optimization.
-
DataFusion - a Rust query engine which is part of Apache Arrow!
- NOTE: there is now a Ballista project that is basically like Spark - distributed Data Fusion.
-
Amadeus - distributed streams / Parquet / big data processing
-
Fluvio - distributed, persistent queuing / stream processing framework using WASM for programmability, written in Rust!
-
Arroyo - another stream processing framework, streaming SQL and Rust pipelines!
-
Weld - Stanford's high-performance runtime for data analytics
-
Cleora - Super fast Rust tool for billion-scale hypergraph vector embedding ML
-
Node crunch - simple lightweight distributed compute framework
-
Project Midas - distributed compute framework and terminal UI using Lua as scripting language
-
Cube Store - Rust and Arrow/DataFusion-based rollup/aggregation/cache layer for SQL datastores, too bad it's mostly for JS
-
Noria - "data-flow for high-performance web apps" - basically a materialized view cache that updates in real time as database data updates
-
polars - super fast and high level DataFrame implementation for both Rust and Python, much faster and higher level than using Arrow itself
-
Bagua - distributed learning/training framework, the very fast communication core is written in Rust
-
Similari - similarity search/computation engine for ML in Rust
-
Toshi - ElasticSearch written in Rust using Tantivy as the engine
-
MeiliDB - fast full-text search engine
-
Quickwit - Log search DB, like Elastic but built on top of Tantivy
-
Datafuse - distributed "Real-Time Data Processing & Analytics DBMS", similar to Clickhouse "but faster"
-
sonic - Fast, very lightweight and schemaless search/text index. NOT a document store, but an index store.
-
Tonbo - embedded database based on Arrow and Parquet
-
Sanakirja - a transactional KV DB engine/local store, claims to be fastest around
-
Sled - an embedded database engine using latch-free Bw-tree on latch-free page cache techniques for speed
-
SlateDB - embedded LSM object storage engine plus caching layer. Seems pretty promising.
-
Lance - "Modern columnar data format for ML"
-
Skytable - Rust "realtime NoSQL" key-value database
-
IOx - New in-memory columnar InfluxDB engine using Arrow, DataFusion, rust! Persists using parquet. Super awesome stuff.
-
IndraDB - Graph database/library written in Rust! and inspired by Facebook's TAO.
-
TerminusDB-store - a Rust RDF triple data store
-
BonsaiDB - NoSQL document store written in Rust with Rust schemas
-
Vector - high performance observability data pipeline, for transforming, aggregating, routing logs, metrics, traces, etc.
- includes a Vector Remap Language for general transformation
-
Tremor - a simple event processing / log and metric processing and forwarding system, with scripting and streaming query support. Much more capable than Telegraf.
-
MinSQL - interesting POC on lightweight SQL based log search, w automatic field parsing etc.
-
pq - Parse and Query log files as time series, extracting structured records out of common log files
-
plotters - Rust data visualization / graphing library
-
Stateright - distributed protocol/model checker with UI, linearizability checker!
-
Clepsydra - Graydon Hoare working on distributed database protocol - in Rust!
-
crepe - Datalog, declarative logic programs as macros in Rust
For JSON DOM (IR) processing, using the mimalloc allocator provided me a 2x speedup with serde-json. Then, switching to json-rust provided another 1.8x speedup. The speedup is completely unreal, much faster than JVM. The main reason I guess is that json-rust has a Short
DOM class for short strings, which requires no heap allocation.
- simdjson-rs - SIMD-enabled JSON parser. NOTE: no writing of JSON.
- pjson - JSON streaming parser
- streamson - efficient JSON processing for large documents
-
leapfrog - fast, concurrent
HashMap
, lock-free if types support atomic ops.- What's neat about its API is that instead of locking at bucket level, and blocking inserts if a reader is taking too long, it never returns references to data and relies on an atomic API
-
concread - Concurrently Readable (Copy on Write, MVCC) datastructures - "allow multiple readers with transactions to proceed while single writers can operate" - guaranteeing readers same version. There is a hashmap and ARCache.
-
flashmap - lock free, partially wait free, eventually consistent concurrent hash map
-
flurry - Rust impl of Java's ConcurrentHashMap. Uses seize for ref-count-based GC.
-
im - Immutable data structures for Rust
- WARNING:
im::HashMap
seems to allocate way too much memory than needed.
- WARNING:
-
immutable-chunkmap - another immutable persistent map
-
slice_deque - A really clever Ringbuffer implementation that uses mmap and virtual pages to allow one to treat ranges of the buffer as slices!
-
rust-phf - generate efficient lookup tables at compile time using perfect hash functions!
-
odht - "hash table that can be mapped from disk into memory without need for up-front decoding" - deterministic binary representation, and platform and endianness independent. Sounds sweet!
-
orx-split-vec - vector with dynamic capacity and pinned elements using chunks (ie pointers/refs are stable)
-
Patricia Tree - Radix-tree based map for more compact storage
-
probabilistic-collections - Bloom/Cuckoo/Quotient filters, CountMinSketch, HyperLogLog, streaming approx set membership, etc.
-
priq - "blazing fast" priority queue built using arrays
-
Using Finite State Automata and Rust to quickly index and find data amongst HUGE amount of strings
-
ahash - this seems to be the fastest hash algo for hash keys
-
Metrohash - a really fast hash algorithm
-
IndexMap - O(1) obtain by index, iteration by index order
-
FM-Index, a neat structure that allows for fast exact string indexing and counting while compressing original string data at the same time. There is a Rust crate
-
Heapless - static data structures with fixed size; Vec, heap, map, set, queues
-
dashmap - "Blazing fast concurrent HashMap for Rust". NOTE: I don't recommend this project, I used it in my Ying profiler but it can deadlock in unpredictable ways
-
Easy Persistent Data Structures in Rust - replacing
Box
withRc
-
VecMap - map for small integer keys, may use less space
-
The base Geometry processing crate is geo.
- Geo does not (as of 0.18) handle intersections, difference, XOR etc. Try geo-booleanop for a Rust-only implementation using Martinez-Rueda algorithm
- Or use geos based on the C library
-
spatial-join - Spatial joins and proximity maps!
-
Rstar - n-dimensional R*-Tree for geospatial indexing and nearest-neighbor
-
spade - R-trees and Delaunay triangulations
-
Hora Search - Nearest-Neighbor (NN) / geo search library that includes multiple algorithms including HNSW, SSG, PQIVF, etc.
-
Petgraph - Graph data structure for Rust, considered perhaps most mature right now
Rust has native UTF8 string processing, which is AWESOME for performance. However, there are two concerns usually:
- Small string memory efficiency. The native
String
type uses at least two words just for pointer and length/cap, which might be longer than the string itself; - Minimizing number of heap allocations
Here are some solutions:
- String - string type with configurable byte storage, including stack byte arrays!
- Inlinable String - stores strings up to 30 chars inline, automatic promotion to heap string if needed.
- flexstr - Enum String type to unify literals, inlined, and heap strings
- kstring - intended for map keys: immutable, inlined for small keys, and have Ref/Cow types to allow efficient sharing. :)
- nested - reduce Vec type structures to just two allocations, probably more memory efficient too.
- tinyset - space efficient sets and maps, can be combined with nested perhaps
- bumpalo can do really cheap group allocations in a
Bump
and has customString
andVec
versions. At least lowers allocation overhead.
Here is a comparison of inline string libraries:
So picking a good base library to use for string processing is not so simple. We avoid the base String type because that always results in an allocation, and ry clone results in further allocs.
Here are some alternative String libraries:
- smallstr - https://crates.io/crates/smallstr
- flexstr - https://docs.rs/flexstr/latest/flexstr/
- kstring - https://docs.rs/kstring/2.0.0/kstring/
- compact_str - https://crates.io/crates/compact_str
- smol_str - https://crates.io/crates/smol_str
- smartstring - https://crates.io/crates/smartstring
They vary based on API and basically on the below two features: A. How many bytes can be "inlined" on the stack, in the normal 24 bytes uired for a normal String? This is one key optimization for small string processing.
- smallstr - 16
- flexstr - 22
- kstring - 15 or 22, depending on max_inline feature
- compact_str - 24*
- smol_str - 22
- smartstirng - 23
B. How expensive is it to clone the heap-based version when the string doesn't inline on the stack? *`` smallstr - O(n), similar to regular String allocation
- flexstr - O(1), Arc or Rc (when using LocalStr)
- kstring - O(1) when using Arc (feature), otherwise O(n) Box same as ing
- compact_str - O(n), but heap size grows more slowly (1.5x) compared to mal String
- smol_str - O(1)
- smartstring - O(n)
-
The presence of true unsigned types is really nice for low-level work. I hit a bug in Scala where I used >> instead of >>>. In Rust you declare a type as unsigned and don't have to worry about this.
-
Immutable byte slices and reference types again are awesome for low-level work.
-
Trait monomorphisation is awesome for ensuring trait methods can be inlined. JVM cannot do this when there is more than one implementation of a trait.
-
Being able to examine assembly directly from compiler output is super nice for low level perf work (compared to examining bytecode and not knowing the final output until runtime)
-
OTOH, rustc is definitely much much stricter (IMO) compared to scalac. Much of this is for good reason though, for example lack of integer/primitive coercion, ownership, etc. gives safety guarantees.
- PyO3 seems to be a gold standard of Rust-based Python module development.
- Maturin - for building and publishing PyO3-based/Rust Python modules, or mixed Rust/Python projects
- There are older posts too: Wrapping Rust Types as Python classes and RustyPy but they are much more work than PyO3
- PyOxidizer - a Rust tool to package Python apps, interpreter, and all dependencies as a single binary, by wrapping app in a Rust program with a custom Rust Py module importer. Also helps embed Python code in Rust apps.
- Oh no, my data science is getting Rusty! - neat post from CrowdStrike on integrating Rust with Python for improved performance AND safety
- Calling Rust from Java - especially see the hint for using jnr-ffi
- There is also j4rs for calling Java from Rust
- SaferFFI - a neat library to make exposing C-like APIs much safer esp dealing with pointers, nulls, borrowing etc.
- Exposing a Rust library to C - has some great tips on creating .so's and working with strings
- cc-rs - C/C++ build integration with Cargo
- It seems to me Circle CI's support for multiple docker images and explicit manifest style makes it very easy to set up multiple language and dependency support
- Supporting multiple languages in Travis CI
- Running LLVM on GraalVM - using GraalVM to embed and run LLVM bitcode! Too bad GraalVM is commercial/Oracle only
-
Structopt - define CLI options using a struct!
-
tui-rs - Rust terminal UI for CLI apps. Check out list of projects it refers to also. Lots of options!
- Ratatui is the new project btw
-
cursive is another text UI library, based on curses
-
Hot Reloading in Rust - great article on how to hot-reload dynamic linked libraries in Rust, and on the potential pitfalls, with plenty of links.
-
inventory and ctor - static "plugin" registration of different things in your repo, say you need a static list of function implementations, metadata, etc.
-
quote - the standard way to generate a Rust code TokenStream from quoting rust code. Great for procedural macros or code generation.
-
prettyplease - Rust TokenStream pretty printer - great for code generation
- EVCXR - a Rust REPL!!! With deps, and tab-completion for methods!!
- comby-rust - rewrite Rust code using comby
- rustviz - Visualize borrowing and ownership!
- no-panics-whatsoever - crate to detect and ensure at compile time there aren't panics in your code
- cargo-bloat - what's taking up space in my Rust binary
- cargo-limit - clean up, sort and limit error/warning output. Great for those of us running cargo in shells!
- cargo-readme - tool to generate a README based on the RustDoc in lib.rs, to avoid duplicated effort!
- roxygen - documenting Rust function parameters!
- mutagen - mutation testing tool for Rust programs. Generates "mutations" in your code to try to break test coverage!
- cargo-rr - time travel/recording/reverse debugger framework for Rust using rr
- For more explanation see Print debugging should go away
- cargo_hakari - A crate to speed up builds of workspace-hack packages ... for when you have multiple crates or complex builds, and you have duplicate dependencies
- inkwell - LLVM API, including LLVM IR generation and running LLVM JIT to run snippets in your code
Dependency conflicts? Use cargo tree -i
to lookup reverse dependencies for specific packages (which crates are using which deps). For example, cargo tree -i arrow:5.0.0-SNAPSHOT
.
- RustAnalyzer - LSP-based plugin/server for IDE functionality in Sublime/VSCode/EMacs/etc
- Configuring Rustfmt
- Godbolt - A "compiler explorer", not Rust specific but neat to play with compiler settings and diff targets.
- Cargo-play - run Rust scripts without needing to set up a project
- Also see cargo-eval and runner for diff ways of easily running scripts without projects
BTW for Rust 1.51+ you can speed up MacOS builds with this in your Cargo.toml (see the release notes):
[profile.dev]
split-debuginfo = "unpacked"
The two standard property testing crates are Quickcheck and proptest. Personally I prefer proptest due to much better control over input generation (without having to define your own type class).
- Rust Continuous Delivery - hints on using Docker, caching deps, and automated cloud-based CI/CD workflows for Rust
- Cargo-nextest looks like a really good project to help with test organization, test CI, running tests faster, etc.
- rstest - Fixture based test framework. Think of being able to inject arguments into a test function. Setup and teardown can be built into fixtures.
- Faster Build Times on MacOS
- 5x Faster Rust Docker Builds with cargo-chef - you need this for faster Rust app deploys!
- Are We Observable Yet? - an introduction to Rust telemetry
- Miri - can run binaries and test suites of cargo projects and detect certain classes of undefined behavior, including memory leaks!!
Notes from a RustConf talk on tracing crate usage:
A few notes from a RustConf talk on the tracing crate:
- Consider not using EnvFilter, it's complicated and buggy
- Beware of OpenTelemetry, it may not do what you want.
- Aggressively filter out spans esp when working with OpenTelemetry as it adds much more $$
- Each layer in tracing-subscriber should be a "Filtered"
- Use callsite registration to filter out logging that we are not interested in. Call sites can be dynamically re configured so that we don't have to analyze, with every single call, whether something is logged or not.
A common concern - how do I build different versions of my Rust lib/app for say OSX and also Linux?
- Easiest way now seems to be to use cross - I tried it and literally as easy as
cargo install cross
andcross build --target ...
as long as you have Docker.- NOTE: crates with non-Rust code (eg jemalloc, mimalloc) often have trouble
- Also see rust-musl-builder, another Docker-based solution
- musl is the best target for Linux as it removes need for G/LIBC dependencies and versioning. Musl creates a single static binary for super easy deploys.
- For automation, maybe better to create a single Docker image which combines crossbuild (which has a recipe for OSXCross + other targets) with a rustup container like abronan/rust-circleci which allows building both nightly and stable. Use Docker multi-stage builds to make combining multiple images easier
Finally, the Taking Rust everywhere with Rustup blog has good guide on how to use rustup to install cross toolchains, but the above steps to install OS specific linkers are still important.
A big part of the appeal of Rust for me is super fast, SAFE, built in UTF8 string processing, access to detailed memory layout, things like SIMD. Basically, to be able to idiomatically, safely, and beautifully (functionally?) do super fast and efficient data processing.
Many of the links/crates/techniques in the sections below use unsafe. Be sure to run the Miri compiler/checker to find and help debug subtle bugs - it is your friend! Also see Learn Rust the Dangerous Way which covers many topics in this space and talks about how to limit and reason about unsafe code.
-
Rustonomicon - Officially part of Rust-lang, this is a guide to "The Dark Arts of Unsafe Rust"
-
Cheap Tricks - Rust Performance - set of quick Cargo settings to try
-
How to Write Fast Rust Code - really good guide
-
High Performance Rust - a book
-
Optimizing String Processing in Rust - really useful stuff
-
Achieving warp speed with Rust - great tips on performance optimization
-
Rust Match vs Lookup - remember that rustc heavily optimizes matches. Just rely on match!
-
Deep Dive into Dynamic Dispatch - great details and perf comparison
-
Rust to Assembly - great series of blog posts detailing how various parts of Rust compile down to assembly
-
- BTW, a super efficient thread local crate is scoped-tls - basically just storing a mutable pointer. If it fits your use case, it's awesome!
-
Modern storage is plenty fast - using a new Rust crate called glommio one can achieve multi-GB per sec read throughputs from modern SSDs. So maybe we don't need memory after all.
- Along the same lines, not Rust-specific but ScyllaDB and I/O Access Methods - discussions of mmap vs AIO/DIO vs standard Linux I/O
- Direct I/O Writes - why doing direct I/O writes may end up better than buffered
-
Representations - super important to understand low-level memory layouts for structs. C vs packed vs .... including alignment issues.
-
Precise memory layouts and how to dump out Rust struct memory layouts
- Or just use the memoffset crate
-
MemFlow - framework to inspect machine memory. Think about DMA/IO, debugger, or Plasma-type memory/DB applications.
-
Rust uses system malloc by default. How to switch the default allocator.
- Use jemallocator and jemalloc-ctl crates for stats, deep dives, etc. Jemalloc from Facebook supposed to be fast.
- Also see MiMalloc - a high perf allocator from Microsoft. I got 2x improvement for JSON workloads!
- There are even epoch GCs available
- Also look into the arena and typed_arena crates... very cheap allocations within a region, then free entire region at once.
- Also see bumpalo - bump allocator which includes custom versions of standard collections
- Also: Garbage Collection for System Programmers - great writeup
-
Watch out for dynamic dispatch (when you need to use
Box<dyn MyTrait>
etc). One solution is to use enum_dispatch.
If small binary size is what you're after, check out Min-sized-Rust.
Rust nightly now has a super slick asm! inline assembly feature. The way that it integrates Rust variables/expressions with auto register assignment is super awesome.
NOTE: simplest way to increase perf may be to enable certain CPU instructions: set -x RUSTFLAGS "-C target-feature=+sse3,+sse4.2,+lzcnt,+avx,+avx2"
NOTE2: lazy_static
accesses are not cheap. Don't use it in hot code paths.
In my experience, if you know all the possible types, using an enum is the fastest, most performant way to store
something dynamic. No allocations in most cases, good data locality. enum-dispatch is a big big help for enums. There
are other solutions like using dyn Any
but they are all slower and usually involves dynamic dispatch of some kind.
Trait objects also have limitations - mainly around trait safety, so some trait methods are not usable in dyn
situations esp Serde traits. OTOH, nested enums can cause serious memory bloat when you have large enum variants and
they are used in collections. Here are some "better dyn Any" alternatives:
- Related: auto_enum - a way to return enums when you might need to return
impl A
for some trait A when you might be returning diff implementations - Can also use ambassador - to delegate trait implementations
- delegate - general purpose method delegation
- See dynamic for a faster alternative to
dyn Any
. However in my usage I didn't see a massive improvement. - box_any is another fast solution which actually keeps
*void
style pointers but still drops properly - smallbox - a box that can store smaller values on stack for speed, also has Clone and PartialEq support. Questionable Any support though.
- Also see unibox - for another solution to storing dynamic data
- Mopa - allows you to derive Any-like methods like downcasting for your traits. Pretty useful.
- typetag - Serde serializable trait objects
Note: this section is mostly about profiling tools -- detailed breakdowns of bottlenecks, as opposed to benchmarking (which is repeatable, systematic measurement). The two benchmarking tools I recommend are criterion and Iai for benchmarking.
NEW: I've created a Docker image for Linux perf profiling, super easy to use. The best combo is cargo flamegraph followed by perf and asm analysis.
-
cargo-flamegraph -- this is now the easiest way to get a FlameGraph on OSX and profile your Rust binaries. To make it work with bench and Criterion:
- First run
cargo bench
to build your bench executable - If you haven't already,
cargo install flamegraph
(recommend at least v0.1.13) sudo flamegraph target/release/bench-aba573ea464f3f67 --profile-time 180 <filter> --bench
(replace bench-aba* with the name of your bench executable)- The
--profile-time
is needed for flamegraph to collect enough stats
- The
open -a Safari flamegraph.svg
- NOTE: you need to turn on
debug = true
in release profile for symbols - This method works better for apps than small benchmarks btw, as inlined methods won't show up in the graph.
- First run
-
Rust Profiling with Instruments on OSX - but apparently cannot export CSV to FlameGraph :(
- Note that you can now just install cargo instruments
- Also useful for heap/memory analysis, including tracking retained vs transient allocations
-
Rust Performance: Perf and Flamegraph - including finding hot assembly instructions
-
samply - used to be called perfrecord, Rust CPU CLI command profiler using Firefox as UI. WIP.
-
Iai - a one-shot Rust profiler that uses Valgrind underneath
-
Top-down Microarchitecture Analysis Method - TMAM is a formal microprocessor perf analysis method from Intel, works with perf to find out what CPU-level bottlenecks are (mem IO? branch predictions? etc.)
-
Rust Profiling with DTrace and FlameGraphs on OSX - probably the best bet (besides Instruments), can handle any native executable too
- From
@blaagh
: though the predicate should be"/pid == $target/"
rather than using execname. - DTrace Guide is probably pretty useful here
- From
-
Hyperfine - Rust performnace benchmarking CLI
-
Tools for Profiling Rust - cpuprofiler might possibly work on OSX. It does compile. The cpuprofiler crate requires surrounding blocks of your code though.
-
Rust Profiling talk - discusses both OSX and Linux, as well as Instruments and Intel VTune
-
2017 RustConf - Improving Rust Performance through Profiling
-
Flamer - an alternative to generating FlameGraphs if one is willing to instrument code. Warning: might require nightly Rust features.
-
cargo-profiler - only works in Linux :(
-
coz and its Cargo plugin, coz-rs -- "a new kind of profiler that unlocks optimization opportunities missed by traditional profilers. Coz employs a novel technique we call causal profiling that measures optimization potential"
-
Rust Perf Book Profiling Page - lots of good links
-
Divan - easy macro to benchmark functions
cargo-asm can dump out assembly or LLVM/IR output from a particular method. I have found this useful for really low level perf analysis. NOTE: if the method is generic, you need to give a "monomorphised" or filled out method. Also, methods declared inline won't show up.
- What I like to do with asm output: check if rustc has inlined certain methods. Also you can clearly see where dynamic dispatch happens and how complicated generated code seems. More complicated code usually == slower.
- llvm-mca - really detailed static analysis and runtime prediction at the machine instruction level
- Godbolt assembly exploring without crate limitations, in Visual Studio Code - great guide to generating disassembly and visualizing it in VSCode
I have found that cargo rustc
can often generate more assembly than cargo asm
where you have to specify a method name. However, in general one needs to make generic structs concrete, perhaps by adding stub functions in lib.rs
, in order to view assembly. Also, LLVM IR might be easier to read.
What works on a Mac (but see cargo flamegraph above for easier way):
sudo dtrace -c './target/release/bench-2022f41cf9c87baf --profile-time 120' -o out.stacks -n 'profile-997 /pid == $target/ { @[ustack(100)] = count(); }'
~/src/github/FlameGraph/stackcollapse.pl out.stacks | ~/src/github/FlameGraph/flamegraph.pl >rust-bench.svg
open -a Safari rust-bench.svg
where -c bench.... is the executable output of cargo bench.
I was hoping cargo-with would allow us to run above dtrace command with the name of the bench output, but alas it doesn't seem to work with bench. (NOTE: they are working on a PR to fix this! :)
I highly recommend for benchmarking to use criterion, which works on stable and has extra features such as gnuplot, parameterized benchmarking and run-to-run comparisons, as well as being able to run for longer time to work with profiling such as dtrace.
The options I've tried out:
- Bytehound - really slick, but only works on Linux (using perf).
- No need to modify apps, uses
LD_PRELOAD
- extracts full stack traces plus every alloc/dealloc, but claims it uses custom unwinding code that's much much faster
- tracks memory usage over time, as well as leaks explicitly, and memory fragmentation
- can give you flamegraphs of memory allocations or just leaks!
- Has a really nice UI/webapp that's bundled together
- Has many options to write out profiling data to different locations or over network
- Problems:
- Creates giant profiling data files. There are options to slim it down though, such as keeping only allocations that live longer than a particular threshold
- Bundled viewer does not seem to be able to load debug symbols when profiling data does not include them :(
- It seems the only way to really include full symbols in the profiling info is to run profiling with a debug build. However this blows up the size of the data file even more... hundreds of MBs from just a few minutes of run time!
- No need to modify apps, uses
- jeprof: If you use jemallocator and install jemalloc as your global allocator, you can get some profiling for free.
- Jemalloc Heap Profiling
- How to parse jeprof text output
- Pros: Jemalloc profiling is sampling based and very lightweight. It can be used in production with minimal perf impact.
- The profile files are also very small
- Cons: it's, like, really hard to use. For example, enabling it via environment variable - the instructions are not very clear, and there is no way to write the files to anything other than the current directory
- Runtime config: set both environment variables
MALLOC_CONF
and_RJEM_MALLOC_CONF
(which one works depends on environment) - Compile time config, for jemallocator users:
JEMALLOC_SYS_WITH_MALLOC_CONF
- Runtime config: set both environment variables
- Con: The stats collected are about total memory allocated, with no differentiation for short/temporal vs long-lived allocations
- Con: It's not built for Rust and difficult to infer stacktraces. Many symbols are mangled.
- It is possible to do differential analysis: use one profile as a "base" and then diff vs other profiles. However, the profile files use sequence numbers, so it's hard to tell which profile to use for what time.
- Also there is no way to sort the output and the options for simplifying the output don't work very well
- dhat - Swap out the global allocator, will profile your allocations & max heap usage
- One advantage DHAT has over jeprof/jemalloc is lifetime / allocation length information. This can be used to figure out long-held things
- DHAT also tracks the entire call graph so it can produce a useful tree
- It's online viewer is also much easier to use than
jeprof
- Unfortunately DHAT tracks every allocation so it's not good for production use
- DHAT also crashes on some workloads. This is really annoying.
- Heaptrack and working with Rust works for Rust, but only on Linux.
After the above frustrations and investigations, I decided to write my own custom memory profiler - Ying - a sampling profiler, built for rich Rust stack traces including inlined methods, which tracks retained memory and lifetimes. Definitely experimental right now.
- Phantom Menace - nice article on a phantom memory leak caused by container measurement problems. Has nice hints on how to set up jemalloc memory profiling.
- memory-profiler - written in Rust by the Nokia team!
- allocative - generate runtime memory usage (not allocations) flamegraphs of structs you tag/derive using a custom trait. From Facebook.
- memuse - another approach to tag your structs and get dynamic (including heap) memory usage info
- stats_alloc can dump out incremental stats about allocation. Or just use
jemalloc-ctl
. - deepsize - macro to recursively find size of an object
- Parity-util-mem - can find the size of collections as well?
- Measuring Memory Usage in Rust - thoughts on working around the fact we don't have a GC to track deep memory usage
- How to Create a Custom Allocator - great post on many details, page allocation, multi-threading etc.
-
nom - a direct parser using macros, commonly accepted as fastest generic parser
-
pest is a PEG parser using an external, easy to understand syntax file. Not quite as fast but might be easier to understand and debug. There is also a book.
-
combine is a parser combinator library, supposedly just as fast as nom, syntax seems slightly
-
simdutf8 - SIMD lightning fast UTF-8 validation
- bitpacking - insanely fast integer bitpacking library
- packed_struct - bitfield packing/unpacking; can also pack arrays of bitfields; mixed endianness, etc. However you have to explicitly pack/unpack.
- Similar but easier to use: bit_field
- rkyv - Zero-copy deserialization, for generic Rust structs, even trait objects. Uses relative pointers.
- binary-layout - "type-safe, inplace, zero-copy access to structured binary data" including open-ended byte arrays at the end
- FlexBuffers - version of FlatBuffers for schema-less data!
- zerovec - zero-copy Vec and Map types for dealing with alignment, endianness, and variable-length str types
- aligned-vec - Vecs that are aligned!!
- Speeding up incoming message parsing using nom - a detailed guide to using nom for deserialization, much faster than Serde
The ideal performance-wise is to not need serialization at all; ie be able to read directly from portions of a binary byte slice. There are some libraries for doing this, such as flatbuffers, or flatdata for which there is a Rust crate; or Cap'n Proto. However, there may be times when you want more control or things like Cap'n Proto are not good enough.
How do we perform low-level byte/bit twiddling and precise memory access? Unfortunately, all structs in Rust basically need to have known sizes. There's something called dynamically sized types basically like slices where you can have the last element of a struct be an array of unknown size; however, they are virtually impossible to create and work with, and this only covers some cases anyhow. So we will unfortunately need a combination of techniques. In order of preference:
- Overall scroll is the best general-purpose struct serialization crate; it helps with reading integers and other fields too, and takes care of endianness. It generates pretty efficient code. It is a bit of a pain working with numeric enums however.
- num_enum - a way to derive TryFrom for numeric enums helps a little bit.
- I have found plain works really well. Mark your structs with
#[repr(C)]
. It only helps with size and alignment, not endianness - so maybe more for in-memory structures or when you are sure you don't need code to work across endianness platforms. If your structures are not aligned then use#[repr(C, packed)]
or#[align(1)]
. - Use a crate such as bytes or scroll to help extract and write structs and primitives to/from buffers. Might need extra copying though. Also see iobuf
- rel-ptr - small library for relative pointers/offsets, should be super useful for custom file formats and binary/persistent data structures
- tagptr - use a few bits in pointer words for metadata
- zerocopy - utilities for zero-copy parsing deserialization and auto byteorder flipping/alignment, with
FromBytes
andAsBytes
traits for easy transmuting - Erasable - type erased pointers
- arrayref might help extract fixed size arrays from longer ones.
- bytemuck for casts
- bitmatch could be great for bitfield parsing
- Also see zero
- Allocate a
Vec::<u8>
and transmute specific portions to/from structs of known size, or convert pointers within regions back to references:
let foobar: *mut Foobar = mybytes[..].as_ptr() as *mut Foobar;
let &mut Foobar = (unsafe { foobar.as_ref() }).expect("Cannot convert foobar to ref");
- Or structview which offers types for unaligned integers etc.
- There are some DST crates worth checking out: slice-dst, thin-dst
- See dyn_struct - a way to allocate DSTs on the heap using safe Rust
- As a last resort, work with raw pointer math using the add/sub/offset methods, but this is REALLY UNSAFE.
let foobar: *mut Foobar = mybytes[..].as_ptr() as *mut Foobar;
unsafe {
(*foobar).foo = 17;
(*foobar).bar = -1;
}
Sometimes you want to make independent parts of byte buffers mutable. Some crates help with this:
- bytes can be used, but its API is more geared towards network use cases
- deferred_reference is a clever crate that can return independent mutable slices
Want to zero memory quickly? Use slice_fill for memset optimization, since there is no memory filling for slices in Rust yet.
Also check out the crazy number of crates available under compression - including various interesting radix and trie data structures, and more compression algorithms that one has never heard of.
A frequent problem, esp when working with data, is to have a "union" of different types. Perhaps Option
will suffice,
but sometimes we need to wrap Vec<A>
and Vec<B>
together in the same type. We don't want to just use Box<dyn MyTrait>
as that allocates and results in dynamic dispatch. Here are some crates and patterns that may help in
working with enums, or alternatives (Do see section on dyn any above):
- enum_dispatch - macro to implement the
dyn MyTrait
trait object pattern for enums, so we get fast static dispatch. Basically implements traits for underlying types in enums - enum_delegate is an alternative that works with associated types in traits - but not generics
- strum - derive strings and discriminant enums using macros
- You can use
std::mem::discriminant
, a built-in function, to find the numeric discriminant for an enum - Also enum discriminants can be explicitly specified using
#[repr(..)]
, see here - you can then transmute the enum into something explicit - Efficient Memory Layouts using Unsafe and Unions
Some non-enum crates that can also help:
- ptr_union - "Pointer union types the since of a pointer by storing the tag in the alignment bits" :)
- erasable - "Type-erased thin pointers" - need to see how this is different from
std::any::Any
There is this great article on Towards fearless SIMD, about why SIMD is hard, and how to make it easier. Along with pointers to many interesting crates doing SIMD. (There is a built in crate, std::simd
but it is really lacking) (However, packed_simd will soon be merged into it)
Another great article: learning simd with rust by finding planets is great too. simd is really about parallelism. it is better to do multiple operations in a parallel (vertical) fashion, vector on vector, than to do horizontal operations where the different components of a wide register depend on one another.
-
ssimd - an effort to bring std::simd/packed_simd to Rust stable, with auto vectorization (meaning auto detect and implement code paths and fallbacks for when SIMD not available!)
-
faster - "SIMD for Humans" -- probably my favorite one, very high level translation of numeric map loops into SIMD
-
fearless_simd, the blog post author's crate. Runtime CPU detection and use of the most optimal code, no need for unsafe, but only focused on f32.
-
SIMDeez - abstracts intrinsic SIMD instructions over different instruction sets & vector widths, runtime detection
-
simd_aligned and simd_aligned_rust - work with SIMD and packed_simd using vectors which have guaranteed alignment
-
aligned - newtype with byte alignment, for stack or heap!
NOTE: shuffle
in packed_simd
is not very fast. Replace with native instructions if possible.