Expand description
Vector index storage and loading for frankensearch.
This crate implements the FSVI binary format reader/writer plus exact brute-force top-k vector search, with optional HNSW ANN acceleration.
§FSVI File Layout
All multi-byte integers are little-endian. The vector slab is 64-byte aligned for cache-line / SIMD friendliness.
┌───────────────────────────────────────────┐
│ Header (variable length) │
│ magic: b"FSVI" (4 bytes) │
│ version: u16 (2 bytes) │
│ embedder_id_len: u16 (2 bytes) │
│ embedder_id: [u8] (variable) │
│ embedder_revision_len: u16 (2 bytes) │
│ embedder_revision: [u8] (variable) │
│ dimension: u32 (4 bytes) │
│ quantization: u8 (1 byte) │
│ reserved: [u8; 3] (3 bytes) │
│ record_count: u64 (8 bytes) │
│ vectors_offset: u64 (8 bytes) │
│ header_crc32: u32 (4 bytes) │
├───────────────────────────────────────────┤
│ Record Table │
│ record_count × 16 bytes each: │
│ doc_id_hash: u64 (8 bytes) │
│ doc_id_offset: u32 (4 bytes) │
│ doc_id_len: u16 (2 bytes) │
│ flags: u16 (2 bytes) │
├───────────────────────────────────────────┤
│ String Table │
│ Concatenated UTF-8 doc_id strings │
├───────────────────────────────────────────┤
│ Padding (to 64-byte alignment) │
├───────────────────────────────────────────┤
│ Vector Slab │
│ record_count × dimension × elem_size │
│ (2 bytes/elem for f16, 4 for f32) │
└───────────────────────────────────────────┘Re-exports§
pub use mrl::MrlConfig;pub use mrl::MrlSearchStats;pub use quantization::ScalarQuantizer;pub use search::PARALLEL_CHUNK_SIZE;pub use search::PARALLEL_THRESHOLD;pub use search::SearchParams;pub use simd::cosine_similarity_f16;pub use simd::dot_product_f16_f32;pub use simd::dot_product_f32_f32;pub use two_tier::TwoTierIndex;pub use two_tier::TwoTierIndexBuilder;pub use two_tier::VECTOR_INDEX_FALLBACK_FILENAME;pub use two_tier::VECTOR_INDEX_FAST_FILENAME;pub use two_tier::VECTOR_INDEX_QUALITY_FILENAME;pub use wal::CompactionStats;pub use wal::WalConfig;pub use wal::wal_path_for;pub use warmup::AdaptiveConfig;pub use warmup::HeatMap;pub use warmup::WarmUpConfig;pub use warmup::WarmUpResult;pub use warmup::WarmUpStrategy;
Modules§
- mrl
- Matryoshka Representation Learning (MRL) adaptive dimensionality at search time.
- quantization
- Scalar quantization for vector compression.
- search
- Brute-force top-k vector search over an opened
crate::VectorIndex. - simd
- Portable SIMD dot-product helpers for vector search.
- two_
tier - Two-tier index wrapper for fast and quality vector indices.
- wal
- Write-ahead log for incremental FSVI index updates.
- warmup
- Index warm-up and adaptive page prefaulting for memory-mapped FSVI indices.
Structs§
- Vacuum
Stats - Statistics returned by
VectorIndex::vacuum. - Vector
Index - Vector
Index Writer - Vector
Metadata - Parsed metadata from an FSVI file header.
Enums§
- Quantization
- Vector element quantization stored in the FSVI slab.
Constants§
- FSVI_
MAGIC - Magic bytes at the start of every FSVI file.
- FSVI_
VERSION - Supported FSVI format version.