#ranking #sorting #differentiable-sorting #fenchel-young #fastsoftsort

fynch

Differentiable sorting and ranking: PAVA, Fenchel-Young losses, and O(n log n) FastSoftSort

2 releases

0.1.1 Mar 8, 2026
0.1.0 Jan 18, 2026

#2600 in Algorithms


Used in rankit

MIT/Apache

105KB
1.5K SLoC

fynch

crates.io Documentation CI

Differentiable sorting and ranking primitives for structured prediction and isotonic regression. Implements PAVA, Fenchel-Young losses, and entropy-regularized operators.

Dual-licensed under MIT or Apache-2.0.

crates.io | docs.rs

Quickstart

[dependencies]
fynch = "0.1.0"
use fynch::{entmax, pava, soft_rank};

let theta = [2.0, 1.0, 0.1];
let p = entmax(&theta, 1.5);

let y = [3.0, 1.0, 2.0, 5.0, 4.0];
let isotonic = pava(&y);
let r = soft_rank(&y, 1.0);

println!("p={p:?}\nisotonic={isotonic:?}\nsoft_rank={r:?}");

lib.rs:

fynch

Fenchel-Young losses and differentiable sorting/ranking.

The name comes from Fenchel-Young losses (Blondel et al. 2020, JMLR), a unifying framework connecting prediction functions and loss functions through convex duality.

The Fenchel-Young Framework

Given a regularizer Ω, the Fenchel-Young loss is:

L_Ω(θ; y) = Ω*(θ) - ⟨θ, y⟩ + Ω(y)

where Ω* is the Fenchel conjugate. The prediction function is:

ŷ_Ω(θ) = ∇Ω*(θ) = argmax_p { ⟨θ, p⟩ - Ω(p) }

Different regularizers give different behaviors:

Regularizer Ω Prediction ŷ_Ω Loss L_Ω Sparsity
Shannon negentropy softmax cross-entropy Dense
½‖·‖² (squared L2) sparsemax sparsemax loss Sparse
Tsallis α-entropy α-entmax entmax loss Tunable

See the fenchel module for the generic framework.

Differentiable Sorting/Ranking

Sorting and ranking are discontinuous—small input changes can cause large output changes (rank swaps). This breaks gradient-based optimization.

Approach Module Regularization Complexity
PAVA + Sigmoid Root L2 O(n) / O(n²)
Sinkhorn OT sinkhorn Entropy (Shannon Ω) O(n² × iter)

Sinkhorn sorting is exactly FY with Shannon regularization applied to the permutation polytope (Birkhoff polytope).

Key Functions

Function Purpose Module
pava Isotonic regression Root
soft_rank Continuous ranks Root
soft_sort Continuous sorting Root
fenchel::softmax Dense prediction fenchel
fenchel::sparsemax Sparse prediction fenchel
fenchel::entmax Tunable sparsity fenchel

Quick Start

Fenchel-Young Predictions

use fynch::fenchel::{softmax, sparsemax, entmax};

let theta = [2.0, 1.0, 0.1];

// Dense (softmax)
let p_soft = softmax(&theta);
assert!(p_soft.iter().all(|&x| x > 0.0));

// Sparse (sparsemax)
let p_sparse = sparsemax(&theta);
assert!(p_sparse.iter().any(|&x| x == 0.0));

// Tunable (1.5-entmax)
let p_ent = entmax(&theta, 1.5);

Fenchel-Young Losses

use fynch::fenchel::{Regularizer, Shannon, SquaredL2, Tsallis};

let theta = [2.0, 1.0, 0.1];
let y = [1.0, 0.0, 0.0];  // one-hot target

// Cross-entropy (Shannon)
let loss_ce = Shannon.loss(&theta, &y);

// Sparsemax-style FY loss via squared L2 regularizer
let loss_sp = SquaredL2.loss(&theta, &y);

// 1.5-entmax loss
let loss_ent = Tsallis::entmax15().loss(&theta, &y);

Differentiable Sorting

use fynch::{pava, soft_rank, soft_sort};

// PAVA: isotonic regression
let y = [3.0, 1.0, 2.0, 5.0, 4.0];
let monotonic = pava(&y);  // [2.0, 2.0, 2.0, 4.5, 4.5]

// Soft ranking
let scores = [0.5, 0.2, 0.8, 0.1];
let ranks = soft_rank(&scores, 0.1).unwrap();

Learning to Rank

use fynch::loss::{spearman_loss, listnet_loss};

let pred = [0.9, 0.1, 0.5];
let target = [3.0, 1.0, 2.0];
let loss = spearman_loss(&pred, &target, 0.1);

Modules

Module Contents
fenchel Generic FY framework: regularizers, predictions, losses
sinkhorn Entropic OT for soft permutations
loss Learning-to-rank losses
metrics IR evaluation: MRR, NDCG, Hits@k

Connections

What Can Go Wrong

  1. Temperature too low: Numerical instability, vanishing gradients
  2. Temperature too high: Predictions become uniform, lose signal
  3. Sinkhorn not converging: Increase max_iter or epsilon
  4. Wrong regularizer: Use sparsemax for top-k, softmax for soft attention

References

  • Blondel, Martins, Niculae (2020). "Learning with Fenchel-Young Losses" (JMLR)
  • Martins & Astudillo (2016). "From Softmax to Sparsemax"
  • Blondel et al. (2020). "Fast Differentiable Sorting and Ranking"
  • Cuturi et al. (2019). "Differentiable Ranking via Optimal Transport"

Dependencies

~0.4–1.2MB
~24K SLoC