Equibles.ParadeDB.EntityFrameworkCore

EF Core integration for ParadeDB pg_search — BM25 full-text search indexes on PostgreSQL.

Provides a [Bm25Index] attribute for automatic index creation via migrations, and LINQ-friendly query methods for BM25 search, fuzzy matching, boosting, scoring, snippets, and more. No raw SQL needed.

Requirements

PostgreSQL with the pg_search extension installed
Npgsql.EntityFrameworkCore.PostgreSQL provider
.NET 10+

Installation

dotnet add package Equibles.ParadeDB.EntityFrameworkCore

Setup

1. Enable ParadeDB in your DbContext

services.AddDbContext<MyDbContext>(options =>
    options.UseNpgsql(connectionString, npgsql => npgsql.UseParadeDb()));

2. Add BM25 indexes to your entities

using Equibles.ParadeDB.EntityFrameworkCore;

[Bm25Index(nameof(Id), nameof(Title), nameof(Content))]
public class Article
{
    public Guid Id { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
}

The first parameter is the key field (required by pg_search to identify rows for scoring via pdb.score()), followed by the columns to index for full-text search. The key field is not searchable — it's only used internally by ParadeDB.

3. Create a migration

dotnet ef migrations add AddBm25Index
dotnet ef database update

EF Core will generate the migration automatically, creating:

The pg_search PostgreSQL extension
A BM25 index on the specified columns with the correct key_field storage parameter

Querying

Basic Search

Uses the ||| operator — matches documents containing any of the query terms (OR).

var results = await dbContext.Articles
    .Where(a => EF.Functions.Matches(a.Content, "machine learning"))
    .ToListAsync();

SELECT * FROM "Articles" WHERE "Content" ||| 'machine learning'

Conjunction Search

Uses the &&& operator — matches documents containing all of the query terms (AND).

var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesAll(a.Content, "machine learning"))
    .ToListAsync();

SELECT * FROM "Articles" WHERE "Content" &&& 'machine learning'

Phrase Search

Matches terms in exact order. Slop allows N words between terms or transposition of adjacent terms.

// Exact phrase
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesPhrase(a.Content, "neural networks"))
    .ToListAsync();

// Phrase with slop — allows up to 2 words between terms
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesPhrase(a.Content, "neural networks", 2))
    .ToListAsync();

-- Exact phrase
SELECT * FROM "Articles" WHERE "Content" ### 'neural networks'

-- With slop
SELECT * FROM "Articles" WHERE "Content" ### 'neural networks'::pdb.slop(2)

Term Search

Exact token match — the query is NOT tokenized (no stemming/lowering). Most tokenizers lowercase, so search lowercase.

// Single term
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesTerm(a.Content, "gpu"))
    .ToListAsync();

// Multiple terms (matches any)
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesTermSet(a.Content, "gpu", "tpu", "npu"))
    .ToListAsync();

SELECT * FROM "Articles" WHERE "Content" === 'gpu'
SELECT * FROM "Articles" WHERE "Content" === ARRAY['gpu', 'tpu', 'npu']

Fuzzy Search (Levenshtein Distance)

Tolerates typos by allowing up to N single-character edits (insertions, deletions, substitutions). Max distance is 2.

prefix: exempts the initial substring from the edit distance
transpositionCostOne: counts swapping two adjacent characters as one edit instead of two

// Basic fuzzy (distance 2)
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesFuzzy(a.Content, "machin", 2))
    .ToListAsync();

// Fuzzy with all options
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesFuzzy(a.Content, "machin", 2, true, false))
    .ToListAsync();

// Fuzzy AND match
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesAllFuzzy(a.Content, "machin lerning", 2))
    .ToListAsync();

// Fuzzy term match
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesTermFuzzy(a.Content, "machin", 1))
    .ToListAsync();

SELECT * FROM "Articles" WHERE "Content" ||| 'machin'::pdb.fuzzy(2)
SELECT * FROM "Articles" WHERE "Content" ||| 'machin'::pdb.fuzzy(2, true, false)
SELECT * FROM "Articles" WHERE "Content" &&& 'machin lerning'::pdb.fuzzy(2)
SELECT * FROM "Articles" WHERE "Content" === 'machin'::pdb.fuzzy(1)

Boost

Increases the BM25 relevance weight of a specific search term. Higher boost = higher score for matches on that term. Factor range: -2048 to 2048.

// Boosted OR match
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesBoosted(a.Title, "transformers", 2.0))
    .ToListAsync();

// Boosted AND match
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesAllBoosted(a.Content, "attention mechanism", 1.5))
    .ToListAsync();

// Combined fuzzy + boost
var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesFuzzyBoosted(a.Title, "transfomers", 2, 2.0))
    .ToListAsync();

SELECT * FROM "Articles" WHERE "Title" ||| 'transformers'::pdb.boost(2)
SELECT * FROM "Articles" WHERE "Content" &&& 'attention mechanism'::pdb.boost(1.5)
SELECT * FROM "Articles" WHERE "Title" ||| 'transfomers'::pdb.fuzzy(2)::pdb.boost(2)

BM25 Scoring

BM25 (Best Matching 25) ranks documents by relevance considering term frequency, inverse document frequency, and document length.

var results = await dbContext.Articles
    .Where(a => EF.Functions.Matches(a.Content, "deep learning"))
    .Select(a => new
    {
        a.Title,
        Score = EF.Functions.Score(a.Id)
    })
    .OrderByDescending(a => a.Score)
    .Take(10)
    .ToListAsync();

SELECT "Title", pdb.score("Id") AS "Score"
FROM "Articles"
WHERE "Content" ||| 'deep learning'
ORDER BY pdb.score("Id") DESC
LIMIT 10

Snippets

Returns text excerpts with matched terms highlighted using configurable HTML tags.

// Basic snippet (default highlighting)
var results = await dbContext.Articles
    .Where(a => EF.Functions.Matches(a.Content, "neural networks"))
    .Select(a => new
    {
        a.Title,
        Snippet = EF.Functions.Snippet(a.Content)
    })
    .ToListAsync();

// Parameterized snippet (custom tags and length)
var results = await dbContext.Articles
    .Where(a => EF.Functions.Matches(a.Content, "neural networks"))
    .Select(a => new
    {
        a.Title,
        Snippet = EF.Functions.Snippet(a.Content, "<b>", "</b>", 100)
    })
    .ToListAsync();

// Multiple snippets
var results = await dbContext.Articles
    .Where(a => EF.Functions.Matches(a.Content, "neural networks"))
    .Select(a => new
    {
        a.Title,
        Snippets = EF.Functions.Snippets(a.Content, 15, 5, 0)
    })
    .ToListAsync();

SELECT "Title", pdb.snippet("Content") AS "Snippet" FROM "Articles" WHERE ...
SELECT "Title", pdb.snippet("Content", start_tag => '<b>', end_tag => '</b>', max_num_chars => 100) AS "Snippet" FROM "Articles" WHERE ...
SELECT "Title", pdb.snippets("Content", max_num_chars => 15, "limit" => 5, "offset" => 0) AS "Snippets" FROM "Articles" WHERE ...

Parse Query (Tantivy Syntax)

Full query parser supporting field:value, boolean operators (AND/OR/NOT), ranges (rating:>3), and wildcards.

lenient: ignores syntax errors
conjunctionMode: defaults terms to AND instead of OR

// Basic parse query
var results = await dbContext.Articles
    .Where(a => EF.Functions.Parse(a.Id, "title:transformers AND content:attention"))
    .ToListAsync();

// With options
var results = await dbContext.Articles
    .Where(a => EF.Functions.Parse(a.Id, "transformers attention", true, true))
    .ToListAsync();

SELECT * FROM "Articles" WHERE "Id" @@@ pdb.parse('title:transformers AND content:attention')
SELECT * FROM "Articles" WHERE "Id" @@@ pdb.parse('transformers attention', lenient => TRUE, conjunction_mode => TRUE)

Regex Search

Matches indexed tokens against a regular expression (Rust regex syntax).

var results = await dbContext.Articles
    .Where(a => EF.Functions.Regex(a.Content, "neuro.*"))
    .ToListAsync();

SELECT * FROM "Articles" WHERE "Content" @@@ pdb.regex('neuro.*')

Phrase Prefix

Matches a phrase where the last term is treated as a prefix — useful for autocomplete/type-ahead.

// Basic phrase prefix
var results = await dbContext.Articles
    .Where(a => EF.Functions.PhrasePrefix(a.Content, "running", "sh"))
    .ToListAsync();

// With max expansions
var results = await dbContext.Articles
    .Where(a => EF.Functions.PhrasePrefix(a.Content, 10, "running", "sh"))
    .ToListAsync();

SELECT * FROM "Articles" WHERE "Content" @@@ pdb.phrase_prefix(ARRAY['running', 'sh'])
SELECT * FROM "Articles" WHERE "Content" @@@ pdb.phrase_prefix(ARRAY['running', 'sh'], max_expansions => 10)

More Like This

Finds documents similar to a given document by analyzing its indexed terms.

// Find similar to document with ID 3
var results = await dbContext.Articles
    .Where(a => EF.Functions.MoreLikeThis(a.Id, 3))
    .ToListAsync();

// Restrict similarity analysis to specific fields
var results = await dbContext.Articles
    .Where(a => EF.Functions.MoreLikeThis(a.Id, 3, "description"))
    .ToListAsync();

SELECT * FROM "Articles" WHERE "Id" @@@ pdb.more_like_this(3)
SELECT * FROM "Articles" WHERE "Id" @@@ pdb.more_like_this(3, ARRAY['description'])

Combining with LINQ

All search methods compose naturally with standard LINQ:

var results = await dbContext.Articles
    .Where(a => EF.Functions.MatchesFuzzy(a.Content, "transfomers", 2)
                && a.CreatedAt > DateTime.UtcNow.AddMonths(-6))
    .Select(a => new
    {
        a.Title,
        Snippet = EF.Functions.Snippet(a.Content, "<mark>", "</mark>", 200),
        Score = EF.Functions.Score(a.Id)
    })
    .OrderByDescending(a => a.Score)
    .Take(20)
    .ToListAsync();

How it works

Index creation

The library hooks into EF Core's model finalization pipeline via IConventionSetPlugin. During model building, it:

Scans entity types for [Bm25Index] attributes
Creates database indexes with the bm25 index method
Sets the key_field storage parameter (required by pg_search)
Registers the pg_search PostgreSQL extension

All of this is translated into standard EF Core migrations — no manual SQL required.

Query translation

LINQ methods on EF.Functions are translated to SQL via IMethodCallTranslatorPlugin:

C# Method	SQL
`Matches(col, "q")`	`col \|\|\| 'q'`
`MatchesAll(col, "q")`	`col &&& 'q'`
`MatchesPhrase(col, "q")`	`col ### 'q'`
`MatchesPhrase(col, "q", 2)`	`col ### 'q'::pdb.slop(2)`
`MatchesTerm(col, "q")`	`col === 'q'`
`MatchesTermSet(col, "a", "b")`	`col === ARRAY['a', 'b']`
`MatchesFuzzy(col, "q", 2)`	`col \|\|\| 'q'::pdb.fuzzy(2)`
`MatchesFuzzy(col, "q", 2, true, false)`	`col \|\|\| 'q'::pdb.fuzzy(2, true, false)`
`MatchesAllFuzzy(col, "q", 2)`	`col &&& 'q'::pdb.fuzzy(2)`
`MatchesTermFuzzy(col, "q", 1)`	`col === 'q'::pdb.fuzzy(1)`
`MatchesBoosted(col, "q", 2.0)`	`col \|\|\| 'q'::pdb.boost(2)`
`MatchesAllBoosted(col, "q", 2.0)`	`col &&& 'q'::pdb.boost(2)`
`MatchesFuzzyBoosted(col, "q", 2, 2.0)`	`col \|\|\| 'q'::pdb.fuzzy(2)::pdb.boost(2)`
`MatchesAllFuzzyBoosted(col, "q", 2, 2.0)`	`col &&& 'q'::pdb.fuzzy(2)::pdb.boost(2)`
`Score(id)`	`pdb.score(id)`
`Snippet(col)`	`pdb.snippet(col)`
`Snippet(col, "<b>", "</b>", 100)`	`pdb.snippet(col, start_tag => '<b>', end_tag => '</b>', max_num_chars => 100)`
`Snippets(col, 15, 5, 0)`	`pdb.snippets(col, max_num_chars => 15, "limit" => 5, "offset" => 0)`
`Parse(id, "desc:shoes")`	`id @@@ pdb.parse('desc:shoes')`
`Parse(id, "q", true, true)`	`id @@@ pdb.parse('q', lenient => true, conjunction_mode => true)`
`Regex(col, "key.*")`	`col @@@ pdb.regex('key.*')`
`PhrasePrefix(col, "running", "sh")`	`col @@@ pdb.phrase_prefix(ARRAY['running', 'sh'])`
`PhrasePrefix(col, 10, "running", "sh")`	`col @@@ pdb.phrase_prefix(ARRAY['running', 'sh'], max_expansions => 10)`
`MoreLikeThis(id, 3)`	`id @@@ pdb.more_like_this(3)`
`MoreLikeThis(id, 3, "description")`	`id @@@ pdb.more_like_this(3, ARRAY['description'])`

License

MIT

Author

Daniel Oliveira

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
Equibles.ParadeDB.EntityFrameworkCore.Tests		Equibles.ParadeDB.EntityFrameworkCore.Tests
Equibles.ParadeDB.EntityFrameworkCore		Equibles.ParadeDB.EntityFrameworkCore
assets		assets
.gitignore		.gitignore
Directory.Build.props		Directory.Build.props
Equibles.ParadeDB.EntityFrameworkCore.slnx		Equibles.ParadeDB.EntityFrameworkCore.slnx
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Equibles.ParadeDB.EntityFrameworkCore

Requirements

Installation

Setup

1. Enable ParadeDB in your DbContext

2. Add BM25 indexes to your entities

3. Create a migration

Querying

Basic Search

Conjunction Search

Phrase Search

Term Search

Fuzzy Search (Levenshtein Distance)

Boost

BM25 Scoring

Snippets

Parse Query (Tantivy Syntax)

Regex Search

Phrase Prefix

More Like This

Combining with LINQ

How it works

Index creation

Query translation

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Equibles.ParadeDB.EntityFrameworkCore

Requirements

Installation

Setup

1. Enable ParadeDB in your DbContext

2. Add BM25 indexes to your entities

3. Create a migration

Querying

Basic Search

Conjunction Search

Phrase Search

Term Search

Fuzzy Search (Levenshtein Distance)

Boost

BM25 Scoring

Snippets

Parse Query (Tantivy Syntax)

Regex Search

Phrase Prefix

More Like This

Combining with LINQ

How it works

Index creation

Query translation

License

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages