Skip to content
View jiaxf's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report jiaxf

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Python 2,304 160 Updated Aug 21, 2024

Building AI agents, atomically

Python 2,018 144 Updated Jan 17, 2025

A simple zero-config tool to make locally trusted development certificates with any names you'd like.

Go 51,689 2,731 Updated Aug 13, 2024

Fast hamming-distance range searches via native GiST Indexing facility in PostgreSQL

C 168 19 Updated Oct 11, 2019
Python 10 Updated Dec 11, 2024

LLM inference in C/C++

C++ 70,885 10,258 Updated Jan 18, 2025

Distribute and run LLMs with a single file.

C++ 21,281 1,096 Updated Jan 5, 2025

Vision model based document ingestion

Rust 1,307 69 Updated Jan 17, 2025

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured …

Python 2,064 158 Updated Jan 19, 2025

PDF to Markdown with vision models

Python 8,555 542 Updated Dec 18, 2024

快速、简洁、解决大文件内存溢出的java处理Excel工具

Java 33,076 7,608 Updated Oct 29, 2024

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

JavaScript 37,034 4,543 Updated Jan 8, 2025

An alternative to original alert, confirm and prompt.

JavaScript 178 16 Updated Mar 16, 2017

🧑‍🚀 The better identity infrastructure for developers and the open-source alternative to Auth0.

TypeScript 9,268 465 Updated Jan 17, 2025

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, le…

TypeScript 19,810 5,217 Updated Jan 18, 2025

Data annotation component library --provided as NPM packages

TypeScript 71 18 Updated Jan 16, 2025

A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.

Python 9,526 340 Updated Jan 18, 2025

A curated list of python scripts for automating your tasks

Python 616 287 Updated Jan 7, 2025

qpdf: A content-preserving PDF document transformer

C++ 3,651 287 Updated Jan 5, 2025

AiEditor is a next-generation rich text editor for AI.

TypeScript 1,201 134 Updated Jan 13, 2025
C 404 121 Updated Aug 22, 2014

Python PDF Parser (Not actively maintained). Check out pdfminer.six.

Python 5,275 1,128 Updated Dec 7, 2022

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks

Python 5,978 481 Updated Nov 3, 2024

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Python 7,067 691 Updated Jan 1, 2025

360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute

259 Updated Sep 10, 2024

Convert PDF to markdown + JSON quickly with high accuracy

Python 19,383 1,154 Updated Jan 17, 2025

Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.

Python 795 80 Updated Oct 15, 2024

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

Python 4,589 405 Updated Jan 2, 2025

Mirror of Apache PDFBox

Java 2,734 876 Updated Jan 18, 2025
Next