Skip to content

learnerdaimler/DocuMagic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

20 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DocuMagic – Intelligent Document Processing Platform

DocuMagic is an end-to-end automated document ingestion, parsing, classification, and storage solution built with FastAPI, PostgreSQL, and modern NLP/AI capabilities. It accelerates enterprise workflows by turning raw documents into structured, searchable insights.

πŸš€ Key Features

Email Ingestion Pipeline Automatically fetch emails + attachments from configured mailboxes.

Document Parsing Engine Extract text, tables, metadata, and entities using Python libraries & custom logic.

Metadata Extraction & Classification Categorize documents using rules or ML-driven classification.

Secure Storage Layer Save parsed documents and metadata in PostgreSQL + file/object storage.

REST API with FastAPI Clean, fast, async API for consumption by dashboards or other systems.

Streamlit Dashboard (Optional) Business-friendly interface to view processed documents.

Automation & Scheduling Background jobs for periodic ingestion, cleaning, and reporting.

πŸ—οΈ Tech Stack Layer Technology Backend API FastAPI Database PostgreSQL ORM SQLAlchemy Document Processing PyPDF2, pdfminer, Tesseract, custom NLP Email Client IMAP / SMTP Dashboard Streamlit Deployment Docker, Uvicorn, Gunicorn Cloud Azure / AWS (optional integrations) πŸ“¦ Project Structure DocuMagic/ │── app/ β”‚ β”œβ”€β”€ api/ β”‚ β”œβ”€β”€ core/ β”‚ β”œβ”€β”€ models/ β”‚ β”œβ”€β”€ services/ β”‚ β”œβ”€β”€ utils/ β”‚ └── main.py β”‚ │── tests/ │── requirements.txt │── README.md │── docker-compose.yml │── .env.example

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages