LOTUS is an open-source framework and query engine designed to enable efficient processing of structured and unstructured datasets using large language models. The system provides a declarative programming model that allows developers to express complex AI data operations using high-level commands rather than manually orchestrating model calls. It offers a Python interface with a Pandas-like API, making it familiar for data scientists and engineers already working with data analysis libraries. The core concept of the framework is the use of semantic operators, which extend traditional relational database operations to support reasoning over text and other unstructured data. These operators allow tasks such as semantic filtering, ranking, clustering, and summarization to be expressed directly within data processing pipelines. The LOTUS engine automatically optimizes how language models are used during execution, which can significantly improve performance and reduce computational cost.
Features
- Declarative query engine for AI powered data processing
- Pandas-like Python API for building data pipelines
- Semantic operators for filtering joining and ranking unstructured data
- Integration with large language models for reasoning tasks
- Optimized execution engine improving LLM performance and efficiency
- Support for processing structured documents and multimedia datasets