Skip to content

Add support for Python UDFs in distributed queries #173

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I would like to be able to execute queries containing custom Python code. This is already supported in DataFusion but we need to add the serde aspect in Ballista.

Describe the solution you'd like
It looks like we can serialize Python functions with https://docs.rs/pyo3/0.7.0/pyo3/marshal/index.html

We then need to store the serialied Python functions in protobuf and deserialize them in the executors.

Describe alternatives you've considered
None

Additional context
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions