Skip to content

Dockerfile #61

@michabbb

Description

@michabbb

hi,

i am not a python guy and i was not able to create working docker image.
maybe someone can help here, thanks !!!

here´s my try, but when starting a container, I always get errors that libraries are not found, although they got installed ;(

FROM ubuntu:24.10

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install -y \
    python3.10 \
    python3-pip \
    python3-venv \
    git \
    wget \
    libjpeg-dev \
    zlib1g-dev \
    libopencv-dev \
    libgl1 \
    && apt-get clean

WORKDIR /app

RUN git clone https://github.com/opendatalab/PDF-Extract-Kit.git

WORKDIR /app/PDF-Extract-Kit

RUN echo "matplotlib\nPyMuPDF\nultralytics\npaddlepaddle\npaddleocr==2.7.3\npaddlepaddle-gpu" > requirements.txt

RUN python3 -m venv venv && \
    . venv/bin/activate && \
    pip install --upgrade pip
RUN  pip install -r requirements.txt --break-system-packages
RUN  pip install --break-system-packages --extra-index-url https://miropsota.github.io/torch_packages_builder detectron2==0.6+pt2.3.1cu121
RUN  pip install pillow>=10.0.0 --break-system-packages
RUN  pip install opencv-python opencv-python-headless --break-system-packages

COPY . .

# Setze den Befehl zum Starten des Containers
CMD ["bash", "-c", "source venv/bin/activate && python3 pdf_extract.py"]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions