Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update papermill transient dependencies on docker build #923

Merged
merged 7 commits into from
Sep 17, 2020

Conversation

lresende
Copy link
Member

Some papermill transient dependencies are not properly
declared and require a manual install/update to avoid
runtime issues.

Fixes #920

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

Some papermill transient dependencies are not properly
declared and require a manual install/update to avoid
runtime issues.
@lresende
Copy link
Member Author

I have also updated elyra/elyra:dev and it's being uploaded now.

@akchinSTC
Copy link
Member

Do these need to bumped to a particular version in runtime docker images or is these just an issue for when we run pipelines locally?

@@ -32,7 +32,8 @@ RUN chmod ugo+x /usr/local/bin/start-elyra.sh && \
USER $NB_USER

RUN python -m pip install --upgrade pip && \
python -m pip install setuptools pandas --ignore-installed --upgrade && \
python -m pip install --ignore-installed --upgrade setuptools && \
python -m pip install --upgrade pandas numpy papermill nbclient jupyter-client notebook && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have pandas and numpy here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas was there, numpy usually needs to be sync-ed up with pandas

@@ -32,7 +32,8 @@ RUN chmod ugo+x /usr/local/bin/start-elyra.sh && \
USER $NB_USER

RUN python -m pip install --upgrade pip && \
python -m pip install setuptools pandas --ignore-installed --upgrade && \
python -m pip install --ignore-installed --upgrade setuptools && \
python -m pip install --upgrade pandas numpy papermill nbclient jupyter-client notebook && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upgrade references of papermill nbclient jupyter-client notebook are redundant with what make install does below. Is this because we aren't able to specify --upgrade-strategy eager in make install? Would it be better to add a macro and call something like make UPGRADE_STRATEGY=eager install and update the Makefile install-server target to appropriately use --upgrade-strategy as follows...

UPGRADE_STRATEGY?=only-if-needed

...

install-server: build-server ## Install backend
	pip install --upgrade --upgrade-strategy $(UPGRADE_STRATEGY) dist/elyra-*-py3-none-any.whl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the papermill transient dependencies that always bite us, and thus why I added them explicitly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, but we list most of those dependencies as part of Elyra's installation requirements and an eager upgrade would address the others - thus my comment about redundancy. Should we ever need to pin (cap) one of these dependencies, the approach in the PR will potentially break the container at runtime since a module can be installed that is greater than it's cap.

I'm OK moving forward without this but wanted to note the reason I'm bringing this up in the first place.

@ptitzler
Copy link
Member

Tested the updated dev Docker image https://hub.docker.com/layers/elyra/elyra/dev/images/sha256-7df010895dfb76168eb3e1c1deae7e5f8ed51558ecedd2aa9502656a03973aca?context=explore

  • Pipeline execution worked in a local environment
  • Pipeline failed running on KFP:
[I 17:31:16.142 LabApp] Pipeline : www-0910173116
[E 17:31:16.147 LabApp] Uncaught exception POST /elyra/pipeline/schedule?1599759076098 (172.17.0.1)
    HTTPServerRequest(protocol='http', host='127.0.0.1:8888', method='POST', uri='/elyra/pipeline/schedule?1599759076098', version='HTTP/1.1', remote_ip='172.17.0.1')
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.7/site-packages/elyra/pipeline/processor_kfp.py", line 64, in process
        kfp.compiler.Compiler().compile(pipeline_function, pipeline_path)
      File "/opt/conda/lib/python3.7/site-packages/kfp/compiler/compiler.py", line 905, in compile
        package_path=package_path)
      File "/opt/conda/lib/python3.7/site-packages/kfp/compiler/compiler.py", line 960, in _create_and_write_workflow
        pipeline_conf)
      File "/opt/conda/lib/python3.7/site-packages/kfp/compiler/compiler.py", line 804, in _create_workflow
        pipeline_func(*args_list)
      File "/opt/conda/lib/python3.7/site-packages/elyra/pipeline/processor_kfp.py", line 62, in <lambda>
        pipeline_function = lambda: self._cc_pipeline(pipeline, pipeline_name)  # nopep8 E731
      File "/opt/conda/lib/python3.7/site-packages/elyra/pipeline/processor_kfp.py", line 243, in _cc_pipeline
        image=operation.runtime_image)
      File "/opt/conda/lib/python3.7/site-packages/kfp_notebook/pipeline/_notebook_op.py", line 138, in __init__
        super().__init__(**kwargs)
    TypeError: __init__() got an unexpected keyword argument 'emptydir_volume_size'
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.7/site-packages/tornado/web.py", line 1699, in _execute
        result = await result
      File "/opt/conda/lib/python3.7/site-packages/elyra/pipeline/handlers.py", line 89, in post
        response = await PipelineProcessorManager.instance().process(pipeline)
      File "/opt/conda/lib/python3.7/site-packages/elyra/pipeline/processor.py", line 70, in process
        res = await asyncio.get_event_loop().run_in_executor(None, processor.process, pipeline)
      File "/opt/conda/lib/python3.7/concurrent/futures/thread.py", line 57, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/opt/conda/lib/python3.7/site-packages/elyra/pipeline/processor_kfp.py", line 70, in process
        format(pipeline_name, pipeline_path), str(ex)) from ex
    RuntimeError: ('Error compiling pipeline www-0910173116 at /tmp/tmp7eryairf/www-0910173116.tar.gz', "__init__() got an unexpected keyword argument 'emptydir_volume_size'")

Dunno whether this is caused by the requirements update or code that's in the master branch.

@kevin-bates
Copy link
Member

kevin-bates commented Sep 10, 2020

You're using Elyra from master but not the same for kfp-notebook. As a result, one of the recent changes to both packages is not getting recognized by the older kfp-notebook.

@ptitzler
Copy link
Member

Just a formality but the following Docker images need to be part of the PR delivery:

@lresende
Copy link
Member Author

lresende commented Sep 10, 2020

Well @ptitzler, officially elyra/elyra:1.1.0 should be updated with a new elyra/elyra:1.1.1 release.

@ptitzler
Copy link
Member

Ah. yes, you are right. I figured because 1.1.0 was not that useful we might as well replace it, but that might open up a can of worms. So this could be shipped as 1.1.1 or delivered as part of 1.2.

elyra/elyra:1.x.y (built using https://github.com/elyra-ai/elyra/releases/tag/v1.x.y)
elyra/elyra:latest (built using https://github.com/elyra-ai/elyra/releases/tag/v1.x.y)
elyra/elyra:dev (master)

This change will allow uses to specify at build time, whether to
update their transient python dependencies with an 'eager' strategy
@lresende
Copy link
Member Author

@akchinSTC @kevin-bates using the updated pr with upgrade_strategy i get the following:

Traceback (most recent call last):
  File "/opt/conda/bin/papermill", line 8, in <module>
    sys.exit(papermill())
  File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/papermill/cli.py", line 238, in papermill
    execution_timeout=execution_timeout,
  File "/opt/conda/lib/python3.7/site-packages/papermill/execute.py", line 106, in execute_notebook
    **engine_kwargs
  File "/opt/conda/lib/python3.7/site-packages/papermill/engines.py", line 49, in execute_notebook_with_engine
    return self.get_engine(engine_name).execute_notebook(nb, kernel_name, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/papermill/engines.py", line 343, in execute_notebook
    cls.execute_managed_notebook(nb_man, kernel_name, log_output=log_output, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/papermill/engines.py", line 402, in execute_managed_notebook
    return PapermillNotebookClient(nb_man, **final_kwargs).execute()
  File "/opt/conda/lib/python3.7/site-packages/papermill/clientwrap.py", line 43, in execute
    with self.setup_kernel(**kwargs):
  File "/opt/conda/lib/python3.7/contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "/opt/conda/lib/python3.7/site-packages/nbclient/client.py", line 430, in setup_kernel
    self.km = self.create_kernel_manager()
  File "/opt/conda/lib/python3.7/site-packages/nbclient/client.py", line 341, in create_kernel_manager
    self.km = self.kernel_manager_class(kernel_name=self.kernel_name, config=self.config)
  File "/opt/conda/lib/python3.7/site-packages/traitlets/traitlets.py", line 556, in __get__
    return self.get(obj, cls)
  File "/opt/conda/lib/python3.7/site-packages/traitlets/traitlets.py", line 535, in get
    value = self._validate(obj, dynamic_default())
  File "/opt/conda/lib/python3.7/site-packages/nbclient/client.py", line 233, in _kernel_manager_class_default
    from jupyter_client import AsyncKernelManager

So I guess we are back on the papermill issue where they don't take good care of their transient dependency.

@kevin-bates
Copy link
Member

The UPGRADE_STRATEGY change wasn't correct. I've suggested the changes above. @ajbozarth submitted a PR to papermill that adds a sufficient floor to jupyter_client and that's in their 2.1.3 release, so please make sure that version of papermill is what is in play when retrying, but I suspect the fix for upgrade strategy should be sufficient.

@lresende - I didn't understand your 'revert' commit since it only addressed the addition of pandas which is not applicable to the traceback.

@akchinSTC
Copy link
Member

@lresende -pushed changes. I thought we wanted a way to configure the update strategy at build time so that we could switch it back and forth.

@kevin-bates
Copy link
Member

I thought we wanted a way to configure the update strategy at build time so that we could switch it back and forth.

I see. We may want that someday but there's nothing indicating we'd need that. This "strategy" approach was purely a means of re-using what we already list in our setup.py rather than duplicate modules - as was the original update in the PR.

If we include the ARG setting, then we'd need to use eager from the Makefile when building the image, which wasn't the case either.

I'm curious why you removed the pandas since that is something completely separate from elyra. This commit from Luciano was a separate question I had above since it has no relation to the issue he ran into regarding the upgrade strategy.

@lresende
Copy link
Member Author

the whole thing with the panda discussion is that it's not related to this and is in master already.
@ptitzler Is that being used by some of the samples/binder/etc?
Anyway, to have a proper history I will put it back and we can remove on a different commit if necessary.

@ptitzler
Copy link
Member

Is that being used by some of the samples/binder/etc?

If you are asking whether I tested the Docker images with the example pipelines then yes, I did. The examples are what I tend to use for smoke testing and what I was referring to in my earlier comment "Both, local pipeline execution and pipeline execution on kfp, must succeed to pass the readiness test."

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good Luciano - thank you.

@lresende lresende merged this pull request into elyra-ai:master Sep 17, 2020
@lresende lresende deleted the fix-docker-kernel-manager-class branch September 17, 2020 03:25
lresende added a commit that referenced this pull request Sep 17, 2020
Some papermill transient dependencies are not properly
declared and require a manual install/update to avoid
runtime issues.

Co-authored-by: Alan Chin <akchin@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Docker] pipeline execution fails with KeyError: 'kernel_manager_class'
4 participants