Add support for Python Script node #722

lresende · 2020-07-08T19:12:32Z

Note for testing:
Requires elyra-ai/kfp-notebook#36
And the following env variables

KFP_NOTEBOOK_BRANCH=python-script
KFP_NOTEBOOK_ORG=lresende

Todos

output of python execution
workdir of python execution
validate clicking ok, and call other api
increment pipeline version

Fixes #187

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

lresende · 2020-09-14T19:08:34Z

So this is now working with the latest code and both in kfp and local mode.
I modified my test pipeline to use a python script and seems to be all good, still could use some more test.

ptitzler · 2020-09-14T20:02:11Z

"Open Notebook" -> "Open Python File"

lresende · 2020-09-14T20:37:06Z

@ptitzler Updated to Open File

ptitzler · 2020-09-14T20:41:16Z

For notebooks we upload the completed notebooks to the CO Sbucket . For Python scripts we we should probably do "the same" and capture STDOUT and STDERR. The script I used writes to STDOUT but the output is not logged in the KFP log nor is a .stdout (or .stderr) file being uploaded to COS.

Relevant excerpt from execution log:

Executing Python Script : download_data.py ==> download_data.log
Processing outputs........

Not having access to STDOUT/STDERR is going to be a problem should troubleshooting or validation be performed. For example, in another run the script failed and there's no information available why:

Executing Python Script : download_data.py ==> download_data.log
Unexpected error: <class 'subprocess.CalledProcessError'>
Error details: Command '['python', 'download_data.py']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "bootstrapper.py", line 355, in <module>
    main()
  File "bootstrapper.py", line 349, in main
    file_op.execute()
  File "bootstrapper.py", line 257, in execute
    raise ex
  File "bootstrapper.py", line 244, in execute
    subprocess.check_call(['python', python_script])
  File "/usr/local/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['python', 'download_data.py']' returned non-zero exit status 1.

ptitzler · 2020-09-14T21:39:12Z

... Updated to Open File

Confirmed "Open File" fix. Just realized it now always says this, which I guess is fine.

ptitzler · 2020-09-14T23:57:20Z

One more issue. The file browser seems to apply an ipynb filter, which now prevents selection of a Python script.

ptitzler · 2020-09-15T00:01:19Z

Possibly a related issue here:

I don't think there is a reason at all to apply a filter (if one was added intentionally) as a notebook or Python script can require any type of file.

elyra/pipeline/pipeline.py

elyra/pipeline/processor_local.py

elyra/pipeline/processor.py

Co-authored-by: Kevin Bates <kbates4@gmail.com>

kevin-bates · 2020-09-20T15:12:39Z

@lresende, I don't see the increment of PIPELINE_CURRENT_VERSION in constants.ts. Is that what should be incremented?

ptitzler · 2020-09-21T14:39:32Z

Please give it a quick try when you have a chance

Will do today (Monday 9/22)

kevin-bates · 2020-09-21T15:25:45Z

Regarding the version increment, I think this particular increment should be conditional on whether the pipeline contains a python script node or not - as I touched on here.

I really don't think we should unconditionally trigger a migration dialog when there is literally nothing to change. By making this particular increment conditional, older elyra versions can continue working with shared pipelines until those shared pipelines contain a python node - which will trigger a "You're running an older version of Elyra, please upgrade" message. Likewise, users of the current version will continue operating just fine.

This does mean that the version check to determine if changes are warranted probably needs to be a list of versions and the check would be is this pipeline version not in the list of "acceptable" versions? If not in the list, prompt migration dialog and set pipeline version to "minimum acceptable" version. Only until a python node is added would the pipeline version then by incremented to the "current" version. Of course, when there are changes that DO warrant a migration, then the "acceptable" versions list is reset to only contain the "current" version and continues as a single entry until a "conditional" version is introduced again.

lresende · 2020-09-21T15:37:39Z

@lresende, I don't see the increment of PIPELINE_CURRENT_VERSION in constants.ts. Is that what should be incremented?

Forgot to push :)

lresende · 2020-09-21T15:40:33Z

For the version increment, we don't have the necessary infrastructure to enable what you are describing at the moment, but it would be something good to have for the future.

kevin-bates · 2020-09-21T17:40:26Z

I'm not seeing the version ever move from 2 to 3. I get prompted to migrate (unconditionally), save the pipeline and zero changes are made to the file (I copied the original and there are no differences after saving after migrating). As a result, I'm prompted to migrate every time I open a pipeline.

Does saving a pipeline only persist changes if there are actual differences (which, in this case, there won't be since migration doesn't do anything. 😄 )?

lresende · 2020-09-21T17:43:54Z

this last commit restricts the change of the associated node file only to the same type (e.g. if you originally created a python script node you can only update the file to point to another python script file).

kevin-bates

Thanks Luciano - this is a nice feature.

ptitzler · 2020-09-21T19:14:04Z

Confirmed local execution works now as expected. Opened #939, which is not specific to this PR.
Verifying KFP execution next.

ptitzler · 2020-09-21T19:38:09Z

Confirmed that KFP execution works but noticed

STDERR is not captured in the node's .log file on COS
STDERR output is displayed in the JL log out of sequence:

 unpacking Complete.
 Executing Python Script : load_data.py ==> load_data.log
Uploading Python Script execution log back to Object Storage
Uploading file load_data.log as load_data.log to bucket pipeline-artifacts
Processing outputs........
Uploading file data/noaa-weather-data-jfk-airport/jfk_weather.csv as data/noaa-weather-data-jfk-airport/jfk_weather.csv to bucket pipeline-artifacts
Execution and Upload Complete.
Hello STDERR world

The last message was produced while a node was executed.

ptitzler · 2020-09-21T19:54:06Z

Confirmed that an exported pipeline containing a Python node can be successfully uploaded to KFP using the KFP UI and runs there.

LGTM for the PR pending resolution of STDERR behavior.

ajbozarth

I did a code review of the front end code (no local testing or back end review) and have a handful of questions and code clean up comments, nothing blocking though if you'd rather address them in a followup PR

packages/pipeline-editor/src/PipelineEditorWidget.tsx

ajbozarth · 2020-09-21T21:59:47Z

packages/pipeline-editor/src/PipelineEditorWidget.tsx

          position += 20;
+          this.setState({ showValidationError: false });
+        } else {
+          // handle error


I think you meant to actually handle the error here and not leave it as a comment

packages/pipeline-editor/src/PipelineService.tsx

packages/pipeline-editor/src/canvas.ts

lresende · 2020-09-22T06:38:16Z

the kfp execution of python scripts now captures both stdout and stderr into the same log file. Please update/build kfp-notebook in order to get these changes.

ptitzler · 2020-09-22T13:47:44Z

I believe somehow a regression was introduced because I am no longer able to drag a Python script onto the canvas.

$ git status
On branch pipeline-python-script
Your branch is up to date with 'origin/pipeline-python-script'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   packages/pipeline-editor/src/canvas.ts

no changes added to commit (use "git add" and/or "git commit -a")

$ git pull origin
Already up to date.

ptitzler · 2020-09-22T17:56:07Z

With the latest fix both local and kfp execution of a mixed pipeline works. It does appear as if we are not yet capturing STDOUT and STDERR output in the right sequence for KFP execution.

This code snippet in a Python node:

    sys.stderr.write('Hello STDERR 1')
    sys.stderr.flush()

    # Try to process the URL
    download_from_public_url(dataset_url)
    
    sys.stderr.write('Hello STDERR 2')
    sys.stderr.flush()

produces the following log file content:

Hello STDERR 1Hello STDERR 2Downloading data file https://dax-cdn.cdn.appdomain.cloud/dax-noaa-weather-data-jfk-airport/1.1.4/noaa-weather-data-jfk-airport.tar.gz ...
Saving downloaded file "noaa-weather-data-jfk-airport.tar.gz" as ...
Extracting downloaded file in directory "data" ...
Removing downloaded file ...

Local execution is fine:

Processing Pipeline : ww
Hello STDERR 1Downloading data file https://dax-cdn.cdn.appdomain.cloud/dax-noaa-weather-data-jfk-airport/1.1.4/noaa-weather-data-jfk-airport.tar.gz ...
Saving downloaded file "noaa-weather-data-jfk-airport.tar.gz" as ...
Extracting downloaded file in directory "data" ...
Removing downloaded file ...
Hello STDERR 2

lresende · 2020-09-22T22:01:32Z

I believe this might be a limitation on how the subprocess python API redirects stderr to stdout.

ptitzler

LGTM

kevin-bates · 2020-09-22T23:20:13Z

I believe this might be a limitation on how the subprocess python API redirects stderr to stdout.

Actually, I think this may be because remote execution is using subprocess.check_call(), while the local execution is using subprocess.run() and there's this statement from the docs...

Code needing to capture stdout or stderr should use run() instead:
run(..., check=True)

I believe I mentioned this in an earlier review. It might be worth updating kfp_notebook's python processing to use run() - primarily so the two are also the same, but we may find the two also produce the same behavior.

ptitzler · 2020-09-24T19:20:35Z

Actually, ...

@kevin-bates @lresende do we need a separate issue to follow up on this?

kevin-bates · 2020-09-24T20:14:05Z

@kevin-bates @lresende do we need a separate issue to follow up on this?

Thanks Patrick. I just created a kfp-notebook issue for this (linked above).

lresende changed the title ~~Pipeline python script~~ [WIP] Add support for Python Script node Jul 8, 2020

lresende added the status:Work in Progress label Jul 8, 2020

lresende force-pushed the pipeline-python-script branch from f60bd79 to da63e40 Compare July 8, 2020 22:29

Add support for Python Script pipeline component

d41697c

lresende force-pushed the pipeline-python-script branch from da63e40 to d41697c Compare September 11, 2020 18:59

Enable local mode for python scripts

71c11f7

lresende force-pushed the pipeline-python-script branch from b12f9c6 to 71c11f7 Compare September 14, 2020 19:06

lresende marked this pull request as ready for review September 14, 2020 19:09

lresende changed the title ~~[WIP] Add support for Python Script node~~ Add support for Python Script node Sep 14, 2020

lresende removed the status:Work in Progress label Sep 14, 2020

Fix tests

ebdc2d4

lresende force-pushed the pipeline-python-script branch from daae64c to ebdc2d4 Compare September 14, 2020 19:28

ptitzler self-requested a review September 14, 2020 19:47

lresende requested review from ajbozarth, kevin-bates and akchinSTC September 14, 2020 20:25

Rename context menu to Open File

e8b5c3c

kevin-bates reviewed Sep 15, 2020

View reviewed changes

lresende and others added 5 commits September 15, 2020 22:26

Update elyra/pipeline/processor_local.py

43ca4e8

Co-authored-by: Kevin Bates <kbates4@gmail.com>

Refactoring common code

5a9365d

Merge branch 'master'

f4fbdf0

Fix string not callable runtime issue

5c9f489

Enable python script on property file dialog

5a12761

Another try on changing the node icon

04b9c3e

Disable changing operation type

970a433

Properly save new pipeline version

a8f22db

kevin-bates approved these changes Sep 21, 2020

View reviewed changes

ajbozarth reviewed Sep 21, 2020

View reviewed changes

lresende added 2 commits September 21, 2020 17:15

Re-enable env-vars parsing

ce5ba27

Addressing pr comments

cffeea3

Properly parse only notebooks for env_vars

ceb2580

ptitzler approved these changes Sep 22, 2020

View reviewed changes

lresende merged commit 746b347 into elyra-ai:master Sep 23, 2020

lresende deleted the pipeline-python-script branch September 23, 2020 01:02

kevin-bates mentioned this pull request Oct 8, 2020

Cannot define two nodes as an input dependency in pipelines #953

Closed

lresende mentioned this pull request Nov 13, 2020

Pipeline editor should support python script operators #106

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Python Script node #722

Add support for Python Script node #722

lresende commented Jul 8, 2020 •

edited

Loading

lresende commented Sep 14, 2020

ptitzler commented Sep 14, 2020

lresende commented Sep 14, 2020

ptitzler commented Sep 14, 2020 •

edited

Loading

ptitzler commented Sep 14, 2020 •

edited

Loading

ptitzler commented Sep 14, 2020

ptitzler commented Sep 15, 2020 •

edited

Loading

kevin-bates commented Sep 20, 2020

ptitzler commented Sep 21, 2020

kevin-bates commented Sep 21, 2020

lresende commented Sep 21, 2020

lresende commented Sep 21, 2020

kevin-bates commented Sep 21, 2020

lresende commented Sep 21, 2020

kevin-bates left a comment

ptitzler commented Sep 21, 2020

ptitzler commented Sep 21, 2020

ptitzler commented Sep 21, 2020

ajbozarth left a comment

ajbozarth Sep 21, 2020

lresende commented Sep 22, 2020

ptitzler commented Sep 22, 2020 •

edited

Loading

ptitzler commented Sep 22, 2020

lresende commented Sep 22, 2020

ptitzler left a comment

kevin-bates commented Sep 22, 2020

ptitzler commented Sep 24, 2020

kevin-bates commented Sep 24, 2020

Add support for Python Script node #722

Add support for Python Script node #722

Conversation

lresende commented Jul 8, 2020 • edited Loading

lresende commented Sep 14, 2020

ptitzler commented Sep 14, 2020

lresende commented Sep 14, 2020

ptitzler commented Sep 14, 2020 • edited Loading

ptitzler commented Sep 14, 2020 • edited Loading

ptitzler commented Sep 14, 2020

ptitzler commented Sep 15, 2020 • edited Loading

kevin-bates commented Sep 20, 2020

ptitzler commented Sep 21, 2020

kevin-bates commented Sep 21, 2020

lresende commented Sep 21, 2020

lresende commented Sep 21, 2020

kevin-bates commented Sep 21, 2020

lresende commented Sep 21, 2020

kevin-bates left a comment

Choose a reason for hiding this comment

ptitzler commented Sep 21, 2020

ptitzler commented Sep 21, 2020

ptitzler commented Sep 21, 2020

ajbozarth left a comment

Choose a reason for hiding this comment

ajbozarth Sep 21, 2020

Choose a reason for hiding this comment

lresende commented Sep 22, 2020

ptitzler commented Sep 22, 2020 • edited Loading

ptitzler commented Sep 22, 2020

lresende commented Sep 22, 2020

ptitzler left a comment

Choose a reason for hiding this comment

kevin-bates commented Sep 22, 2020

ptitzler commented Sep 24, 2020

kevin-bates commented Sep 24, 2020

lresende commented Jul 8, 2020 •

edited

Loading

ptitzler commented Sep 14, 2020 •

edited

Loading

ptitzler commented Sep 14, 2020 •

edited

Loading

ptitzler commented Sep 15, 2020 •

edited

Loading

ptitzler commented Sep 22, 2020 •

edited

Loading