Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix variable formatting in jinja template used by export #1027

Merged
merged 1 commit into from
Nov 4, 2020

Conversation

akchinSTC
Copy link
Member

@akchinSTC akchinSTC commented Oct 30, 2020

-Remove single quotes for pipeline input and outputs
-Add link and description to pipeline experiment
-Filter paths from artifact names

Fixes #1025

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

@elyra-bot
Copy link

elyra-bot bot commented Oct 30, 2020

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link: Binder

@akchinSTC akchinSTC requested a review from ptitzler October 30, 2020 23:56
@ptitzler
Copy link
Member

ptitzler commented Nov 2, 2020

The latest update produces the following output:

   # Operator for "examples/pipelines/dax_noaa_weather_data/Part 1 - Data Cleaning.ipynb"
    notebook_op_2 = NotebookOp(name='examples_pipelines_dax_noaa_weather_data_Part_1___Data_Cleaning',
                               notebook='examples/pipelines/dax_noaa_weather_data/Part 1 - Data Cleaning.ipynb',
                               cos_endpoint='http://...',
                               cos_bucket='pipeline-artifacts',
                               cos_directory='analyze_NOAA_weather_data',
                               cos_dependencies_archive='Part 1 - Data Cleaning-e07e1b7f-568b-4bc3-9fc6-da372fd58daf.tar.gz',

Note the path prefix examples/pipelines/dax_noaa_weather_data/, which shouldn't be included because references are relative to the location of the pipeline file.

Neither the original pipeline file

          "app_data": {
            "filename": "Part 1 - Data Cleaning.ipynb",
            "runtime_image": "amancevice/pandas:1.0.3",
            "env_vars": [],

nor the generated tar archives contain the path information:

$ tar -tvf Part\ 1\ -\ Data\ Cleaning-e07e1b7f-568b-4bc3-9fc6-da372fd58daf.tar.gz 
drwxr-xr-x  0 patti  staff       0 Nov  2 14:43 /
-rw-r--r--  0 patti  staff   37843 Nov  2 12:38 Part 1 - Data Cleaning.ipynb

pipeline_outputs={{ operation.pipeline_outputs }},
image='{{ operation.image }}')

notebook_op_{{ loop.index }}.name = '{{ operation.name }}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why we are overriding the name instead of setting it to the appropriate value right away?

notebook_op_{{ loop.index }} = NotebookOp(name='{{ operation_id }}',
...
notebook_op_{{ loop.index }}.name = '{{ operation.name }}'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what is the question here, if is regarding using operation_id versus operation.name, we definitely don't want to use the id as the operation name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this had to do with the way the notebook op initializes its parent classes.

@ptitzler
Copy link
Member

ptitzler commented Nov 3, 2020

Confirmed that the exported Python DSL works as expected for the tutorial pipeline.

@ptitzler
Copy link
Member

ptitzler commented Nov 3, 2020

The exported YAML file still uses fully qualified names

...
--file "examples/pipelines/dax_noaa_weather_data/load_data.ipynb"
...
--file "examples/pipelines/dax_noaa_weather_data/Part
          2 - Data Analysis.ipynb
...
--file "examples/pipelines/dax_noaa_weather_data/Part 3 - Time Series Forecasting.ipynb"

We wouldn't have these issues if export would always generate Python DSL first, and then if YAML was selected as output format, compile the DSL to produce the final output.

Copy link
Member

@lresende lresende left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and compared before/after results. LGTM

pipeline_outputs={{ operation.pipeline_outputs }},
image='{{ operation.image }}')

notebook_op_{{ loop.index }}.name = '{{ operation.name }}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what is the question here, if is regarding using operation_id versus operation.name, we definitely don't want to use the id as the operation name.

@lresende lresende changed the title Fix kfp jinja template variable formatting Fix variable formatting in jinja template used by export Nov 3, 2020
-Remove single quotes for pipeline input and outputs
-Add link and description to pipeline experiment
-Filter paths from artifact names

Fixes elyra-ai#1025
@lresende lresende merged commit f163885 into elyra-ai:master Nov 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pipeline export to Python DSL produces invalid script
3 participants