Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor script processors, include brief detail on generic errors #1485

Merged
merged 2 commits into from
Mar 29, 2021

Conversation

kevin-bates
Copy link
Member

Although this change was instigated by #1337, it primarily focuses on the aftermath of the merges for #1411 and #1418 - which essentially crossed paths, leaving the R script processor behind the Python script processor in terms of its error handling.

While looking into #1337, it was found that Papermill release 2.3.3 handles the missing kernelspec much more gracefully than raising a KeyError. Instead, it raises a ValueError indicating the kernel could not be determined nor was one provided. However, the general error processing for notebook and both forms of script operations drop any level of details due to the unknown nature of error messages.

This PR does the following:

  • It introduces a new base class named ScriptOperationProcessor from which the existingPythonScriptOperationProcessor and RScriptOperationProcessor classes now derive. This base class contains 90% of the applicable code with the subclasses providing their name and argument vectors.
  • It introduces a log_and_raise() method on the base FIleOperationProcessor class that is available to all file-based operations. Building on the work done in Expose error due to failure of local python pipeline node execution #1411, this method checks the length of the error message and truncates it to around the max (80), replacing overflow with an ellipses (...).
  • Adds a test that removes the kernel metadata from a notebook node and ensures the appropriate error is raised.

When the kernelspec information is missing, the following message will be produced:
Screen Shot 2021-03-25 at 2 35 42 PM

Resolves #1337

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

@elyra-bot
Copy link

elyra-bot bot commented Mar 25, 2021

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link: Binder

@lresende
Copy link
Member

Should we really truncate messages? I would say that the current issues we see on the dialog box limitation might not be present on different UIs? and if we truncate the message it would not be available for UIs that have the proper way to display them? Also, if it's only in the logs, what happens with managed environments such as Hub/Binder that does not expose the logs to the user?

Copy link
Member

@lresende lresende left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing to requested changes to discuss the direction of truncating error messages on the server-side versus on the UI.

@kevin-bates kevin-bates marked this pull request as draft March 26, 2021 02:19
@kevin-bates
Copy link
Member Author

We can open up the message, it wasn't there prior to this, so I thought I'd add a length to it. My concern is that we can easily get back to situations described here where some exception decides to be more verbose about things.

Also, if it's only in the logs, what happens with managed environments such as Hub/Binder that does not expose the logs to the user?

The details twisty thing essentially contains the portion of the "logs" for a user to send to an admin.

@kevin-bates
Copy link
Member Author

Drilling down in this comment:

what happens with managed environments such as Hub/Binder that does not expose the logs to the user?

I'm curious what you're getting at here? In my opinion, exposing access to a server's log - whether that be directly or via some propagation mechanism using third-party software like Grafana - is the responsibility of the administrator/installation. We don't even instruct users to redirect the server's console output to a file where the "logs" can be persisted and "exposed".

Even if the logs were exposed to the users, are we expecting users to decipher what's going on - especially in error situations?

Are you proposing we capture things like status information during a given request so that that information can be returned to the user, or that we expose logs via some kind of service to work around the scenario where logs are intentionally not made available to the end-user?

I'm just trying to understand the end-goal here because we'll need to design/conform our logging differently (and perhaps add state management altogether).

One thought would be to assign every request a, well, request ID, that is randomly generated at the start of each request. The request ID would be logged on entry and exit from the handlers and potentially at interesting data points and certainly be included in log_pipeline_info() along with logged exceptions. It would also be included on the error dialog. Then the user would simply reference the request id as the "handle" that an administrator could use to decipher that particular request. This is tantamount to a session id associated with each request.

@akchinSTC akchinSTC added the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label Mar 26, 2021
@akchinSTC akchinSTC added this to the 2.2.0 milestone Mar 26, 2021
@kevin-bates
Copy link
Member Author

After a sidebar (scrum), it was determined that having the traceback in the error details is sufficient log information and inclusion of state or progress is not necessary.

We also discussed possibly moving the truncation to the client since other client interfaces may be able to handle the more unwieldy messages better. We decided, for now, to continue with the truncation on the server after confirming that the full (non-truncated) message is available in the traceback. Here's a screenshot, in which I temporarily shorted the max message length from 80 to 40 characters, showing that the full message is indeed in the traceback:
Screen Shot 2021-03-26 at 9 36 52 AM

Again, note that this truncation only happens for general exceptions. We explicitly handle PapermillExceutionError (for notebooks) and CalledProcessError (for scripts) exceptions and they do not apply to this discussion.

SInce it sounds like the discussion is complete, I'm removing this from DRAFT.

@kevin-bates kevin-bates removed the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label Mar 26, 2021
@kevin-bates kevin-bates marked this pull request as ready for review March 26, 2021 16:50
Copy link
Member

@kiersten-stokes kiersten-stokes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@akchinSTC akchinSTC requested a review from lresende March 26, 2021 21:37
@lresende lresende merged commit 9132c16 into elyra-ai:master Mar 29, 2021
@kevin-bates kevin-bates deleted the refactor-script-processors branch September 11, 2021 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing kernelspec on new notebooks created in VSCode
4 participants