Refactor script processors, include brief detail on generic errors #1485

kevin-bates · 2021-03-25T23:25:57Z

Although this change was instigated by #1337, it primarily focuses on the aftermath of the merges for #1411 and #1418 - which essentially crossed paths, leaving the R script processor behind the Python script processor in terms of its error handling.

While looking into #1337, it was found that Papermill release 2.3.3 handles the missing kernelspec much more gracefully than raising a KeyError. Instead, it raises a ValueError indicating the kernel could not be determined nor was one provided. However, the general error processing for notebook and both forms of script operations drop any level of details due to the unknown nature of error messages.

This PR does the following:

It introduces a new base class named ScriptOperationProcessor from which the existingPythonScriptOperationProcessor and RScriptOperationProcessor classes now derive. This base class contains 90% of the applicable code with the subclasses providing their name and argument vectors.
It introduces a log_and_raise() method on the base FIleOperationProcessor class that is available to all file-based operations. Building on the work done in Expose error due to failure of local python pipeline node execution #1411, this method checks the length of the error message and truncates it to around the max (80), replacing overflow with an ellipses (...).
Adds a test that removes the kernel metadata from a notebook node and ensures the appropriate error is raised.

When the kernelspec information is missing, the following message will be produced:

Resolves #1337

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

elyra-bot · 2021-03-25T23:26:06Z

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link:

lresende · 2021-03-26T01:12:21Z

Should we really truncate messages? I would say that the current issues we see on the dialog box limitation might not be present on different UIs? and if we truncate the message it would not be available for UIs that have the proper way to display them? Also, if it's only in the logs, what happens with managed environments such as Hub/Binder that does not expose the logs to the user?

lresende

Changing to requested changes to discuss the direction of truncating error messages on the server-side versus on the UI.

kevin-bates · 2021-03-26T02:36:52Z

We can open up the message, it wasn't there prior to this, so I thought I'd add a length to it. My concern is that we can easily get back to situations described here where some exception decides to be more verbose about things.

Also, if it's only in the logs, what happens with managed environments such as Hub/Binder that does not expose the logs to the user?

The details twisty thing essentially contains the portion of the "logs" for a user to send to an admin.

kevin-bates · 2021-03-26T15:18:00Z

Drilling down in this comment:

what happens with managed environments such as Hub/Binder that does not expose the logs to the user?

I'm curious what you're getting at here? In my opinion, exposing access to a server's log - whether that be directly or via some propagation mechanism using third-party software like Grafana - is the responsibility of the administrator/installation. We don't even instruct users to redirect the server's console output to a file where the "logs" can be persisted and "exposed".

Even if the logs were exposed to the users, are we expecting users to decipher what's going on - especially in error situations?

Are you proposing we capture things like status information during a given request so that that information can be returned to the user, or that we expose logs via some kind of service to work around the scenario where logs are intentionally not made available to the end-user?

I'm just trying to understand the end-goal here because we'll need to design/conform our logging differently (and perhaps add state management altogether).

One thought would be to assign every request a, well, request ID, that is randomly generated at the start of each request. The request ID would be logged on entry and exit from the handlers and potentially at interesting data points and certainly be included in log_pipeline_info() along with logged exceptions. It would also be included on the error dialog. Then the user would simply reference the request id as the "handle" that an administrator could use to decipher that particular request. This is tantamount to a session id associated with each request.

kevin-bates · 2021-03-26T16:49:51Z

After a sidebar (scrum), it was determined that having the traceback in the error details is sufficient log information and inclusion of state or progress is not necessary.

We also discussed possibly moving the truncation to the client since other client interfaces may be able to handle the more unwieldy messages better. We decided, for now, to continue with the truncation on the server after confirming that the full (non-truncated) message is available in the traceback. Here's a screenshot, in which I temporarily shorted the max message length from 80 to 40 characters, showing that the full message is indeed in the traceback:

Again, note that this truncation only happens for general exceptions. We explicitly handle PapermillExceutionError (for notebooks) and CalledProcessError (for scripts) exceptions and they do not apply to this discussion.

SInce it sounds like the discussion is complete, I'm removing this from DRAFT.

kiersten-stokes

Looks great!

Refactor script processors, include brief detail on generic errors

3b62a8d

kevin-bates requested a review from kiersten-stokes March 25, 2021 23:26

Cleanup initializers

df352d6

lresende approved these changes Mar 26, 2021

View reviewed changes

lresende requested changes Mar 26, 2021

View reviewed changes

kevin-bates marked this pull request as draft March 26, 2021 02:19

akchinSTC added the status:Work in Progress label Mar 26, 2021

akchinSTC added this to the 2.2.0 milestone Mar 26, 2021

kevin-bates removed the status:Work in Progress label Mar 26, 2021

kevin-bates marked this pull request as ready for review March 26, 2021 16:50

kiersten-stokes approved these changes Mar 26, 2021

View reviewed changes

akchinSTC requested a review from lresende March 26, 2021 21:37

lresende approved these changes Mar 29, 2021

View reviewed changes

lresende merged commit 9132c16 into elyra-ai:master Mar 29, 2021

kevin-bates deleted the refactor-script-processors branch September 11, 2021 23:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor script processors, include brief detail on generic errors #1485

Refactor script processors, include brief detail on generic errors #1485

kevin-bates commented Mar 25, 2021

elyra-bot bot commented Mar 25, 2021

lresende commented Mar 26, 2021

lresende left a comment •

edited

Loading

kevin-bates commented Mar 26, 2021

kevin-bates commented Mar 26, 2021

kevin-bates commented Mar 26, 2021

kiersten-stokes left a comment

Refactor script processors, include brief detail on generic errors #1485

Refactor script processors, include brief detail on generic errors #1485

Conversation

kevin-bates commented Mar 25, 2021

elyra-bot bot commented Mar 25, 2021

lresende commented Mar 26, 2021

lresende left a comment • edited Loading

Choose a reason for hiding this comment

kevin-bates commented Mar 26, 2021

kevin-bates commented Mar 26, 2021

kevin-bates commented Mar 26, 2021

kiersten-stokes left a comment

Choose a reason for hiding this comment

lresende left a comment •

edited

Loading