Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TFX 0.23 complains upstream components were not run in interactive mode #2500

Closed
sidharths opened this issue Sep 16, 2020 · 9 comments
Closed

Comments

@sidharths
Copy link

The notebook was training fine with TFX 0.22 but getting this error in TFX 0.23

Please find the notebook here
https://colab.research.google.com/drive/1Qb-uv5JtyZhMl1BEVHplZmfULgV-BN45?usp=sharing

The upstream components (transformed examples in this caser were executed succesfully with context.run() however trainer still complains that it was not done

ValueError                                Traceback (most recent call last)
<ipython-input-26-a501b7646f34> in <module>()
----> 1 context.run(trainer)

5 frames
/usr/local/lib/python3.6/dist-packages/tfx/components/base/base_driver.py in resolve_input_artifacts(self, input_dict, exec_properties, driver_args, pipeline_info)
    158                 'components must first be run with '
    159                 '`interactive_context.run(component)` before their outputs can '
--> 160                 'be used in downstream components.') % (artifact, name))
    161         result[name] = artifacts
    162       else:

ValueError: Unresolved input channel Artifact(artifact: custom_properties {
  key: "name"
  value {
    string_value: "transformed_examples"
  }
}
custom_properties {
  key: "pipeline_name"
  value {
    string_value: "tfx-segmentation"
  }
}
custom_properties {
  key: "producer_component"
  value {
    string_value: "Transform"
  }
}
, artifact_type: name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
) for input 'examples' was passed in interactive mode. When running in interactive mode, upstream components must first be run with `interactive_context.run(component)` before their outputs can be used in downstream components.
@sidharths sidharths changed the title TFX 0.23 complaints upstream components were not run in interactive mode TFX 0.23 complains upstream components were not run in interactive mode Sep 16, 2020
@rmothukuru rmothukuru self-assigned this Sep 17, 2020
@rmothukuru
Copy link
Contributor

@sidharths,
I tried to reproduce your error but I encountered an error, FileNotFoundError: [Errno 2] No such file or directory: '/content/gdrive/My Drive/Colab Notebooks/TFX' can you please provide complete code so that we can help you. Thanks!

@nroberts1
Copy link

nroberts1 commented Sep 18, 2020

I found this issue and others running the tutorial using 0.23 - https://www.tensorflow.org/tfx/tutorials/tfx/components_keras so thi smaybe easier to reproduce this error.

I found the issues seem to start with the transform.outputs where the
transform.outputs['transform_graph'].get()[0].uri
exists but the
transform.outputs['transform_examples'].get()[0].uri
does not.

When you get to the step in the tutorial:
trainer = Trainer( ...
you will get the output relating to 'not run in interactive mode':

ValueError: Unresolved input channel Artifact(artifact: custom_properties {
  key: "name"
  value {
    string_value: "transformed_examples"
  }
}
custom_properties {
  key: "pipeline_name"
  value {
    string_value: "interactive-2020-09-18T10_59_08.256588"
  }
}
custom_properties {
  key: "producer_component"
  value {
    string_value: "Transform"
  }
}
, artifact_type: name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
) for input 'examples' was passed in interactive mode. When running in interactive mode, upstream components must first be run with `interactive_context.run(component)` before their outputs can be used in downstream components.

In 0.22 everything works okay

Tried 0.24.0-rc0 out of curiosity but don't get very far as CsvExampleGen returns 0 artifacts

@hanneshapke
Copy link

hanneshapke commented Sep 19, 2020

Hi @nroberts1, @rmothukuru & @sidharths,

I can confirm this issue!

Under 0.23: transform.outputs['transformed_examples'].get()[0].uri is None.
I have checked the MLMD (via SQLite) and I don't see any Transformed Example registered.

I checked the Do method of the Transform component and the output_dict[TRANSFORMED_EXAMPLES_KEY] looks good.

[Artifact(artifact: id: 6
type_id: 5
uri: "/tmp/tfx-interactive-2020-09-19T21_26_25.653535-h5_omuzl/Transform/transformed_examples/5"
, artifact_type: id: 5
name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]

Furthermore, the variable materialize_output_paths contains the correct example path info. In my case:

['/tmp/tfx-interactive-2020-09-19T21_26_25.653535-h5_omuzl/Transform/transformed_examples/5/train/transformed_examples', '/tmp/tfx-interactive-2020-09-19T21_26_25.653535-h5_omuzl/Transform/transformed_examples/5/eval/transformed_examples']`

The base_component_launcher module provided the following info for the execution_decision.output_dict:

{'transform_graph': [Artifact(artifact: id: 5
type_id: 13
uri: "/tmp/tfx-interactive-2020-09-19T21_26_25.653535-h5_omuzl/Transform/transform_graph/5"
custom_properties {
  key: "name"
  value {
    string_value: "transform_graph"
  }
}
custom_properties {
  key: "pipeline_name"
  value {
    string_value: "interactive-2020-09-19T21_26_25.653535"
  }
}
custom_properties {
  key: "producer_component"
  value {
    string_value: "Transform"
  }
}
, artifact_type: id: 13
name: "TransformGraph"
)], 'transformed_examples': [Artifact(artifact: id: 6
type_id: 5
uri: "/tmp/tfx-interactive-2020-09-19T21_26_25.653535-h5_omuzl/Transform/transformed_examples/5"
, artifact_type: id: 5
name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]}

No uri was registered for the transformed example, only for the graph.

Regarding 0.24: I can also confirm the issue @nroberts1 is reporting.

I don't see where the info of the example paths is being dropped since the output_dict[TRANSFORMED_EXAMPLES_KEY] within the Transformer Executor (Do method) is looking good, and then the info is missing in the launcher in execution_decision.output_dict.

@joe4k
Copy link

joe4k commented Sep 19, 2020 via email

@hanneshapke
Copy link

hanneshapke commented Sep 19, 2020

Hi @nroberts1, @rmothukuru & @sidharths,

Update from my side: I wasn't able to run any pipeline with 0.23, 0.24rc0, and master!

I have set up a public colab notebook for each version error (@rmothukuru, those notebooks will download a public data set to reproduce the errors):

Version 0.23: Colab with 0.23 errors
Version 0.24rc0: Colab with 0.24rc0 errors
(Links were updated to be accessible)

I hope this helps. @rmothukuru feel free to reach out if you have any questions to reproduce the errors.

@hanneshapke
Copy link

@rmothukuru One more comment: The Transform step runs fine in pipelines based on 0.22. I noticed that the Transform components from versions > 0.22 have a new attribute materialize as input. Is the bug may be related to the new implementation? I have tried setting the attribute to True, but the error persists.

@hanneshapke
Copy link

@nroberts1, @rmothukuru & @sidharths
My pipelines seem to run with 0.24.0rc1.

@venkat2469
Copy link
Contributor

No Active issues found on TFX 0.23.0. will reopen if any issues found in future.

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants