-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MAINTENANCE] Prevent unneeded test setup/teardown #10619
Conversation
✅ Deploy Preview for niobium-lead-7998 canceled.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
✅ All tests successful. No failed tests found. Additional details and impacted files@@ Coverage Diff @@
## develop #10619 +/- ##
===========================================
- Coverage 80.31% 80.31% -0.01%
===========================================
Files 463 463
Lines 40117 40117
===========================================
- Hits 32221 32220 -1
- Misses 7896 7897 +1 Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -148,29 +148,29 @@ class TestExpectTableRowCountToEqualOtherTable: | |||
data_source_configs=[ | |||
PostgreSQLDatasourceTestConfig( | |||
column_types={"col_a": sqltypes.INTEGER}, | |||
extra_assets={"test_table_two": {"col_b": sqltypes.VARCHAR}}, | |||
extra_assets={"test_table_a": {"col_b": sqltypes.VARCHAR}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO in a subsequent task: figure out how to generate these names ourselves like we do for the main asset. https://greatexpectations.atlassian.net/browse/CORE-586
# We need to implement this ourselves to call `.equals` on dataframes.` | ||
if not isinstance(value, TestConfig): | ||
return False | ||
return all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do an equality check on the hashes? Not sure if that's better but this seems to very similar to that and then we'd only have 1 block of code to update if this changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to push back a bit here. I considered similar approaches. I went with this because, while more tedious, it saves us from the (very unlikely) case of hash collisions. But the likelihood of that is sooo small, so I'm totally happy to change to a hash comparison, but wanted to lay out my thoughts. LMK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this is better.
tests/integration/data_sources_and_expectations/test_test_performance.py
Outdated
Show resolved
Hide resolved
|
||
@override | ||
def __hash__(self) -> int: | ||
hashable_col_types = dict_to_tuple(self.column_types) if self.column_types else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If 2 objects have the same value of __eq__
are they guaranteed to have the same hash? This seems to be a requirement of hash (eg so putting them in dictionaries works as expected): https://docs.python.org/3/reference/datamodel.html#object.__hash__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If 2 objects have the same value of eq are they guaranteed to have the same hash
If they have the same value of eq AND implement hash, it must be the same, but most (probably all well defined?) mutable objects do not implement hash.
>>> hash({"foo": "bar"})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
The trick I've always seen is to use tuples, but maybe there's something cleaner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we are talking about the same thing? Using tuples is fine. I'm asking if a == b
are we guaranteed that a.__hash__() == b.__hash__()
? Maybe it's true but it wasn't apparent when I read this code.
My link didn't work because the last characters got removed from the url (fixed now) but I'm referring to this from docs:
The only required property is that objects which compare equal have the same hash value; it is advised to mix together the hash values of the components of the object that also play a part in comparison of objects by packing them into a tuple and hashing the tuple.
If we use this for a key in a dict, and this isn't true, the lookup can fail since both hash and equality are used in the lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, I misunderstood your question. My intention was definitely the implement this so if a == b are we guaranteed that a.hash() == b.hash(). I'm pretty confident it does this. But LMK if you aren't convinced (or if I'm still missing something 😄 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
See the comment toward the bottom of
conftest.py
for details on the approach. But a couple notes on some of the changes here:TestConfig
->BatchTestSetup
that is shared across test runs so that we can use them when appropriate and only run setup/teardown onceTestConfig
to be hashable, so I implemented both__hash__
and__eq__
.extra_data
down to the baseBatchTestSetup
. I'm interested in reworking this a bit in a subsequent refactor.set_context
in our fixture before we run yield to the test. Without this, tests were able to run with different tests' contexts, resulting in errors around not finding the datasource referenced by batchesBatchTestSetup._context
was made public asBatchTestSetup.context
invoke lint
(usesruff format
+ruff check
)For more information about contributing, visit our community resources.
After you submit your PR, keep the page open and monitor the statuses of the various checks made by our continuous integration process at the bottom of the page. Please fix any issues that come up and reach out on Slack if you need help. Thanks for contributing!