[MAINTENANCE] Prevent unneeded test setup/teardown #10619

tyler-hoffman · 2024-11-02T00:26:13Z

See the comment toward the bottom of conftest.py for details on the approach. But a couple notes on some of the changes here:

We now have a mapping of TestConfig -> BatchTestSetup that is shared across test runs so that we can use them when appropriate and only run setup/teardown once
For this behavior to work, we need TestConfig to be hashable, so I implemented both __hash__ and __eq__.
Because of ^, it was easiest just to push extra_data down to the base BatchTestSetup. I'm interested in reworking this a bit in a subsequent refactor.
Because gx doesn't support use of multiple concurrent contexts, there's now a call to set_context in our fixture before we run yield to the test. Without this, tests were able to run with different tests' contexts, resulting in errors around not finding the datasource referenced by batches
Because of ^, BatchTestSetup._context was made public as BatchTestSetup.context

Description of PR changes above includes a link to an existing GitHub issue
PR title is prefixed with one of: [BUGFIX], [FEATURE], [DOCS], [MAINTENANCE], [CONTRIB]
Code is linted - run invoke lint (uses ruff format + ruff check)
Appropriate tests and docs have been updated

For more information about contributing, visit our community resources.

After you submit your PR, keep the page open and monitor the statuses of the various checks made by our continuous integration process at the bottom of the page. Please fix any issues that come up and reach out on Slack if you need help. Thanks for contributing!

…rformance

netlify · 2024-11-02T00:26:28Z

✅ Deploy Preview for niobium-lead-7998 canceled.

Name	Link
🔨 Latest commit	`0467e98`
🔍 Latest deploy log	https://app.netlify.com/sites/niobium-lead-7998/deploys/672a7d166e65380008d49a7d

codecov · 2024-11-02T00:28:40Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.31%. Comparing base (da65e16) to head (0467e98).
Report is 1 commits behind head on develop.

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop   #10619      +/-   ##
===========================================
- Coverage    80.31%   80.31%   -0.01%     
===========================================
  Files          463      463              
  Lines        40117    40117              
===========================================
- Hits         32221    32220       -1     
- Misses        7896     7897       +1

Flag	Coverage Δ
3.10	`68.03% <ø> (ø)`
3.10 athena or openpyxl or pyarrow or project or sqlite or aws_creds	`?`
3.10 aws_deps	`?`
3.10 big	`?`
3.10 clickhouse	`?`
3.10 filesystem	`?`
3.10 mssql	`?`
3.10 mysql	`?`
3.10 postgresql	`?`
3.10 spark_connect	`?`
3.10 trino	`?`
3.11	`68.03% <ø> (ø)`
3.11 athena or openpyxl or pyarrow or project or sqlite or aws_creds	`?`
3.11 aws_deps	`?`
3.11 big	`?`
3.11 clickhouse	`?`
3.11 filesystem	`?`
3.11 mssql	`?`
3.11 mysql	`?`
3.11 postgresql	`?`
3.11 spark_connect	`?`
3.11 trino	`?`
3.12	`68.03% <ø> (ø)`
3.12 athena or openpyxl or pyarrow or project or sqlite or aws_creds	`55.41% <ø> (ø)`
3.12 aws_deps	`46.14% <ø> (ø)`
3.12 big	`54.75% <ø> (ø)`
3.12 databricks	`47.88% <ø> (ø)`
3.12 filesystem	`61.71% <ø> (ø)`
3.12 mssql	`50.25% <ø> (ø)`
3.12 mysql	`50.31% <ø> (ø)`
3.12 postgresql	`54.63% <ø> (ø)`
3.12 snowflake	`48.85% <ø> (-0.01%)`	⬇️
3.12 spark	`58.06% <ø> (ø)`
3.12 spark_connect	`46.44% <ø> (ø)`
3.12 trino	`52.68% <ø> (ø)`
3.9	`68.06% <ø> (-0.01%)`	⬇️
3.9 athena or openpyxl or pyarrow or project or sqlite or aws_creds	`55.41% <ø> (ø)`
3.9 aws_deps	`46.17% <ø> (ø)`
3.9 big	`54.76% <ø> (ø)`
3.9 clickhouse	`43.03% <ø> (ø)`
3.9 databricks	`47.89% <ø> (ø)`
3.9 filesystem	`61.72% <ø> (ø)`
3.9 mssql	`50.23% <ø> (ø)`
3.9 mysql	`50.30% <ø> (ø)`
3.9 postgresql	`54.61% <ø> (ø)`
3.9 snowflake	`48.86% <ø> (-0.01%)`	⬇️
3.9 spark	`58.02% <ø> (ø)`
3.9 spark_connect	`46.45% <ø> (ø)`
3.9 trino	`52.66% <ø> (ø)`
cloud	`0.00% <ø> (ø)`
docs-basic	`53.36% <ø> (ø)`
docs-creds-needed	`52.93% <ø> (ø)`
docs-spark	`52.41% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tyler-hoffman · 2024-11-02T18:13:19Z

tests/integration/data_sources_and_expectations/test_canonical_expectations.py

@@ -148,29 +148,29 @@ class TestExpectTableRowCountToEqualOtherTable:
        data_source_configs=[
            PostgreSQLDatasourceTestConfig(
                column_types={"col_a": sqltypes.INTEGER},
-                extra_assets={"test_table_two": {"col_b": sqltypes.VARCHAR}},
+                extra_assets={"test_table_a": {"col_b": sqltypes.VARCHAR}},


TODO in a subsequent task: figure out how to generate these names ourselves like we do for the main asset. https://greatexpectations.atlassian.net/browse/CORE-586

…rformance

tests/integration/conftest.py

billdirks · 2024-11-04T20:59:50Z

tests/integration/conftest.py

+        # We need to implement this ourselves to call `.equals` on dataframes.`
+        if not isinstance(value, TestConfig):
+            return False
+        return all(


Can we do an equality check on the hashes? Not sure if that's better but this seems to very similar to that and then we'd only have 1 block of code to update if this changed.

I want to push back a bit here. I considered similar approaches. I went with this because, while more tedious, it saves us from the (very unlikely) case of hash collisions. But the likelihood of that is sooo small, so I'm totally happy to change to a hash comparison, but wanted to lay out my thoughts. LMK.

You are right, this is better.

tests/integration/data_sources_and_expectations/test_test_performance.py

billdirks · 2024-11-04T21:22:51Z

tests/integration/test_utils/data_source_config/base.py

+
+    @override
+    def __hash__(self) -> int:
+        hashable_col_types = dict_to_tuple(self.column_types) if self.column_types else None


If 2 objects have the same value of __eq__ are they guaranteed to have the same hash? This seems to be a requirement of hash (eg so putting them in dictionaries works as expected): https://docs.python.org/3/reference/datamodel.html#object.__hash__

If 2 objects have the same value of eq are they guaranteed to have the same hash

If they have the same value of eq AND implement hash, it must be the same, but most (probably all well defined?) mutable objects do not implement hash.

>>> hash({"foo": "bar"}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'dict'

The trick I've always seen is to use tuples, but maybe there's something cleaner?

I don't know if we are talking about the same thing? Using tuples is fine. I'm asking if a == b are we guaranteed that a.__hash__() == b.__hash__()? Maybe it's true but it wasn't apparent when I read this code.

My link didn't work because the last characters got removed from the url (fixed now) but I'm referring to this from docs:

The only required property is that objects which compare equal have the same hash value; it is advised to mix together the hash values of the components of the object that also play a part in comparison of objects by packing them into a tuple and hashing the tuple.

If we use this for a key in a dict, and this isn't true, the lookup can fail since both hash and equality are used in the lookup.

Oh yeah, I misunderstood your question. My intention was definitely the implement this so if a == b are we guaranteed that a.hash() == b.hash(). I'm pretty confident it does this. But LMK if you aren't convinced (or if I'm still missing something 😄 )

tests/integration/conftest.py

joshua-stauffer

✅

tyler-hoffman added 3 commits October 31, 2024 15:41

Sketch out some ideas on caching

a7c0c02

Merge remote-tracking branch 'origin' into m/CORE-567/improve-test-pe…

c7d1a37

…rformance

More

8f74651

tyler-hoffman added 4 commits November 2, 2024 08:28

Less

99cce89

Closer

c0bf563

Actually get it working

8d7d687

Remove unneeded dunder methods

d86da8b

tyler-hoffman changed the title ~~M/core 567/improve test performance~~ [MAINTENANCE] Prevent unneeded test setup/teardown Nov 2, 2024

tyler-hoffman added 3 commits November 2, 2024 14:02

Undo change in unrelated file

75e49ef

Cleanup

794b4fe

More cleanup

3ad1e7e

tyler-hoffman commented Nov 2, 2024

View reviewed changes

tyler-hoffman marked this pull request as ready for review November 2, 2024 20:10

tyler-hoffman and others added 2 commits November 2, 2024 16:10

Merge branch 'develop' into m/CORE-567/improve-test-performance

fcad993

Update comments

d7d3397

tyler-hoffman requested a review from a team November 2, 2024 20:22

tyler-hoffman added 11 commits November 2, 2024 16:25

Another comment

9be17ad

Back out unwanted change

428fbd0

Rename test file.

0d72863

Merge remote-tracking branch 'origin' into m/CORE-567/improve-test-pe…

8937b81

…rformance

Spelling

d77ca2b

Remove unneeded null check

ad565bd

Back out unneeded change

c526219

Back out another unneeded change

3b3670e

Remove arg from call to DummyBatchTestSetup

b69758d

Use fixture for session-scoped cached test configs

9815374

Remove more global stuff

fb17a2f

billdirks reviewed Nov 4, 2024

View reviewed changes

tyler-hoffman and others added 5 commits November 4, 2024 16:41

Actually cache the setup teardown counts obj

b3d5138

Switch order of decorators for python 3.9

a70ade3

Merge branch 'develop' into m/CORE-567/improve-test-performance

06634e1

Spell better

03fa222

Remove a line

3d8fd78

joshua-stauffer reviewed Nov 5, 2024

View reviewed changes

tests/integration/conftest.py Show resolved Hide resolved

joshua-stauffer approved these changes Nov 5, 2024

View reviewed changes

tyler-hoffman added this pull request to the merge queue Nov 5, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 5, 2024

tyler-hoffman added this pull request to the merge queue Nov 5, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 5, 2024

tyler-hoffman added this pull request to the merge queue Nov 5, 2024

tyler-hoffman removed this pull request from the merge queue due to a manual request Nov 5, 2024

Merge branch 'develop' into m/CORE-567/improve-test-performance

31df8b0

tyler-hoffman enabled auto-merge November 5, 2024 18:46

Temporarily skip snowflake tests around extra assets

0467e98

tyler-hoffman added this pull request to the merge queue Nov 5, 2024

Merged via the queue into develop with commit dbd4746 Nov 5, 2024
70 checks passed

tyler-hoffman deleted the m/CORE-567/improve-test-performance branch November 5, 2024 21:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MAINTENANCE] Prevent unneeded test setup/teardown #10619

[MAINTENANCE] Prevent unneeded test setup/teardown #10619

tyler-hoffman commented Nov 2, 2024 •

edited

Loading

netlify bot commented Nov 2, 2024 •

edited

Loading

codecov bot commented Nov 2, 2024 •

edited

Loading

tyler-hoffman Nov 2, 2024 •

edited

Loading

billdirks Nov 4, 2024

tyler-hoffman Nov 5, 2024

billdirks Nov 6, 2024

billdirks Nov 4, 2024 •

edited

Loading

tyler-hoffman Nov 5, 2024 •

edited

Loading

billdirks Nov 6, 2024

tyler-hoffman Nov 7, 2024

joshua-stauffer left a comment

[MAINTENANCE] Prevent unneeded test setup/teardown #10619

[MAINTENANCE] Prevent unneeded test setup/teardown #10619

Conversation

tyler-hoffman commented Nov 2, 2024 • edited Loading

netlify bot commented Nov 2, 2024 • edited Loading

✅ Deploy Preview for niobium-lead-7998 canceled.

codecov bot commented Nov 2, 2024 • edited Loading

Codecov Report

tyler-hoffman Nov 2, 2024 • edited Loading

Choose a reason for hiding this comment

billdirks Nov 4, 2024

Choose a reason for hiding this comment

tyler-hoffman Nov 5, 2024

Choose a reason for hiding this comment

billdirks Nov 6, 2024

Choose a reason for hiding this comment

billdirks Nov 4, 2024 • edited Loading

Choose a reason for hiding this comment

tyler-hoffman Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

billdirks Nov 6, 2024

Choose a reason for hiding this comment

tyler-hoffman Nov 7, 2024

Choose a reason for hiding this comment

joshua-stauffer left a comment

Choose a reason for hiding this comment

tyler-hoffman commented Nov 2, 2024 •

edited

Loading

netlify bot commented Nov 2, 2024 •

edited

Loading

codecov bot commented Nov 2, 2024 •

edited

Loading

tyler-hoffman Nov 2, 2024 •

edited

Loading

billdirks Nov 4, 2024 •

edited

Loading

tyler-hoffman Nov 5, 2024 •

edited

Loading