Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure metrics work with column-wise distributed training #9020

Merged
merged 13 commits into from
Apr 17, 2023

Conversation

rongou
Copy link
Contributor

@rongou rongou commented Apr 10, 2023

This is mostly moving things around so that we can test the metrics under distributed training, for both CPU and GPU, row split and column split.

rongou added 8 commits April 4, 2023 15:45

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@rongou
Copy link
Contributor Author

rongou commented Apr 10, 2023

@trivialfis I'll follow up with some refactoring to make the logic for row vs. column split, and horizontal vs vertical federated learning more encapsulated.


namespace xgboost {
namespace metric {

TEST(Metric, DeclareUnifiedTest(BinaryAUC)) {
auto ctx = xgboost::CreateEmptyGenericParam(GPUIDX);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to move all these code to headers? Maybe copying the function declaration is enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is since we are including the .cc file in .cu, we'd get duplicate symbols if we put the implementation in the .cc file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are declaring all the test bodies in headers, I think we can remove the cc file include trick right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TESTs are still defined in the .cc file, so if you don't want to do the include trick, then we'd have to redefine all the TESTs in the .cu file as well.

Copy link
Member

@trivialfis trivialfis Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will probably do that in the future. Let's keep the trick for now. Just out of curiosity, do we really need to modify every test for this? Seems it takes quite a lot of work and deep understanding to keep the tests up-to-date. Can we have something similar to the objective tests that just compare the results from single-worker and multi-worker?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would testing that property be easier than the current tests? My thought is we can use something similar to the objective tests for federated learning instead of adding new parameters to the existing metric tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The objective tests are at a higher level, we only run through each objective once to obtain the result, so it's simpler to compare it with the result from column split. The metric tests are lower level, we are calculating the metric result repeatedly under different conditions, so it's harder to compare the results. I suppose we could refactor the tests more to make it easier to do comparisons, but with the current setup, this seems like the simplest approach.

Copy link
Member

@trivialfis trivialfis Apr 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the many questions, can we make it high-level? What's the downside of doing so

for m : all_metrics:
    local_data = random()
    column_split_data = copy(local_data)
    column_split_data.info.column_split = true

    expected_result = m.evaluate(local_data)
    // alternatively, `for world_size : {1, 3} {}`, and assert results are the same.
    column_split_result = run_with_communicator(world_size=3, []() { return m.evaluate(column_split_data) })
    assert_equal(expected_result, column_split_result)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue is for each metric, we are testing many variations of inputs (each GetMetricEval call is different test case). In order to compare single node vs column split, we'd have to either:

  • Only test a single input per metric, a happy path, thus missing a lot of corner cases.
  • Or still test all the inputs, which means repeating all the tests, introducing a lot of duplication.

Since we've already calculated the expected metric outputs, there is really no need to compare column-split results with single-node ones.

This is slightly different from the objective tests, where the outputs depend on the randomly generated data. For metric tests we are manually specifying predictions and labels, so the outputs are fixed.

Copy link
Member

@trivialfis trivialfis Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. For future references, it would be great if we can hide or make it more automatic (like having a metric test helper that can generate different test cases). Personally, I would like to keep simple things like metric to be beginner friendly.

By beginner, I mean a person who knows only the mathematic equation of the metric and nothing else.

@trivialfis trivialfis merged commit ba9d24f into dmlc:master Apr 17, 2023
@rongou rongou deleted the vertical-metrics branch September 25, 2023 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants